No commit activity in last 3 years
No release in over 3 years
Methods to classify and manipulate PfEMP1 DBL-alpha sequence tags
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

>= 1.0.22
>= 1.6.4
>= 0.9.2.2
>= 2.8.0

Runtime

>= 1.4.3
 Project Readme

<img src=“https://secure.travis-ci.org/georgeG/bioruby-dbla-classifier.png?branch=master” alt=“Build Status” />

bio-dbla-classifier¶ ↑

DBL-alpha tags are small regions of the PfEMP1 protein that can be PCR amplified and are classified into six expression groups depending on the number of cysteines and presence of certain motifs within the tag region (Bull et al 2007). This plugin extends bioruby’s amino acid class, Bio::Sequence::AA by adding methods to analyze DBL-alpha sequences tags. If you use this plugin please quote, Bull et al “An approach to classifying sequence tags sampled from Plasmodium falciparum var genes..” Molecular and Biochemical Parasitology 154 (1) (July): 98–102. doi:10.1016/j.molbiopara.2007.03.011.

Installation¶ ↑

Ruby must be installed on your system. See rubylang.info/ for information on Ruby and how to install it on your system. Once Ruby 1.9.2 is installed type the following command in the terminal to install the gem. This will install the bioruby gem if it is not already installed on your system. The plugin has been tested on Ruby 1.9.2-p290.

gem install bio-dbla-classifier

Uninstall¶ ↑

gem uninstall bio-dbla-classifier

Usage¶ ↑

#create an instance of Bioruby's Bio::Sequence::AA class with methods to classify and describe DBL-alpha tags.

require 'bio-dbla-classifier'

seq ='DIGDIVRGRDMFKSNPEVEKGLKAVFRKINNGLTPQAKTHYADEDGSGNYVKLREDWWKANRDQVWKAITCKAPQSVHYFIKTSHGTRGFTSHGKCGRNETNVPTNLDYVPQYLR'
dbl_seq = Bio::Sequence::AA.new(seq)

#get the positions of limited variability
puts dbl_seq.polv1 #=> MFKS
puts dbl_seq.polv2 #=> LRED
puts dbl_seq.polv3 #=> KAIT
puts dbl_seq.polv4 #=> PTNL

#get the number of cysteines in the tag
puts dbl_seq.cys_count #=> 2

#get the distinct sequence identifier
puts dbl_seq.dsid #=>MFKS-LRED-KAIT-2-PTNL-115

#get the cyspolv group for this tag
puts dbl_seq.cyspolv_group #=> 1

#get the block sharing group for this tag
puts dbl_seq.bs_group #=> 1

#get the length of the tag
puts dbl_seq.size #=> 115

#determine whether the tag is a var1
dbl_seq.is_var1? #=> false

#is this tag a groupA like sequence tag?
dbl_seq.is_groupA_like?

Finding the Position Specific Polymorphic Blocks(PSPB)¶ ↑

The pspb methods take 2 arguments, an anchor position and a window length that defines the length of the pspb.The default anchor position is 0 and the default window length is 14

#get pspb1
puts seq.pspb1 #=> NPEVEKGLKAVFRK

#get pspb2
puts seq.pspb2 #=> THYADEDGSGNYVK

#get pspb3
puts seq.pspb3 #=> CKAPQSVHYFIKTS

#get pspb4
puts seq.pspb4 #=> FTSHGKCGRNETNV

#get a pspb as an amino acid
puts seq.pspb4_as_dna

Finding polv2 to polv3 regions¶ ↑

Only the Amino acid regions are supported for now 

#get polv1 to polv2 region
puts seq.polv1_to_polv2

#get polv3 to polv4 region
puts seq.polv3_to_polv4

Processing a flatfile for example fasta, genbank or embl¶ ↑

seq_file = "sequences.fasta"

#for each entry in the file
Bio::FlatFile.open(seq_file).each do |entry|
 tag = Bio::Sequence::AA.new(entry.seq)
 puts "#{entry.definition},#{tag.cyspolv_group},#{tag.dsid},#{tag.bs_group},"
end

Copyright¶ ↑

See LICENSE.txt for further details