bio-signalp
A wrapper for the signal peptide prediction algorithm SignalP.
Using bio-signalp
requires SignalP to be locally installed and configured correctly. http://www.cbs.dtu.dk/services/SignalP/ has instructions on how it may be downloaded. This gem works best when the signalp executable is available from the command line (i.e. running 'signalp' by itself works at the commandline).
Installation
First you need to setup SignalP itself. bio-signalp
is tested with SignalP versions 3.0 and 4.0.
- Download SignalP and unpack the archive
- Modify the signalp script in the unpacked directory. Specific instructions are provided in the script itself.
- Add the unpacked directory to your path (or alternately, give the path to the signalp executable to the
calculate
method)
Then you need to install this bio-gem
gem install bio-signalp
Usage
Usage as a script:
Usage: signalp.rb my.fasta
my.fasta is the name of the fasta file you want to analyse ($stdin also accepted). Default output is all the sequences with their signal sequences cleaved.
This default output can be changed by using one (only) of -s, -S, -v, -f, -F.
-s, --summary print a tab separated table indicating if the sequence had a signal peptide results (if Signalp 3 is used, HMM and NN predictions are both given, respectively [default: no]
-S, --bigger-summary like -s, except also includes where the cleavage site is predicted [default: no]
-v, --verbose-summary much like -s except more details of the prediction are predicted [default: no]
-f, --filter-in filter in: print those sequences that have a signal peptide [default: no]
-F, --filter-out filter out: print those sequences that don't have a signal peptide [default: no]
-b, --binary-path SIGNALP_PATH path to the signalp binary e.g. /usr/local/bin/signalp-4.0/signalp [default: 'signalp' i.e. whatever is on the PATH]
Usage as a programmatic interface
require 'bio-signalp'
# The Plasmodium falciparum ACP sequence is known to have a signal peptide (one that helps direct it to the apicoplast)
acp_sequence = 'MKILLLCIIFLYYVNAFKNTQKDGVSLQILKKKRSNQVNFLNRKNDYNLIKNKNPSSSLKSTFDDIKKIISKQLSVEEDKIQMNSNFTKDLGADSLDLVELIMALEEKFNVTISDQDALKINTVQDAIDYIEKNNKQ'
# Run SignalP. The version is automatically detected
result = Bio::SignalP::Wrapper.new.calculate(acp_sequence) #=> Either a Bio::SignalP::Version3::Result or a Bio::SignalP::Version4::Result object
result.signal? #=> true. ACP has a predicted signal peptide.
result.cleavage_site #=> 17. The Ymax output from SignalP gives the predicted cleavage site.
result.cleave(acp_sequence) #=> 'FKNTQKDGVSLQILKKKRSNQVNFLNRKNDYNLIKNKNPSSSLKSTFDDIKKIISKQLSVEEDKIQMNSNFTKDLGADSLDLVELIMALEEKFNVTISDQDALKINTVQDAIDYIEKNNKQ'. The acp_sequence after signal peptide cleavage.
Copyright
Copyright (c) 2011-2012 Ben J Woodcroft. See LICENSE.txt for further details.