bio-img_metadata
Reads metadata from Integrated Microbial Genomes (IMG) metadata files. Metadata files are generated by searching for one or more taxons, and then exporting various/all genome-specific characters e.g. kingdom, genus, temperature range, taxon identifier, etc.
Installation
gem install bio-img_metadata
Usage
require 'bio-img_metadata'
d = Bio::IMG::Metadata.read(File.join DATA_DIR, 'head.metadata.csv') #=> an Array of Bio::IMG::Metadata objects
d.length.should == 9 #=> The array has 9 members, one for each line in the metadata file
d[0].kind_of?(Bio::IMG::Lineage).should == true #=> Each lineage's object
d[0].domain.should == 'Archaea' #=> some attributes are now methods (mostly the taxonomy-related ones)
d[1].taxon_id.should == 2515075008
d[0].attributes['Status'].should == 'Finished' #=> the rest are in the attributes array
How to get the metadata file
Go to IMG > Genome Browser: http://img.jgi.doe.gov/cgi-bin/w/main.cgi?section=TaxonList&page=taxonListAlpha
In the Table Configuration section:
- Genome Field > Click All
- Project Metadata > Click All
- Data Statistics > Click All
Click Display Genomes Again. In the Genome Browser section > Click Select All. Finally, click the Export button.
PS/ Don't trust the IMG metadata too much. There are some big mistakes, e.g. in the 16S copy number
PS2/ What have I done to create the FIXED metadata?
- I have deleted two occurences of "\r" (^M) by ""
- taxonoid 2515154013 has two extra fields: remove the two cells containing "Human wound, cranian"
- Replace cells containing "-1" by ""
- Replaced 'Marine archaeal group 1 BG20 (Nitrosoarchaeum limnia BG20)' by 'Nitrosoarchaeum limnia BG20'
(Download instructions kindly contributed by @fangly / Florent Angly)
Project home page
Information on the source tree, documentation, examples, issues and how to contribute, see
http://github.com/wwood/bioruby-img_metadata
The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
Cite
If you use this software, please cite one of
- BioRuby: bioinformatics software for the Ruby programming language
- Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics
Biogems.info
This Biogem is published at (http://biogems.info/index.html#bio-img_metadata)
Copyright
Copyright (c) 2013 Ben J. Woodcroft. See LICENSE.txt for further details.