No commit activity in last 3 years
No release in over 3 years
There's a lot of open issues
Open local or remote XLSX, XLS, ODS, CSV (comma separated), TSV (tab separated), other delimited, fixed-width files, and Google Docs. Returns an enumerator of Arrays or Hashes, depending on whether there are headers.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

>= 1.5.0
>= 1.1.3
>= 0
>= 1.11
>= 0.0.8
 Project Readme

remote_table

Open Google Docs spreadsheets, local or remote XLSX, XLS, ODS, CSV (comma separated), TSV (tab separated), other delimited, fixed-width files.

Tested on MRI 1.8, MRI 1.9, and JRuby 1.6.7+. Thread-safe.

Sponsor

Faraday logo

We use remote_table for data-driven marketing at Faraday.

Example

>> require 'remote_table'
=> true
>> t = RemoteTable.new 'http://www.fueleconomy.gov/FEG/epadata/98guide6.zip', :filename => '98guide6.csv'
=> #<RemoteTable:0x00000101b87390 @download_count_mutex=#<Mutex:0x00000101b87228>, @extend_bang_mutex=#<Mutex:0x00000101b871d8>, @errata_mutex=#<Mutex:0x00000101b871b0>, @cache=[], @download_count=0, @url="http://www.fueleconomy.gov/FEG/epadata/98guide6.zip", @format=nil, @headers=:first_row, @compression=:zip, @packing=nil, @streaming=false, @warn_on_multiple_downloads=true, @delimiter=",", @sheet=nil, @keep_blank_rows=false, @form_data=nil, @skip=0, @internal_encoding="UTF-8", @row_xpath=nil, @column_xpath=nil, @row_css=nil, @column_css=nil, @glob=nil, @filename="98guide6.csv", @transform_settings=nil, @cut=nil, @crop=nil, @schema=nil, @schema_name=nil, @pre_select=nil, @pre_reject=nil, @errata_settings=nil, @other_options={}, @transformer=#<RemoteTable::Transformer:0x00000101b8c2f0 @t=#<RemoteTable:0x00000101b87390 ...>, @legacy_transformer_mutex=#<Mutex:0x00000101b8c2a0>>, @local_copy=#<RemoteTable::LocalCopy:0x00000101b8bf58 @t=#<RemoteTable:0x00000101b87390 ...>, @encoded_io_mutex=#<Mutex:0x00000101b8be18>, @generate_mutex=#<Mutex:0x00000101b8bdc8>>>
>> t.rows.length
=> 806
>> t.rows.first.length
=> 26
>> require 'pp'
=> true
>> pp t[23]
{"Class"=>"TWO SEATERS",
 "Manufacturer"=>"PORSCHE",
 "carline name"=>"BOXSTER",
 "displ"=>"2.5",
 "cyl"=>"6",
 "trans"=>"Manual(M5)",
 "drv"=>"R",
 "cty"=>"19",
 "hwy"=>"26",
 "cmb"=>"22",
 "ucty"=>"21.2",
 "uhwy"=>"33.9499",
 "ucmb"=>"25.5114",
 "fl"=>"P",
 "G"=>"",
 "T"=>"",
 "S"=>"",
 "2pv"=>"",
 "2lv"=>"",
 "4pv"=>"",
 "4lv"=>"",
 "hpv"=>"",
 "hlv"=>"",
 "fcost"=>"956",
 "eng dscr"=>"",
 "trans dscr"=>""}

Columns and rows

  • If there are headers, you get an Array of Hashes with string keys.
  • If you set :headers => false, then you get an Array of Arrays.

Row keys

Row keys are strings. Row keys are NOT symbolized.

row['foobar'] # correct
row[:foobar]  # incorrect

You can call symbolize_keys yourself, but we don't do it automatically to avoid creating loads of garbage symbols.

Supported formats

Format Notes Library
Delimited (CSV, TSV, etc.) All RemoteTable::Delimited::PASSTHROUGH_CSV_SETTINGS, for example :col_sep, are passed directly to fastercsv. fastercsv (1.8); stdlib (1.9)
Fixed width You have to set up a :schema. fixed_width-multibyte
HTML See XML. nokogiri
ODS roo
XLS roo
XLSX roo
XML The idea is to set up a :row_[xpath|css] and (optionally) a :column_[xpath|css]. nokogiri
JSON Force JSON format using format: :json and define root nodes using root_node: 'data' JSON

Compression and packing

You can directly pick a file out of a remote archive using :filename or use a :glob.

  • zip
  • tar
  • bz2
  • gz
  • exe (treated as zip)

Encoding

Everything is forced into UTF-8. You can improve the quality of the conversion by specifying the original encoding with :encoding

  • ASCII-8BIT and BINARY are equal
  • ISO-8859-1 and Latin1 are equal

More examples

RemoteTable.new('https://spreadsheets.google.com/pub?key=0AoQJbWqPrREqdHRNaVpSUWw2Z2VhN3RUV25yYWdQX2c&output=csv')

# aircraft fuel use equations derived from EMEP/EEA and ICAO
RemoteTable.new('https://docs.google.com/spreadsheet/pub?key=0AoQJbWqPrREqdEhYenF3dGt1T0Y1cTdneUNsNjV0dEE&output=csv')

# distance classes from the WRI business travel tool and UK DEFRA/DECC GHG Conversion Factors for Company Reporting
RemoteTable.new('https://spreadsheets.google.com/pub?key=0AoQJbWqPrREqdFBKM0xWaUhKVkxDRmdBVkE3VklxY2c&hl=en&gid=0&output=csv')

# seat classes used in the WRI GHG Protocol calculation tools
RemoteTable.new('https://docs.google.com/spreadsheet/pub?key=0AoQJbWqPrREqdG9EdmxybG1wdC1iU3JRYXNkMGhvSnc&output=csv')

# pure automobile fuels
RemoteTable.new('https://spreadsheets.google.com/pub?key=0AoQJbWqPrREqdE9xTEdueFM2R0diNTgxUlk1QXFSb2c&gid=0&output=csv')

# blended automobile fuels
RemoteTable.new('https://spreadsheets.google.com/pub?key=0AoQJbWqPrREqdEswNGIxM0U4U0N1UUppdWw2ejJEX0E&gid=0&output=csv')

# A list of hybrid make model years derived from the EPA fuel economy guide
RemoteTable.new('https://docs.google.com/spreadsheet/pub?hl=en_US&hl=en_US&key=0AoQJbWqPrREqdGtzekE4cGNoRGVmdmZMaTNvOWluSnc&output=csv')

# BTS aircraft type lookup table
RemoteTable.new("http://www.transtats.bts.gov/Download_Lookup.asp?Lookup=L_AIRCRAFT_TYPE",
                :errata => { :url => RemoteTable.new('https://spreadsheets.google.com/spreadsheet/pub?key=0AoQJbWqPrREqdEZ2d3JQMzV5T1o1T3JmVlFyNUZxdEE&output=csv' })

# aircraft made by whitelisted manufacturers whose ICAO code starts with 'B' from the FAA
# for definition of `Aircraft::Guru` and `manufacturer_whitelist?` see https://github.com/brighterplanet/earth/blob/master/lib/earth/air/aircraft/data_miner.rb
RemoteTable.new("http://www.faa.gov/air_traffic/publications/atpubs/CNT/5-2-B.htm",
                :encoding => 'windows-1252',
                :row_xpath => '//table/tr[2]/td/table/tr',
                :column_xpath => 'td',
                :errata => { :url => 'https://spreadsheets.google.com/spreadsheet/pub?key=0AoQJbWqPrREqdGVBRnhkRGhSaVptSDJ5bXJGbkpUSWc&output=csv', :responder => Aircraft::Guru.new },
                :select => proc { |record| manufacturer_whitelist? record['Manufacturer'] })

# OpenFlights.org airports database
RemoteTable.new('https://openflights.svn.sourceforge.net/svnroot/openflights/openflights/data/airports.dat',
                :headers => %w{ id name city country_name iata_code icao_code latitude longitude altitude timezone daylight_savings },
                :select => proc { |record| record['iata_code'].present? },
                :errata => { :url => RemoteTable.new('https://spreadsheets.google.com/pub?key=0AoQJbWqPrREqdFc2UzhQYU5PWEQ0N21yWFZGNmc2a3c&gid=0&output=csv', :responder => Airport::Guru.new }) # see https://github.com/brighterplanet/earth/blob/master/lib/earth/air/aircraft/data_miner.rb

# T100 flight segment data for #{month.strftime('%B %Y')}
# for definition of `form_data` and `FlightSegment::Guru` see https://github.com/brighterplanet/earth/blob/master/lib/earth/air/flight_segment/data_miner.rb
RemoteTable.new('http://www.transtats.bts.gov/DownLoad_Table.asp',
                :form_data => form_data,
                :compression => :zip,
                :glob => '/*.csv',
                :errata => { :url => RemoteTable.new('https://spreadsheets.google.com/spreadsheet/pub?key=0AoQJbWqPrREqdGxpYU1qWFR3d0syTVMyQVVOaDd0V3c&output=csv', :responder => FlightSegment::Guru.new },
                :select => proc { |record| record['DEPARTURES_PERFORMED'].to_i > 0 })

# 1995 Fuel Economy Guide
# for definition of `:fuel_economy_guide_b` and `AutomobileMakeModelYearVariant::ParserB` see https://github.com/brighterplanet/earth/blob/master/lib/earth/automobile/automobile_make_model_year_variant/data_miner.rb
RemoteTable.new("http://www.fueleconomy.gov/FEG/epadata/95mfgui.zip",
                :filename => '95MFGUI.DAT',
                :format => :fixed_width,
                :cut => '13-',
                :schema_name => :fuel_economy_guide_b,
                :select => proc { |row| row['model'].present? and (row['suppress_code'].blank? or row['suppress_code'].to_f == 0) and row['state_code'] == 'F' },
                :transform => { :class => AutomobileMakeModelYearVariant::ParserB, :year => 1995 },
                :errata => { :url => "https://docs.google.com/spreadsheet/pub?key=0AoQJbWqPrREqdDkxTElWRVlvUXB3Uy04SDhSYWkzakE&output=csv", :responder => AutomobileMakeModelYearVariant::Guru.new })

# 1998 Fuel Economy Guide
# for definition of `AutomobileMakeModelYearVariant::ParserC` see https://github.com/brighterplanet/earth/blob/master/lib/earth/automobile/automobile_make_model_year_variant/data_miner.rb
RemoteTable.new('http://www.fueleconomy.gov/FEG/epadata/98guide6.zip',
                :filename => '98guide6.csv',
                :transform => { :class => AutomobileMakeModelYearVariant::ParserC, :year => 1998 },
                :errata => { :url => "https://docs.google.com/spreadsheet/pub?key=0AoQJbWqPrREqdDkxTElWRVlvUXB3Uy04SDhSYWkzakE&output=csv", :responder => AutomobileMakeModelYearVariant::Guru.new },
                :select => proc { |row| row['model'].present? })

# annual corporate average fuel economy data for domestic and imported vehicle fleets from the NHTSA
RemoteTable.new('https://spreadsheets.google.com/pub?key=0AoQJbWqPrREqdEdXWXB6dkVLWkowLXhYSFVUT01sS2c&hl=en&gid=0&output=csv',
                :errata => { 'url' => 'http://static.brighterplanet.com/science/data/transport/automobiles/make_fleet_years/errata.csv' },
                :select => proc { |row| row['volume'].to_i > 0 })

# total vehicle miles travelled by gasoline passenger cars from the 2010 EPA GHG Inventory
RemoteTable.new('http://www.epa.gov/climatechange/emissions/downloads10/2010-Inventory-Annex-Tables.zip',
                :filename => 'Annex Tables/Annex 3/Table A-87.csv',
                :skip => 1,
                :select => proc { |row| row['Year'].to_i.to_s == row['Year'] })

# total vehicle miles travelled from the 2010 EPA GHG Inventory
RemoteTable.new('http://www.epa.gov/climatechange/emissions/downloads10/2010-Inventory-Annex-Tables.zip',
                :filename => 'Annex Tables/Annex 3/Table A-87.csv',
                :skip => 1,
                :select => proc { |row| row['Year'].to_i.to_s == row['Year'] })

# total travel distribution from the 2010 EPA GHG Inventory
RemoteTable.new('http://www.epa.gov/climatechange/emissions/downloads10/2010-Inventory-Annex-Tables.zip',
                :filename => 'Annex Tables/Annex 3/Table A-93.csv',
                :skip => 1,
                :select => proc { |row| row['Vehicle Age'].to_i.to_s == row['Vehicle Age'] })

# building characteristics from the 2003 EIA Commercial Building Energy Consumption Survey
RemoteTable.new('http://www.eia.gov/emeu/cbecs/cbecs2003/public_use_2003/data/FILE02.csv',
                :skip => 1,
                :headers => ["PUBID8","REGION8","CENDIV8","SQFT8","SQFTC8","YRCONC8","PBA8","ELUSED8","NGUSED8","FKUSED8","PRUSED8","STUSED8","HWUSED8","ONEACT8","ACT18","ACT28","ACT38","ACT1PCT8","ACT2PCT8","ACT3PCT8","PBAPLUS8","VACANT8","RWSEAT8","PBSEAT8","EDSEAT8","FDSEAT8","HCBED8","NRSBED8","LODGRM8","FACIL8","FEDFAC8","FACACT8","MANIND8","PLANT8","FACDST8","FACDHW8","FACDCW8","FACELC8","BLDPLT8","ADJWT8","STRATUM8","PAIR8"])

# 2003 CBECS C17 - Electricity Consumption and Intensity - New England Division
# for definition of `CbecsEnergyIntensity::NAICS_CODE_SYNTHESIZER` see https://github.com/brighterplanet/earth/blob/master/lib/earth/industry/cbecs_energy_intensity/data_miner.rb
RemoteTable.new("http://www.eia.gov/emeu/cbecs/cbecs2003/detailed_tables_2003/2003set10/2003excel/C17.xls",
                :headers => false,
                :select => proc { |row| CbecsEnergyIntensity::NAICS_CODE_SYNTHESIZER.call(row) },
                :crop => (21..37))

# U.S. Census 2002 NAICS code list
RemoteTable.new('http://www.census.gov/epcd/naics02/naicod02.txt',
                :skip => 4,
                :headers => false,
                :delimiter => '	')

# MECS table 3.2 Total US
RemoteTable.new("http://205.254.135.24/emeu/mecs/mecs2006/excel/Table3_2.xls",
                :crop => (15..94),
                :headers => ["NAICS Code", "Subsector and Industry", "Total", "BLANK", "Net Electricity", "BLANK", "Residual Fuel Oil", "Distillate Fuel Oil", "Natural Gas", "BLANK", "LPG and NGL", "BLANK", "Coal", "Coke and Breeze", "Other"])

# MECS table 6.1 Midwest
RemoteTable.new("http://205.254.135.24/emeu/mecs/mecs2006/excel/Table6_1.xls",
                :crop => (184..263),
                :headers => ["NAICS Code", "Subsector and Industry", "Consumption per Employee", "Consumption per Dollar of Value Added", "Consumption per Dollar of Value of Shipments"])

# U.S. Census Geographic Terms and Definitions
RemoteTable.new('http://www.census.gov/popest/about/geo/state_geocodes_v2009.txt',
                :skip => 6,
                :headers => %w{ Region Division FIPS Name },
                :select => proc { |row| row['Division'].to_i > 0 and row['FIPS'].to_i == 0 })

# state census divisions from the U.S. Census
RemoteTable.new('http://www.census.gov/popest/about/geo/state_geocodes_v2009.txt',
                :skip => 8,
                :headers => ['Region', 'Division', 'State FIPS', 'Name'],
                :select => proc { |row| row['State FIPS'].to_i > 0 })

# OpenGeoCode.org's Country Codes to Country Names list
RemoteTable.new('http://opengeocode.org/download/countrynames.txt',
                :format => :delimited,
                :delimiter => ';',
                :headers => false,
                :skip => 22)

# heating degree day data from WRI CAIT
RemoteTable.new('https://docs.google.com/spreadsheet/pub?key=0AoQJbWqPrREqdDN4MkRTSWtWRjdfazhRdWllTkVSMkE&output=csv',
                :select => Proc.new { |record| record['country'] != 'European Union (27)' },
                :errata => { :url => RemoteTable.new('https://docs.google.com/spreadsheet/pub?key=0AoQJbWqPrREqdDNSMUtCV0h4cUF4UnBKZlNkczlNbFE&output=csv' })

# US average grid loss factor derived eGRID 2007 data
RemoteTable.new('http://www.epa.gov/cleanenergy/documents/egridzips/eGRID2010V1_1_STIE_USGC.xls',
                :sheet => 'USGC',
                :skip => 5)

# eGRID 2010 regions and loss factors
RemoteTable.new('http://www.epa.gov/cleanenergy/documents/egridzips/eGRID2010V1_1_STIE_USGC.xls',
                :sheet => 'STIE07',
                :skip => 4,
                :select => proc { |row| row['eGRID2010 year 2007 file state sequence number'].to_i.between?(1, 51) })

# eGRID 2010 subregions and electricity emission factors
RemoteTable.new('http://www.epa.gov/cleanenergy/documents/egridzips/eGRID2010_Version1-1_xls_only.zip',
                :filename => 'eGRID2010V1_1_year07_AGGREGATION.xls',
                :sheet => 'SRL07',
                :skip => 4,
                :select => proc { |row| row['SEQSRL07'].to_i.between?(1, 26) })

# U.S. Census State ANSI Code file
RemoteTable.new('http://www.census.gov/geo/www/ansi/state.txt',
                :delimiter => '|',
                :select => proc { |record| record['STATE'].to_i < 60 })

# Mapping Hacks zipcode database
RemoteTable.new('http://mappinghacks.com/data/zipcode.zip',
                :filename => 'zipcode.csv')

# zipcode states and eGRID Subregions from the US EPA
RemoteTable.new('http://www.epa.gov/cleanenergy/documents/egridzips/Power_Profiler_Zipcode_Tool_v3-2.xlsx',
                :sheet => 'Zip-subregion')

# horse breeds
RemoteTable.new('http://www.freebase.com/type/exporttypeinstances/base/horses/horse_breed?page=0&filter_mode=type&filter_view=table&show%01p%3D%2Ftype%2Fobject%2Fname%01index=0&show%01p%3D%2Fcommon%2Ftopic%2Fimage%01index=1&show%01p%3D%2Fcommon%2Ftopic%2Farticle%01index=2&sort%01p%3D%2Ftype%2Fobject%2Ftype%01p%3Dlink%01p%3D%2Ftype%2Flink%2Ftimestamp%01index=false&=&exporttype=csv-8')

# Brighter Planet's list of cat and dog breeds, genders, and weights
RemoteTable.new('http://static.brighterplanet.com/science/data/consumables/pets/breed_genders.csv',
                :encoding => 'ISO-8859-1',
                :select => proc { |row| row['gender'].present? })

# residential electricity prices from the EIA
RemoteTable.new('http://www.eia.doe.gov/cneaf/electricity/page/sales_revenue.xls',
                :select => proc { |row| row['Year'].to_s.first(4).to_i > 1989 })

# residential natural gas prices from the EIA
# for definition of `NaturalGasParser` see https://github.com/brighterplanet/earth/blob/master/lib/earth/residence/residence_fuel_price/data_miner.rb
RemoteTable.new('http://tonto.eia.doe.gov/dnav/ng/xls/ng_pri_sum_a_EPG0_FWA_DMcf_a.xls',
                :sheet => 'Data 1',
                :skip => 2,
                :select => proc { |row| row['year'].to_i > 1989 },
                :transform => { :class => NaturalGasParser })

# 2005 EIA Residential Energy Consumption Survey microdata
RemoteTable.new('http://www.eia.doe.gov/emeu/recs/recspubuse05/datafiles/RECS05alldata.csv',
                :headers => :upcase)

# Public albums from the Facebook Engineering Team
RemoteTable.new('https://graph.facebook.com/Engineering/albums', format: :json, root_node: 'data')

# ...and more from the tests...

RemoteTable.new 'http://spreadsheets.google.com/pub?key=t5HM1KbaRngmTUbntg8JwPA&single=true&gid=0'

RemoteTable.new 'http://spreadsheets.google.com/pub?key=t5HM1KbaRngmTUbntg8JwPA'

RemoteTable.new 'http://spreadsheets.google.com/pub?key=t5HM1KbaRngmTUbntg8JwPA', :skip => 1, :headers => false

RemoteTable.new 'http://spreadsheets.google.com/pub?key=tObVAGyqOkCBtGid0tJUZrw'

RemoteTable.new 'http://spreadsheets.google.com/pub?key=tObVAGyqOkCBtGid0tJUZrw', :headers => %w{ col1 col2 col3 }

RemoteTable.new 'http://spreadsheets.google.com/pub?key=tujrgUOwDSLWb-P4KCt1qBg'

RemoteTable.new 'http://tonto.eia.doe.gov/dnav/pet/xls/PET_PRI_RESID_A_EPPR_PTA_CPGAL_M.xls', :transform => { :class => FuelOilParser }

RemoteTable.new 'http://www.freebase.com/type/exporttypeinstances/base/horses/horse_breed?page=0&filter_mode=type&filter_view=table&show%01p%3D%2Ftype%2Fobject%2Fname%01index=0&show%01p%3D%2Fcommon%2Ftopic%2Fimage%01index=1&show%01p%3D%2Fcommon%2Ftopic%2Farticle%01index=2&sort%01p%3D%2Ftype%2Fobject%2Ftype%01p%3Dlink%01p%3D%2Ftype%2Flink%2Ftimestamp%01index=false&=&exporttype=csv-8'

RemoteTable.new 'http://www.fueleconomy.gov/FEG/epadata/02data.zip', :filename => 'guide_jan28.xls'

RemoteTable.new 'http://www.fueleconomy.gov/FEG/epadata/08data.zip', :filename => '2008_FE_guide_ALL_rel_dates_-no sales-for DOE-5-1-08.csv'

RemoteTable.new 'http://www.fueleconomy.gov/FEG/epadata/08data.zip', :glob => '/*.csv'

RemoteTable.new 'http://www.fueleconomy.gov/FEG/epadata/98guide6.zip', :filename => '98guide6.csv'

RemoteTable.new 'http://www.worldmapper.org/data/opendoc/2_worldmapper_data.ods', :sheet => 'Data', :keep_blank_rows => true

RemoteTable.new 'https://spreadsheets.google.com/pub?key=t5HM1KbaRngmTUbntg8JwPA'

RemoteTable.new 'www.customerreferenceprogram.org/uploads/CRP_RFP_template.xlsx'

RemoteTable.new 'www.customerreferenceprogram.org/uploads/CRP_RFP_template.xlsx', :headers => %w{foo bar baz}

RemoteTable.new 'www.customerreferenceprogram.org/uploads/CRP_RFP_template.xlsx', :headers => false

RemoteTable.new 'http://www.transtats.bts.gov/DownLoad_Table.asp?Table_ID=293&Has_Group=3&Is_Zipped=0', :form_data => 'UserTableName=T_100_Segment__All_Carriers&[...]', :compression => :zip, :glob => '/*.csv'

RemoteTable.new "http://www.faa.gov/air_traffic/publications/atpubs/CNT/5-2-E.htm",
                :encoding => 'US-ASCII',
                :row_xpath => '//table/tr[2]/td/table/tr',
                :column_xpath => 'td'

RemoteTable.new "http://www.faa.gov/air_traffic/publications/atpubs/CNT/5-2-G.htm",
                :encoding => 'windows-1252',
                :row_xpath => '//table/tr[2]/td/table/tr',
                :column_xpath => 'td',
                :errata => Errata.new(:url => 'http://spreadsheets.google.com/pub?key=tObVAGyqOkCBtGid0tJUZrw',
                                      :responder => AircraftGuru.new)

RemoteTable.new "http://www.faa.gov/air_traffic/publications/atpubs/CNT/5-2-G.htm",
                :encoding => 'windows-1252',
                :row_xpath => '//table/tr[2]/td/table/tr',
                :column_xpath => 'td',
                :errata => { :url => 'http://spreadsheets.google.com/pub?key=tObVAGyqOkCBtGid0tJUZrw',
                             :responder => AircraftGuru.new }

RemoteTable.new 'http://www.fueleconomy.gov/FEG/epadata/00data.zip',
                :filename => 'Gd6-dsc.txt',
                :format => :fixed_width,
                :crop => 21..26, # inclusive
                :cut => '2-',
                :select => proc { |row| /\A[A-Z]/.match row['code'] },
                :schema => [[ 'code',   2, { :type => :string }  ],
                            [ 'spacer', 2 ],
                            [ 'name',   52, { :type => :string } ]]

RemoteTable.new 'http://cloud.github.com/downloads/seamusabshere/remote_table/test2.fixed_width.txt',
                :format => :fixed_width,
                :skip => 1,
                :schema => [[ 'header4', 10, { :type => :string }  ],
                            [ 'spacer',  1 ],
                            [ 'header5', 10, { :type => :string } ],
                            [ 'spacer',  12 ],
                            [ 'header6', 10, { :type => :string } ]]

RemoteTable.new 'http://cloud.github.com/downloads/seamusabshere/remote_table/test2.fixed_width.txt',
                :format => :fixed_width,
                :keep_blank_rows => true,
                :skip => 1,
                :schema => [[ 'header4', 10, { :type => :string }  ],
                            [ 'spacer',  1 ],
                            [ 'header5', 10, { :type => :string } ],
                            [ 'spacer',  12 ],
                            [ 'header6', 10, { :type => :string } ]]

RemoteTable.new 'http://cloud.github.com/downloads/seamusabshere/remote_table/remote_table_row_hash_test.fixed_width.txt',
                :format => :fixed_width,
                :skip => 1,
                :schema => [[ 'header1', 10, { :type => :string }  ],
                            [ 'spacer',  1 ],
                            [ 'header2', 10, { :type => :string } ],
                            [ 'spacer',  12 ],
                            [ 'header3', 10, { :type => :string } ]]

RemoteTable.new 'http://cloud.github.com/downloads/seamusabshere/remote_table/remote_table_row_hash_test.alternate_order.fixed_width.txt',
                :format => :fixed_width,
                :skip => 1,
                :schema => [[ 'spacer',  11 ],
                            [ 'header2', 10, { :type => :string }  ],
                            [ 'spacer',  1 ],
                            [ 'header3', 10, { :type => :string } ],
                            [ 'spacer',  1 ],
                            [ 'header1', 10, { :type => :string } ]]

Requirements

  • Unix tools like curl, iconv, perl, cat, cut, tail, etc. accessible from your $PATH
  • geo_ruby and dbf gems if you plan on fetching shapefiles

Wishlist

  • Win32 compat

Authors

Copyright

Copyright (c) 2012 Brighter Planet. See LICENSE for details.