Project

tsumamigui

0.0
No commit activity in last 3 years
No release in over 3 years
Tsumamigui(つまみぐい) is a simple and hussle-free Ruby web scraping library.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

 Project Readme

Tsumamigui

Gem Version circleci Build Status Code Climate Test Coverage Dependency Status Inline docs codebeat badge

Tsumamigui(つまみぐい) is a simple and hussle-free Ruby web scraping library.

Requirement

Ruby 2.1+

Installation

Add this line to your application's Gemfile:

gem 'tsumamigui'

Or install it yourself as:

$ gem install tsumamigui

Usage

You just give it a URL(or URLs) and Xpath to data you want to get with its label as a hash. Then you can get scraped and parsed data as array.

Tsumamigui.scrape('http://example.com', {h1: 'html/body/div/h1/text()'})

# Returns:
# [
#   {h1: 'Example Domain', scraped_from: 'http://example.com'}
# ]

You can specify multiple URLs if you want to scrape different pages which they have the same HTML structure.

urls = ['http://example.com/page/1', 'http://example.com/page/2']
Tsumamigui.scrape(urls, {h1: 'html/body/div/h1/text()'})

# Returns:
# [
#   {h1: 'Example Domain 1', scraped_from: 'http://example.com/page/1'}
#   {h1: 'Example Domain 2', scraped_from: 'http://example.com/page/2'}
# ]

Important: Tsumamigui requests each urls at intervals of 1.0~3.0sec automatically.

TODO

  • Custom request headers.

etc...

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/obiyuta/tsumamigui. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

Guideline

  1. Fork it ( http://github.com/obiyuta/tsumamigui )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Write codes and specs.
    • Run test suite with bundle exec rspec and confirm that it passes
    • Run lint checker with the bundle exec rubocop and confirm that it passes
  4. Commit your changes (git commit -am 'Add some feature')
  5. Push to the branch (git push origin my-new-feature)
  6. Create new Pull Request

License

The gem is available as open source under the terms of the MIT License.

Copyright (c) 2017 Obi Yuta. See MIT-LICENSE for details.