Cut
A DSL for Scraping Websites
Installation
Add this line to your application's Gemfile:
gem 'cut'
And then execute:
$ bundle
Or install it yourself as:
$ gem install cut
Usage
Search Google:
class SearchResult
include Cut
url "http://google.com/search?q={{keywords}}"
selector "li.g"
map :title, String, to: "h3.r"
map :url, String, to: "div.s cite", operation: lambda {|str| str.upcase }
end
Return Results:
SearchResult.all(keywords: "war and peace")
#=> [#<SearchResult:0x007f94bbfaae90 @title="War and Peace - Wikipedia, the free encyclopedia", @url="HTTPS://EN.WIKIPEDIA.ORG/WIKI/WAR_AND_PEACE">, #<SearchResult:0x007f94beed97c0 @title="War and Peace (Vintage Classics): Leo Tolstoy, Richard Pevear ...", @url="WWW.AMAZON.COM/WAR-PEACE-VINTAGE-CLASSICS.../DP/1400079985">, #<SearchResult:0x007f94be95ee80 @title="War and Peace (1956) - IMDb", @url="WWW.IMDB.COM/TITLE/TT0049934/">, #<SearchResult:0x007f94be9cb198 @title="SparkNotes: War and Peace", @url="WWW.SPARKNOTES.COM/LIT/WARANDPEACE/">, #<SearchResult:0x007f94be9c7ea8 @title="War and Peace by graf Leo Tolstoy - Free Ebook - Project Gutenberg", @url="WWW.GUTENBERG.ORG/EBOOKS/2600">, #<SearchResult:0x007f94bc83f218 @title="War and Peace by Leo Tolstoy - Reviews, Discussion, Bookclubs, Lists", @url="WWW.GOODREADS.COM/BOOK/SHOW/656.WAR_AND_PEACE">, #<SearchResult:0x007f94bba7ee80 @title="War and Peace - The Literature Network", @url="WWW.ONLINE-LITERATURE.COM/TOLSTOY/WAR_AND_PEACE/">, #<SearchResult:0x007f94bba7b820 @title="War and Peace - graf Leo Tolstoy - Google Books", @url="BOOKS.GOOGLE.COM/BOOKS/ABOUT/WAR_AND_PEACE.HTML?ID=2GOK4HJO2VKC">, #<SearchResult:0x007f94bbed4ac0 @title="Images for war and peace", @url="">, #<SearchResult:0x007f94bdda0eb8 @title="War and Peace - Shmoop", @url="WWW.SHMOOP.COM/WAR-AND-PEACE/">, #<SearchResult:0x007f94bdd695d0 @title="War and Peace - Planet PDF", @url="WWW.PLANETPDF.COM/PLANETPDF/PDFS/FREE_EBOOKS/WAR_AND_PEACE_NT.PDF">, #<SearchResult:0x007f94bdde53d8 @title="News for war and peace", @url="">]
SearchResult.first(keywords: "war and peace")
#=> #<SearchResult:0x007f94bdfbeb78 @title="War and Peace - Wikipedia, the free encyclopedia", @url="HTTPS://EN.WIKIPEDIA.ORG/WIKI/WAR_AND_PEACE">
Contributing
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request