UriResolver
Checking whether a URI resolves manually is easy: You simply enter it into your browser and wait to see what happens!
Scripting this is normally quite easy, too. Some things you might try are:
- Check if the URI is a valid format, i.e. does it match
URI::regexp
? - Ping the URI, and check for a response.
- Try using
URI.parse
orNet::HTTP.get
, and rescue ifErrno::ECONNREFUSED
?
For most URIs a simple method like the above works fine.
However, for many "obscure" websites, such as those registered in new GTLDs, you run into all sorts of trouble attempting this - especially when checking a very long list! For example:
- Even a simple
ping
(i.e. opening a TCP connection) can freeze while connecting to the DNS server - which causes it to take ~20 seconds to check just one URI!! - Some URIs resolve via many layers of redirections.
- Some URIs resolve, but to "invalid" URIs which contain escape sequences.
- The URI may resolve to a HTTPS connection with no/a misconfigured SSL certificate.
- The URI may connect, but fails/times out when attempting to read any data.
This ruby gem attempts to gracefully account for all of these edge cases (and more), by use of intelligent error handling. It also uses a multi threaded approach to prevent system-blocking timeouts.
It's a pretty simplistic solution, but will hopefully be useful to others.
Installation
Feel free to use this as you like, but I'm currently experimenting with it;
implementation may change significantly, until v1.0.0
is published.
Add this line to your application's Gemfile:
gem 'uri_resolver'
And then execute:
$ bundle
Usage
UriResolver.resolve_status "google.com" # => :resoves
UriResolver.resolve_status "sakjflkdjsfh.com" # => :does_not_resolve
# If the connection times out, then the gem returns :maybe_resolves
UriResolver.resolve_status "getmintedpoker.tv" # => :maybe_resolves
# Such URIs *probably* don't resolve, but a manual check may be a good idea
Warning: This is not perfect; you can still get some false negatives. For example:
# Intermittant and very slow... This often times out, but sometimes does resolve!
UriResolver.resolve_status "bet-and-win.gr" # => :maybe_resolves
# This IS a real website, but (currently) has "Bandwidth Limit Exceeded" error:
UriResolver.resolve_status "notarealwebsite.com" # => :does_not_resolve
For a more "complete" usage example, see the file(s) in the scripts/
folder of this repo.
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests.
You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
.
To release a new version, update the version number in version.rb
, and then run
bundle exec rake release
, which will create a git tag for the version,
push git commits and tags, and push the .gem
file to
rubygems.org.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/tom-lord/uri_resolver.