LinkHum
LinkHum (aka "Links Humana") is URL auto-linker for user-entered texts. It tries hard to do the most reasonable thing even in complex cases.
It will be useful for sites with plain-text user input
Features:
- auto-links URL;
- very accurate detection of punctiations inside and outside of URL;
- excessive tests set for complex (yet real-life) texts with URLs;
- customizable behavior.
NB: the original algo was written by @squadette and the test cases provided by users of Mokum. Just gemifying this (on behalf of original author).
Install
[sudo] gem install linkhum
Or in your Gemfile
gem 'linkhum'
And then
bundle install
Usage
As simple as:
LinkHum.urlify("Please look at http://github.com/zverok/linkhum, it's awesome!")
# => 'Please look at <a href="http://github.com/zverok/linkhum">http://github.com/zverok/linkhum</a>, it's awesome!'
Showcase
# Doesn't touch punctuations outside:
LinkHum.urlify('http://slashdot.org, or http://lwn.net? They say, "just http://google.com"')
# => "<a href='http://slashdot.org'>http://slashdot.org</a>, or <a href='http://lwn.net'>http://lwn.net</a>? They say, \"just <a href='http://google.com'>http://google.com</a>\""
# But processes it inside:
LinkHum.urlify('Watch this: https://www.youtube.com/watch?v=Q9Dv4Hmf_O8')
# => "Watch this: <a href='https://www.youtube.com/watch?v=Q9Dv4Hmf_O8'>https://www.youtube.com/watch?v=Q9Dv4Hmf_O8</a>"
# Understands parentheses:
LinkHum.urlify("It's a movie: https://en.wikipedia.org/wiki/Hours_(2013_film) It's just parens: (https://www.youtube.com/watch?v=Q9Dv4Hmf_O8)")
# => "It's a movie: <a href='https://en.wikipedia.org/wiki/Hours_(2013_film)'>https://en.wikipedia.org/wiki/Hours_(2013_film)</a> It's just parens: (<a href='https://www.youtube.com/watch?v=Q9Dv4Hmf_O8'>https://www.youtube.com/watch?v=Q9Dv4Hmf_O8</a>)"
# URL shortening:
LinkHum.urlify("It's too long: http://www.booking.com/searchresults.ru.html?sid=28c7356c8d0fb6d81de3a45eff97e0fe;dcid=4;bb_asr=2&class_interval=1&csflt=%7B%7D&dest_id=-2167973&dest_type=city&group_adults=2&group_children=0&idf=1&label_click=undef&no_rooms=1&offset=0&review_score_group=empty&score_min=0&si=ai%2Cco%2Cci%2Cre%2Cdi&src=index&ss=Lisbon%2C%20Lisbon%20Region%2C%20Portugal&ss_raw=Lisbon&ssb=empty")
# => "It's too long: <a href='http://www.booking.com/searchresults.ru.html?sid=28c7356c8d0fb6d81de3a45eff97e0fe;dcid=4;bb_asr=2&class_interval=1&csflt=%7B%7D&dest_id=-2167973&dest_type=city&group_adults=2&group_children=0&idf=1&label_click=undef&no_rooms=1&offset=0&review_score_group=empty&score_min=0&si=ai,co,ci,re,di&src=index&ss=Lisbon,%20Lisbon%20Region,%20Portugal&ss_raw=Lisbon&ssb=empty'>http://www.booking.com/searchresults.ru.html?sid=28c7356c8d0f...</a>"
# It's customizable:
LinkHum.urlify(
"It's too long: http://www.booking.com/searchresults.ru.html?sid=28c7356c8d0fb6d81de3a45eff97e0fe;dcid=4;bb_asr=2&class_interval=1&csflt=%7B%7D&dest_id=-2167973&dest_type=city&group_adults=2&group_children=0&idf=1&label_click=undef&no_rooms=1&offset=0&review_score_group=empty&score_min=0&si=ai%2Cco%2Cci%2Cre%2Cdi&src=index&ss=Lisbon%2C%20Lisbon%20Region%2C%20Portugal&ss_raw=Lisbon&ssb=empty",
max_length: 20)
# =>
# International domains and Non-ASCII paths:
LinkHum.urlify("Domain: http://www.詹姆斯.com/, and path: https://ru.wikipedia.org/wiki/Эффект_Даннинга_—_Крюгера")
# => "Domain: <a href='http://www.詹姆斯.com/'>http://www.詹姆斯.com/</a>, and path: <a href='https://ru.wikipedia.org/wiki/%D0%AD%D1%84%D1%84%D0%B5%D0%BA%D1%82_%D0%94%D0%B0%D0%BD%D0%BD%D0%B8%D0%BD%D0%B3%D0%B0_%E2%80%94_%D0%9A%D1%80%D1%8E%D0%B3%D0%B5%D1%80%D0%B0'>https://ru.wikipedia.org/wiki/Эффект_Даннинга_—_Крюгера</a>"
# Look, ma, no XSS!
LinkHum.urlify('http://example.com/foo?">here.</a><script>window.alert("wow");</script>')
# => "<a href='http://example.com/foo?%22%3Ehere.%3C/a%3E%3Cscript%3Ewindow.alert(%22wow%22);%3C/script%3E'>http://example.com/foo?\">here.</a><script>window.alert(\"wow\")...</a>"
Customization
On the fly
Custom URL params:
LinkHum.urlify("http://oursite.com/posts/12345 has been mentioned at http://cnn.com"){
|uri|
uri.host == 'oursite.com' ? {} : {target: '_blank'}
}
# => "<a href='http://oursite.com/posts/12345'>http://oursite.com/posts/12345</a> has been mentioned at <a href='http://cnn.com' target='_blank'>http://cnn.com</a>"
Provided block should receive an instance of Addressable::URI
and
return hash of additional link attributes. You can use it for opening
foreign links in new tab, or for styling them different (Wikipedia-style),
or to provide special icons for links to Youtube, Wikipedia and Google...
Up to you
Define your own LinkHum
class MyLinks < LinkHum
def link_attrs(uri)
{target: '_blank'} unless uri.host == 'oursite.com'
end
end
MyLinks.urlify("http://oursite.com/posts/12345 has been mentioned at http://cnn.com")
# => "<a href='http://oursite.com/posts/12345'>http://oursite.com/posts/12345</a> has been mentioned at <a href='http://cnn.com' target='_blank'>http://cnn.com</a>"
You can also define special strings, which should also became URLs on your site:
class MyLinks < LinkHum
special /@(\S+)\b/ do |username|
"http://oursite/users/#{username}"
end
end
MyLinks.urlify("Hey, @jude!")
# => "Hey, <a href='http://oursite/users/jude'>@jude</a>!"
# nil or false means no replacements:
class MyLinksConditional < LinkHum
special /@(\S+)\b/ do |username|
"http://oursite/users/#{username}" if User.where(name: username).exists?
end
end
MyLinksConditional.urlify("So, our @dude and @unknownguy walk into a bar...")
# => "So, our <a href='http://oursite/users/dude'>@dude</a> and @unknownguy walk into a bar..."
Some special
gotchas:
- in version 0.0.2, you can define any number of
special
s, but it's totally up to you to have non-conflicting, clearly distinguished patterns; - it passes to the block values by the same logic as
String#scan
does:
class AllSymbols < LinkHum
special /@\S+\b/ do |username|
p username
nil
end
end
AllSymbols.urlify('@dude')
# Receives "@dude"
class SelectedPart < LinkHum
special /@(\S+)\b/ do |username|
p username
nil
end
end
SelectedPart.urlify('@dude')
# Receives "dude"
class SeveralArgs < LinkHum
special(/@(\S+)_(\S+)\b/) do |first, second|
p first, second
nil
end
end
SeveralArgs.urlify('@cool_dude')
# Receives "cool", "dude"
"Parse only" mode
If your demands for resulting strings construction is far more complicated
than default LinkHum behavior, you can use its #parse
command to split
string into tokens, and process them by yourself. All URL-detection
goodness and special
s still will be with you:
class MyParser < LinkHum
# You don't need rendering blocks for your specials
# Second argument is special's name, it is optional
special /@(\S+)\b/, :username
special /\#(\S+)\b/, :tag
end
MyParser.parse("Here is @dude. He is #cute. Is he on http://facebook.com?")
# => [
# {type: :text , content: 'Here is '},
# {type: :username, content: '@dude', captures: ['dude']},
# {type: :text , content: '. He is '},
# {type: :tag , content: '#cute', captures: ['cute']},
# {type: :text , content: '. Is he on '},
# {type: :url , content: 'http://facebook.com'},
# {type: :text , content: '?'}
# ]
Credits
- @squadette -- author of original code;
- users of Mokum -- testing and advicing (and now you can observe LinkHum work online at Mokum);
- @zverok -- gemifying, documenting and writing specs.
Contributing
Just usual fork-change-pull request process.
Development
- Don't forget to use
rspec
after any changes made (and specify them, of course!) - It's preferred to use
bundle exec dokaz
to check if README written correctly andbundle exec dokaz -fshow
to check what exactly code from README will output.
License
MIT