0.0
Low commit activity in last 3 years
There's a lot of open issues
A long-lived project that still receives updates
Utilities library for various scholarly identifiers used by Altmetric
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 10.0
~> 3.4

Runtime

~> 2.0
 Project Readme

Identifiers Gem Version

Collection of utilities related to the extraction, validation and normalization of various scholarly identifiers. The supported list is:

Supported Ruby versions: >= 2.7

Installation

Add this line to your application's Gemfile:

gem 'identifiers', '~> 0.12'

And then execute:

$ bundle

Or install it yourself as:

$ gem install identifiers

Usage

Identifiers::DOI.extract('example: 10.1234/5678.ABC')
# => ["10.1234/5678.abc"]

Identifiers::DOI.extract('no DOIs here')
# => []

Identifiers::URN.new('urn:abc:123')
# => #<URN:0x007ff11c13d930 @urn="urn:abc:123", @nid="abc", @nss="123">

Identifiers::URN('urn:abc:123')
# => #<URN:0x007ff11c0ff568 @urn="urn:abc:123", @nid="abc", @nss="123">

A small percentage of DOIs end in trailing .. However, having trailing periods being returned by the default extraction method would possibly return quite a few false positives. DOI.extract accepts a strict option, which can be set to true if we prefer to return DOIs ending in .. By default, this option is set to false, which strips any trailing .:

Identifiers::DOI.extract('example: 10.1234/5678.abc.', strict: true)
# => ["10.1234/5678.abc."]

Identifiers::DOI.extract('example: 10.1234/5678.abc.')
# => ["10.1234/5678.abc"]

By identifier

.extract is a common method that works across all the supported identifiers.

Identifiers::AdsBibcode.extract('')
Identifiers::ArxivId.extract('')
Identifiers::DOI.extract('')
Identifiers::Handle.extract('')
Identifiers::ISBN.extract('')
Identifiers::NationalClinicalTrialId.extract('')
Identifiers::ORCID.extract('')
Identifiers::PubmedId.extract('')
Identifiers::RepecId.extract('')
Identifiers::URN.extract('')

For ISBNs .extract, you can pass an array of prefixes as an optional parameter when you want to exclude matches that are not preceded by those prefixes (it is case insensitive and ignores ':' and extra whitespaces):

Identifiers::ISBN.extract(
  "IsBN:9789992158104  \n isbn-10 9789971502102 \n ISBN-13: 9789604250592 \n 9788090273412",
  ["ISBN", "ISBN-10"]
)
# => ["9789992158104", "9789971502102"]

But for some identifiers might have more. Check their implementation to see all the methods available.

For URNs, please check the URN gem documentation to see all the available options.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/altmetric/identifiers.

Contributions

PHP version

We also maintain a version of this library for PHP.

License

Copyright © 2016-2024 Altmetric LLP

Distributed under the MIT License.