patentscope gem
Gem to allow easy access to data from the WIPO PATENTSCOPE Web Service.
Introduction
The Patentscope gem allows easy access, with Ruby, to data provided by the PATENTSCOPE Web Service of the World Intellectual Property Organisation (WIPO).
As provided by WIPO, the PATENTSCOPE Web Service is available through a SOAP interface. The documentation provided by WIPO uses Java.
The Patentscope gem, on the other hand, provides a simple Ruby interface to the PATENTSCOPE Web Service. The gem allows access for each of the functions available from the SOAP interface.
About the WIPO PATENTSCOPE Web Service
From PATENTSCOPE Web Service site:
"Includes:
- Bibliographic data for all published international applications (XML format);
- Images for all published international applications (TIFF format);
- Full-text description and claims (OCR output) for all international applications published in English, French, German, Spanish and Russian, as well as Japanese and Korean (soon available) (PDF format).
Available on the Internet on the day of publication. Programmatic access ... to the documents available in the document tab of the PATENTSCOPE search engine (example). This set makes it possible to integrate access to PATENTSCOPE in an IT architecture, to retrieve the International Application Status Report (IASR) and to parse it on the fly and to download, within the framework of the authorized uses policy; documents by batch. The formats of the documents are the same as the formats of the documents available via the web site, i.e. TIFF, XML for all documents and a text-based PDF OCR for most pamphlets."
The PATENTSCOPE Web Service is available from the World Intellectual Property Organisation (WIPO) through a paid subscription. The current cost of a subscription is 600 Swiss Francs per calendar year.
If you ask nicely, the folks at WIPO might give you a trial account.
Installation
Add this line to your application's Gemfile:
gem 'patentscope'
And then execute:
$ bundle
Or install it yourself as:
$ gem install patentscope
Usage
Configuration
Run the configuration block first to set the credentials for the PATENTSCOPE Web Service.
Patentscope.configure do |config|
config.username = 'username'
config.password = 'password'
end
Configuring from Environment Variables
It is most convenient to store the PATENTSCOPE Web Service username and password credentials as environment variables.
If these are stored as PATENTSCOPE_WEBSERVICE_USERNAME
and PATENTSCOPE_WEBSERVICE_PASSWORD
respectively, you can simply use
Patentscope.configure_from_env
to load the credentials into the configuration in a single step.
Querying and Resetting Configuration
The configured?
class method returns a boolean indicating whether the configuration has been set. This doesn't necessarily mean that the credentials are valid, only that they have been set.
Patentscope.configured? #=> true
Use the username
method of the configuration
object to obtain the username set in the configuration.
Patentscope.configuration.username #=> 'username'
The password
method of the configuration
object returns the password set in the configuration.
Patentscope.configuration.password #=> 'password'
The reset_configuration
class method resets the configuration.
Patentscope.reset_configuration
Patentscope.configuration #=> nil
List of Available Methods
get_iasr
get_available_documents
get_document_content
get_document_ocr_content
get_document_table_of_contents
get_document_content_page
wsdl
Getting the International Application Status Report (get_iasr
)
This is possibly the most useful of all the functions provided by this gem.
The get_iasr
class method returns an International Application Status Report in XML format for the specified application number. The IASR document is essentially a bibliographic summary of the PCT application in XML format.
The get_iasr
method takes an International Application number, with or without the PCT prefix and with or without slashes.
Patentscope.get_iasr('SG2003000062')
Patentscope.get_iasr('SG2003/000062')
Patentscope.get_iasr('PCTSG2003000062')
Patentscope.get_iasr('PCT/SG2003/000062')
Example output for SG2003000062:
Patentscope.get_iasr('SG2003000062')
#=>
<?xml version="1.0"?>
<wo-international-application-status>
<wo-bibliographic-data produced-by="IB" dtd-version="0.2" lang="EN" date-produced="20140108">
<publication-reference>
<document-id lang="EN">
<country>WO</country>
<doc-number>2009/105044</doc-number>
<kind>A1</kind>
<date>20090827</date>
</document-id>
</publication-reference>
<wo-publication-info>...
The PATENTSCOPE Web Service doesn't allow us to access documents using WO publication numbers. Calling Patentscope.get_iasr('WO2003/080231')
for example will fail.
Getting a List of Available Documents (get_available_documents
)
The get_available_documents
class method returns the list of available documents for the specified application number.
Patentscope.get_available_documents('SG2009000062')
# =>
<?xml version="1.0"?>
<doc ocrPresence="no" docType="RO101" docId="id00000008679651"/>
Getting the Binary Content of a Document (get_document_content
)
The get_document_content
class method returns the binary content of the document for the specified document id.
Patentscope.get_document_content('090063618004ca88')
#=>
<?xml version="1.0"?>
<documentContent>UEsDBBQACAAIAIyMOy0AAAAAAAAAAAAAAAAKAAAAMDAwMDAxLnRpZsy7ezxU2 ...
Getting the Text of a Document in PDF Format (get_document_ocr_content
)
The get_document_ocr_content
class method returns the binary content of the document for the specified document id, in text-based PDF format (high quality OCR).
Patentscope.get_document_ocr_content('id00000015801579')
=>
<?xml version="1.0"?>
<documentContent>JVBERi0xLjQKJeLjz9MKOCAwIG9iago8PC9EZWNvZGVQYXJtcyA8PC9CbG ...
Getting a List of Page IDs for a Document (get_document_table_of_contents
)
The get_document_table_of_contents
class method returns the list of page ids for the specified document id.
Patentscope.get_document_table_of_contents('090063618004ca88')
#=>
<?xml version="1.0"?>
<content>000001.tif</content>
Getting the Binary Content for a Document and Page (get_document_content_page
)
The get_document_content_page
class method returns the binary content for specified document and page ids.
Patentscope.get_document_content_page('090063618004ca88', '000001.tif')
#=>
<?xml version="1.0"?>
<pageContent>SUkqAAgAAAASAP4ABAABAAAAAAAAAAABAwABAAAA</pageContent>
###Getting a WSDL Document for the Web Service (wsdl
)
The wsdl
method returns a WSDL document for the PATENTSCOPE Web Service
Patentscope.wsdl
#=>
<?xml version='1.0' encoding='UTF-8'?>
...
<wsdl:definitions...
Errors
- NoCredentialsError - attempting to use the client when no credentials were entered
- WrongCredentialsError - attempting to access the PATENTSCOPE Webservice with incorrect credentials
- NoAppNumberError - no application numbere was entered, or unable to convert number
- NoDocIDError - no document id was entered
- NoPageIDError - no page id was entered
- BusinessError - PATENTSCOPE Webservice returned a "business error"
Disclaimer
The Patentscope gem is not an official product of the World Intellectual Property Organization. No warranty is attached to the provision of this gem. Please use this gem at your own risk.
You are solely responsible for ensuring that any uses are suitable, authorized and otherwise legal. In particular, you are solely responsible for maintaining a subscription at WIPO and adhering to WIPO's terms of use for the PATENTSCOPE Web Service.
Please note the Terms and Conditions relating to use of the PATENTSCOPE Web Service, especially Section 3.2 ("authorized uses", i.e., fewer than 10 retrieval related actions per minute from an individual IP address of a subscriber).
Support
I am happy to help with any queries in relation to the Patentscope gem. Please file an issue on the Github repo and I will try my best to help.
I am unable to answer any queries on the PATENTSCOPE Web Service itself.
For support on the PATENTSCOPE Web Service, please see the resources in the section below, visit the PATENTSCOPE Forum or send an email to WIPO at patentscope@wipo.int
Resources
Web App Using Patentscope Gem
- The PCT National Phase Guide fetches IASR data from WIPO PATENTSCOPE and displays a customised guide for the national phase entry of a PCT application
PATENTSCOPE Web Service
- PATENTSCOPE Web Service
- Terms and Conditions
- Subscription Form
- News
- PCT PATENTSCOPE Web-services for Offices (Presentation by WIPO)
PATENTSCOPE Search System
Contact
Comments and bug reports are welcome.
Contributing
Feel free to drop us a line to let us know you would like to work on something or if you have an idea. Otherwise, fork, code, commit, push and create pull request, viz:
- Create a fork of the repo from http://github.com/cantab/patentscope.
- Create your feature branch (
git checkout -b new-feature
). - Write some tests (in RSpec, if you please).
- Write the code that allows the tests to pass.
- Commit your changes (
git commit -am 'Add some feature'
). - Push to the branch (
git push origin new-feature
). - Create a new [Pull Request] (https://help.github.com/articles/using-pull-requests).
More details on how to contribute can be found at this great Thoughtbot blogpost [8 (new) steps for fixing other people's code] (http://robots.thoughtbot.com/8-new-steps-for-fixing-other-peoples-code).
License
Copyright (c) 2013-2016 Chong-Yee Khoo. All rights reserved.
MIT License
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.