title: “Patent Data Sources” |
author: “Paul Oldham” |
date: “22/06/2018” |
output: html_document |
Programmatic access to patent databases is presently limited. However, for ABS monitoring searches based on inventor, organisations and species name will be possible with free services. If you are completely new to working with patent data the WIPO Manual on Open Source Patent Analytics is recommended reading. In particular see the section on patent data fields
For ABS monitoring we will typically want to perform look ups by:
To avoid extremely noisy results (the John Smith problem) and bearing in mind issues such as name variants (Kirk, James T or Kirk James) it is important to combine search operators where possibe. An example might be inventor name AND Applicant AND species. More detail on options will be provided in future.
The Lens is a free open access patent database that seeks to make patent data and now scientific literature available without restrictions. No API is available but the data is open access. Please respect the robots.txt.
An R package is available that uses rvest
to retrieve data from the main data fields from the Lens (while respecting the database). In contrast with the EPO OPS and Patentsview (below), full text searching is available through the Lens and for that reason it is ranked first.
You will need to install devtools install.packages("devtools")
to install the package from Github.
The USPTO has an antiquated public database interface BUT has created a number of RESTFul APIs returning JSON
For ABS monitoring the patentsview data download page offers options including a complete set of inventor and cleaned inventor names. This could be used to generate a lookup table. However, note that patentsview download data is a set of tables for joining on a key field and the titles and asbstracts are not presently included as downloads. The full text of all patent documents can be made available for download on request.
For working with text data the MySQL tables available from the PatentsView data download page avoid the need to parse the XML from the bulk download. The text of the detailed descriptions are available as .tsv downloads from PatentsView on request and take the form of six zipped files of about 39GB each. Title and Abstract fields are available for text mining in the Patent table and the Claims are available as a distinct table.
Throttling on this service is common and the number of records that can be retrieved through calls are also limited. OPS is pretty hard to use and the output in either XML or JSON is complex and consists of lists of varying lengths.
See this publication for an example of the use of Patent2Net
opsr
An R client for accessing and searching. Presently archived.The World Intellectual Property Organisation offers full text search through Patentscope. It is unusual among patent databases because it offers chemical structure search. It also has particularly good translation facilities for search and data retrieval. See the video tutorials for more details.
A SOAP based API is available for 600 Swiss francs per year. For details see this page.