Skip to main content

OA Web Service API

The PMC OA Web Service API allows users to discover downloadable resources from the PMC Open Access Subset. These articles are available for download from our FTP site in tgz (tar'd, gzipped) format, or, for those articles that have them, in PDF format as well.

This API allows discovery of resources related to articles. For example, it can be used to find the PDFs of all articles that have been updated since a specified date. This could facilitate implementing tools that reuse the OA subset content, such as mirror sites, text mining processes, etc.

If you have questions or comments about this service, please write to the PMC help desk. To stay informed about new or updated tools or services provided by PMC, subscribe to the PMC-Utils-Announce mailing list.

The base URL for the service is https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi.

  • Requests to the service use HTTP GET or POST, with a set of parameters that specify the desired data.
  • There are two types of responses:

Date and Time Information

All dates and times are given in local time in Bethesda, Maryland: either EST (-05:00) or EDT (-04:00), depending on the time of year. There is a space separating the date from the time in responses. In URLs used for requests the space must be represented as either "+" or "%20".

Maximum Number of Results and Resumption Tokens

If there are more than 1000 records in a result set, then only the first 1000 will be returned, and the response will end with a element, which provides the resumptionToken value to use in resumptionToken parameter for your request to retrieve the next results. For example,

<resumption>
<link token="1102623!20130101000000!!!a1e8c64fd7952a09"
href="https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?resumptionToken=1102623!20130101000000!!!a1e8c64fd7952a09"/>
</resumption>

Get the next 1000 records in a result set:

https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?resumptionToken=843921!20120101000000!!!6e8a2c112f595273

Error responses

If there is any error in the request parameters, then a response will be produced that contains the tag, with a description of the problem.

Parameters

id Parameter

Use the id parameter to request information about a particular PMCID, e.g

https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?id=PMC5334499

from Parameter

Use the from parameter to request all records updated on or after the date or date and time specified in the parameter.

  • Specify a date - Use YYYY-MM-DD format
  • Specify a date/time combination - Use YYYY-MM-DD HH:MM:SS format.

Dates and times in responses are in local time in Bethesda, Maryland.

Note that in a URL, the space separating the date and the time can be represented either as a "+" or as "%20".

From a specific date:

https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?from=2021-01-01

From a specific date and time:

https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?from=2021-01-01+08:00:00

until Parameter

Use the until parameter together with the from parameter to request all records between the dates (or dates and times) specified in the parameters, e.g.

https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?from=2019-01-02&until=2019-01-02+07:00:00

format Parameter

Only return records that have PDFs, e.g.

https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?from=2019-01-01&format=pdf

resumptionToken Parameter

If your request returns more than 1000 results, your response set will return the first 1000 and will end with a element, that includes the value to use in the resumptionToken parameter for your next request.

Get the next 1000 records in a result set:

https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?resumptionToken=843921!20120101000000!!!6e8a2c112f595273

Identification response

Accessing the base URL of the service, without any other parameters, retrieves a response that provides information about the database. For example,

Get database information:

https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi

The response gives a list of the data formats supported (currently pdf and tgz), a count of the number of records in the OA subset (total and by format), and the dates/times of the earliest and latest updates. For example,

<OA>
  <responseDate>2021-03-15 18:07:50</responseDate>
  <request>https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi</request>
  <repositoryName>PubMed Central Open Access FTP Repository</repositoryName>
  <formats>
    <format>tgz</format>
    <format>pdf</format>
  </formats>
  <records>
    <count>3454737</count>
    <count format="tgz">3454736</count>
    <count format="pdf">816971</count>
    <latest>2021-03-15 13:16:22</latest>
  </records>
</OA>

Results Set Response

Adding parameters to the request causes the service to return a Results Set Response which includes information about a set of records in the database, as the following example illustrates.

Get a record by id:

https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?id=PMC5334499

In addition to echoing the response date and time, this will provide the article's citation, license, and retraction status, as well as information about any downloadable resources for that article, for example:

<OA>
  <responseDate>2019-01-28 10:41:16</responseDate>
  <request id="PMC5334499">https://www.ncbi.nlm.nih.gov/utils/oa/oa.fcgi?id=PMC5334499</request>
  <records returned-count="2" total-count="2">
    <record id="PMC5334499" citation="World J Radiol. 2017 Feb 28; 9(2):27-33" 
        license="CC BY-NC" retracted="no">
      <link format="tgz" updated="2017-03-17 13:10:45"
        href="ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/8e/71/PMC5334499.tar.gz"/>
      <link format="pdf" updated="2017-03-03 06:05:17"
        href="ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/8e/71/WJR-9-27.PMC5334499.pdf"/>
    </record>
  </records>
  </OA>