OA Web Service API
The PMC OA Web Service API allows users to discover downloadable resources from the PMC Open Access Subset. These articles are available for download from our FTP site in tgz (tar'd, gzipped) format, or, for those articles that have them, in PDF format as well.
This API allows discovery of resources related to articles. For example, it can be used to find the PDFs of all articles that have been updated since a specified date. This could facilitate implementing tools that reuse the OA subset content, such as mirror sites, text mining processes, etc.
If you have questions or comments about this service, please write to the PMC help desk. To stay informed about new or updated tools or services provided by PMC, subscribe to the PMC-Utils-Announce mailing list.
The base URL for the service is https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi.
- Requests to the service use HTTP GET or POST, with a set of parameters that specify the desired data.
- There are two types of responses:
- an identification response, which returns information about the service and database as a whole, and
- a results set response, which returns a list of records.
Date and Time Information
All dates and times are given in local time in Bethesda, Maryland: either EST (-05:00) or EDT (-04:00), depending on the time of year. There is a space separating the date from the time in responses. In URLs used for requests the space must be represented as either "+" or "%20".
Maximum Number of Results and Resumption Tokens
If there are more than 1000 records in a result set, then only the first 1000 will be returned, and the response will end with a
<resumption>
<link token="1102623!20130101000000!!!a1e8c64fd7952a09"
href="https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?resumptionToken=1102623!20130101000000!!!a1e8c64fd7952a09"/>
</resumption>
Get the next 1000 records in a result set:
Error responses
If there is any error in the request parameters, then a response will be produced that contains the
Parameters
id Parameter
Use the id parameter to request information about a particular PMCID, e.g
https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?id=PMC5334499
from Parameter
Use the from parameter to request all records updated on or after the date or date and time specified in the parameter.
- Specify a date - Use YYYY-MM-DD format
- Specify a date/time combination - Use YYYY-MM-DD HH:MM:SS format.
Dates and times in responses are in local time in Bethesda, Maryland.
Note that in a URL, the space separating the date and the time can be represented either as a "+" or as "%20".
From a specific date:
https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?from=2021-01-01
From a specific date and time:
https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?from=2021-01-01+08:00:00
until Parameter
Use the until parameter together with the from parameter to request all records between the dates (or dates and times) specified in the parameters, e.g.
https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?from=2019-01-02&until=2019-01-02+07:00:00
format Parameter
Only return records that have PDFs, e.g.
https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?from=2019-01-01&format=pdf
resumptionToken Parameter
If your request returns more than 1000 results, your response set will return the first 1000 and will end with a
Get the next 1000 records in a result set:
Identification response
Accessing the base URL of the service, without any other parameters, retrieves a response that provides information about the database. For example,
Get database information:
https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi
The response gives a list of the data formats supported (currently pdf and tgz), a count of the number of records in the OA subset (total and by format), and the dates/times of the earliest and latest updates. For example,
<OA> <responseDate>2021-03-15 18:07:50</responseDate> <request>https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi</request> <repositoryName>PubMed Central Open Access FTP Repository</repositoryName> <formats> <format>tgz</format> <format>pdf</format> </formats> <records> <count>3454737</count> <count format="tgz">3454736</count> <count format="pdf">816971</count> <latest>2021-03-15 13:16:22</latest> </records> </OA>
Results Set Response
Adding parameters to the request causes the service to return a Results Set Response which includes information about a set of records in the database, as the following example illustrates.
Get a record by id:
https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?id=PMC5334499
In addition to echoing the response date and time, this will provide the article's citation, license, and retraction status, as well as information about any downloadable resources for that article, for example:
<OA> <responseDate>2019-01-28 10:41:16</responseDate> <request id="PMC5334499">https://www.ncbi.nlm.nih.gov/utils/oa/oa.fcgi?id=PMC5334499</request> <records returned-count="2" total-count="2"> <record id="PMC5334499" citation="World J Radiol. 2017 Feb 28; 9(2):27-33" license="CC BY-NC" retracted="no"> <link format="tgz" updated="2017-03-17 13:10:45" href="ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/8e/71/PMC5334499.tar.gz"/> <link format="pdf" updated="2017-03-03 06:05:17" href="ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/8e/71/WJR-9-27.PMC5334499.pdf"/> </record> </records> </OA>