Skip to main content

Do text mining / retrieving full text

April 13, 2026: Update on PMC Article Dataset Distribution Changes

As announced on February 12, major changes to PMC's Article Dataset Distribution Services are underway.

On April 13, all legacy files for the PMC Article Datasets were moved to new temporary directories and prefixes on the PMC FTP and Cloud Services.

  • FTP Service: all legacy files were moved to a new directory named "deprecated."
  • Cloud Service: all legacy prefixes were updated to add "deprecated" to the prefix. Prefixes for legacy files now begin with //pmc-oa-opendata/deprecated/.

This intentional disruption alerts users to the upcoming changes to the PMC Cloud Service on AWS, while allowing for easy updates to keep existing automated workflows running. We encourage users of the legacy PMC FTP and PMC Cloud Services to begin working with the updated PMC Cloud Service structure and to adjust existing workflows.

All legacy files on the FTP and Cloud Services will be removed in August 2026.

For complete details about this transition, please see the NCBI Insights blog post and our documentation on Accessing PMC Article Datasets Using Amazon Web Services

The majority of articles in PMC are subject to traditional copyright restrictions, and are not available for downloading in bulk. However, we do have several large datasets of journal articles and other scientific publications made available for automated retrieval under license terms that generally allow for more liberal redistribution and reuse than a traditional copyrighted work. We provide multiple ways of programmatically retrieving the full text as described on the PMC Article Datasets page.

NLM provides cloud service access to the PMC Article Datasets. As part of this service, content from these datasets is accessible to users on Amazon Web Services (AWS), without charge, through either an HTTPS or S3 URL, and without any log-in requirement for retrieval. Cloud Service documentation is available on the PMC Cloud Service and Accessing PMC Article Datasets Using AWS pages.