How to Cite a Dataset in APA Format
Understanding Dataset Citations in APA
Datasets have become critical research materials across all disciplines. Researchers increasingly make their data available for verification and further analysis. APA format provides guidelines for properly citing datasets, recognizing data creators’ contributions and enabling reproducible research.
Basic Dataset Citation Format
The standard APA format for datasets is: Creator(s). (Year). Dataset name. Repository Name. https://doi.org/DOI or URL
Include the dataset creator’s name, publication year, the dataset title in italics, the repository name, and the DOI or URL for access.
Dataset Citation with DOI
A complete dataset citation:
Smith, J., & Johnson, M. (2024). Global climate measurements 2024. Zenodo. https://doi.org/10.5281/zenodo.1234567
The DOI provides a persistent link to the dataset.
Dataset with Version Number
For datasets with specific versions:
Williams, R., Chen, S., & Brown, J. (2023). Economic indicators dataset (Version 3.2). World Bank Data Repository. Retrieved from https://data.worldbank.org/dataset/economic-indicators
Include the version number to ensure reproducibility.
Organization as Dataset Creator
When an organization publishes data:
U.S. Census Bureau. (2024). American Community Survey 5-year estimates. U.S. Census Bureau Data Repository. Retrieved from https://www.census.gov/acs
Government Dataset Citation
For federal datasets:
National Oceanic and Atmospheric Administration. (2024). Global surface temperature records. NOAA Climate Data Repository. https://www.noaa.gov/climate-data
Include the agency name and the specific repository.
In-Text Citations for Datasets
For in-text citations, use the author-date format: (Smith & Johnson, 2024) or (U.S. Census Bureau, 2024).
For direct references to specific data points: (Smith & Johnson, 2024, Table 3) or (National Oceanic and Atmospheric Administration, 2024, p. 15).
Dataset from Academic Repository
For datasets from university repositories:
Garcia, M., Lopez, J., & Rodriguez, A. (2023). Longitudinal educational achievement data. Harvard Dataverse. https://doi.org/10.7910/DVN/EXAMPLE
Include the repository name and DOI.
Time Series Dataset
For continuous or regularly updated datasets:
Federal Reserve Economic Data. (2024). Unemployment rates monthly. Federal Reserve Economic Data Repository. Retrieved from https://fred.stlouisfed.org/
Note if the dataset is continuously updated.
Qualitative Dataset Citation
For qualitative research datasets:
Smith, J., Brown, K., & Lee, T. (2023). Interview transcripts: Technology adoption in small businesses. OSF Registries. https://osf.io/example
Examples for Different Dataset Types
Health Science Dataset
National Institutes of Health. (2023). National Health and Nutrition Examination Survey (NHANES) dataset. CDC National Center for Health Statistics. Retrieved from https://www.cdc.gov/nchs/nhanes
Genomic Data
International Human Genome Sequencing Consortium. (2024). Genomic reference sequences. NCBI GenBank. Retrieved from https://www.ncbi.nlm.nih.gov/genbank
Social Science Dataset
Inter-university Consortium for Political and Social Research. (2023). American National Election Study dataset. University of Michigan. https://doi.org/10.3886/ICPSR35157.v4
Environmental Data
NASA. (2024). Satellite Earth observation data. NASA Earth Data Repository. Retrieved from https://earthdata.nasa.gov/
Citing Specific Datasets within Collections
When referencing a specific dataset within a larger collection:
European Environment Agency. (2024). Air quality dataset: Ozone levels 2024. European Environment Information and Observation Network. Retrieved from https://www.eea.europa.eu/data
Include the specific dataset name from the collection.
Using GenText for Dataset Citations
GenText streamlines dataset citation in APA format by organizing creator information, dataset titles, and repository details. The tool ensures proper DOI inclusion and consistent formatting.
Reference List Formatting
In the reference list, arrange dataset citations alphabetically by creator’s last name or organization name. Use hanging indent formatting.
Common Citation Elements
Essential elements for dataset citations:
- Creator(s) or organization name
- Publication year
- Dataset title in italics
- Repository name
- DOI or URL
- Version number (if applicable)
- Access date (only if content is likely to change)
Common Citation Mistakes
- Missing DOI: Always include DOI when available for persistent linking.
- Incomplete repository information: Specify the repository housing the dataset.
- Missing version number: Include version information for reproducibility.
- Inconsistent title formatting: Dataset titles should be italicized.
Finding Datasets
Locating open datasets:
- Zenodo (zenodo.org) - Multidisciplinary repository
- Figshare (figshare.com) - Research outputs
- Open Science Framework (osf.io)
- Google Dataset Search (datasetsearch.research.google.com)
- Data.gov (data.gov) - U.S. government data
- Harvard Dataverse (dataverse.harvard.edu)
- Kaggle (kaggle.com) - Data science datasets
Data Reuse Considerations
When citing datasets for reuse:
- Check license and attribution requirements
- Verify the dataset version and access date
- Confirm data quality and documentation
- Consider whether the data has been updated
- Review terms of use for any restrictions
When to Cite Datasets
Dataset citations are important for:
- Reproducible research
- Secondary analysis of existing data
- Data-driven studies
- Open science practices
- Meta-analyses using shared data
- Demonstrating methodological transparency
By following APA guidelines for dataset citations, you acknowledge data creators’ contributions and support open, reproducible research practices.
Frequently Asked Questions
What is the basic APA format for citing a dataset?
The format is: Creator(s). (Year). Dataset name. Repository Name. Retrieval information. Include the creator's name, publication year, dataset title, the repository where it's housed, and the DOI or URL.
How do I cite a dataset from Zenodo or Figshare?
Include creator, year, dataset title, repository name, and DOI. Format: Smith, J., & Johnson, M. (2024). Climate data measurements. Zenodo. https://doi.org/10.5281/zenodo.example
Should I include the specific version number of a dataset?
Yes, if a version number is available. Include it after the dataset title: Dataset name (Version 2.1). Version numbers help ensure reproducibility.
Related Guides
Format Citations Automatically
Format citations in APA, MLA, Chicago and more—all inside Microsoft Word.
Install Free