(previously the Data Archiving and Citation Guidelines page, updated 15 May 2020)
- Social Science Data and Data Restrictions
- About Archiving Data
- Data Availability Statement
The American Meteorological Society (AMS) is committed to promoting full, open, and timely access to the environmental data, associated metadata, and derived data products that underlie scientific findings and are reported in our Publications (see the updated 2019 AMS policy statement “Full, Open, and Timely Access to Data”). These data and metadata must be properly cited and readily available to the scientific community and the general public. Social science data may need to be handled differently, as described in the next section. Embargoes on data sharing are discouraged, and must be approved by the Editor when applicable and included in the Data Availability Statement.
AMS, as a member of the Coalition on Publishing Data in the Earth and Space Sciences (COPDESS), is working toward becoming a signatory of the Enabling FAIR Data initiative (Stall et al. 2018), which is committed to aligning publishers, repositories, and other organizations in the Earth, space, and environmental sciences to enable scientific data to be FAIR: Findable, Accessible, Interoperable, and Reusable (Wilkinson et al. 2016). As part of the FAIR initiative, COPDESS recommends best practices around data and identifiers. These best practices are the basis for these AMS guidelines.
Further recommendations about best practices for data management can be found in the AMS statement on Best Practices for Data Management approved by the AMS Council in September 2019, and the references therein.
One requirement for authors is that at initial submission of the manuscript, they must confirm that their data are archived and referenced/cited properly (subject to legal and ethical restrictions; see Data Requirements for Submission). Likewise, peer review editors are asked to ensure that this AMS expectation is being met. As laid out in the AMS data policy statement, the spectrum of what constitutes “data” is diverse and includes in situ and remotely sensed observations, environmental predictions generated by numerical models, and data products derived from integrations of observational and model-generated sources. Associated software should also be archived if possible. A companion policy specifically addressing software and model data is currently being developed. Questions regarding these data policy guidelines should be directed to firstname.lastname@example.org.
Social Science Data and Data Restrictions
AMS recognizes that social science data and other data involving humans may be subject to restrictions, up to and including being unavailable. Authors must comply with applicable institutional review board and funding agency policies and regulations when collecting human subject data.
Authors using data subject to restricted access or that are unavailable, such as those with proprietary or other legal restrictions, should provide an explanation to the journal editor at initial submission and in the Data Availability Statement section of the manuscript immediately following the Acknowledgements section. More information about data availability statements can be found in the section below and on the Data Availability Statement Examples page.
About Archiving Data
Authors are expected to have archived core research outputs (data, software, appropriate samples and sample descriptions) to FAIR-aligned repositories, following the Enabling FAIR Data principles. This means that article supplemental material should no longer be used as a primary archive for data. AMS also strongly discourages the archiving of data on personal servers and websites because of their lack of permanence.
FAIR-aligned repositories provide additional quality checks around domain data and data services, and facilitate discovery and reuse of data and other research outputs. Authors who are unsure about appropriate repositories should refer to the Data Archiving Guidance page for further information. More information about the Enabling FAIR Data guidelines is available at the project FAQ page.
AMS strongly prefers research data to be made available under open licenses that permit reuse freely, and does not enforce particular licenses for research data where research data are deposited in third party repositories. AMS does not require transfer of copyright for research data.
Referencing and Citing Data
Authors should cite and link to the data in the article, following the general guidelines below (which are derived from the Joint Declaration of Data Citation Principles and ESIP Guidelines), using the unique, resolvable, and persistent identifiers provided by the repository in which the data are archived. Specific examples can be viewed on the Data Reference and Citation Examples page.
In particular, citations should appear in the body of the article with a corresponding reference in the reference list. Citations should include persistent identifiers in well-formed references to data and software, so they can be accurately tracked. Also, citations should include software used in the research following the FORCE11 Software Citation Principles, which recommends a similar depositing of the software in an archival repository, and citation/references that include the persistent identifiers provided by the repository. An AMS-specific policy on software citation and archiving is being developed, overseen by the Board on Data Stewardship.
Citing dataset references in text
The in-text citations for dataset references should be formatted the same as other publication types, using the author’s name and year of publication [e.g., “dataset produced by Knutti (2014),” or “as shown by an earlier dataset (Knutti 2014)”].
When dataset authors consist of organizations with lengthy names, abbreviate the author names appropriately [e.g., use “(NCEP 2005)” instead of “(National Centers for Environmental Prediction 2005)”]. If the citation is for a reference with two authors, use both author names [e.g., “Yeager and Large (2008)”]. References with three or more authors are always cited as the first author’s name followed by “et al.” [e.g., “Lawrimore et al. (2011)”].
Unpublished or inaccessible data
Data that are not curated or cannot be reliably made available should not be included in the references and should be cited directly in the text as “unpublished data,” giving the names of the person(s) who provided the data and the year in which it was provided. If the unpublished data are the authors’ own data, the authors’ names should be listed with the year of dataset creation: J. Weatherly (2017, unpublished data); (J. Weatherly 2017, unpublished data).
Citing processed/derived data
Findings presented in scientific articles are often the product of multistage workflows that involve combining, extracting, processing, and deriving datasets. Information generated by numerical simulation models should also be regarded as derived data. In these cases, citations should be to any dataset(s) from an external source, and, if possible, to the final derived dataset(s), if they are archived in a reliable location. The goal is to provide transparency and traceability for the results of computational processes and models. In some cases, it may be more appropriate to provide citation and access to processing or model software than to the output data themselves. Questions about these cases should be discussed with the journal editors.
Citing papers that describe a dataset vs citing a dataset
Avoid citing only the published paper that describes a dataset or presents findings that are based on a dataset. Such papers may not link directly to the dataset and/or might be out of synchronization with the dataset, particularly when a dataset is updated or revised. It is best practice to cite both the paper and the dataset: the paper citation links to an important (but incomplete) source of information about the dataset, and the data citation links directly to the dataset and associated metadata.
Acknowledging a dataset vs citing a dataset
The acknowledgments section may include a brief statement if explicitly requested by the dataset provider, but the author must also create a formal citation to ensure the inclusion of detailed information about the dataset. If authors are unsure of the details needed for a dataset citation, contact the dataset provider for the specific information.
Refer to the AMS Data Reference and Citation Examples page for specific examples of how to reference and cite data according to AMS style. Citations should include as much of the following information as possible: Dataset or software authors/producers, release date; title; version; archive/distributor, and the locator/identifier (persistent identifier such as DOI preferred), and year.
Data Availability Statement
Authors are expected to include a separate Data Availability Statement section in their manuscript immediately following the Acknowledgments section that details where data are available, and how the data can be accessed and reused (listing specific restrictions, if any). See the Data Availability Statement Examples page for more details.
In special cases, where data access is restricted, authors are required to mention these restrictions in the data availability statement. In short, authors should provide unrestricted access to all data and materials underlying reported findings for which ethical or legal constraints do not apply, to the greatest extent possible.