Data Policy and Guidelines for AMS Publications

(previously the Data Archiving and Citation Guidelines page, updated 7 August 2020)

Background

The American Meteorological Society (AMS) is committed to promoting full, open, and timely access to the environmental data, associated metadata, and derived data products that underlie scientific findings and are reported in our Publications (see the updated 2019 AMS policy statement “Full, Open, and Timely Access to Data”). These data and metadata must be properly cited and readily available to the scientific community and the general public as much as possible. Social science data may need to be handled differently, as described in the section below. Embargoes on data sharing are discouraged, and must be approved by the Editor when applicable and included in the Data Availability Statement.

AMS, as a member of the Coalition on Publishing Data in the Earth and Space Sciences (COPDESS), is working toward becoming a signatory of the Enabling FAIR Data initiative (Stall et al. 2018), which is committed to aligning publishers, repositories, and other organizations in the Earth, space, and environmental sciences to enable scientific data to be FAIR: Findable, Accessible, Interoperable, and Reusable (Wilkinson et al. 2016). As part of the FAIR initiative, COPDESS recommends best practices around data and identifiers. These best practices are the basis for these AMS guidelines.

Further recommendations about best practices for data management can be found in the AMS statement on Best Practices for Data Management approved by the AMS Council in September 2019, and the references therein.

Requirements for Authors

While AMS is committed to the FAIR principles cited above, our data policies are designed to be flexible enough so that no author should be excluded from submitting to our journals, especially due to resource limitations. Special circumstances should be discussed with the journal Editor, and explained in the Data Availability Statement section of the manuscript. Data requirements for manuscript submission include the following:

  • Authors must confirm during initial submission that they are aware of the AMS data policies, including the expectation that datasets used or derived in the reported work are archived and cited/referenced properly. Please see Data Reference and Citation Examples for more information.
  • Authors are expected to have archived core research outputs (data, software, samples, etc.) to valid FAIR-aligned repositories, if possible. This includes the assignment and use of persistent identifiers such as DOIs for as much of the archived data and documentation as possible. Please see Data Archiving Guidance for more information about identifying a valid repository. 

  • Authors are expected to include a Data Availability Statement section in the submitted manuscript immediately following the Acknowledgments section. The Data Availability Statement should describe where the data underlying the findings for the article are archived, and how they can be accessed and reused. See the Data Availability Statement Examples page for more information. 

  • In cases where archiving is not possible, the Data Availability Statement should describe the reasons why, and what resources are available for other researchers to understand how the research being reported on was conducted.

Peer review editors are asked to ensure that the above expectations are being met.

As laid out in the AMS data policy statement, the spectrum of what constitutes “data” is diverse and includes in situ and remotely sensed observations, environmental predictions generated by numerical models, and data products derived from integrations of observational and model-generated sources. Associated software should also be archived if possible. A companion policy specifically addressing software and model data is currently being developed. Questions regarding these data policy guidelines should be directed to datapolicy@ametsoc.org.

Social Science Data and Data Restrictions

AMS recognizes that social science data and other data involving humans may be subject to restrictions, up to and including being unavailable. Authors must comply with applicable institutional review board and funding agency policies and regulations when collecting human subject data.

Authors using data subject to restricted access or that are unavailable, such as those with proprietary or other legal restrictions, should provide an explanation to the journal editor at initial submission and in the Data Availability Statement section of the manuscript immediately following the Acknowledgements section. More information about data availability statements can be found in the section below and on the Data Availability Statement Examples page. 

About Archiving Data

Authors are expected to have archived core research outputs (data, software, appropriate samples and sample descriptions) to FAIR-aligned repositories, following the Enabling FAIR Data principles. This means that article supplemental material should no longer be used as a primary archive for data. 

FAIR-aligned repositories provide additional quality checks around domain data and data services, and facilitate discovery and reuse of data and other research outputs. Authors who are unsure about appropriate repositories should refer to the Data Archiving Guidance page for further information.  More information about the Enabling FAIR Data guidelines is available at the project FAQ page.

AMS strongly prefers research data to be made available under open licenses that permit reuse freely, and does not enforce particular licenses for research data where research data are deposited in third party repositories. AMS does not require transfer of copyright for research data. 

Referencing and Citing Data

Authors should cite and link to the data in the article, following the general guidelines below (which are derived from the Joint Declaration of Data Citation Principles and ESIP Guidelines), using the unique, resolvable, and persistent identifiers provided by the repository in which the data are archived. Specific examples can be viewed on the Data Reference and Citation Examples page.

In particular, citations should appear in the body of the article with a corresponding reference in the reference list. Citations should include persistent identifiers in well-formed references to data and software, so they can be accurately tracked. Also, citations should include software used in the research following the FORCE11 Software Citation Principles, which recommends a similar depositing of the software in an archival repository, and citation/references that include the persistent identifiers provided by the repository. An AMS-specific policy on software citation and archiving is being developed, overseen by the Board on Data Stewardship.

Citing dataset references in text

The in-text citations for dataset references should be formatted the same as other publication types, using the author’s name and year of publication [e.g., “dataset produced by Knutti (2014),” or “as shown by an earlier dataset (Knutti 2014)”].

When dataset authors consist of organizations with lengthy names, abbreviate the author names appropriately [e.g., use “(NCEP 2005)” instead of “(National Centers for Environmental Prediction 2005)”]. If the citation is for a reference with two authors, use both author names [e.g., “Yeager and Large (2008)”]. References with three or more authors are always cited as the first author’s name followed by “et al.” [e.g., “Lawrimore et al. (2011)”].

Unpublished or inaccessible data

Data that are not curated or cannot be reliably made available should not be included in the references and should be cited directly in the text as “unpublished data,” giving the names of the person(s) who provided the data and the year in which it was provided. If the unpublished data are the authors’ own data, the authors’ names should be listed with the year of dataset creation: J. Weatherly (2017, unpublished data); (J. Weatherly 2017, unpublished data).

Citing processed/derived data

Findings presented in scientific articles are often the product of multistage workflows that involve combining, extracting, processing, and deriving datasets. Information generated by numerical simulation models should also be regarded as derived data. In these cases, citations should be to any dataset(s) from an external source, and, if possible, to the final derived dataset(s), if they are archived in a reliable location. The goal is to provide transparency and traceability for the results of computational processes and models. In some cases, it may be more appropriate to provide citation and access to processing or model software than to the output data themselves. Questions about these cases should be discussed with the journal editors.

Citing papers that describe a dataset vs citing a dataset

Avoid citing only the published paper that describes a dataset or presents findings that are based on a dataset. Such papers may not link directly to the dataset and/or might be out of synchronization with the dataset, particularly when a dataset is updated or revised. It is best practice to cite both the paper and the dataset: the paper citation links to an important (but incomplete) source of information about the dataset, and the data citation links directly to the dataset and associated metadata.

Acknowledging a dataset vs citing a dataset

The acknowledgments section may include a brief statement if explicitly requested by the dataset provider, but the author must also create a formal citation to ensure the inclusion of detailed information about the dataset. If authors are unsure of the details needed for a dataset citation, contact the dataset provider for the specific information.

Refer to the AMS Data Reference and Citation Examples page for specific examples of how to reference and cite data according to AMS style. Citations should include as much of the following information as possible: Dataset or software authors/producers, release date; title; version; archive/distributor, and the locator/identifier (persistent identifier such as DOI preferred), and year.

Data Availability Statement

Authors are expected to include a separate Data Availability Statement section in their manuscript immediately following the Acknowledgments section that details where data are available, and how the data can be accessed and reused (listing specific restrictions, if any). See the Data Availability Statement Examples page for more details.

In special cases, where data access is restricted, authors are required to mention these restrictions in the data availability statement. In short, authors should provide unrestricted access to all data and materials underlying reported findings for which ethical or legal constraints do not apply, to the greatest extent possible.

In other cases where the data or model output cannot be archived (due to their size or nature), the Data Availability Statement should point to documentation and other resources that are available so that a transparent roadmap for how to reproduce the work is presented.