Internet of Water Principles

The Internet of Water Principles were originally developed during the Aspen Institute Dialog Series on Water Data, and published in the 2017 report “The Internet of Water: Sharing and Integrating Water Data for Sustainability.” In 2021, the Principles were revised in consultation with the Advisory Board of the Nicholas Institute’s Internet of Water Project to reflect lessons learned over the first three years of project implementation. 

Updated November 2021

Internet of Water Principles

1. Water data are essential for efficient, equitable, sustainable, and resilient water planning, management, and stewardship.

2. Modern data infrastructure increases the usefulness of water data and enables its broadest possible application.

3. Data equity is necessary for water equity; modern data infrastructure should be implemented and governed so that data are usable by and for overburdened communities.

4. All water data produced for the public good should, by default, be findable, accessible, interoperable, and reusable (FAIR) for public use or authorized users.*

5. Security and privacy risks associated with sharing data can be mitigated using mechanisms for tiered access for authorized users.

6. Commonly accepted data, metadata, and exchange standards should be adopted by water data producers to promote interoperability, efficiency, sharing, equity, and secondary uses of data

7. Control and responsibility over data are best maintained by data producers.

8. Data producers are responsible for sharing data of known quality and documenting essential metadata; data users are responsible for determining whether data are appropriate for specific purposes and uses.

9. Federated, distributed systems of interoperable public water data generally provide scalability and flexibility to meet the diverse needs of data producers and users.

Definitions

Accessible: Full data sets are available to the public or authorized users for download in machine-readable, non-proprietary formats.

Authorized users: The group of users that are allowed to access a given dataset. The default group of authorized users for public water data is the general public. In certain cases, such as datasets that include personally identifiable information or that represent serious security risks, this group may be limited by data producers to users with specific data use agreements or security clearances.

Data hubs: Structured sources of standardized water data aggregated by theme or geography.

Data producers: Entities that collect data for a specific purpose and have authority over what and how data are produced, including organizations that manage citizen science and crowd-sourced data (e.g., a wastewater treatment plant that produces data about surface water conditions, a state agency that holds water rights data, a non-governmental organization (NGO) that collects water data samples, a private company that takes meter readings).

Data standards: Guidelines regarding how data about a particular topic is (1) structured, defining what data elements should be present; (2) populated, defining the kind and quality of information represented; (3) encoded in machine-readable formats; and (4) made interoperable for data exchange.

Data users: Primary and secondary entities that use water data to create information and value. Primary users are the producers who use the data they collect to meet a specific mission (e.g., a state environmental quality agency that regulates discharges of pollutants, a reservoir operator that regulates the flow of water through a dam). Secondary users create value by combining multiple types of data, typically from multiple organizations (e.g., a conservation organization building stream restoration maps from data held by a utility, state, and reservoir operator; a private company assessing, modeling, and visualizing the environmental impacts of real estate development).

Findable: Data and metadata published on the web in compliance with data-on-the-web best practices, ideally tied to a common hydrography.

Interoperable: Data bulk download formats and application programming interfaces (APIs) that follow community standard patterns; metadata are included with data and of sufficient quality for users to make judgments as to what purposes the data is fit for use; and data content references including publicly available definitions, controlled vocabularies, and data standards appropriate to the data’s subject matter.

Metadata:  Metadata is information about data that assists potential data users in the discovery, access, and use of the data. It can describe the identity, subject matter, and producer of the data to aid in data discovery. It can describe the location, license, and point of contact for the data to assist in data administration and access. It can describe the structure, format, and any applicable data standards to assist in the use and manipulation of the data.

Modern data infrastructure: An integrated system of 21st-century information technologies, which includes common standards, formats, and tools designed to make water data easy to find, access, and share online. This system is connected by a network of people and organizations serving as water data producers, users, and hubs.

Overburdened community: Minority, low-income, tribal, or indigenous populations or geographic locations in the United States that potentially experience disproportionate environmental harms and risks. This disproportionality can be as a result of greater vulnerability to environmental hazards, lack of opportunity for public participation, or other factors. Increased vulnerability may be attributable to an accumulation of negative or lack of positive environmental, health, economic, or social conditions within these populations or places. The term describes situations where multiple factors, including both environmental and socio-economic stressors, may act cumulatively to affect health and the environment and contribute to persistent environmental health disparities.**

Water data produced for the public good: refers to water data collected for any public mission or purpose, including for regulatory compliance, either made available to the public or limited to authorized users.

Reusable: Data that is published and identified with version records and made available to the public or authorized users so that workflows can be reproduced.

 

* Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
**US Environmental Protection Agency. EJ 2020 Glossary. https://www.epa.gov/environmentaljustice/ej-2020-glossary