When it comes to our most pressing environmental challenges, much of the data needed to take effective action are missing. Water quality, recycling rates, toxic chemical exposures, land degradation – assessing these environmental issues is hampered by the lack of consistent, global information pathways. Where data exist, they are often incomplete, erratic or untrustworthy. Even if accurate, data may be based at a resolution or scale inapplicable to the policy question at hand. The flood of information poses new problems of its own, often creating more noise than clear signals.

In China, recent announcements by the government have signaled a new opportunity to scope the potential for novel data approaches, including big data (e.g., satellite data) analysis as well as bottom-up methods (e.g., citizen science) to help address these data challenges, utilizing a mix of inventive and cutting-edge techniques we dub “third-wave data.” “Third-wave data” has the potential to engage new actors (e.g., citizens and technologists) and audiences in environmental data collection, improve linkages between policy-making and monitoring, and facilitate collaboration across private and public sectors in China.

On March 1, 2016, government officials, civil society organizations, and researchers gathered at the Yale Center-Beijing to share their work assessing and responding to key data gaps in existing environmental information in China, and to scope potential opportunities for big data to measure, manage, and communicate environmental issues. This workshop aimed to build a baseline understanding of the status of data-driven initiatives in China, to identify opportunities for future collaboration and initiatives. Discussions were loosely organized around a gap analysis of data in China; the status of environmental monitoring in China; the current big data initiatives in China; and international cooperation in China’s environmental monitoring.

A number of key themes and questions emerged from these presentations and conversations:

  1. The “central-local divide,” or the difficulty of coordinating information between central, provincial and local governments.
  2. The gap that separates data from usable data.  Difficulties in accessing, synthesizing, verifying, and interpreting data limit its overall utility to regulators, researchers, policy-makers, and the general public. The creation of a “data lake” that synthesizes all available data represents one example of an approach to better understanding and accessing the scope of available data. Bringing disparate sources of data together in a single platform would both help clarify the existing data landscape, and help inform efforts to create more legible and accessible monitoring results.  
  3. The need to determine the types of policy mechanisms and frameworks that could help support efforts to address the central-local divide, and create more accessible and usable data.
  4. The need to harness citizen engagement and to help governments respond to these new sources of environmental information. Determining how to incentivize citizens to collect environmental data, and how to evaluate and contextualize this information, will be crucial to efforts to generate third-wave data. Questions about data accuracy and reliability will be crucial to determining how governments can utilize and respond to citizen-generated data.  

For workshop participants, presentations and a full workshop synopsis are available at the link below. E-mail datadriven@yale.edu to request access.