Measuring open data use

Measuring open data activity can help agencies understand how their data is being used and what impact it is having. This section provides guidance on how state agencies can measure open data use and quality, as well as how to assess the impact and value of the open datasets they publish. It draws from the evaluation framework designed by DataSF, using three categories of metrics: 1) publishing activity, 2) quality, and 3) use and impact.

Publishing activity

Executive branch agencies should maintain an open data access plan that details their plan to improve the availability of open data (as required by Section 4-67p of the Connecticut General Statutes).

The template for the open data access plans (available here) asks agencies to list the datasets that they are already publishing as open data in addition to the datasets that they plan to publish. Agencies should list the date each dataset will be made available and the frequency with which it will be updated. The open data access plans can serve as the benchmark against which to measure open data publishing activity by agencies.

Below are four possible metrics to measure publishing activity:

  • Number of executive branch agencies with open data access plans. Does the agency have an open data access plan?
  • Number of datasets published by agency. How many datasets does the agency have published on the CT Open Data Portal and the CT Geodata Portal?
  • Percentage of datasets listed in open data access plan that have been published. What percentage of the datasets listed in the open data access plan have been published as open data?
  • Percentage of datasets with automated updates. What percentage of datasets published are automatically updated?

Quality

Publishing high quality data is essential to promoting open data accessibility and use. In addition to measuring publishing activity and data use, agencies must also measure the quality of the data that they make available as open data, ensuring that it is well-documented and up-to-date.

The metadata standards for data published on the CT Open Data Portal and the CT Geodata Portal provide guidance on metadata requirements. The open data access plans for each agency should include the target publication date and the data update interval. Below are three possible metrics for measuring data quality:

  • Percentage of datasets published on time. What percentage of datasets are published by the date indicated in the open data access plan?
  • Percentage of datasets that are updated at the target interval. What percentage of datasets are updated at the interval indicated in the open data access plan?
  • Percentage of datasets with required metadata, as indicated in metadata standard. What percentage of datasets are published with the required metadata listed in the metadata standard?

Use and impact

Understanding how open data is used and what impact it is having is the is the most challenging element of open data evaluation. Some metrics of open data use can be accessed through site analytics from Tyler Technologies, Esri, or Google Analytics.

The CT Open Data Portal Site Analytics page provides analytics on assets published on the CT Open Data Portal, including:

  • Cumulative downloads and views
  • Downloads, views, API calls over time
  • Data freshness (whether a dataset is up-to-date according to its metadata)
  • Metadata completeness (whether a dataset is missing metadata fields like a description, update frequency, or tags)

Measuring the impact and value of open data is more difficult than measuring data use. Agencies may want to consider creating a log of use cases that show their open data being used.

Five possible metrics for measuring open data use and impact are below:

  • Number of dataset views. What is the total/monthly number of page views for each dataset?
  • Number of dataset downloads. How many times has each dataset been downloaded?
  • Number of API hits. How many API hits has each dataset received?
  • Number of products (apps, tools, websites, etc.) made with open data. How many products have been made with open data from each agency?
  • Measurement of estimated time saved in responding to data requests. How much time is saved by proactively making data available as open data rather than responding to data requests reactively?