The steps below are best practices for agencies to develop an efficient data sharing process.
Identifying who plays each data-related role allows organizations to establish who has the responsibility of fielding external inquiries, designing sharing procedures, and executing requests. The first step in setting up a strong data governance model and maintaining institutional knowledge of the data sharing process is to establish and communicate these roles.
Although the roles below are described separately, the same person may exercise more than one role and may have a separate job title and function.
Agency data officers serve as the main contact person for inquiries, requests, or concerns regarding access to the data of an agency. The agency data officer, in consultation with the Chief Data Officer and the agency head, establishes procedures to ensure that the agency complies with requests for data in an appropriate and prompt manner.
Section 4-67p of the Connecticut General Statutes defines the role and responsibilities of agency data officers, and a list of agency data officers is published on the Connecticut Open Data Portal.
The data owner is accountable for the quality and security of the data and holds the decision-making authority about data within their domain. The data owner varies by database, and there may be multiple data owners.
The data steward is responsible for the governance of data and ensures the fitness of content and metadata. Stewards exercise established processes, policies, guidance, compliance, and rules in this effort. They are usually the subject matter experts and data analysts that work with the data on a daily basis.
Legal counsel is a person or team that can evaluate data access and use and help craft appropriate legal agreements when needed.
This person or team develops and implements policies and procedures to protect individual rights and comply with federal and state law. The privacy and compliance officer also investigates any data incidents and breaches.
A publicly-available data dictionary helps requesters understand what data your agency collects and maintains. It can also help them craft requests that reference specific tables and fields, making the request easier to fulfill. The data dictionary should:
P20 WIN has a data dictionary containing the elements available for request. Participating agencies provide updates each year in accordance with the P20 WIN metadata policies and processes.
Metadata is a set of information that describes the fields in a dataset. It provides data about your data. It includes information such as when and how the data was gathered or any other information that might describe an aspect of the data. It is important to keep detailed notes on the metadata and process by which the data was collected because this information can facilitate easier and more effective use later on.
One common misconception about metadata is that it is solely the definitions of the various fields in a dataset. However, metadata includes much more than these surface-level characteristics. Anything that gives additional information about the nature, structure, or gathering process of the dataset counts as metadata. Some examples of metadata for different types of media include:
When tracking metadata, it’s important to:
Connecticut’s High Value Data Inventory is a data catalog that highlights general information about high-value datasets possessed by state agencies. The annual maintenance of the high value data inventories is required by C.G.S. § 4-67p. At the end of each year, OPM will reach out to agency data officers to provide updates by December 31 of that year. By keeping your agency’s datasets up to date in the catalog, you help other agencies and the public understand what data your agency owns and who to contact for more information.
To update the inventory, email both Scott Gaul and Pauline Zaldonis with the subject line “CT High Value Data Inventory Change Request.”
As organizations become more data-driven, data experts are discovering more instances in which unaccounted biases in data perpetuate racism, sexism, and other forms of discrimination.
The data that government agencies, academic researchers, and other organizations collect most likely contain implicit biases. These biases can be introduced due to:
Consider possible sources of bias in your agency’s data carefully. If you do identify possible bias, communicate it to data requesters, and work to reduce it, the decisions made based on your data may have serious unintended societal implications.
Data analysts are ultimately responsible for how they use your agency’s data; however, as the data owners and experts, you can help data analysts avoid biases in data that perpetuate racism, sexism, and other forms of discrimination.
First, be open about the limitations of the agency’s data to reduce the likelihood that it will be used in ways that have unintended consequences. Second, work towards systemic changes to data collection practices. Finally, require data requesters to demonstrate responsible use of your agency’s data.
A clearly documented data request process can facilitate successful requests. This section covers some of the supporting documents to develop as part of a comprehensive data request process.
Remember that the data request process must abide by the regulations and laws that apply to each dataset. For more detailed information, refer to Establish a privacy policy and the report on Legal Issues in Interagency Data Sharing, including the appendices reviewing state and federal laws and regulations.
Ensure that the data requester answers the questions below in order to evaluate the benefits and mitigate the risks of sharing data.
It’s important to have a way to illustrate or describe the data sharing process from start to finish. Common approaches include using a flow diagram or descriptions for each step.
A data dictionary describes the agency’s data. (See Create and publish a data dictionary.)
A request fee schedule communicates the cost of requesting data. Each agency may have unique procedures for enacting request fees. Consult your agency’s legal counsel for specific guidance on fee schedules.
Connecticut state agencies can better leverage data for decision-making through P20 WIN’s data governance framework. P20 WIN uses an enterprise framework to facilitate data sharing across participating agencies. By participating in the data governance structure of P20 WIN, state agencies can enable the secure sharing of data to address critical policy questions in the state.
If your agency is not yet a participating agency in P20 WIN, contact Katie Breslin, Outreach and Engagement Coordinator, to learn about how your agency can join P20 WIN.