Metadata standards
This section provides guidance on metadata for publishers of open data in Connecticut state government, including through the CT Open Data Portal and the CT Geodata Portal.
Metadata is the data that describes other data. It helps answer the question “what is this data about?” High quality metadata provides helpful context about the data’s creation, quality, and uses and is key to improving data discovery.
Metadata Checklist
When publishing data as open data use the checklist below to make sure your metadata is complete.
- Does the metadata describe what information is contained in the data?
- Does it include a clear title that non-technical users will be able to understand?
- Does it use plain language and avoid use of overly technical language, jargon, and acronyms?
- Does it include the source of the data and describe how the dataset was created?
- Does it list the data owner?
- Does it include information about how frequently the data is updated?
- Does it provide guidance on how the data should be used and any limitations?
- Does it align with the metadata standard of the platform where the data is being published?
CT Open Data Portal Metadata Schema
The table below summarizes the metadata element on the CT Open Data Portal. When creating a new dataset, data publishers should provide as much detail as possible in the metadata, providing the following elements where applicable. A more detailed version of the table below is available as an Excel spreadsheet here.
Field | Definition | Permitted values | |
---|---|---|---|
Basic Descriptive | Title | Human-readable name of the asset. Should be in plain language and include sufficient detail to facilitate search and discovery. Avoid acronyms. | Text |
Description | What the dataset describes. Provide a longer description of the data that can be readily understood by non-technical users. | Text | |
Category | The category of the data set identified by the list of possible values. If a data set can fall into multiple categories, select the one which is most significant. | Drop down menu: Business, Education, Environment and Natural Resources, Government, Health and Human Services, Housing and Development, Local Government, Public Safety, Tax and Revenue, Transportation | |
Agency | The agency that collects and manages the data as the authoritative source. | Drop down menu | |
Detailed Descriptive | Number of Rows | Number of rows in the dataset. | Auto-generated |
Row Label | What each row in the dataset represents. | Text | |
Tags/Keywords | Tags (or keywords) help users discover your dataset; include terms that would be used by technical and non-technical users. | Text | |
Source Link | The URL to the program area web pages | URL | |
Geographic Unit | At what geographic unit is the data collected? For example, if the data is collected by address, it would be Street Address. | Drop down menu: Latitude/longitude, Street address, Parcel (block/lot), Census block, Census block group, Census tract, Planning District, Zip code, Municipality, County or county-equivalent, State, Other, Not applicable | |
Temporal Coverage | This field should contain an interval of time defined by the start and end dates for which the dataset is applicable (e.g. 2000 - present). | Text | |
Related Documents / Attachments | Related documents such as technical information about a dataset, developer documentation, etc. | Attachments | |
Internal Management | Data Provided By / Attribution | The name of the data source, for example the name of the publishing agency, organization, or individual. | Text |
Dataset Owner | The name associated with the account publishing a dataset. The name will be automatically associated with the dataset. | Auto-generated | |
Contact Email | This address will not be displayed publicly, but inquiries submitted via the “Contact Dataset Owner” button will be routed to this email. If left blank, it will default to the dataset owner email address | Text | |
Publishing Details | Update Frequency | Frequency with which dataset is updated. | Drop down menu: Not updated (historical only), As needed, Annually, Bi-annually, Quarterly, Bi-monthly, Monthly, Bi-weekly, Weekly, Every weekday, Daily, Other |
Last Updated | Most recent date and time when the dataset was changed, updated or modified. | Auto-generated | |
Public License Type | The license with which the dataset is published. | Drop down menu |
Column Metadata
In addition to the dataset metadata (as detailed in the table above), data publishers should also provide “column metadata,” or descriptions of the columns in the dataset. Especially when column names are not easily understood by users not familiar with the dataset, data publishers should provide short column descriptions in the metadata. Additional technical documentation can be provided as an attachment accompanying the dataset.
CT Geodata Portal Metadata Schema
The table below summarizes the metadata element on the CT Geodata Portal, adapted from the ArcGIS Hub documentation. Data publishers should provide as much detail as possible in the metadata, providing the following elements where applicable.
Field | Definition | Permitted values | |
---|---|---|---|
Basic Descriptive | Thumbnail | The thumbnail image is displayed on layout cards and content views. | Image |
Title | Use a title that is succinct and informative. Avoid underscores as they will be removed. | Text | |
Summary | Give an overview in several sentences that cover the key elements of the item. | Text | |
Description | Write a description that is clear and informative, as it is shown on content views and in search results. | Text | |
Detailed Descriptive | Number of Records | Number of records in the data. | Auto-generated |
Tags | These help users discover your datasets through searches. For individual items, add tags in ArcGIS Online. All layers in map and feature services will have the same tags as those set for the entire service in ArcGIS Online. | Text | |
Categories | Category of the data. These help organize items and facilitate their discovery and use. | URL | |
Publishing Details | Data Publisher | The name associated with the account publishing a dataset. The name will be automatically associated with the dataset. | Auto-generated |
Published Date | Date item was created. | Auto-generated | |
Data Updated | Date item was last modified | Auto-generated | |
Info Updated | Date metadata was last modified. | Auto-generated | |
License | Choose a structured license or enter a custom one. The license is shown on the full details page and on the information side panel if the item has an explore view. It also displays (and you can add it here as well) in ArcGIS Online, under Terms of Use on the item details page. | Drop down menu |
Marking Data as Authoritative
In addition to providing complete metadata, publishers of data on the CT Geospatial Data Portal can also improve the usability of the data on the Portal by marking their data as authoritative. Designating items as authoritative makes it easier for users to find authoritative items. The ArcGIS Online documentation explains:
Organization administrators and those with administrative privileges to update content can specify that an item is authoritative using the Mark as Authoritative button. Items designated as authoritative are identified with an Authoritative badge on the Overview tab. If your organization is verified, items that are shared with everyone (the public) and marked as authoritative display the organization name as the item owner on the Overview tab.