Tag Archives: geothink partners

Inside Geothink’s Open Data Standards Project: Standards For Improving City Governance

By Rachel Bloom

Rachel Bloom is a McGill University undergraduate student and project lead for Geothink’s Open Data Standards Project.

In February, I led a Geothink seminar with city officials to introduce the results of our open data standards project we began approximately one year earlier. The project was started with the objective of assisting municipal publishers of open data in standardizing their datasets. We presented two spreadsheets: the first was dedicated to evaluating ‘high-value’ open datasets published by Canadian municipalities and the second consisted of an inventory of open data standards applicable to these types of datasets.

Both spreadsheets enable our partners who publish open data to know what standards exist and who uses them for which datasets. The project I lead is motivated by the idea that well-developed data standards for city governance can grant us the luxury of not having to think about the compatibility of technological components. When we screw in a new light bulb or open a web document we assume that it will work with the components we have (Guidoin and McKinney 2012). Technology, whether it refers to information systems or manufactured goods, relies on standards to ensure its usability and dissemination.

Municipal governments that publish open data look to the importance of standards for improving the usability of their data. Unfortunately, even though ‘high-value’ datasets have increasingly become available to the public, these datasets currently lack a consensus about how they should be structured and specified. Such datasets include crime statistics and annual budget data that can provide new services to citizens when municipalities open such datasets by publishing them to their open data catalogues online. Anyone can access such datasets and use the data however they wish without restriction.

Civic data standards provide agreements about semantic and schematic guidelines for structuring and encoding the data. Data standards specify technical data elements such as file formats, data schemas, and unique identifiers to make civic data interoperable. For example, most datasets are published in CSV or XML formats. CSV structures the data in columns and rows, while XML encapsulates the data in a hierarchical tree of <tags>.

They also specify common vocabularies in order to clarify interpretation of the data’s meanings. Such vocabularies could include, for example, definitions for categories of expenditure in annual budget data. Geothink’s Open Data Standards Project offers publishers of open data an opportunity to improve the usability and efficiency of their data for consumers. This makes it easier to share data across municipalities because the technological components and their meanings within systems will be compatible.

Introducing Geothink’s Open Data Standards Project
No single, clear definition of an open data standard exists. In fact, most definitions of an ‘open data standard’ follow two prevailing ideas: 1) Standards for open data; 2) And, open standards for data. Geothink’s project examines and relates together both of these prevailing ideas (Table 1). The first spreadsheet, the ‘Adoption of Open Data Standards By Cities’, considers open data and its associated data standards. The second spreadsheet, the ‘Inventory of Open Data Standards,’ considers the process of open standardization. In other words, we were curious about what standards are currently being applied to open municipal data, and how to break down and document open standards for data in a way that is useful to municipalities looking to standardize their open data.

Table 1: Differences between ‘open data’ standards and open ‘data standards’

Requires open data Requires open standard process
Evaluation of ‘High-Value’ Datasets Yes No
Inventory of Open Data Standards No Yes

The project’s evaluation of datasets relates to standards for open data. Standards for open data refer to standards that, regardless of how they are developed and maintained, can be applied to open data. Open data, according to the Open Knowledge Foundation (2014), consists of raw digital data that should be freely available to anyone to use, repurposable and re-publishable as users wish, and absent mechanisms of control like restrictive licenses. However, the process of developing and maintaining standards for open data may not require transparency nor include public appeals for its development.

To discover what civic data standards are currently being used, the first spreadsheet, Adoption of Open Data Standards By Cities, evaluates ‘high value’ datasets specific to 10 domains (categories of datasets such as crime, transportation or or service requests) in the open data catalogues for the cities of Vancouver, Toronto, Surrey, Edmonton and Ottawa. The types of data were chosen based on the Open Knowledge Foundation’s choice of datasets considered to provide greatest utility for the public. The project’s spreadsheet notes salient structuring and vocabulary of each dataset; such as the name, file format, schema, and available metadata. It especially notes which data standards these five municipalities are using for their open data (if any at all).

With consultation from municipal bodies and organizations dedicated to publishing open data, we developed a second spreadsheet, Inventory and Evaluation of Open Data Standards,  that catalogues and evaluates 22 open data standards that are available for domain-specific data. The rows of this spreadsheet indicate individual data standards. The columns of this spreadsheet evaluate background information and quality for achieving optimal interoperability for each of the listed standards. Evaluating the quality of the standard’s performance, such as whether the standard is transferable to multiple jurisdictions, is an important consideration for municipalities looking to optimally standardize their data. Examples of open data standards in this inventory are BLDS for building permit data and the Budget Data Package for annual budget data.

The project’s second spreadsheet is concerned with open standards for data. Open standards, as opposed to closed standards, requires a collaborative, transparent, and consensus-driven process to maintain its development (Palfrey and Gasser, 2012). Therefore, open standards honor a commitment to processes of transparency, due process, and rights of appeal. Similarly to open data, open standards resist processes of unchecked, centralized control (Russell, 2014) . Open data standards make sure that end users do not get locked into a specific technology. In addition, because open standards are driven by consensus, they are developed according to the needs and interests of participatory stakeholders. While we provide spreadsheets on both, our project advocates implementing open standards for open data.

In light of the benefits of open standardization, the metrics of the second spreadsheet note the degree of openness for each standard. Such indicators of openness include multi-stakeholder participation and a consensus-driven process. Openness may be observed through the presence of online forums to discuss suggestions or concerns regarding the standard’s development and background information about each standard’s publishers. In addition, open standards use open licenses that dictate the standards may be used without restriction and repurposable for any use. Providing this information not only allows potential implementers to be aware of what domain-specific standards exist, but also allows them to gauge how well the standard performs in terms of optimal interoperability and openness.

Finally, an accompanying white paper explains the two spreadsheets and the primary objective of my project for both publishers and consumers of open data. In particular, it explains the methodology, justifies chosen evaluations, and notes the project’s results.  In addition, this paper will aid in navigating and understanding both of the project’s spreadsheets.

Findings from this Project
My work on this project has led me to conclude that the majority of municipally published open datasets surveyed do not use civic data standards. The most common standard used by municipalities in our survey was the General Transit Feed Specification (GTFS) for transit data and the Open311 API for service request data. Because datasets across cities and sectors vary formats and structure, differences in them coupled with a lack of cohesive definitions for labeling indicate standardization across cities will be a challenging undertaking. Publishers aiming to extend data shared among municipalities would benefit from collaborating and agreeing on standards for domain-specific data (as is the case with GTFS).

Our evaluation of 22 domain-specific data standards also shows standards do exist across a variety of domains. However, some domains, such as budget data, contain more open data standards than others. Therefore, potential implementers of standards must reconcile which domain-specific standard best fits their objectives in publishing the data and providing the most benefits for public good.

Many of standards also contain information for contacting the standard’s publishers along with online forums for concerns or suggestions. However, many still full information regarding their documentation or are simply in early draft stages. This means that although standards exist, some of these standards are in their early stages and may not be ready for implementation.

Future Research Pathways
This project has room for growth so that we can better our partners who publish and use open data decide how to go about adopting standards. To accomplish this goal, we could add more cities, domains, and open standards to the spreadsheets. In addition, any changes made to standards or datasets in the future must be updated.

In terms of the inventory of open data standards, it might be beneficial to separate metrics that evaluate openness of a standard from metrics that evaluate interoperability of a standard. Although we have emphasized the benefits of open standardization in this project, it is evident that some publishers of data do not perceive openness as crucial for the successfulness of a data standard in achieving optimal interoperability.

As a result, my project does not aim to dictate how governments implement data standards. Instead, we would like to work with municipalities to understand what is valued within the decision-making process to encourage adoption of specific standards. We hope this will allow us to provide guidance on such policy decisions. Most importantly, to complete such work, we ask Geothink’s municipal partners for input on factors that influence the adoption of a data standard in their own catalogues.

Contact Rachel Bloom at rachel.bloom@mail.mcgill.ca with comments on this article or to provide input on Geothink’s Open Data Standards Project.

References
Guidoin, Stéphane and James McKinney. 2012. Open Data, Standards and Socrata. Available at http://www.opennorth.ca/2012/11/22/open-data-standards.html. November 22, 2012.
Open Knowledge. Open Definition 2.0. Opendefinition.org. Retrieved 23 October 2015, from http://opendefinition.org/od/2.0/en/
Palfrey, John Gorham, and Urs Gasser. Interop: The promise and perils of highly interconnected systems. Basic Books, 2012.
Russell, Andrew L. Open Standards and the Digital Age. Cambridge University Press, 2014.

Behind the Scenes with Geothink Partner Yvonne Chen: Edmonton’s Open City Initiative

By Drew Bush

Sometime early next week, Edmonton’s City Council will vote to endorse an Open City Initiative that will help cement the city’s status as a smart city alongside cities like Stokholm, Seattle and Vienna. This follows close on the heels of a 2010 decision by city leaders to be the first to launch an open data catalogue and the 2011 awarding of a $400,000 IBM Smart Cities Challenge award grant.

Yvonne Chen, Strategic Planner at City of Edmonton

Yvonne Chen, Strategic Planner at City of Edmonton.

“Edmonton is aspiring to fulfill its role as a permanent global city, which means innovative, inclusive and engaged government,” said Yvonne Chen, a strategic planner for the city who has helped orchestrate the Open City Initiative. “So Open City acts as the umbrella that encompasses all the innovative open government work within the city of Edmonton.”

This work has included using advanced analysis of open data streams to enhance crime enforcement and prevention, an “open lab” to provide new products that improve citizen interactions with government, and interactive neighbourhood maps that will help Edmontonians locate and examine waste disposal services, recreational centres, transit information, and capitol projects.

For Chen, last week’s release of the Open City Initiative represents the culmination of years of work.

“My role throughout this entire release was I was helping with the public consultation sessions, I was analyzing information, and I was writing the Open City initiative documents as well as a lot of the policy itself,” she said.

In fact, the document outlines city policy, action plans for specific initiatives, environmental scans (or reviews), and results from public consultations on city initiatives. More than 1,800 Edmontonians commented on the Initiative last October before it was revised and updated for the Council.

“The City of Edmonton’s Open City Initiative is a municipal perspective of the broader open government philosophy,” Chen adds. “It guides the development of innovative solutions in the effort to connect Edmontonians to city information, programs, services and any engagement opportunities.”

Like many provincial and federal open city policies, the document focuses on making Edmonton’s government more transparent, accountable and accessible. What sets Edmonton apart, Chen said, is a focus on including citizens in the design and delivery of city programs and services through deliberate consultations, presentations, public events and online citizen panels. In fact, more than 2,200 citizens are on just the panel asked questions by city officials as they design the Initiative’s infrastructure.

So what will it mean to live in one of North America’s smart cities? Edmonton is already beginning to provide free public Wi-Fi around the city, developing a not yet ready 311 application for two-way communication about city services, and working to fully integrate its electronic services across city departments.

“So one of the projects which has been reviewed very, very popularly is we have opened public Wi-Fi on the LRT stations,” Chen said of a project that includes plans for expanding the service to all train and tunnels. “A lot of commuters are utilizing the service and utilizing the free Wi-Fi provided by the city while waiting for their trains.”

What do you think about Edmonton’s Open City initiative? Let us know on twitter @geothinkca.

If you have thoughts or questions about this article, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Open North’s Inventory: Coming Up With Standards for Open Data

Open data standards are the subject of a new OGP report (Photo courtesy of opensource.com).

Open data standards are the subject of a new OGP report (Photo courtesy of opensource.com).

By Drew Bush

Open North’s James McKinney, Stéphane Guidoin and Paulina Marczak completed an inventory of global open data standards last week that seeks to establish a global viewpoint on the subject and identify any missing pieces. Their work was completed as part of the Open Government Partnership (OGP) Working Group, a group that aims to support governments seeking transparency through open data.

“The objective…is to promote the use of open data standards to improve transparency, create social and economic value, and increase the interoperability of open data activities across multiple jurisdictions,” the authors write in their report. “Its first deliverable is to complete an inventory of open data standards by type to develop a global view and to identify gaps and overlaps. Its final deliverable is an OGP document outlining baseline standards and best practices for open data, along with guidance for adoption and implementation.”

In their report, the authors used scripts to automatically collect, normalize and analyze data from 40 OGP members’ catalogs. Their goal was to determine how to standardize the ways such data is licensed, how metadata is used, what types of file formats catalogs make use of, and the overall structure of each catalog. As they wrote, they did not seek to “pursue a comprehensive inventory of data standards” but rather to focus on those “most relevant to OGP members.”

A myriad number of findings result from their analysis. In particular, they found OGP members have no common structure to their catalogs, a need for a common vocabulary for metadata (or data about data), and that there are significant problems with the metadata used to specify licensing in some countries (with “8 out of 24 catalogs, the licenses of over 10 percent of datasets are either not specified or underspecified”).

OGP’s Working Group consists of four streams that include Principles, Measurement, Standards and Capacity Building. Each consists of leads from the government, private and nonprofit world who work to identify and share practices that help OGP governments implement commitments and develop more ambitious and innovative open data plans.

McKinney serves as the lead for the Standards theme which promotes the use of open data standards to improve transparency and to increase the interoperability of open data activities across multiple jurisdictions. His organization, the Canadian non-profit Open North, creates online tools for civil society and government to educate and empower citizens to participate actively in Canadian democracy. Open North is also a Geothink partner.

Find the report here.

If you have thoughts or questions about this article, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.