Tag Archives: canada municipalities

Inside Geothink’s Open Data Standards Project: Standards For Improving City Governance

By Rachel Bloom

Rachel Bloom is a McGill University undergraduate student and project lead for Geothink’s Open Data Standards Project.

In February, I led a Geothink seminar with city officials to introduce the results of our open data standards project we began approximately one year earlier. The project was started with the objective of assisting municipal publishers of open data in standardizing their datasets. We presented two spreadsheets: the first was dedicated to evaluating ‘high-value’ open datasets published by Canadian municipalities and the second consisted of an inventory of open data standards applicable to these types of datasets.

Both spreadsheets enable our partners who publish open data to know what standards exist and who uses them for which datasets. The project I lead is motivated by the idea that well-developed data standards for city governance can grant us the luxury of not having to think about the compatibility of technological components. When we screw in a new light bulb or open a web document we assume that it will work with the components we have (Guidoin and McKinney 2012). Technology, whether it refers to information systems or manufactured goods, relies on standards to ensure its usability and dissemination.

Municipal governments that publish open data look to the importance of standards for improving the usability of their data. Unfortunately, even though ‘high-value’ datasets have increasingly become available to the public, these datasets currently lack a consensus about how they should be structured and specified. Such datasets include crime statistics and annual budget data that can provide new services to citizens when municipalities open such datasets by publishing them to their open data catalogues online. Anyone can access such datasets and use the data however they wish without restriction.

Civic data standards provide agreements about semantic and schematic guidelines for structuring and encoding the data. Data standards specify technical data elements such as file formats, data schemas, and unique identifiers to make civic data interoperable. For example, most datasets are published in CSV or XML formats. CSV structures the data in columns and rows, while XML encapsulates the data in a hierarchical tree of <tags>.

They also specify common vocabularies in order to clarify interpretation of the data’s meanings. Such vocabularies could include, for example, definitions for categories of expenditure in annual budget data. Geothink’s Open Data Standards Project offers publishers of open data an opportunity to improve the usability and efficiency of their data for consumers. This makes it easier to share data across municipalities because the technological components and their meanings within systems will be compatible.

Introducing Geothink’s Open Data Standards Project
No single, clear definition of an open data standard exists. In fact, most definitions of an ‘open data standard’ follow two prevailing ideas: 1) Standards for open data; 2) And, open standards for data. Geothink’s project examines and relates together both of these prevailing ideas (Table 1). The first spreadsheet, the ‘Adoption of Open Data Standards By Cities’, considers open data and its associated data standards. The second spreadsheet, the ‘Inventory of Open Data Standards,’ considers the process of open standardization. In other words, we were curious about what standards are currently being applied to open municipal data, and how to break down and document open standards for data in a way that is useful to municipalities looking to standardize their open data.

Table 1: Differences between ‘open data’ standards and open ‘data standards’

Requires open data Requires open standard process
Evaluation of ‘High-Value’ Datasets Yes No
Inventory of Open Data Standards No Yes

The project’s evaluation of datasets relates to standards for open data. Standards for open data refer to standards that, regardless of how they are developed and maintained, can be applied to open data. Open data, according to the Open Knowledge Foundation (2014), consists of raw digital data that should be freely available to anyone to use, repurposable and re-publishable as users wish, and absent mechanisms of control like restrictive licenses. However, the process of developing and maintaining standards for open data may not require transparency nor include public appeals for its development.

To discover what civic data standards are currently being used, the first spreadsheet, Adoption of Open Data Standards By Cities, evaluates ‘high value’ datasets specific to 10 domains (categories of datasets such as crime, transportation or or service requests) in the open data catalogues for the cities of Vancouver, Toronto, Surrey, Edmonton and Ottawa. The types of data were chosen based on the Open Knowledge Foundation’s choice of datasets considered to provide greatest utility for the public. The project’s spreadsheet notes salient structuring and vocabulary of each dataset; such as the name, file format, schema, and available metadata. It especially notes which data standards these five municipalities are using for their open data (if any at all).

With consultation from municipal bodies and organizations dedicated to publishing open data, we developed a second spreadsheet, Inventory and Evaluation of Open Data Standards,  that catalogues and evaluates 22 open data standards that are available for domain-specific data. The rows of this spreadsheet indicate individual data standards. The columns of this spreadsheet evaluate background information and quality for achieving optimal interoperability for each of the listed standards. Evaluating the quality of the standard’s performance, such as whether the standard is transferable to multiple jurisdictions, is an important consideration for municipalities looking to optimally standardize their data. Examples of open data standards in this inventory are BLDS for building permit data and the Budget Data Package for annual budget data.

The project’s second spreadsheet is concerned with open standards for data. Open standards, as opposed to closed standards, requires a collaborative, transparent, and consensus-driven process to maintain its development (Palfrey and Gasser, 2012). Therefore, open standards honor a commitment to processes of transparency, due process, and rights of appeal. Similarly to open data, open standards resist processes of unchecked, centralized control (Russell, 2014) . Open data standards make sure that end users do not get locked into a specific technology. In addition, because open standards are driven by consensus, they are developed according to the needs and interests of participatory stakeholders. While we provide spreadsheets on both, our project advocates implementing open standards for open data.

In light of the benefits of open standardization, the metrics of the second spreadsheet note the degree of openness for each standard. Such indicators of openness include multi-stakeholder participation and a consensus-driven process. Openness may be observed through the presence of online forums to discuss suggestions or concerns regarding the standard’s development and background information about each standard’s publishers. In addition, open standards use open licenses that dictate the standards may be used without restriction and repurposable for any use. Providing this information not only allows potential implementers to be aware of what domain-specific standards exist, but also allows them to gauge how well the standard performs in terms of optimal interoperability and openness.

Finally, an accompanying white paper explains the two spreadsheets and the primary objective of my project for both publishers and consumers of open data. In particular, it explains the methodology, justifies chosen evaluations, and notes the project’s results.  In addition, this paper will aid in navigating and understanding both of the project’s spreadsheets.

Findings from this Project
My work on this project has led me to conclude that the majority of municipally published open datasets surveyed do not use civic data standards. The most common standard used by municipalities in our survey was the General Transit Feed Specification (GTFS) for transit data and the Open311 API for service request data. Because datasets across cities and sectors vary formats and structure, differences in them coupled with a lack of cohesive definitions for labeling indicate standardization across cities will be a challenging undertaking. Publishers aiming to extend data shared among municipalities would benefit from collaborating and agreeing on standards for domain-specific data (as is the case with GTFS).

Our evaluation of 22 domain-specific data standards also shows standards do exist across a variety of domains. However, some domains, such as budget data, contain more open data standards than others. Therefore, potential implementers of standards must reconcile which domain-specific standard best fits their objectives in publishing the data and providing the most benefits for public good.

Many of standards also contain information for contacting the standard’s publishers along with online forums for concerns or suggestions. However, many still full information regarding their documentation or are simply in early draft stages. This means that although standards exist, some of these standards are in their early stages and may not be ready for implementation.

Future Research Pathways
This project has room for growth so that we can better our partners who publish and use open data decide how to go about adopting standards. To accomplish this goal, we could add more cities, domains, and open standards to the spreadsheets. In addition, any changes made to standards or datasets in the future must be updated.

In terms of the inventory of open data standards, it might be beneficial to separate metrics that evaluate openness of a standard from metrics that evaluate interoperability of a standard. Although we have emphasized the benefits of open standardization in this project, it is evident that some publishers of data do not perceive openness as crucial for the successfulness of a data standard in achieving optimal interoperability.

As a result, my project does not aim to dictate how governments implement data standards. Instead, we would like to work with municipalities to understand what is valued within the decision-making process to encourage adoption of specific standards. We hope this will allow us to provide guidance on such policy decisions. Most importantly, to complete such work, we ask Geothink’s municipal partners for input on factors that influence the adoption of a data standard in their own catalogues.

Contact Rachel Bloom at rachel.bloom@mail.mcgill.ca with comments on this article or to provide input on Geothink’s Open Data Standards Project.

Guidoin, Stéphane and James McKinney. 2012. Open Data, Standards and Socrata. Available at http://www.opennorth.ca/2012/11/22/open-data-standards.html. November 22, 2012.
Open Knowledge. Open Definition 2.0. Opendefinition.org. Retrieved 23 October 2015, from http://opendefinition.org/od/2.0/en/
Palfrey, John Gorham, and Urs Gasser. Interop: The promise and perils of highly interconnected systems. Basic Books, 2012.
Russell, Andrew L. Open Standards and the Digital Age. Cambridge University Press, 2014.

Four Geothink Partner Cities Named to Top 10 on First Ever Canada Open Cities Index

Rankings of Canada's Top 10 cities out of possible max scores of 193 (Image courtesy of Public Sector Digest).

Rankings of Canada’s Top 10 cities out of possible max scores of 193 in Public Sector Digest’s 2015 Open Cities Index (Image courtesy of Public Sector Digest).

By Drew Bush

Numerous city, state, and provincial governments across North America are finding new ways to share government data online. With more than 60 nations now part of the Open Government Partnership, it’s often difficult to determine which initiatives are simply part of a growing fad instead of being true attempts at more responsive and accountable government.

In the United States, President Barak Obama announced plans in 2009 to make many federal agencies open by default with government information, yet just last month the office charged with carrying out this directive failed to openly publish a schedule for its guidelines on this work. In Canada, a variety of city initiatives aim to allow citizens to more easily view crime statistics, find out information about neighborhood quality of life, or time the arrival of the next bus. With so many initiatives, it can be difficult to determine which best improves municipal responsiveness or offers new services to citizens particularly amidst promises by the newly elected Liberal government on open data (see Tweet below).

The authors of Public Sector Digest’s first ever 2015 Open Cities Index aim to solve this problem by providing “a reference point for the performance” of open data programs in 34 Canadian cities. The authors of the index undertook a survey to measure 107 variables related to open data programs. In particular, the index measures three types of data sets cities may have made available: those related to accountability (e.g. elections or budget data), innovation (e.g. traffic volume or service requests), and social policy (e.g. crime rates or health performance).

Across each data set in these three categories, municipalities were scored on five variables according to questions such as whether their data sets are available online, machine readable, free, and up-to-date. The aim was to help these municipalities, which often have limited resources to spend on open data programs, to assess their strengths and weaknesses and improve open data programs.

Four Geothink partner cities made the top 10 of the index, with Edmonton in first place, Toronto second, Ottawa fourth, and Vancouver sixth. At last year’s Canadian Open Data Summit, Edmonton also won the Canadian Open Data Award. You can find the full list of city rankings on the report’s home page. Yet the value of these types of ratings and awards will only be shown over time, according to many practitioners in the field.

“It’s hard to tell what it means to be ranked fourth because it’s a brand new thing,” said Robert Giggey, the coordinator and lead for the City of Ottawa’s Open Data program. “It’s not something that’s done every year, every month, that everybody knows about and is waiting for. So it’s kind of yet to be determined.”

The Value of the Index

Other indexes have measured open data at the national level, such as the Open Data Barometer. And measurements of municipal open data undertaken by two university students focused only on what types of data sets were available. The Open Cities Index works to take this a step further by engaging with key areas of interest. In particular, the index aims to standardize measurements around three themes:

1. Readiness—To what extent is the municipality ready/capable of fostering positive outcomes through its open data initiative?
2. Implementation—To what extent has the city fulfilled its open data goals and ultimately, what data has it posted online?
3. Impact—To what extent has the posted information been used, what benefits has the city accrued as a result of its open data program, and to what extent is the city capable of measuring the impact?

One Geothink researcher cautions, however, that it’s difficult to ascertain the worth of the index until its authors make the full report available along with more information on the 107 variables surveyed. In particular, he said, implementation can be a difficult metric to measure because different cities have different data collection responsibilities and different goals.

“I’m working on some research right now that shows that governments don’t actually have very good tracking metrics for use,” Peter Johnson, assistant professor in the Department of Geography and Environmental Management at the University of Waterloo, wrote in an e-mail to Geothink.ca. “Much of their sense of who uses open data and what it is used for is anecdotal and certainly incomplete. Since open data is provided with few restrictions, it is difficult to track who is using it and what it is being used for in any comprehensive way.”

Beyond the data online now, cities interested in being included in future years of the index and accessing a detailed analysis of municipal open data programs across Canada must contact Public Sector Digest. Some municipalities, like Ottawa, may wait and see how it goes in those places that have already paid for the service, according to Giggey.

“I want to see what the reaction is from the open data community, from other jurisdictions, from other areas—Geothink—about what they think of the index,” he said. “Is this any good? Is it worth anything? Then we’ll look to see if it’s something we want to invest in.”

A screen shot of Toronto's Open Data portal for city hall.

A screen shot of Toronto’s Open Data portal for city hall.

The Reaction Among Geothink Partner Cities

The value of the index will be determined as more details on its methodology and conclusions are released, and, perhaps, it becomes a regular measure of open data work in Canada’s municipalities. For now city staff in charge of open data work in the cities interviewed by Geothink.ca agree that the index does achieve the goal of bringing recognition to the work they are doing. In Ottawa, this has included work to make the city accountable by providing datasets on elected officials, budget data, lobbyist and employee information, and 311 calls. Toronto got a relatively early start with city budgets in 2009 and now also has a portal with social data on neighborhoods (including datasets like demographics, public health, and crime rates).

“I am glad the index recognizes the time and effort each city puts in to make its data open and accessible for reuse and repurpose,” Linda Low, open data coordinator for the City of Vancouver, wrote in an e-mail. Datasets in her city include information on crime, business licenses, property tax, Orthophoto imagery, and census local areal profiles. “This doesn’t happen overnight and it certainly is a team effort to get to where we are today.”

Edmonton’s recognition for its work derives from a 2010 decision by city leaders to launch an open data catalogue and the 2011 awarding of a $400,000 IBM Smart Cities Challenge award grant. Work in the city has included using advanced analysis of open data streams to enhance crime enforcement and prevention, an “open lab” to provide new products that improve citizen interactions with government, and interactive neighbourhood maps that will help Edmontonians locate and examine waste disposal services, recreational centres, transit information, and capital projects. More can be found on Edmonton’s work in a previous Geothink article.

“We are thrilled and honoured that our innovation and hard work have been recognized,” Yvonne Chen, a strategic planner for the City of Edmonton, wrote in an e-mail. She noted that Edmonton’s success, which results directly from a city council policy on open data, includes having an online budget tool that increases transparency about the allocation of public funds. “Our goal has always been to be a leader in the Canadian open government movement.”

While the recognition helps bring attention to the work being done by cities, much remains to be seen about how well the index actually compares cities against each other when objectives and the types of data recorded can vary greatly.

“It’s great to be in the top 10 any time, but we know from when we got the survey sent to us, we weren’t sure of all their measures that they were taking,” Keith McDonald, open data lead for the City of Toronto, said.

“We’d like to see other studies and maybe a little more apples to apples comparison for sure,” he added. “I think actually that was the intent—I can’t speak for the Public Sector Digest—but I think that was the intent of having an ongoing group that would buy into their measuring, so that people could continue to tweak and make it a stronger real apples to apples comparison. And we would support that.”

In fact, the value of an index like this one may lie in allowing cities to track their own progress over time.

“For all those cities included (and even those that aren’t) it can help to narrow the field as to where effort may be best placed to improve open data provision,” Johnson wrote of what he called a “high-profile external evaluation” of each city’s work.

If you have thoughts or questions about this article, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.