Tag Archives: open data

Geothoughts 11: 2016 Geothink Summer Institute Trains New Generation of Open Data Experts

Geothink's 2016 Summer Institute took place the second week of May at Ryerson University in Toronto with 35 students in attendance.

Geothink’s 2016 Summer Institute took place the second week of May at Ryerson University in Toronto with 35 students in attendance.

By Drew Bush

We’re very excited to present you with our 11th episode of Geothoughts. You can also subscribe to this Podcast by finding it on iTunes.

In this episode, we take a look at the just concluded 2016 Geothink Summer Institute. Students at this year’s institute learned difficult lessons about applying actual open data to civic problems through group work and interactions with Toronto city officials, local organizations, and Geothink faculty. The last day of the institute culminated in a writing-skill incubator that gave participants the chance to practice communicating even the driest details of work with open data in a manner that grabs the attention of the public.

Held annually as part of a five-year Canadian Social Sciences and Humanities Research Council (SSHRC) partnership grant, each year the Summer Institute devotes three days of hands-on learning to topics important to research taking place in the grant. This year, each day of the institute alternated lectures and panel discussions with work sessions where instructors mentored groups one-on-one about the many aspects of open data.

Thanks for tuning in. And we hope you subscribe with us at Geothoughts on iTunes. A transcript of this original audio podcast follows.

TRANSCRIPT OF AUDIO PODCAST

Welcome to Geothoughts. I’m Drew Bush.

[Geothink.ca theme music]

The 2016 Geothink Summer Institute wrapped up during the second week of May at Ryerson University in Toronto. Held annually as part of a five-year Canadian Social Sciences and Humanities Research Council (SSHRC) partnership grant, each year the Summer Institute devotes three days of hands-on learning to topics important to research taking place in the grant.

The 35 students at this year’s institute learned difficult lessons about applying actual open data to civic problems through group work and interactions with Toronto city officials, local organizations, and Geothink faculty. The last day of the institute culminated in a writing-skill incubator that gave participants the chance to practice communicating even the driest details of work with open data in a manner that grabs the attention of the public.

On day one, students confronted the challenge of working with municipal open data sets to craft new applications that could benefit cities and their citizens. The day focused on an Open Data Iron Chef that takes it name from the popular television show of the same name. Geothink.ca spoke to the convener of the Open Data Iron Chef while students were still hard at work on their apps for the competition.

“Richard Pietro, OGT Productions and we try to socialize open government and open data.”

“You have such a variety of skill sets in the room, experience levels, ages, genders, ethnicities. I think it’s one of the most mixed sort of Open Data Iron Chefs that I’ve ever done. So I’m just excited to see the potential just based on that.”

“But I think they’re off to a great start. They’re definitely, you know, eager. That was clear from the onset. As soon as we said “Go,” everybody got into their teams. And it’s as though the conversation was like—as though they’ve been having this conversation for years.”

For many students, the experience was a memorable one. Groups found the competition interesting as they worked to conceptualize an application for most of the afternoon before presenting it the institute as a whole.

“More in general, just about the sort of the challenge we have today: It’s kind of interesting coming from like an academic sort of standpoint, especially in my master of arts, there is a lot of theory around like the potential benefits of open data. So it’s kind of nice to actually be working on something that could potentially have real implications, you know?”

That’s Mark Gill, a student in attendance from the University of British Columbia, Okanagan. His group worked with open data from the Association of Bay Area Governments Resilience Program to better inform neighborhoods about their level of vulnerability to natural hazards such as earthquakes, floods, or storms. The application they later conceptualized allowed users to measure their general neighborhood vulnerability. Specific users could also enter their socioeconomic data to gain their own individual vulnerability.

On day two, students heard from four members of Geothink’s faculty on their unique disciplinary perspectives on how to value open data. Here we catch up with Geothink Head Renee Sieber, an associate professor in McGill University’s Department of Geography and School of Environment, as she provided students a summary of methods for evaluating open data. Sieber started her talk by detailing many of the common quantitative metrics used including the counting of applications generated at a hackathon, the number of citizens engaged, or the economic output from a particular dataset.

“There’s a huge leap to where you start to think about how do you quantify the improvement of citizen participation? How do you quantify the increased democracy or the increased accountability that you might have. So you can certainly assign a metric to it. But how do you actually attach a value to that metric? So, I basically have a series of questions around open data valuation. I don’t have a lot of answers at this point. But they’re the sort of questions that I’d like you to consider.”

After hearing from the four faculty members, students spent the rest of the day working in groups to first create measures to value open data, and, second, role-play how differing sectors might use a specific type of data. In between activities on day two, students also heard from a panel of municipal officials and representatives of Toronto-based organizations working with open data. On day three, students transitioned to taking part in a writing-skills incubator workshop run by Ryerson University School of Journalism associate professors Ann Rahaula and April Lindgren. Students were able to learn from the extensive experience both professors have had in the journalism profession.

“I’m going to actually talk a little about, more broadly, about getting your message out in different ways, including and culminating with the idea of writing a piece of opinion. And, you know, today’s going to be mostly about writing and structuring an Op-ed piece. But I thought I want to spend a few minutes talking about the mechanics of getting your message out—some sort of practical things you can do. And of course this is increasingly important for all the reasons that Ann was talking about and also because the research granting institutions are putting such an emphasis on research dissemination. In other words, getting the results of your work out to organizations and the people who can use it.”

For most of her talk, Lindgren focused on three specific strategies.

“So, one is becoming recognized as an expert and being interviewed by the news media about your area of expertise. The second is about using Twitter to disseminate your work. And the third is how to get your Op-ed or your opinion writing published in the mainstream news media whether it’s a newspaper, an online site, or even if you’re writing for your own blog or the research project, or the blog of the research project that you’re working on.”

Both Lindgren and Rahaula emphasized how important it is for academics to share their work to make a difference and enrich the public debate. Such a theme is central to Geothink, which emphasizes partnerships between researchers and actual practitioners in government, private, and non-profit sectors. Such collaboration makes possible unique research that has direct impacts on civil society.

At the institute, this focus was illustrated by an invitation Geothink extended to Civic Tech Toronto for a hackathon merging the group’s members with Geothink’s students. Taking place on the evening of day two, the hack night featured a talk by Sieber and hands-on work on the issues Toronto citizens find most important to address in their city. Much like the institute itself, the night gave students a chance to apply their skills and knowledge to real applications in the city they were visiting.

[Geothink.ca theme music]

[Voice over: Geothoughts are brought to you by Geothink.ca and generous funding from Canada’s Social Sciences and Humanities Research Council.]

###

If you have thoughts or questions about this podcast, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Geothoughts Talks 4, 5, 6, & 7: Four Talks to Remember from the 2016 Summer Institute

Peter Johnson was one of four Geothink Co-Applicants who gave presentations at the 2016 Geothink Summer Institute. Listen to their lectures here as podcasts.

Peter Johnson was one of four Geothink Co-Applicants who gave presentations on day two of the 2016 Geothink Summer Institute. Listen to their lectures here as podcasts.

By Drew Bush

Geothink’s Summer Institute may have concluded but, for those of you who missed it, we bring you four talks to remember. These lectures come from day two of the institute when four Geothink faculty members gave short talks on their different disciplinary approaches to evaluating open data.

The lectures feature Peter Johnson, an assistant professor at Waterloo University’s Department of Geography and Environmental Planning; Teresa Scassa, Canada Research Chair in Information Law at the University of Ottawa; Pamela Robinson, associate professor in Ryerson University’s School of Urban and Regional Planning; And, Geothink Head Renee Sieber, associate professor in McGill University’s Department of Geography and School of Environment.

Students at this year’s institute learned difficult lessons about applying actual open data to civic problems through group work and interactions with Toronto city officials, local organizations, and Geothink faculty. The last day of the institute culminated in a writing-skill incubator that gave participants the chance to practice communicating even the driest details of work with open data in a manner that grabs the attention of the public.

Held annually as part of a five-year Canadian Social Sciences and Humanities Research Council (SSHRC) partnership grant, each year the Summer Institute devotes three days of hands-on learning to topics important to research taking place in the grant. This year, each day of the institute alternated lectures and panel discussions with work sessions where instructors mentored groups one-on-one about the many aspects of open data.

Below we present you with a rare opportunity to learn about open data with our experts as they discuss important disciplinary perspectives for evaluating the value of it. You can also subscribe to these Podcasts by finding them on iTunes.

Geothoughts Talk 4: Reflecting on the Success of Open Data: How Municipal Governments Evaluate Open Data Programs
Join Peter Johnson as he kicks off day two of Geothink’s 2016 Summer Institute by inviting students to dream that they are civil servants at the City of Toronto when the city receives a hypothetical “F” rating for its open data catalogue. From this starting premise, Johnson’s lecture interrogates how outside agencies, academics, and organizations evaluate municipal open data programs. In particular, he discusses problems with current impact studies such as the Open Data 500 and what other current evaluation techniques look like.

Geothoughts Talk 5: The Value of Open Data: A Legal Perspective

Teresa Scassa starts our fifth talk by discussing how those working in the discipline of law don’t usually participate in the evaluation of open data. While those in law don’t actually evaluate open data, however, legal statutes often are responsible for mandating such valuation, she argues. In particular, legal statutes often require specific types of data to be open. Furthermore, provisions in Canadian law such as the Open Courts Principle mean that many aspects of Canada’s legal system can be open-by-default.

Geothoughts Talk 6: Open Data: Questions and Techniques for Adding Civic Value
Pamela Robinson dispels the notion that open data derives value from economic benefits by instead discussing how such data can be used to fundamentally shift the relationship between civil society and institutions. She elaborates on this idea by noting that not all open data sets are created equal. Right now, she argues, the mixed ways in which open data is released can dramatically impact whether or not it’s useful to civic groups hoping to work with such data.

Geothoughts Talk 7: Measuring the Value of Open Data
In a talk that helps to summarize the previous three presenters, Renee Sieber discusses the different ways in which open data can be evaluated. She details many of the common quantitative metrics used—counting applications generated at a hackathon, the number of citizens engaged, or the economic output from a particular dataset—before discussing some qualitative indicators of the importance of a specific open data set. Some methods can likely capture certain aspects of open data better than others. She then poses a series of questions on how one can actually attach a value to the increased democracy or accountability gained by using open data.

If you have thoughts or questions about these podcasts, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca

Out of the Ivory Tower: Conveying Open Data Research to the General Public – Summer Institute Day 3

DSC_0595

Day three of Geothink’s 2016 Summer Institute featured Ann Rauhala and April Lindgren leading a writing-skills incubator workshop.

By Drew Bush

On day three, the students at Geothink’s 2016 Summer Institute shifted gears from working with open data to thinking about the importance of conveying their work to the public. The day alternated between interactive lectures on how to write a strong Op-ed piece for a newspaper and hands-on group work where students tried their own hand at writing gripping prose.

Ann Rahaula, an associate professor Ryerson University’s School of Journalism and associate director of the Ryerson Journalism Research Centre, started the day by talking about the importance of disseminating one’s research to a broader audience. Then she covered how to structure opinions pieces. She was followed by Geothink Partner April Lindgren, an associate professor at Ryerson University’s School of Journalism and founding director of the Ryerson Journalism Research Centre, who discussed how to think and write clearly about one’s research.

“You are already or are entering a world, let’s face it, of great privilege,” Rahaula told students. “You are lucky enough to be one of those people who gets to work with ideas and do exciting things that keep your brains moving. You are very fortunate. Part of the responsibility that comes with that privilege is your ability to communicate those ideas.”

“Because after all if what’s going on in the academy is not available or understood or appreciated in the public, we would still be, I don’t know, living in caves and reading the Globe and Mail,” she added. “And nothing else. Communicating these ideas will dramatically enhance your career no matter what your career is. It essentially raises your profile. It is actually, literally awarded in the academy. It is seen as knowledge translation.”

Over the first two days of the institute, students learned difficult lessons about applying actual open data to civic problems through group work and interactions with Toronto city officials, local organizations, and Geothink faculty. This last day of the institute represented the culmination of this work with open data.

Held annually as part of a five-year Canadian Social Sciences and Humanities Research Council (SSHRC) partnership grant, each year the Summer Institute devotes three days of hands-on learning to topics important to research taking place in the grant. This year, each day of the institute alternated lectures and panel discussions with work sessions where instructors mentored groups one-on-one.

After her introduction to the importance of students being able to communicate their ideas to a wider audience, Rahaula detailed the ways in which students should be structuring any opinions that they write. The interactive lecture took students through examples of opinion pieces ranging from good to bad, with detailed analyses of what made them either effective or ineffective.

To see an excerpt of Rahaula’s talk on how to structure an Op-ed, check out this video:

Lindgren continued with a discussion of the important points students should consider in constructing any piece of writing to make it accessible and engaging to the reader.

“Sitting down to write does cause grief to quite a—well to most of us at some point in time,” Lindgren told students. “And a lot of us actually also think that there is something really mysterious and mystical about the writing process. You know, I have to be in the mood and the window blinds have to be down to a certain level, and the plants have to be in flower, and I have to have had this for breakfast, and then I can write.”

“Well, that’s maybe what you think,” she added. “But the truth is it’s like anything else. If you want to get better at it, you’ve got to sit down and you’ve got to practice it because you will improve with practice. Now having said all of that, there actually are some tricks of the trade to write in a clear and accessible way. And I’m going to talk about some of those today.”

For more of Lindgren’s talk, check out this excerpt:

For the students in attendance, the change in direction on the last day proved refreshing and taught them important new skills. For many, the nuanced and detailed coverage of best writing practices is not something that is often taught in their home departments. While working in groups, many mentioned learning important skills such as how to clearly organize an opinions piece, use Twitter to promote research, write captivating sentences, or pick the right time to propose an article to a publication.

“The third day, for me as a journalist, was like going back home from a trip,” Catalina Arango, from University of Ottawa, said. “I had the chance to bring all those new experiences and lessons and put them into practice using familiar tools. The almost colloquial tone of the presentations and the exercises allowed me to translate that ‘almost exclusively academic’ concept of open data to simple words. Words that people can understand and digest in order to see their real value.”

“I took skills learned in other latitudes and put them into action in my current context,” she added. “It was a super interesting experience.”

Stay tuned for more iTunes podcasts from the Summer Institute here, and, of course, watch more of our video clips (which we’ll be uploading in coming days) here.

DSC_0594

Geothink students, faculty, and staff at the 2016 Summer Institute at Ryerson University in Toronto.

If you have thoughts or questions about this article or the videos, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Measuring the Value of Open Government Data – Summer Institute Day 2

 Day two of Geothink's 2016 Summer Institute began with short lectures on specific disciplinary perspectives on open data. Teresa Scassa, Canada Research Chair in Information Law at the University of Ottawa, gave a legal perspective on the value of open data.

Day two of Geothink’s 2016 Summer Institute began with short lectures on specific disciplinary perspectives on open data. Teresa Scassa, Canada Research Chair in Information Law at the University of Ottawa, gave a legal perspective on the value of open data.

By Drew Bush

Day two of the 2016 Summer Institute began with presentations from Geothink’s faculty that aimed to provide different disciplinary approaches to evaluating open data. Armed with this information, students spent the rest of the day working in groups to first create measures to value open data, and, second, role-play how differing sectors might use a specific type of data.

The morning began with 30-minute presentations from members of Geothink’s faculty. Peter Johnson, an assistant professor at Waterloo University’s Department of Geography and Environmental Planning, led off with a presentation on how municipal governments evaluate the success of their open data programs.

“This is the situation that we sort of find ourselves in when it comes to evaluating open data,” Johnson told students. “There’s this sort of world outside of government that’s bent on evaluating open data. And those are people like me, academics, those are non-profits, those are, you know, private sector organizations who are looking at open data and trying to understand how is it being used. So this is kind of, I think, a sign that open data has arrived a little bit. Right? It’s not just this sort of dusty, sort of nerdy cobweb in the corner of the municipal government basement. It’s something that other people are noticing and other people are taking an interest in.”

Johnson was followed by Teresa Scassa, Canada Research Chair in Information Law at the University of Ottawa, with a legal perspective on the value of open data. Pamela Robinson, associate professor in Ryerson University’s School of Urban and Regional Planning, gave a civic-oriented approach to the value of open data, one that was intentionally at odds with the private sector.

“I’ll be really blunt, I’m not that interested in making money from open data,” Robinson told students in regard to the common municipal reason for opening data. “It’s important but it’s not my thing. As an urban planner, my primary preoccupation is about citizen’s relationships with their government. And I’m interested in the proposition that open data as an input into open government can fundamentally shift the relationship between civil society and institutions.”

Finally, Geothink Head Renee Sieber, associate professor in McGill University’s Department of Geography and School of Environment, provided a summary of the methods for evaluating open data.

Each of these short lectures were part of a comprehensive look at open data during the three-day institute. Students at this year’s institute learned difficult lessons about applying actual open data to civic problems and on how to evaluate the success of an open data program. In between activities on day two, students also heard from a panel of municipal officials and representatives of Toronto-based organizations working with open data.

Held annually as part of a five-year Canadian Social Sciences and Humanities Research Council (SSHRC) partnership grant, each year the Summer Institute devotes three days of hands-on learning to topics important to research taking place in the grant. This year, each day of the institute alternated lectures and panel discussions with work sessions where instructors mentored groups one-on-one about the many aspects of open data.

But many students struggled not only with thinking about how to evaluate the open data that they were working with, but also with how to determine the impact of any project that utilizes such an information source.

“I think a big challenge that I personally am facing is this idea of it’s supposed to have real improvement for society, it’s suppose to help society,” Rachel Bloom, from McGill University, said. “But we find that a lot of vulnerable populations actually won’t have access to these applications and the technology. So it’s kind of like trying to reconcile this idea of helping while also being aware that like maybe you are not actually reaching the population you are trying to help. Which is kind of what openness is about—is actually engaging the people personally.”

It is for such reasons that evaluating open data can be quite nuanced—an idea represented in student group presentations on the topic. The presentations varied greatly with some student groups choosing metrics based on the things that a community might value and then establishing an outside monitor to observe datasets and report back to the community. Other students established a workflow to harness citizen input to evaluate open data through instruments such as online surveys.

An afternoon panel comprised of local city officials and representatives from groups concerned with open data discussed the practical side of publishing, using, and evaluating open data as it stands today. The panel included Keith McDonald, former open data lead for City of Toronto; Bryan Smith, co-founder and Chief-Executive-Officer of ThinkData Works; Marcy Burchfield and Vishan Guyadeen, from The Neptis Foundation; And, Dawn Walker and Curtis McCord, Geothink students from University of Toronto who designed the Citizen’s Guide to Open Data.

Two of the primary concerns shared by panelists included the lack of standards for which differing municipalities provide open data, and the gap that exists between how open data is provided and what businesses or citizens require to actually use it. Smith spoke of how early visions of students and application developers using open data to radically transform life in cities have not scaled up to the national level particularly well.

“What we are seeing, which I don’t think anyone predicted, is the large companies—mostly companies that run a bunch of apps that probably everyone here has on their phones—are the ones who are the biggest purveyors of open data,” Smith told students. Issues with the type and quantity of data (as well as differences between how data is provided in different places) have limited other players and even some of these big developers too.

For more on this discussion, check out an excerpt of the panel discussion below. We pick up the discussion as the panelists talk about standards in relation to the Open Government Partnership.

In role-playing activities, students considered the issues raised by the panel as well as the practical problems citizens or other groups might face in finding the open data they require. Concluding presentations included those from students playing the role of real estate developers, non-profits concerned with democracy, and a bicycle food courier service.

Stay tuned for the full audio of each professors’ talk presented as podcasts here. Also check back on Geothink for a synopsis of day three, and, of course, watch more of our video clips (which we’ll be uploading in coming days) here.

If you have thoughts or questions about this article or the videos, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Cooking up Open Data with the Iron Chef – Summer Institute Day 1

Richard Pietro and James Steenberg discuss one group's open data application with them at Geothink's 2016 Summer Institute.

Richard Pietro and James Steenberg discuss one group’s open data application with them at Geothink’s 2016 Summer Institute.

By Drew Bush

The 2016 Geothink Summer Institute kicked off on May 9 with introductions from Geothink Head Renee Sieber, associate professor in McGill University’s Department of Geography and School of Environment, and Pamela Robinson, associate professor in Ryerson University’s School of Urban and Regional Planning. By that afternoon, the 35 students attending had gotten their hands dirty conceptualizing applications for real open data.

Students at this year’s institute learned difficult lessons about applying actual open data to civic problems through group work and interactions with Toronto city officials, local organizations, and Geothink faculty. The last day of the institute culminated in a writing-skill incubator that gave participants the chance to practice communicating even the driest details of work with open data in a manner that grabs the attention of the public.

Held annually as part of a five-year Canadian Social Sciences and Humanities Research Council (SSHRC) partnership grant, each year the Summer Institute devotes three days of hands-on learning to topics important to research taking place in the grant. This year, each day of the institute alternated lectures and panel discussions with work sessions where instructors mentored groups one-on-one about the many aspects of open data.

On day one, students learned about open data during an Open Data Iron Chef event with Toronto-based open data expert, Richard Pietro, who affectionately calls himself an open government and open data fanboy. He’s known for twice riding his motorcycle across Canada to raise awareness of open data, his film Open, and the company he founded OGT Productions. All of this work has led him to a unique view of open data and open government.

“It [open data and open government] allows people to customize their government,” Pietro said between sessions. “It’s as simple as that. And whenever anybody asks what it means: It just allows people to customize their government. Very similarly it does what social media did in 2004 to our relationships with our friends and companies and celebrities. Open data and open government is like social media but ten years ago.”

“It’s very new,” he added. “Some people understand its potential but nobody really understands how much it’s going to change everything about how people interact with their government and how government interacts with people. So it’s going to have incredible transformative powers.”

Watch a clip of Pietro introducing the Open Data Iron Chef event on day one here:

After Pietro’s introduction to open data, James Steenberg, a postdoctoral researcher at Ryerson University with Robinson, walked students through the different file types open data is often released in, what an actual data set might look like, and how to go about working with such data.

“I think it would be more useful if I just went through all the questions I would have if I was literally doing an Iron Chef by myself at home in the kitchen, which I did,” Steenberg told students. “Small apartment, my work desk happened to be pretty much in my kitchen, so I was able to draw some inspiration.”

“And I put together some slides and questions and answers based on just the questions I had starting from scratch,” he continued. “So going to the open data portal, downloading them, opening them up, what kind of file formats are we looking at and so forth. So that’s what I’m going to do today, I’m going to bounce around from a few different files as you saw. But basically I’d like to just develop my own civic app here of what I hope can be a useful function in the city.”

The majority of the day was then given over to students actually finding data they wished to work with (Pietro gave a wide variety of examples during his presentation), a close examination of their chosen datasets, and determining novel uses for which the data could be used to improve city services or better engage citizens. At the end of the day, students presented their proposals that included an analysis of gaps in open data (in availability and quality) and what data was needed to be able to create an open data solution to a chosen real-world problem.

For one student group, this meant taking a closer look at data pertaining to water main breaks within the City of Toronto. In particular, they hoped to determine if any spatial pattern existed with water main breaks in comparison to aspects of the built or natural environments that might influence this phenomenon. The group felt such data could be used to help predict future break sites and facilitate repair before a rupture occurs.

Experiences with this type of work within the group varied widely.

“I don’t have a lot of background in some of this mapping stuff, so I come at it from a very different perspective,” Shelley Cook, from the University of British Columbia, Okanagan, said.

Her group-mate, in contrast, felt quite comfortable with the project the group had chosen.

“So I’ve had a lot of experience doing research on sort of the geographic side of open data, looking at geographic content,” Edgar Baculi, from Ryerson University, added. “I like this activity. This is a great experience. One question that comes to mind right now is why the quality of the data isn’t what I want it to be. In the future, I’d like to see the quality of the data better released, better published from municipal governments to help better answer questions we have as citizens in the decision-making process and in making things better for everyone else.”

Stay tuned for more iTunes podcasts from the Summer Institute here, check back on Geothink for synopses of days two and three, and, of course, watch more of our video clips (which we’ll be uploading in coming days) here.

If you have thoughts or questions about this article or video, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Getting a Better Handle on Geosocial Data with Geothink Co-Applicant Robert Feick

 Images and text from sites like Flickr (the source of this image) provide geosocial data which University of Waterloo Associate Professor Robert Feick and his graduate students work to make more useful to planners and citizens.

Images and text from sites like Flickr (the source of this image) provide geosocial data that University of Waterloo Associate Professor Robert Feick and his graduate students work to make more useful to planners and citizens.

By Drew Bush

A prevailing view of volunteered geographic information (VGI) is that large datasets exist equally across North American cities and spaces within them. Such data should therefore be readily available for planners wishing to use it to aid in decision-making. In a paper published last August in Cartography and Geographic Information Science, Geothink Co-Applicant Rob Feick put this idea to the test.

He and co-author Colin Robertson tracked Flickr data across 481 urban areas in the United States to determine what characteristics of a given city space correspond to the most plentiful data sets. This research allowed Feick, an associate professor in the University of Waterloo’s School of Planning, to determine how representative this type of user generated data are across and within cities.

The paper (entitled Bumps and bruises in the digital skins of cities: Unevenly distributed user-generated content across U.S. urban areas) reports that coverage varies greatly between downtown cores and suburban spaces, as may be expected, but also that such patterns differ markedly between cities that appear similar in terms of size, function and other characteristics.

“Often it’s portrayed as if these large data resources are available everywhere for everyone and there aren’t any constraints,” he told Geothink.ca recently about this on-going research. Since these data sets are often repurposed to learn more about how people perceive places, this misconception can have clear implications for those working with such data sets, he added.

“Leaving aside all the other challenges with user generated data, can we take an approach that’s been piloted let’s say in Montreal and assume that’s it going to work as well in Hamilton, or Calgary, or Edmonton and so on?” he said. Due to variations in VGI coverage, tools developed in one local context may not produce the same results elsewhere in the same city or in other cities.

The actual types of data used in research like Feick’s can vary. Growing amounts of data from social media sites such as Flickr, Facebook, and Twitter, and transit or mobility applications developed by municipalities include geographic references. Feick and his graduate students work to transform such large datasets—which often include many irrelevant (and unruly) user comments or posts—into something that can be useful to citizens and city officials for planning and public engagement.

“My work tends to center on two themes within the overall Geothink project,” Feick said. “I have a longstanding interest in public engagement and participation from a GIS perspective—looking at how spatial data and tools condition and, hopefully, improve public dialogue. And the other broad area that I’m interested in is methods that help us to transform these new types of spatial data into information that is useful for governments and citizens.”

“That’s a pretty broad statement,” he added. “But in a community and local context, I’m interested in both understanding better the characteristics of these data sources, particularly data quality, as well as the methods we can develop to extract new types of information from large scale VGI resources.”

Applying this Research Approach to Canadian Municipalities

Much of Feick’s Geothink related research at University of Waterloo naturally involves work in the Canadian context of Kitchener, Waterloo, and the province of Ontario. He’s particularly proud of the work being done by his graduate students, Ashley Zhang and Maju Sadagopan. Both are undertaking projects that are illustrative of Feick’s above-mentioned two areas of research focus.

Many municipalities offer Web map interfaces that allow the public to place comments in areas of interest to them. Sadagopan’s work centres on providing a semi-automated approach for classifying these comments. In many cases, municipal staff have to read each comment and manually view where the comment was placed in order to interpret a citizen’s concerns.

Sadagopan is developing spatial database tools and rule-based logic that use keywords in comments as well as information about features (e.g. buildings, roads, etc.) near their locations to filter and classify hundreds of comments and identify issues and areas of common concern. This work is being piloted with the City of Kitchener using data from a recent planning study of the Iron Horse Trail that that runs throughout Kitchener and Waterloo.

Zhang’s work revolves around two projects that relate to light rail construction that is underway in the region of Waterloo. First, she is using topic modeling approaches to monitor less structured social media and filter data that may have relevance to local governments.

“She’s doing work that’s really focused on mining place-based and participation related information from geosocial media as well as other types of popular media, such as online newspapers and blogs, etc.,” Feick said. “She has developed tools that help to start to identify locales of concern and topics that over space and time vary in terms of their resonance with a community.”

“She’s moving towards the idea of changing public feedback and engagement from something that’s solely episodic and project related to something that could include also this idea of more continuous forms of monitoring,” he added.

To explore the data quality issues associated with VGI use in local governments, they are also working on a new project with Kitchener that will provide pedestrian routing services based on different types of mobility. The light rail project mentioned above has disrupted roadways and sidewalks with construction in the core area and will do so until the project is completed in 2017. Citizen feedback on the impacts of different barriers and temporary walking routes for people with different modes of mobility (e.g. use of wheelchairs, walkers, etc.) will be used to study how to gauge VGI quality and develop best practices for integrating public VGI into government data processes.

The work of Feick and his students provides important insight for the Geothink partnership on how VGI can be used to improve communication between cities and their citizens. Each of the above projects has improved service for citizens in Kitchener and Waterloo or enhanced the way in which these cities make and communicate decisions. Feick’s past projects and future research directions are similarly oriented toward practical, local applications.

Past Projects and Future Directions

Past projects Feick has completed with students include creation of a solar mapping tool for Toronto that showed homeowners how much money they might make from the provincial feed-in-tariff that pays for rooftop solar energy they provide to the grid. It used a model of solar radiation to determine the payoff from positioning panels on different parts of a homeowner’s roof.

Future research Feick has planned includes work on how to more effectively harness different sources of geosocial media given large data sizes and extraneous comments, further research into disparities in such data between and within cities, and a project with Geothink Co-Applicant Stéphane Roche to present spatial data quality and appropriate uses of open data in easy-to-understand visual formats.

If you have thoughts or questions about this article, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Abstract of Paper mentioned in the above article:

Bumps and bruises in the digital skins of cities: Unevenly distributed user-generated content across U.S. urban areas
Abstract
As momentum and interest builds to leverage new user-generated forms of digital expression with geographical content, classical issues of data quality remain significant research challenges. In this paper we highlight the uneven textures of one form of user-generated data: geotagged photographs in U.S. urban centers as a case study into representativeness. We use generalized linear modeling to associate photograph distribution with underlying socioeconomic descriptors at the city-scale, and examine intra-city variation in relation to income inequality. We conclude with a detailed analysis of Dallas, Seattle, and New Orleans. Our findings add to the growing volume of evidence outlining uneven representativeness in user-generated data, and our approach contributes to the stock of methods available to investigate geographic variations in representativeness. We show that in addition to city-scale variables relating to distribution of user-generated content, variability remains at localized scales that demand an individual and contextual understanding of their form and nature. The findings demonstrate that careful analysis of representativeness at both macro and micro scales simultaneously can provide important insights into the processes giving rise to user-generated datasets and potentially shed light into their embedded biases and suitability as inputs to analysis.

 

Geothoughts 9: Geothink Project Measures Open Data Standards for Consumer and Publisher Uses

Geothink's Open Data Standards Project helps publishers and consumers better use open data.

Geothink’s Open Data Standards Project helps publishers and consumers better use open data.

By Drew Bush

We’re very excited to present you with our ninth episode of Geothoughts. You can also subscribe to this Podcast by finding it on iTunes.

In this episode, we examine a Geothink project on open data that officially kicked off in February 2015 with a Geothink teleconference call. Project lead Rachel Bloom, an undergraduate student in the Geothink Rapid Response Think Tank at McGill University, began this research one year ago. She worked with Geothink Head Renee Sieber, associate professor in McGill University’s Department of Geography and School of Environment.

It recently culminated in a white paper written on two spread sheets (1) an examination of high-value open datasets Canadian cities use; And (2) an inventory of open data standards published by open data providers. Listen in as Bloom explains to partners who publish open data how to know what standards exist and who uses them for which datasets.

Thanks for tuning in. And we hope you subscribe with us at Geothoughts on iTunes. A transcript of this original audio podcast follows.

TRANSCRIPT OF AUDIO PODCAST

Welcome to Geothoughts. I’m Drew Bush.

[Geothink.ca theme music]

“This project is about investigating open data domain specific standards at the Canadian municipal level, which I guess is kind of a mouthful. But basically I’ve created two spreadsheets to figure out how Canadian municipalities are publishing their data and how the level of conformity is per the guidelines for open data standards.”

That’s Rachel Bloom, an undergraduate student in Geothink’s Rapid Response Think Tank at McGill University, talking about domain specific data from sectors like transportation and city budgets. She’s working with Geothink Head Renee Sieber, associate professor in McGill University’s Department of Geography and School of Environment.

“To begin this project I chose ten domains to focus on. These domains came from open knowledge foundation spreadsheets. They are considered high value, and I thought these were interesting. I thought they were important to public use. So I chose them as the basis to create these spreadsheets.”

In late February, Bloom conducted a teleconference for the project’s partners in several Canadian cities. In it, Bloom discusses the project, each spreadsheet, and answers questions from those on the call. She starts with the first spreadsheet.

“It’s called ‘Adoption of Open Data Standards By Cities.’ So what we did for this is we have the 10 domains on the side on the y-axis, and then we have kind of nested between these certain metrics of how the municipality names the dataset, the file format, the structuration of the data, any metadata associated with the dataset or description of the data, and if theses data sets for each domain are already using specific data standards—open data standards. And these were taken from each municipality’s open data catalogues.”

“And it helped for eventually comparing whether the ways that data is being published is even kind of compatible with the semantic and schematic guidelines dictated by available open data standards.”

Participants then examined a specific example from the spreadsheet, building permits for the City of Toronto. The call then proceeded to the next spreadsheet developed.

“It’s called ‘Inventory and Evaluation of Open Data Standards.’ Here we have on the y-axis these are individual open data standards that are kind of domain specific so they are pegged to certain domains and they cover the ten domains used for the other table. Though there is two extra domains…the metrics you find kind of on the top, are an innovation on my part. They were chosen by me based on the demand of data publishers and consumers I found in my research which came from all different types of mediums.

“I’ve even read e-mail correspondences of people talking about what they want when they are structuring their datasets. They also come from reinforcing that these standards are open. So what does it mean to be open? They have to be open, they have to be consensus driven, they have to have to multi-stakeholder participation so theirs metrics have to account for that.”

Bloom again takes participants through a specific example, this time a budget data package, going through all the metrics to give participants a sense of the quality of standard in terms of making data interoperable. When she finishes, Linda Low, Open data lead for the City of Vancouver, interrupts her to ask:

“Rachel can you talk a little about the criteria for whether or not it’s open or not again, it’s whether multi-stakeholders contribute to it, and there was something else too, right, that you said?

“So when we talk about multi-stakeholders we’re talking about people who contribute that are from different facets of society. So the private sector, the public sector, civil societies, and also the obvious which is that open implies that there should be no royalties or fees associated with using the standard. It should be repurposable, they should be able to extend it how they wish, it should have a license that is open so that there is legal ramification for using the standard as you please. You’re right it’s not explicitly mentioned which of these kind of contribute to defining openness but all of these are good fundamental metrics for an open standard I would think.”

The teleconference proceeds as Bloom and the call’s participants discuss the spreadsheets and white paper, stopping to elaborate on specific examples or details in more depth. Toward the end of the 40-minute call, Bloom shares the vision and goals for this project.

“There’s metrics that can help publishers, but there’s also metrics that can help consumers who would want to voice how they want to structure the data which is really part of the open process. So I think it can be used as multiple, for multiple purposes, really so it’s flexible in that way. So I’m not sure if there’s a very specific way of using it cause it really depends on the goals of the person using the resource.”

She’s followed-up by Sieber who firsts asks a question and then provides insight into how the project’s goals were determined.

“A standard is likely to be viewed much differently if you want to do something for internal government use like business intelligence as oppose to external use. And depending upon the audience, if you’re doing something for realtors it might be viewed quite differently than if you’re trying to do it for, I don’t know, low information voters.”

At the conclusion, Low offers the municipal publishers perspective on how constantly updated and revised standards make it hard to know which one a municipality should adopt in differing domains such as city budgets, crime statistics, or waste removal services.

“When do we say this is justifiable for us without doing a whole bunch of research and wasting the effort afterward. That was the thing I always keep struggling about.”

Bloom doesn’t hesitate with an answer.

“There are so many options too and ways of approaching it. I mean, I don’t know–it’s really about the interests of the person who is publishing the data and the goals. I think at the end of the day, it’s going to, different governments are going to have reconcile what their goals are and how they want to go about it. Which is the hardest part.”

This project is ongoing and next steps will continue to look at the landscape of open data standards in Canada.

[Voice over: Geothoughts are brought to you by Geothink.ca and generous funding from Canada’s Social Sciences and Humanities Research Council.]

###

If you have thoughts or questions about this podcast, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Inside Geothink’s Open Data Standards Project: Standards For Improving City Governance

By Rachel Bloom

Rachel Bloom is a McGill University undergraduate student and project lead for Geothink’s Open Data Standards Project.

In February, I led a Geothink seminar with city officials to introduce the results of our open data standards project we began approximately one year earlier. The project was started with the objective of assisting municipal publishers of open data in standardizing their datasets. We presented two spreadsheets: the first was dedicated to evaluating ‘high-value’ open datasets published by Canadian municipalities and the second consisted of an inventory of open data standards applicable to these types of datasets.

Both spreadsheets enable our partners who publish open data to know what standards exist and who uses them for which datasets. The project I lead is motivated by the idea that well-developed data standards for city governance can grant us the luxury of not having to think about the compatibility of technological components. When we screw in a new light bulb or open a web document we assume that it will work with the components we have (Guidoin and McKinney 2012). Technology, whether it refers to information systems or manufactured goods, relies on standards to ensure its usability and dissemination.

Municipal governments that publish open data look to the importance of standards for improving the usability of their data. Unfortunately, even though ‘high-value’ datasets have increasingly become available to the public, these datasets currently lack a consensus about how they should be structured and specified. Such datasets include crime statistics and annual budget data that can provide new services to citizens when municipalities open such datasets by publishing them to their open data catalogues online. Anyone can access such datasets and use the data however they wish without restriction.

Civic data standards provide agreements about semantic and schematic guidelines for structuring and encoding the data. Data standards specify technical data elements such as file formats, data schemas, and unique identifiers to make civic data interoperable. For example, most datasets are published in CSV or XML formats. CSV structures the data in columns and rows, while XML encapsulates the data in a hierarchical tree of <tags>.

They also specify common vocabularies in order to clarify interpretation of the data’s meanings. Such vocabularies could include, for example, definitions for categories of expenditure in annual budget data. Geothink’s Open Data Standards Project offers publishers of open data an opportunity to improve the usability and efficiency of their data for consumers. This makes it easier to share data across municipalities because the technological components and their meanings within systems will be compatible.

Introducing Geothink’s Open Data Standards Project
No single, clear definition of an open data standard exists. In fact, most definitions of an ‘open data standard’ follow two prevailing ideas: 1) Standards for open data; 2) And, open standards for data. Geothink’s project examines and relates together both of these prevailing ideas (Table 1). The first spreadsheet, the ‘Adoption of Open Data Standards By Cities’, considers open data and its associated data standards. The second spreadsheet, the ‘Inventory of Open Data Standards,’ considers the process of open standardization. In other words, we were curious about what standards are currently being applied to open municipal data, and how to break down and document open standards for data in a way that is useful to municipalities looking to standardize their open data.

Table 1: Differences between ‘open data’ standards and open ‘data standards’

Requires open data Requires open standard process
Evaluation of ‘High-Value’ Datasets Yes No
Inventory of Open Data Standards No Yes

The project’s evaluation of datasets relates to standards for open data. Standards for open data refer to standards that, regardless of how they are developed and maintained, can be applied to open data. Open data, according to the Open Knowledge Foundation (2014), consists of raw digital data that should be freely available to anyone to use, repurposable and re-publishable as users wish, and absent mechanisms of control like restrictive licenses. However, the process of developing and maintaining standards for open data may not require transparency nor include public appeals for its development.

To discover what civic data standards are currently being used, the first spreadsheet, Adoption of Open Data Standards By Cities, evaluates ‘high value’ datasets specific to 10 domains (categories of datasets such as crime, transportation or or service requests) in the open data catalogues for the cities of Vancouver, Toronto, Surrey, Edmonton and Ottawa. The types of data were chosen based on the Open Knowledge Foundation’s choice of datasets considered to provide greatest utility for the public. The project’s spreadsheet notes salient structuring and vocabulary of each dataset; such as the name, file format, schema, and available metadata. It especially notes which data standards these five municipalities are using for their open data (if any at all).

With consultation from municipal bodies and organizations dedicated to publishing open data, we developed a second spreadsheet, Inventory and Evaluation of Open Data Standards,  that catalogues and evaluates 22 open data standards that are available for domain-specific data. The rows of this spreadsheet indicate individual data standards. The columns of this spreadsheet evaluate background information and quality for achieving optimal interoperability for each of the listed standards. Evaluating the quality of the standard’s performance, such as whether the standard is transferable to multiple jurisdictions, is an important consideration for municipalities looking to optimally standardize their data. Examples of open data standards in this inventory are BLDS for building permit data and the Budget Data Package for annual budget data.

The project’s second spreadsheet is concerned with open standards for data. Open standards, as opposed to closed standards, requires a collaborative, transparent, and consensus-driven process to maintain its development (Palfrey and Gasser, 2012). Therefore, open standards honor a commitment to processes of transparency, due process, and rights of appeal. Similarly to open data, open standards resist processes of unchecked, centralized control (Russell, 2014) . Open data standards make sure that end users do not get locked into a specific technology. In addition, because open standards are driven by consensus, they are developed according to the needs and interests of participatory stakeholders. While we provide spreadsheets on both, our project advocates implementing open standards for open data.

In light of the benefits of open standardization, the metrics of the second spreadsheet note the degree of openness for each standard. Such indicators of openness include multi-stakeholder participation and a consensus-driven process. Openness may be observed through the presence of online forums to discuss suggestions or concerns regarding the standard’s development and background information about each standard’s publishers. In addition, open standards use open licenses that dictate the standards may be used without restriction and repurposable for any use. Providing this information not only allows potential implementers to be aware of what domain-specific standards exist, but also allows them to gauge how well the standard performs in terms of optimal interoperability and openness.

Finally, an accompanying white paper explains the two spreadsheets and the primary objective of my project for both publishers and consumers of open data. In particular, it explains the methodology, justifies chosen evaluations, and notes the project’s results.  In addition, this paper will aid in navigating and understanding both of the project’s spreadsheets.

Findings from this Project
My work on this project has led me to conclude that the majority of municipally published open datasets surveyed do not use civic data standards. The most common standard used by municipalities in our survey was the General Transit Feed Specification (GTFS) for transit data and the Open311 API for service request data. Because datasets across cities and sectors vary formats and structure, differences in them coupled with a lack of cohesive definitions for labeling indicate standardization across cities will be a challenging undertaking. Publishers aiming to extend data shared among municipalities would benefit from collaborating and agreeing on standards for domain-specific data (as is the case with GTFS).

Our evaluation of 22 domain-specific data standards also shows standards do exist across a variety of domains. However, some domains, such as budget data, contain more open data standards than others. Therefore, potential implementers of standards must reconcile which domain-specific standard best fits their objectives in publishing the data and providing the most benefits for public good.

Many of standards also contain information for contacting the standard’s publishers along with online forums for concerns or suggestions. However, many still full information regarding their documentation or are simply in early draft stages. This means that although standards exist, some of these standards are in their early stages and may not be ready for implementation.

Future Research Pathways
This project has room for growth so that we can better our partners who publish and use open data decide how to go about adopting standards. To accomplish this goal, we could add more cities, domains, and open standards to the spreadsheets. In addition, any changes made to standards or datasets in the future must be updated.

In terms of the inventory of open data standards, it might be beneficial to separate metrics that evaluate openness of a standard from metrics that evaluate interoperability of a standard. Although we have emphasized the benefits of open standardization in this project, it is evident that some publishers of data do not perceive openness as crucial for the successfulness of a data standard in achieving optimal interoperability.

As a result, my project does not aim to dictate how governments implement data standards. Instead, we would like to work with municipalities to understand what is valued within the decision-making process to encourage adoption of specific standards. We hope this will allow us to provide guidance on such policy decisions. Most importantly, to complete such work, we ask Geothink’s municipal partners for input on factors that influence the adoption of a data standard in their own catalogues.

Contact Rachel Bloom at rachel.bloom@mail.mcgill.ca with comments on this article or to provide input on Geothink’s Open Data Standards Project.

References
Guidoin, Stéphane and James McKinney. 2012. Open Data, Standards and Socrata. Available at http://www.opennorth.ca/2012/11/22/open-data-standards.html. November 22, 2012.
Open Knowledge. Open Definition 2.0. Opendefinition.org. Retrieved 23 October 2015, from http://opendefinition.org/od/2.0/en/
Palfrey, John Gorham, and Urs Gasser. Interop: The promise and perils of highly interconnected systems. Basic Books, 2012.
Russell, Andrew L. Open Standards and the Digital Age. Cambridge University Press, 2014.