Paulina Marczak – looking back on her co-op at Open North


As she is now embarking on a Master’s degree, I interviewed Paulina Marczak (former Geothink student) to reflect on her four month co-op with Geothink partner, Open North.

What have you been up to since your internship at Open North?
After Open North, I did another co-op in the fall term with Dr. Derek Robinson under an NSERC USRA [Natural Sciences and Engineering Resource Council Undergraduate Student Research Award] grant, where I looked at variations in aboveground vegetative carbon storage across different spatial resolutions within Southwestern Ontario.
I just finished my undergraduate degree at the University of Waterloo. My undergraduate thesis looked at landscape configurations with wetlands in the boreal plains and asked: Is there a relationship between geology and wetland landscape configuration?

Right now I have just begun pursuing a Master’s degree in Geography at Queen’s University in Kingston. So I went into another sub-field still related to geography, but diverging from open data.

Your work in open data and open government are quite removed from your current course
Yes. I wanted to go into climate change after my undergrad, particularly through GIS and remote sensing. However, this summer I had the opportunity to work for the Canadian Open Data Exchange (ODX) and got to help develop their plans for commercialization of open data. They wanted someone who understood the value of open data.

What do you think you got out of your time at Open North?
I learned a lot. I started out from zero experience with open data. You know, it’s easy to fall down the rabbit hole of open data and explore one particular aspect of it, like metadata, without even touching another aspect Being able to co-author white papers that contribute to a global-scale initiative, and interview people from around the world, that was a really valuable and unique experience.

What was it like working for a non-profit?
James, Stéphane, and everyone at Open North were really great. It was different because all my previous co-ops were in government, federal and municipal. They were very structured. Open North was smaller, and it required you to be more. They want you to be a part of that team. They make you feel like you are a critical component of the team, not to mention the valuable mentorship they provide. Infomediaries, they prod governments, they speak on behalf of and give a voice to the people. That’s why I think their work is impactful. Working at Open North also gave me the opportunity to attend the Canadian Open Data Summit 2015 in Ottawa, where I got to meet various members of the open data community and speak to panelists.

What skills did you bring from Open North to your current position?
Being able to critically research, and experience with technologies such as APIs and R (statistical software). Most important is writing. At Open North I learned to write on a deadline, such as our OGP [Open Government Partnership] white papers, and I also learned about academic writing from Professor Renee Sieber.

It’s been interesting as a new Master’s student. I was talking to a librarian here in Kingston and they were interested in the idea of open data, but were surprisingly satisfied with the very restrictive data agreements that are currently in place…there is more work to be done on the advocacy side. On the other hand, I was able to talk to the City of Kingston and they are about to roll out a new open data initiative, per Council approval. From my interactions with the librarian, I realized that I could talk about this topic now and I had some idea of how things should be done. In fact, they were looking to me for advice, which was a new milestone for me.

It sounds like you may be interested in advocating for open data in your new environment?
Sure. I can talk about it, but I don’t feel I have the capacity and knowledge to spearhead it. But I do feel it is my responsibility to inform people if they don’t know what open data is or want to learn about some of current issues surrounding open data these days.

Do you feel more confident in talking about open data now?
Yes, but I don’t feel like I’m the expert. I feel like I’m an apprentice. Constantly learning.

Open Data and Urban Forests: A Summer Student Exchange in Waterloo

This is a guest post from Geothink Post Doctoral researcher James Steenberg, Ryerson University School of Urban and Regional Planning, working with Dr. Pamela Robinson. He writes about his experiences in Geothink’s student exchange program.

By James Steenberg, PhD

I recently undertook a three-day Geothink Summer Exchange at the University of Waterloo. My mission: to find out what, if anything, open data has to do with the practice of urban forestry.

I am currently a postdoctoral researcher under the supervision of Dr. Pamela Robinson at Ryerson University’s School of Urban and Regional Planning. Dr. Robinson was also on my PhD committee and over the past three years we have been blending our ideas on urban forest ecosystems, urban planning, citizen science, and open data. Open data and open government, in particular, are something that I’m excited about, but the topic is still quite new and unfamiliar to me. I was therefore incredibly fortunate to have the opportunity to seek out the guidance of Geothink co-applicant Dr. Peter Johnson.

Dr. Johnson is an Assistant Professor at Waterloo’s Department of Geography and Environmental Management, where among a great many other topics he conducts research on the value of open data and its role in open government initiatives. My hope was to learn about open data and open government from Peter and his students with the ultimate goal of writing a collaborative paper about the role of open data in municipal urban forestry. Practitioners of urban forestry are faced with a myriad of management challenges due to the complex, rapidly-changing, and vulnerable state of urban forest ecosystems. Two challenges particular stand out: 1) practitioners lack sufficient data describing the state of the urban forest to inform their decision-making and 2) a large portion of the urban forest is situated on privately-owned residential properties and municipal governments need to engage residents to undertake stewardship activities.

We began the three-day exchange with one of my favourite things to do: having a conversation about how to write something together. This was followed by a meet-and-greet lunch with Dr. Johnson’s students. I was also given the opportunity to give a presentation to students and faculty in Waterloo’s Faculty of Environment. I discussed and received feedback on my current research with Dr. Robinson investigating the effects of housing renewal on urban trees, which was the original research that led us to believe there was more to uncover on open data and urban forests. Over the course of the exchange, I learned about a number of fascinating research projects ranging from citizen engagement to volunteered geographic information (VGI) to water management.

james steenberg student exchange presentation

Giving my talk at the Faculty of Environment, University of Waterloo

james steenberg at waterloo

Rehearsing prior to my talk, with a captivated audience

It can be all too rare an opportunity to hear about on-going research projects that are outside of my discipline, and I found it insightful in guiding my own work. For instance, I learned about Qing (Lucy) Lu’s research and recent publication on how Edmonton citizens engage their government through different communication channels and technologies. Citizens and community groups also engage with their urban forest in many different ways, and arguably open data is one such way that is on the rise. In a serendipitous discovery, Lucy’s paper inspired me to explore Edmonton’s open data portal where I saw that the government leverages open data and the geoweb in their urban forestry. The City’s yegTreeMap initiative not only provides people with open data describing the urban forest and its benefits, but also provides an interactive mapping platform and even allows city residents to input data about their favourite trees.

I wrapped up my time in Waterloo with Dr. Johnson by revisiting a potential paper on the role of open data in municipal urban forestry, which was now appropriately seasoned with new ideas. In particular, I was challenged to think that maybe it’s not just about how urban foresters can use government open data to advance the practice. Perhaps our inquiry could be expanded to the full Geothink mandate of understanding citizen-government interactions. In Edmonton, citizens can engage their government by participating in urban forest data collection while municipal urban foresters can make better decisions with a more complete and up-to-date tree inventory. Can people and trees alike reap the benefits in cities that practice open urban forestry? This is the question I returned home with, and I will continue to investigate until answered.

james steenberg at waterloo 2

Importantly, the University of Waterloo campus has some stunning trees

My sincere thanks to Geothink for giving me the opportunity to go on a summer exchange at the University of Waterloo. Thank you to Dr. Peter Johnson for hosting me at the Department of Geography and Environmental Management and for introducing me to your students and colleagues.

To the Geothink community members: please don’t hesitate to contact me if you have further questions or if you are considering going on a summer exchange yourself.

James Steenberg is a postdoctoral researcher under the supervision of Dr. Pamela Robinson at Ryerson University’s School of Urban and Regional Planning. His research focuses on the ecology and management of the urban forest. James can be reached by email – james.steenberg@ryerson.ca – and on Twitter – @JamesSteenberg

Stay tuned for James’ next post detailing his research.

Getting a Better Handle on Geosocial Data with Geothink Co-Applicant Robert Feick

 Images and text from sites like Flickr (the source of this image) provide geosocial data which University of Waterloo Associate Professor Robert Feick and his graduate students work to make more useful to planners and citizens.

Images and text from sites like Flickr (the source of this image) provide geosocial data that University of Waterloo Associate Professor Robert Feick and his graduate students work to make more useful to planners and citizens.

By Drew Bush

A prevailing view of volunteered geographic information (VGI) is that large datasets exist equally across North American cities and spaces within them. Such data should therefore be readily available for planners wishing to use it to aid in decision-making. In a paper published last August in Cartography and Geographic Information Science, Geothink Co-Applicant Rob Feick put this idea to the test.

He and co-author Colin Robertson tracked Flickr data across 481 urban areas in the United States to determine what characteristics of a given city space correspond to the most plentiful data sets. This research allowed Feick, an associate professor in the University of Waterloo’s School of Planning, to determine how representative this type of user generated data are across and within cities.

The paper (entitled Bumps and bruises in the digital skins of cities: Unevenly distributed user-generated content across U.S. urban areas) reports that coverage varies greatly between downtown cores and suburban spaces, as may be expected, but also that such patterns differ markedly between cities that appear similar in terms of size, function and other characteristics.

“Often it’s portrayed as if these large data resources are available everywhere for everyone and there aren’t any constraints,” he told Geothink.ca recently about this on-going research. Since these data sets are often repurposed to learn more about how people perceive places, this misconception can have clear implications for those working with such data sets, he added.

“Leaving aside all the other challenges with user generated data, can we take an approach that’s been piloted let’s say in Montreal and assume that’s it going to work as well in Hamilton, or Calgary, or Edmonton and so on?” he said. Due to variations in VGI coverage, tools developed in one local context may not produce the same results elsewhere in the same city or in other cities.

The actual types of data used in research like Feick’s can vary. Growing amounts of data from social media sites such as Flickr, Facebook, and Twitter, and transit or mobility applications developed by municipalities include geographic references. Feick and his graduate students work to transform such large datasets—which often include many irrelevant (and unruly) user comments or posts—into something that can be useful to citizens and city officials for planning and public engagement.

“My work tends to center on two themes within the overall Geothink project,” Feick said. “I have a longstanding interest in public engagement and participation from a GIS perspective—looking at how spatial data and tools condition and, hopefully, improve public dialogue. And the other broad area that I’m interested in is methods that help us to transform these new types of spatial data into information that is useful for governments and citizens.”

“That’s a pretty broad statement,” he added. “But in a community and local context, I’m interested in both understanding better the characteristics of these data sources, particularly data quality, as well as the methods we can develop to extract new types of information from large scale VGI resources.”

Applying this Research Approach to Canadian Municipalities

Much of Feick’s Geothink related research at University of Waterloo naturally involves work in the Canadian context of Kitchener, Waterloo, and the province of Ontario. He’s particularly proud of the work being done by his graduate students, Ashley Zhang and Maju Sadagopan. Both are undertaking projects that are illustrative of Feick’s above-mentioned two areas of research focus.

Many municipalities offer Web map interfaces that allow the public to place comments in areas of interest to them. Sadagopan’s work centres on providing a semi-automated approach for classifying these comments. In many cases, municipal staff have to read each comment and manually view where the comment was placed in order to interpret a citizen’s concerns.

Sadagopan is developing spatial database tools and rule-based logic that use keywords in comments as well as information about features (e.g. buildings, roads, etc.) near their locations to filter and classify hundreds of comments and identify issues and areas of common concern. This work is being piloted with the City of Kitchener using data from a recent planning study of the Iron Horse Trail that that runs throughout Kitchener and Waterloo.

Zhang’s work revolves around two projects that relate to light rail construction that is underway in the region of Waterloo. First, she is using topic modeling approaches to monitor less structured social media and filter data that may have relevance to local governments.

“She’s doing work that’s really focused on mining place-based and participation related information from geosocial media as well as other types of popular media, such as online newspapers and blogs, etc.,” Feick said. “She has developed tools that help to start to identify locales of concern and topics that over space and time vary in terms of their resonance with a community.”

“She’s moving towards the idea of changing public feedback and engagement from something that’s solely episodic and project related to something that could include also this idea of more continuous forms of monitoring,” he added.

To explore the data quality issues associated with VGI use in local governments, they are also working on a new project with Kitchener that will provide pedestrian routing services based on different types of mobility. The light rail project mentioned above has disrupted roadways and sidewalks with construction in the core area and will do so until the project is completed in 2017. Citizen feedback on the impacts of different barriers and temporary walking routes for people with different modes of mobility (e.g. use of wheelchairs, walkers, etc.) will be used to study how to gauge VGI quality and develop best practices for integrating public VGI into government data processes.

The work of Feick and his students provides important insight for the Geothink partnership on how VGI can be used to improve communication between cities and their citizens. Each of the above projects has improved service for citizens in Kitchener and Waterloo or enhanced the way in which these cities make and communicate decisions. Feick’s past projects and future research directions are similarly oriented toward practical, local applications.

Past Projects and Future Directions

Past projects Feick has completed with students include creation of a solar mapping tool for Toronto that showed homeowners how much money they might make from the provincial feed-in-tariff that pays for rooftop solar energy they provide to the grid. It used a model of solar radiation to determine the payoff from positioning panels on different parts of a homeowner’s roof.

Future research Feick has planned includes work on how to more effectively harness different sources of geosocial media given large data sizes and extraneous comments, further research into disparities in such data between and within cities, and a project with Geothink Co-Applicant Stéphane Roche to present spatial data quality and appropriate uses of open data in easy-to-understand visual formats.

If you have thoughts or questions about this article, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Abstract of Paper mentioned in the above article:

Bumps and bruises in the digital skins of cities: Unevenly distributed user-generated content across U.S. urban areas
As momentum and interest builds to leverage new user-generated forms of digital expression with geographical content, classical issues of data quality remain significant research challenges. In this paper we highlight the uneven textures of one form of user-generated data: geotagged photographs in U.S. urban centers as a case study into representativeness. We use generalized linear modeling to associate photograph distribution with underlying socioeconomic descriptors at the city-scale, and examine intra-city variation in relation to income inequality. We conclude with a detailed analysis of Dallas, Seattle, and New Orleans. Our findings add to the growing volume of evidence outlining uneven representativeness in user-generated data, and our approach contributes to the stock of methods available to investigate geographic variations in representativeness. We show that in addition to city-scale variables relating to distribution of user-generated content, variability remains at localized scales that demand an individual and contextual understanding of their form and nature. The findings demonstrate that careful analysis of representativeness at both macro and micro scales simultaneously can provide important insights into the processes giving rise to user-generated datasets and potentially shed light into their embedded biases and suitability as inputs to analysis.


Geothoughts Talks 1, 2, & 3: Three Talks to Remember from the 2015 Geothink Summer Institute

Our first three Geothoughts Talks come from the 2015 Summer Institute.

Our first three Geothoughts Talks come from the 2015 Summer Institute.

By Drew Bush

Geothink’s Summer Institute may have concluded over a month ago, but, for those of you who missed it, we bring you three talks to remember. Run as part of Geothink’s five-year Canadian Social Sciences and Humanities Research Council (SSHRC) partnership grant, the Institute aimed to provide undergraduate and graduate students from the partnership and beyond with knowledge and training in theoretical and practical aspects of crowdsourcing.

Each day of the institute alternated morning lectures, panel discussions and in-depth case studies on topics in crowdsourcing with afternoon work sessions where professors worked with student groups one-on-one on their proposal to meet a challenge posed by the City of Ottawa. See our first post on this here.

The lectures featured Geothink Head Renee Sieber, associate professor in McGill University’s Department of Geography and School of Environment; Robert Goodspeed, assistant professor of Urban Planning at the University of Michigan’s Taubman College of Architecture and Urban Planning; Daren Brabham, assistant professor in the University of Southern California Annenberg School of Journalistm and Communication; and Monica Stephens, assistant professor in the Department of Geography at State University of New York at Buffalo.

Below we present you with a rare opportunity to learn about crowdsourcing with our experts as they discuss important ideas and case studies. A short summary describes what each talk covers.

Geothoughts Talk One: In-Depth Case Studies in Crowdsourcing (1hr 3min)

Join Sieber and Brabham as they discuss two case studies that examine the actual application of crowdsourcing technologies and techniques to real-world situations. First Sieber describes the work of her Master’s Student Ana Brandusescu in applying crowdsourcing technologies to chronic community development issues in three places in Montreal, QC and Vancouver, BC. Next, Brabham discusses one of his first efforts to research the application of crowdsourcing technology to public transportation planning during a design contest he held for a bus stop at the University of Utah campus in Salt Lake City, UT.

Geothoughts Talk Two: A Deeper Dive into Crowdsourcing: Advanced Topics in Crowdsourcing and Civic Crowdfunding (1hr 8min)

Goodspeed spends the morning covering three topics of inherent interest to anyone involved in crowdsourcing work. During this talk, he focuses in on three areas new to his own research including crowdfunding, formal crowdsourcing and the tool Ushahidi. Each of these topics helps prepare listeners for being a crowdsourcing professional.

Geothoughts Talk Three: Discussion on the Future of Crowdsourcing in the Public Sector (35 min)

Brabham and Goodspeed lead a discussion on where the future for crowdsourcing lies in the public sector. In particular, Goodspeed begins with an opening statement on how crowdsourcing can be used to help government agencies gain legitimacy by actually seeking input which can guide their actions. Brabham then challenges students to consider that crowdsourcing applications do fail and, even when they succeed, often can challenge whole professions that exist to collect the same data by other means.

If you have thoughts or questions about these podcasts, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Geothink Summer Institute to Kick-off Monday, June 15, 2015!

The   Environment 3 Building (EV3), at the University of Waterloo, where the summer institute will be held.

The Environment 3 Building (EV3), at the University of Waterloo, where the summer institute will be held.

By Drew Bush

Get ready Geothinkers, this year’s Geothink Summer Institute will run from June 15-16, 2015 and will be held at the University of Waterloo in Waterloo, Ontario. Check in at our Summer Institute web site, where we’ll be live tweeting the day’s events.

The agenda is jam-packed with big names in the emergent field of crowdsourcing, which one Geothinker calls “a web-based business model that harnesses the creative solutions of a distributed network of individuals.” That’s from the University of Southern California Annenberg School of Journalistm and Communication Assistant Professor Daren Brabham, who will be giving one of the morning’s first sessions to more than 30 undergraduate and graduate students who have registered to attend.

Other speakers include Robert Goodspeed, assistant professor of Urban Planning at the University of Michigan’s Taubman College of Architecture and Urban Planning; Monica Stephens, assistant professor in the Department of Geography at State University of New York at Buffalo and Geothink Head Renee Sieber, McGill University associate professor in the Department of Geography and School of Environment. Check out the full agenda here.

Speakers will explore topics related to crowdsourcing in a hyperlocal world where geospatial technologies like Google Maps and GPS-enabled cellphones enable massive quantities of data to be collected. In today’s world, there are tweets about potholes, mobile applications which deliver directions to the nearest coffee shop, and large databases only recently opened by many governments around the world.

The summer institute is hosted by Geothink, a five-year partnership grant awarded by the Canadian Social Sciences and Humanities Research Council (SSHRC) in 2012. The partnership includes researchers in different institutions across Canada, as well as partners in Canadian municipal governments, non-profits and the private sector. The expertise of our group is wide-ranging and includes aspects of social sciences as well as humanities such as: geography, GIS/geospatial analysis, urban planning, communications, and law.

If you have thoughts or questions about the article, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Is Raw Data Bad For You? Open Data Obligations to Government.

By: Leah Cooke, Stephanie Piper, Alana Kingdon, and Peter Johnson

*This blog post was written collaboratively during the springtime Geothink meetup between Ryerson University and University of Waterloo students + faculty. The goals of this meetup were to discuss current and future issues related to Geothink research themes.

What strings are attached to governments that provide open data to citizens? Alongside the current interest in government open data, questions remain about how government should share data. Specifically, what obligations do government have beyond simple data provision. These obligations could include educating citizens, contextualizing data, and also being receptive to citizen feedback on the data provided. For example, if a government publishes drinking water quality data, do they have a (moral, ethical, operational) obligation to support this data with relevant contextualizing information? We propose five main responses that government could provide when answering this question.

1. Nothing

Providing the data as it exists without any contextual information to aid in understanding the data.

2. Metadata

Defining the details of data by including acronyms and field names etc., to make the document readable for technically adept users.

3. Processed data

Data that includes maps, legends, annotations, or graphs/charts to aid in the understanding of the data by viewers, while still including original data to allow for additional analyses.  Also included is descriptive information or explanatory text that may be helpful to user’s understanding of the data.

4. Engagement and Responsiveness:

A responsive format for the distribution of open data would see a commitment to the sustainability of the data itself, by ensuring updates and maintenance to open data portals.  An obligation for citizen engagement would also be present at this level, with governments creating workshops or tools to help citizens become knowledgeable about the data as well as ensuring two-way communication between those with questions or suggestions surrounding the data.

5.  Interoperable Standards for Data Sets

Data sets are released in a standardized format, with the intention of increasing the accessibility of data for novice users as well as for ease of integrating information from different municipalities for regional analyses.

While these five standards are different potential ways government can operationally structure and release their data, the question still remains: which format is ethically or morally the option that should be adopted. Further, government bodies have complex requirements to abide by legislation, including the Accessibility for Ontarians with Disabilities Act (AODA), that also need to be considered when releasing any information. Do these requirements alter these obligations?  Beyond the regulations themselves, further accessibility issues are also raised.  Should the data be accessible by various levels of users, from novice to expert?  What does this mean for the ethical framework surrounding the release of the data?  As data is often released in formats only recognized by technical users such as .csv files, is there an additional obligation to release data that is open to nontechnical users as well? Inherent in the name, open data is the assumption that this data is being released in order to create an increase in transparency. It would be natural to assume that this data should therefore be accessible to users regardless of their technical skill levels.

In conclusion, for municipal governments, providing raw data is really just the first step. Governments that are serious about using open data as a prelude or support to open government need to also provide tools and support to enable data being turned into information. Metadata is not enough, and open data does not replace targeted information and publications created internally and shared with citizens.

Accuracy, Authenticity and Technical Aspects of Privacy

At the Universities of Laval and Waterloo, we are interested in what is often seen as the “virtuous cycle” of citizens’ increasing use of open government data and, potentially, for governments to actively leverage information that the public creates. Our work centers on issues of accuracy, authenticity and privacy in citizen-generated spatial data and the changing relationships between governments and citizens in data provision and use. In Year 1, we are concentrating on assembling baseline information that will help us understand how citizens use open data from governments and the extent that Canadian governments’ currently leverage citizen-contributed data. In this first phase, we will assemble a literature review and survey government partners at local, provincial and national levels to:

  1. Identify and characterize the main current open data initiatives (e.g., who is providing what data, in which forms?) and what data standards are used at local and provincial levels (if any?),
  2. Identify existing as well as potential practices for: a) using crowdsourced data (including barriers and opportunities) and, b) for validating crowdsourced data,
  3. Explore the linkages between open data (as a product and as practice) and crowdsourcing at the municipal and provincial levels (e.g. open data not only a service provided by the organization but also a way to improve data and by feedback loops in practice).

Two PhD students (Ashley Zhang – Waterloo, Teriitutea Quesnot – Laval) have been hired to jointly complete the literature review, survey administration and analysis and also participate in reporting the results through a journal paper. Teriitutea Quesnot is from French Polynesia. Teriitutea received his bachelor and masters in France and he has strong geocomputing and programming skills as well as consulting experience. Ashley is from China and has completed her Masters at the University of Georgia with a thesis focus on exploring spatio-temporal changes in the sociao-spatial structure of Beijing. Currently, her PhD research is centred on public engagement and place-making in smart cities. Since our government partners operate in both English and French, the survey will be bilingual to allow a pan-Canadian assessment to be developed. This information relating to current opportunities and barriers will help us develop new methods for promoting and visualizing data authenticity and accuracy. We anticipate that it also will contribute to project-wide efforts to develop best practices for Canadian governments to manage citizen-generated in light of data privacy and quality concerns.

We know that many of our partners and others have considerable experience in utilizing crowdsourced data. Even if you don’t then you probably have questions you’d like explored.

We encourage you to get in touch with us to enrich our research. Feel free to email stephane.roche@scg.ulaval.ca and robert.feick@uwaterloo.ca.