geosocial data | Geothink

Images and text from sites like Flickr (the source of this image) provide geosocial data which University of Waterloo Associate Professor Robert Feick and his graduate students work to make more useful to planners and citizens.

Images and text from sites like Flickr (the source of this image) provide geosocial data that University of Waterloo Associate Professor Robert Feick and his graduate students work to make more useful to planners and citizens.

By Drew Bush

A prevailing view of volunteered geographic information (VGI) is that large datasets exist equally across North American cities and spaces within them. Such data should therefore be readily available for planners wishing to use it to aid in decision-making. In a paper published last August in Cartography and Geographic Information Science, Geothink Co-Applicant Rob Feick put this idea to the test.

He and co-author Colin Robertson tracked Flickr data across 481 urban areas in the United States to determine what characteristics of a given city space correspond to the most plentiful data sets. This research allowed Feick, an associate professor in the University of Waterloo’s School of Planning, to determine how representative this type of user generated data are across and within cities.

The paper (entitled Bumps and bruises in the digital skins of cities: Unevenly distributed user-generated content across U.S. urban areas) reports that coverage varies greatly between downtown cores and suburban spaces, as may be expected, but also that such patterns differ markedly between cities that appear similar in terms of size, function and other characteristics.

“Often it’s portrayed as if these large data resources are available everywhere for everyone and there aren’t any constraints,” he told Geothink.ca recently about this on-going research. Since these data sets are often repurposed to learn more about how people perceive places, this misconception can have clear implications for those working with such data sets, he added.

“Leaving aside all the other challenges with user generated data, can we take an approach that’s been piloted let’s say in Montreal and assume that’s it going to work as well in Hamilton, or Calgary, or Edmonton and so on?” he said. Due to variations in VGI coverage, tools developed in one local context may not produce the same results elsewhere in the same city or in other cities.

The actual types of data used in research like Feick’s can vary. Growing amounts of data from social media sites such as Flickr, Facebook, and Twitter, and transit or mobility applications developed by municipalities include geographic references. Feick and his graduate students work to transform such large datasets—which often include many irrelevant (and unruly) user comments or posts—into something that can be useful to citizens and city officials for planning and public engagement.

“My work tends to center on two themes within the overall Geothink project,” Feick said. “I have a longstanding interest in public engagement and participation from a GIS perspective—looking at how spatial data and tools condition and, hopefully, improve public dialogue. And the other broad area that I’m interested in is methods that help us to transform these new types of spatial data into information that is useful for governments and citizens.”

“That’s a pretty broad statement,” he added. “But in a community and local context, I’m interested in both understanding better the characteristics of these data sources, particularly data quality, as well as the methods we can develop to extract new types of information from large scale VGI resources.”

Applying this Research Approach to Canadian Municipalities

Much of Feick’s Geothink related research at University of Waterloo naturally involves work in the Canadian context of Kitchener, Waterloo, and the province of Ontario. He’s particularly proud of the work being done by his graduate students, Ashley Zhang and Maju Sadagopan. Both are undertaking projects that are illustrative of Feick’s above-mentioned two areas of research focus.

Many municipalities offer Web map interfaces that allow the public to place comments in areas of interest to them. Sadagopan’s work centres on providing a semi-automated approach for classifying these comments. In many cases, municipal staff have to read each comment and manually view where the comment was placed in order to interpret a citizen’s concerns.

Sadagopan is developing spatial database tools and rule-based logic that use keywords in comments as well as information about features (e.g. buildings, roads, etc.) near their locations to filter and classify hundreds of comments and identify issues and areas of common concern. This work is being piloted with the City of Kitchener using data from a recent planning study of the Iron Horse Trail that that runs throughout Kitchener and Waterloo.

Zhang’s work revolves around two projects that relate to light rail construction that is underway in the region of Waterloo. First, she is using topic modeling approaches to monitor less structured social media and filter data that may have relevance to local governments.

“She’s doing work that’s really focused on mining place-based and participation related information from geosocial media as well as other types of popular media, such as online newspapers and blogs, etc.,” Feick said. “She has developed tools that help to start to identify locales of concern and topics that over space and time vary in terms of their resonance with a community.”

“She’s moving towards the idea of changing public feedback and engagement from something that’s solely episodic and project related to something that could include also this idea of more continuous forms of monitoring,” he added.

To explore the data quality issues associated with VGI use in local governments, they are also working on a new project with Kitchener that will provide pedestrian routing services based on different types of mobility. The light rail project mentioned above has disrupted roadways and sidewalks with construction in the core area and will do so until the project is completed in 2017. Citizen feedback on the impacts of different barriers and temporary walking routes for people with different modes of mobility (e.g. use of wheelchairs, walkers, etc.) will be used to study how to gauge VGI quality and develop best practices for integrating public VGI into government data processes.

The work of Feick and his students provides important insight for the Geothink partnership on how VGI can be used to improve communication between cities and their citizens. Each of the above projects has improved service for citizens in Kitchener and Waterloo or enhanced the way in which these cities make and communicate decisions. Feick’s past projects and future research directions are similarly oriented toward practical, local applications.

Past Projects and Future Directions

Past projects Feick has completed with students include creation of a solar mapping tool for Toronto that showed homeowners how much money they might make from the provincial feed-in-tariff that pays for rooftop solar energy they provide to the grid. It used a model of solar radiation to determine the payoff from positioning panels on different parts of a homeowner’s roof.

Future research Feick has planned includes work on how to more effectively harness different sources of geosocial media given large data sizes and extraneous comments, further research into disparities in such data between and within cities, and a project with Geothink Co-Applicant Stéphane Roche to present spatial data quality and appropriate uses of open data in easy-to-understand visual formats.

If you have thoughts or questions about this article, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Abstract of Paper mentioned in the above article:

Bumps and bruises in the digital skins of cities: Unevenly distributed user-generated content across U.S. urban areas
Abstract
As momentum and interest builds to leverage new user-generated forms of digital expression with geographical content, classical issues of data quality remain significant research challenges. In this paper we highlight the uneven textures of one form of user-generated data: geotagged photographs in U.S. urban centers as a case study into representativeness. We use generalized linear modeling to associate photograph distribution with underlying socioeconomic descriptors at the city-scale, and examine intra-city variation in relation to income inequality. We conclude with a detailed analysis of Dallas, Seattle, and New Orleans. Our findings add to the growing volume of evidence outlining uneven representativeness in user-generated data, and our approach contributes to the stock of methods available to investigate geographic variations in representativeness. We show that in addition to city-scale variables relating to distribution of user-generated content, variability remains at localized scales that demand an individual and contextual understanding of their form and nature. The findings demonstrate that careful analysis of representativeness at both macro and micro scales simultaneously can provide important insights into the processes giving rise to user-generated datasets and potentially shed light into their embedded biases and suitability as inputs to analysis.

London Olympic wayfinding beacon (Photo courtesy of www.mudarchitecture.com).

By Drew Bush

In two articles published this January, Geothink researcher Stéphane Roche and his doctoral student Teriitutea Quesnot argue that not all geosocial data is equivalent, and that better data on the social significance of a landmark could greatly enhance our understanding of human wayfinding behavior. A Professor of Geomatics at University of Laval, Roche’s research over the past five years has focused on how new forms of digital spatiality affect spatial reasoning skills, and the capacity of individuals to engage in the city.

Entitled “Measure of Landmark Semantic Salience through Geosocial Data Streams,” the first paper was published by Roche in the ISPRS International Journal of Geo-Information. The authors write that a lot of research “in wayfinding is done in order to enable individuals to reach as quickly as possible a desired destination, to help people with disabilities by designing cognitively appropriate orientation signs, and reduce the fact of being lost.”

Previous researchers in the field of geo-cognition have tried to characterize the salience of landmarks in human wayfinding behaviour. Most have classified differing landmarks by visual, structural and semantic cues. However, the social dimensions of a landmark, such as how they are practised or recognized by individuals or groups, had been excluded from its semantic salience (or often reduced to historical or cultural cues), according to the authors.

Instead, the authors follow in a tradition of research which utilizes text mining from the web to understand how places are expressed by Internet users rather than relying on how they are visually perceived. Such an approach has been made possible by social media and mobile communications technology that has resulted in vast user-generated databases that constitute “the most appropriate VGI data for the detection of global semantic landmarks.”

In conducting their research, the authors examined world famous landmarks and detected semantic landmarks in the cities of Vienna and Paris using data from Foursquare API v2 and Facebook API v2.1. from September 29, 2014 to November 15, 2014.

In a second paper entitled “Platial or Locational Data? Toward the Characterization of Social Location Sharing,” the authors expanded on this theme in arguing that not all geosocial data is equal. The paper was presented at 48^th Hawaii International Conference on Systems Sciences this past January.

Some data, which the authors consider “platial,” relates more to users experiences of a given place while “spatial” data is tied to the actual coordinates of a place. In the context of geosocial data, spatial data might mean the exact location of the Eiffel tower while palatial could refer to a person passing by the Eiffel tower or taking a photo of it from another location.

Because each can potentially represent a very different kind of data point, they must be treated differently. As the authors write, “With the objective of a better understanding of urban dynamics, lots of research projects focused on the combination of geosocial data harvested from different social media platforms. Those analyses were mainly realized on a traditional GIS, which is a tool that does not take into account the platial component of spatial data. Yet, with the advent of Social Location Sharing, the inconvenience of relying on a classic GIS is that a large part of VGI is now more palatial than locational.”

Find links to each article along with their abstracts below.

Measure of Landmark Semantic Salience through Geosocial Data Streams

ABSTRACT

Research in the area of spatial cognition demonstrated that references to landmarks are essential in the communication and the interpretation of wayfinding instructions for human being. In order to detect landmarks, a model for the assessment of their salience has been previously developed by Raubal and Winter. According to their model, landmark salience is divided into three categories: visual, structural, and semantic. Several solutions have been proposed to automatically detect landmarks on the basis of these categories. Due to a lack of relevant data, semantic salience has been frequently reduced to objects’ historical and cultural significance. Social dimension (i.e., the way an object is practiced and recognized by a person or a group of people) is systematically excluded from the measure of landmark semantic salience even though it represents an important component. Since the advent of mobile Internet and smartphones, the production of geolocated content from social web platforms—also described as geosocial data—became commonplace. Actually, these data allow us to have a better understanding of the local geographic knowledge. Therefore, we argue that geosocial data, especially Social Location Sharing datasets, represent a reliable source of information to precisely measure landmark semantic salience in urban area.

Platial or Locational Data? Toward the Characterization of Social Location Sharing

ABSTRACT

Sharing “location” information on social media became commonplace since the advent of smartphones. Location-based social networks introduced a derivative form of Volunteered Geographic Information (VGI) known as Social Location Sharing (SLS). It consists of claiming “I am/was at that Place”. Since SLS represents a singular form of place-based (i.e. platial) communication, we argue that SLS data are more platial than locational. According to our data classification of VGI, locational data (e.g. a geotagged tweet which geographic dimension is limited to its coordinate information) are a reduced form of platial data (e.g. a Swarm check-in). Therefore, we believe these two kinds of data should not be analyzed on the same spatial level. This distinction needs to be clarified because a large part of geosocial data (i.e. spatial data published from social media) tends to be analyzed on the basis of a locational equivalence and not on a platial one.

If you have thoughts or questions about the article, get in touch with Drew Bush, Geothink’s digital journalist, at drew.bush@mail.mcgill.ca.

Geothink

Canadian Geospatial and Open Data Research Partnership

Tag Archives: geosocial data

Getting a Better Handle on Geosocial Data with Geothink Co-Applicant Robert Feick