Geothink&Learn 8: Data Driven Journalism

The rise of big data that is geospatially referenced enables journalists and their organizations to utilize new analytical techniques and approaches to investigate and report the news. Building off computer assisted forms of reporting, data driven journalism allows the analyzation and filtering of large datasets to dig deeper into important societal issues or news stories. Such work can be driven by a variety of computational or data-driven advances including resources such as open source software, open access publishing, and open data. The range of applications for data driven journalism can provide a deeper understanding of important societal, governmental, or cultural issues. It also aids in helping journalism to reach new levels of service for the publicincluding understanding patterns to make decisions based on findings. As such, data driven journalism helps to drive a new brand of reporting that places journalists and news outlets in an increasingly more relevant role.

Geothink’s researchers have used big data, crowdsourced data, and new open datasets in conjunction with mapping technologies, applications (apps), and Web 2.0 software to explore applications of data driven journalism. This has even included an investigation of the changing news media market that resulted in concrete data on what parts of Canada are not well-served by print, radio, and television journalistic institutions. This panel of experts brought together leading academic experts and journalists to discuss the opportunities, challenges, and implications of this work.

On Tuesday, March 27 at 12:00 (EST), held its eighth monthly Geothink&Learn video conference session on the topic of data driven journalism. It highlighted Geothink’s unique interdisciplinary perspective and included a myriad of ideas from our faculty, students and partners. (Catch a recording of this session below along with a transcript of the written question and answer session.)

The convener for the session was Geothink Co-Applicant Jon Corbett, an associate professor at University of British Columbia, Okanagan’s Department of Community, Culture and Global Studies. Speakers included Roberto Rocha, an investigative data journalist at the Canadian Broadcasting Corporation; Fred Vallance-Jones, an associate professor in the school of journalism at University of King’s College, Halifax; April Lindgren, an associate professor in the Ryerson School of Journalism; and, Zane Schwartz, the 2017 Michelle Lang Fellow at the National Post.

A question and answer session followed after presentations conclude. Our five panelists briefly introduced their research and then reflected on new applications of data to journalism.

Edited transcript of written question and answer:

Question 1: Zane, super interesting database! Will there be a cross-reference database available for other funding?

It’s something that we’re talking about, but it would be resource and time dependent. It would be great to cross reference this database with things like lobbyist databases, public disclosure of contracts over $10,000 or even things like the PM’s public schedule.

Question 2: Zane, can you expand more on what stories stand to be missed?

Hopefully we’ll get to everything! At this point though there’s been quite a bit of focus on who are the donors to the dominante political parties. There’s all kinds of small things I haven’t seen too much talk on yet. Some examples: The federal Green party gives a fair bit to the provincial parties, in some years probably helping keep those parties afloat. The “regional” tool is well worth exploring; It shows how bigger provinces like Ontario and Alberta play a big role in some of the smaller provinces/territories.

Question 3: How can we make visual story telling more interactive?

There are many tools to make interactive projects, in addition to programming it manually. Some include Tableau, Carto, Datawrapper, Flourish.

Question 4: Fred, what are some of the challenges to representing historical sites that no longer exist?

Interesting question. We represented the buildings as 2-D polygons as we had no precise way to know what they looked like, accurate heights etc. So we saw it as a map. The other issue was the map we worked from was based on a 1914 insurance survey so was missing some buildings. Additionally, we had to do a lot of research at the archives to understand how the street network had changed. So all that together adds up to a simple answer, a lot of research is required to verify what you think existed.

Question 5: Who is data journalism for?

[Question answered verbally during session.]

Question 6: Do you have user statistics that suggests one visualisation technique is more successful than another?

It really depends on the data. There isn’t one visualization type that is known to be better for all types of information. But there are well-known principles of how the brain interprets information visually. For example, bar charts are more effective than pie charts. Look up Edward Tufte and Alberto Cairo.

Question 7: Another question is about the role of users real-time data which could be used in journalism, using social media like Twitter an so on. What is their role, the methods of validation (with considering the fact that news are shared) and also the peoples privacy when they share news?

It doesn’t change. We still need to verify information on social media. Perhaps moreso than in other contexts since it’s easy to spread false information online. As for privacy, when you post publicly on social media, it’s understood that it’s not a private utterance.

Question 8: To rephrase, do we aim data journalism towards everyone, or to those who are most likely to use it?

Hopefully to everyone, just like regular journalism. If only data-savvy people are getting value from our work, we’re not doing it right.

Question 9: Could the panelists share some of the links to projects shown in their presentations?

For the Halifax Explosion Project:

National Post Follow-the-money database:

Local News Project:

Tuesday, March 27, 2018 at 12:00 PM EST [NOW CONCLUDED]


After registering, you will receive a confirmation email containing information about joining the webinar.

Jon Corbett

Drew Bush

Download a poster of this event to share with colleagues.


Roberto Rocha: Within a few short years, data journalism has evolved from a newsroom novelty to a serious storytelling specialization. In its early days, it was fine to dump all the data in a map or interactive dashboard and let readers explore it themselves. Today, data journalists are expected to find the main story and present it in an engaging way, like any competent storyteller should.

Zane Schwartz: Zane discussed the National Post Follow-the-Money Database he created.



April Lindgren:  Talked about data journalism and local news. In particular, she discussed how the capacity of local news organizations to mine stories from open data is lagging behind the growing availability of government open data sources.  She suggested the problems are two-fold: 1) most local journalists still lack the skills they need to clean and use data  for storytelling, and 2) local news organizations do not have/are unwilling to invest resources in local data journalism projects.

Fred Vallance-Jones: Journalists love to use maps to sell stories because place is such an important aspect of the news. When master of journalism students at the University of King’s College wanted to tell the story of the 1917 Halifax Explosion, they turned to GIS software. Their professor, Fred Vallance-Jones, joins us.