Useful Tools For Data Journalists

27 February 2014

Author: cristina

Useful Tools For Data Journalists


On February 13 2014, a group of around 20 individuals, most of whom were journalists, activists, students, and programmers, gathered in the central Tbilisi office of JumpStart Georgia for a screening of the documentary film Our Currency is Information, created by the Tactical Tech Collective; a US based non-governmental organization dedicated to the use of information in activism. The film was part of a three-part series called Exposing the Invisible that explores a variety of tools that investigative journalists, data analysts, and information activists can use to fight organized crime and corruption both in their communities and internationally.


Our Currency is Information features the work of Paul Radu, a Romanian investigative reporter with the Organized Crime and Corruption Reporting Project who specializes in investigating international organized crime networks. In order to simplify the very complex process necessary to track the flow of money across borders, Radu and his team created the Visual Investigative Scenarios. Similar to JumpStart, Radu understood the necessity of communicating complex information visually in order to make it more easily understandable. By mapping criminal networks, the VIS allowed Radu and his team to understand and analyze the large quantities of data available. The tool maps elicit deals, offshore companies held by politicians, and other data that provides evidence to support the journalists’ stories. But the tool was not only created to assist journalists. It also helps to raise awareness among the general population about the illicit deals that take place across borders that could have an effect on communities locally. The journalists hope that if more people are aware of how these illegal activities affects them, they will put pressure on local officials to crack down on them.  


After the film, the group gathered in JumpStart’s office participated in a lively round table discussion to share their impressions about the documentary and the way a tool like the VIS could be used in Georgia. While the spectators raised a variety of important questions, there was a general consensus within the group that tools that visualize large quantities of data are both useful and necessary for anyone interested in conducting serious investigative reporting. A lot of the data made available to journalists is not organized for analysis, visualization, or uncovering stories, so journalists have to learn to use a variety of tools to gather information and make it consumable. JumpStart’s data journalist Nino Macharashvili, for example, uses a variety of technologies to analyze the large data sets which are the foundation of the information visuals that JumpStart publishes each month.


With this in mind, here are just a few examples of the lesser known resources that data journalists and information activists can use to gather, analyze and visualize data.


1.       The Investigative Dashboard:  Relaunched in October 2013, the investigative dashboard is a research tool to help journalists get access to business records around the world. Created by the Global Investigative Journalism Network with the help of Google Ideas, the Investigative Dashboard features a crowd-sourced database, put together by dozens of reporters and civic hackers, which contains company registration records and other similar public information. The ID also serves as a portal to more than 400 online databases in 120 jurisdictions where you can search for information on individuals and corporations worldwide.


2.  Open Corporates:  Is a website that provides consolidated information about corporations in a place that is easy to access by anyone. It currently provides information about 211,653 different legal entities in 173 countries worldwide, but its main target is to have a URL for every company in the world. While this may seem like a lofty goal, since its creation in 2010 the site has grown from including information about a few million companies to including information from over 75 jurisdictions and 55 million companies, and they are adding more each week.  For the more data-oriented and tech savvy users, the website also provides access to the raw data and the page’s API.


3.       Google Fusion Tables: This online database, which anyone with a Gmail account can find in their Google drive, is perfect for filtering and summarizing data across hundreds of rows. After you have sorted through your data, you can chart, map, network graph, custom layout, and easily embed or share what you’ve created. Moreover, Google Fusion Tables is also useful for producing quick and detailed maps, especially those where you need to zoom in. It produces visuals with the same high resolution as Google maps, but it can open a large quantity of data, including up to 100mb of CS. Moreover, all of the data entered into Google Fusion Tables is stored in Google drive, so you can easily use it to work on collaborative projects.  


4.       SQL:  Several relational database management systems exist, including PostgreSQL and MySQL, that use the standard query language (SQL) that is useful when your data is stored on multiple spreadsheets or in multiple tables or when you want to join very large data sets to query. SQL allows you to describe exactly the subset of data you want to extract and the exact changes you want to make, and it allows you to perform these queries across related data sets. You can also save your commands as a script, so you can document everything you’ve done with the data and automatically repeat those steps on a future data set.


Since these are just a few of the tools that will help data-driven journalists and activists to gather, store, analyze and visualize their data, we will continue this discussion on useful tools, by hosting a screening of the second two films in Tactical Tech’s series, From My Point of View, which covers DIY investigations of war in Syria, Lebanon and Palestine, and Unseen War, exploring the physical, moral, and political invisibility of US drone warfare in Pakistan. Join us on Thursday 6 March to participate in the discussion.