by CJ Joulain

Speaking on the intersection of tech and journalism in 2013, Miranda Mulligan commented that “technologists are winning at media technology innovation, but they do not understand journalism and … most journalists barely understand how the Internet works, let alone how to get the most out of storytelling on the web.” Since 2013, journalism has made great strides to incorporate software development (and open-source tools in particular) into more and more newsrooms.

This media fusion has been aided by various projects intended to bring together writers, storytellers, and tech innovators. For the past five years, the Knight Foundation and Mozilla have sponsored fellowships for those who “love to code and want to influence the future of journalism on the web.” The fellows have been placed within traditional print newsrooms but also at NPR, Frontline, and even cross-media collaborations like the Coral Project. The Nieman Foundation awards annual fellowships intended for programmers, designers, academics, and media researchers. Journalism schools across the U.S. are equipping their students to meet a changing media landscape, including teaching students to write headlines with SEO in mind.

Adapting to the need for more tech in newsrooms, virtually all major news companies offer tech-related internships. Independent news outlets are also following this trend, as evidenced by Democracy Now’s IT internship specializing in sysadmin tasks. On the flip side, tech companies are now getting into the reporting fold. Google, for example, offers students a summer fellowship in the Google News Lab, which sponsors an opportunity to “research and write stories, contribute to open source data programs, and create timely data to accurately frame public debates about issues in the US and the world.” Outlets like the Chicago Reporter are providing an immersive user experience; in one data visualization project related to police lawsuits; the module ends with the reader being encouraged to search through individual database entries on the issue. Although tech & journalism have made impressive progress in a short amount of time, several of Mulligan’s points still hold true.

Technologists have the ability to craft powerful tools but often lack the foresight to know how users will react to and possibly abuse their product. More than that, once the consequences are known, technologists have hesitated to curb their effects. While some companies, such as GitHub, have actively worked to make customer and community safety a priority, the issue of online harassment (particularly aimed at marginalized communities) still looms large in the sphere of social media. Ideally, journalism promotes an engaged readership through its well-researched and fact-checked coverage of current events. One of the biggest news stories heading into 2017 is the lack of critical discourse within the tech media sphere. For many, social media has supplanted newspapers as a primary source of news. More and more, as Brian Edwards-Tiekert has argued, “new media [technologies] organize us into national audiences defined by what we believe—fueling the rise of the so-called “filter bubble.” The algorithms utilized by platforms such as Facebook are effective at curating exactly what viewers want to see, but it’s largely revenues, and not public interest, that determine how information is filtered. As the 2016 U.S. presidential election has taught us, the difficulty of parsing legitimate “news” from propaganda has severe political consequences. In many respects, though tech has defined and innovated the media world, it lacks journalism’s capacity to act as an independent and critical monitor of power.

In recent years, reporters have utilized data visualization in particular as a way to augment the field’s traditional forms of storytelling. Job positions that did not exist within newsrooms five years ago are now becoming essential to the field. News application developers, for instance, not only parse data needed for research, they also tie together long-form storytelling with various forms of multimedia.

Lucio Villa, an interactive producer with the San Francisco Chronicle, commented: “Journalists today are being trained how to effectively use photos, video, and social media. But I think the next generation will be trained on how to cull and visualize data, especially in investigative journalism.” Villa points to the influence of photography on the web development work he does today. He states: “When people are viewing at one of my photos, I want them to look at it in a certain way and I feel that can apply to data visualization. I want people to interpret data a certain way and, for example, I might want to focus on a specific number (by using bright colors to highlight its importance). This tells them: ‘hey, this is an important fact and I want you to care about it.’ Photography has helped me to observe and analyze the user experience. Back when I was a photojournalist, I would assume every article needed a photograph. But now I believe every story should be treated differently. I’ve learned that not every story needs an article, video or chart. It’s up to the journalist to figure out what format works best to tell a story.”

This quote shows how data visualization is also re-fashioning journalism. It’s not only an aid to the story; it can often function as the centerpiece. More than that, it’s becoming another medium (like photography or video) through which reporters communicate their message. Villa also spoke about the types of tools popular amongst journalists and developers. He pointed to the ubiquity of the open-source JavaScript library D3 in newsrooms. D3 is especially known for its ability to transform data into compelling charts, shape drawings, animations, and mappings. As with jQuery, it allows full access to the DOM. The increasing number of tutorials available for D3 give interested journalists several starting points. While created with journalists in mind, D3 is also popular amongst data scientists and anyone who uses data to present a narrative.

Mike Bostock, one D3’s co-creators, helped revitalize the use of data visualizations while working as a web developer at the New York Times. This news organization is widely considered to be leading the journalistic charge in the adaptation of multimedia platforms and boasts an impressive array of APIs. For instance, early in 2016, the New York Times rolled out a VR application, and with its implementation of hls.js now also includes support for 360-degree videos.

Bostock is a Ph.D. alumnus of the Stanford Visualization Group, where he studied under the tutelage of noted Computer Science professor Jeffrey Heer. Research at this institute1 contributed to the development of D3, Protovis, Tableau, and Data Wrangler. Academic communities have long been part of a movement to make data more accessible. Among the leading institutes for cross-media collaboration is the Stanford Computational Journalism Lab, which sponsors an annual conference on how data and algorithms can craft better and more engaging narratives (with an emphasis on investigative reporting especially). Those involved in the Stanford Computational Journalism Lab are also taking a critical lens towards technology and the user experience, as evidenced by James Hamilton and Fiona Morgan’s forthcoming book on the digital lives of low-income folks. One challenge that remains is how to bring academic scholarship on these issues into the mainstream.

Lucio Villa credits Python as a preferred language for data journalism, perhaps because of its readability and supportive user community. The Python framework Django itself was created in collaboration with the Kansas newspaper the Lawrence Journal-World, so this may also contribute to Python’s popularity amongst news developers. Data journalists were involved in the development of the Python package Agate, which has been likened to a smaller and more compact version of Pandas.2 This is a fascinating trend, as journalists aren’t only reporting the news, they are participating in the development of open-source storytelling tools and sharing them with a wider public. 3

Open-source communities don’t only make projects more accessible and participatory, they also bring attention to social issues historically reserved for reporters and activists. One key issue many groups are tackling is police brutality and social movements towards reform and accountability. An example of this is Project Comport, a collaboration on public data between Code for America, community organizers, and city officials in Indianapolis. The information compiled includes use of police force, officer-involved shootings, and complaints against officers. Whether or not this improves police-community relations remains to be seen, though it seems clear accountability to the public guides the spirit of the project. A beta version was released in Baltimore earlier this fall and there are plans for Project Comport to roll out similar projects in different U.S. cities.

While the value of data engaged in critical social consciousness is apparent, tech-savvy activists are often shouldering the research and work that should be coming from the civic sphere. The 2015 and 2016 participants of the White House LBGTQ Initiative for Tech Innovation included criminal justice reform as part of their portfolio of open-source projects. Many of the participants in the Tech Innovation program are engineers with the skillset to build tools to analyze data, but who are also informed about and invested in working with the topics chosen at the summit. These topics include mental health, inclusive workplaces, the environment, and more. Courtney Wilburn, a White House fellow for both years, mentioned that the 2015 Criminal Justice Reform team focused on information specific to arrests and traffic stops, whereas the 2016 iteration emphasized the entire cycle of incarceration. Information referenced in their repository culls data from Public Safety Data Portal, which provides documentation related to body camera metadata and use of force, among other categories. While technologists can certainly aid in making datasets more legible and user-friendly, the platforms that they offer do not of themselves provide panaceas for solving the pressing social issues of the day. In media and beyond, several key questions remain: who will decide what issues matter, will public policy be shaped for the better by tech, in what ways will tech be regulated, and how can different communities collaborate to create data for public good?

References:

  • Cohen, S. “Investigative Reporting in the Age of Data Science.” Computation and Journalism Conference, 30 September 2016, Stanford University, Stanford, California. Keynote Address.
  • Villa, L. (2016, December 26.) Phone interview.
  • Wilburn, C. (2016, December 27.) Personal interview.

CJ lives in Oakland and is a co-organizer with Techqueria, a peer mentorship network for underrepresented groups in tech.


  1. The Stanford Visualization Group no longer exists. The staff and program have recently moved to the University of Washington and now exists as the Interactive Data Lab
  2. Pandas is an open-source Python library that is considered to greatly enhance NumPy through its use of labels and multi-level indices. It also offers the type of functionality typical of relational database operations (such as complex joins). 
  3. In terms of media platforms that incorporate data beyond news stories, the non-profit investigative journalism outlet ProPublica provides many datasets on a variety of issues for free.