by Elizabeth J. Kopras

The history of ethical oversight in research

Scientists are humans. One negative aspect of being human is that we can justify some horrific behaviors. Therefore, scientists who want to use people as research subjects must study cautionary tales of scientists who went before us. We undergo formal training to learn about scientists who hurt people, so we can be aware of the dangers of unmonitored human subjects research. We are now required to adhere to the Common Rule, which includes three core principles of ethics: justice, beneficence, and respect for the person. Justice means ensuring that reasonable, non-exploitative, and well-considered procedures are administered fairly, and that the costs and benefits of research will be distributed equally. Beneficence is the philosophy of “do no harm,” with the goal of maximizing research benefits while minimizing risks to the research subjects. Respect for the person includes protecting the autonomy of subjects, treating them with courtesy and respect, and providing informed consent before including them as research subjects.

One critical case of unethical scientific research is the Tuskegee Syphilis study, which led to the ethical rules we use today. In 1932, scientists offered rural African-American men free government health care. However, they were really performing a natural history study of syphilis. These scientists, some of whom were African American clinicians from the Tuskegee Institute, failed to tell the participants that they had syphilis, and did not attempt to treat the participants who had the disease. As early as 1936, the study was criticized for not trying to treat the participants. Sadly, not only did the doctors deny treatments, they withheld information about penicillin when it became the standard of care in 1947. The surviving patients were not cured of their syphilis until 1973—almost three decades after a cure was available.

This scientific malfeasance resulted in the creation of the U.S. National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research in 1974. Human research1 is now carefully reviewed and approved by an Institutional Review Board, or IRB. All procedures, interventions, surveys, or data collection for research purposes must meet certain criteria and ethical guidelines before scientists can proceed with a study.

The IRB process has been successful at limiting ethical breaches by modern scientists. However, this process has not responded to changes in technology, communications, and data collection. There are no clear guidelines on using social media platforms for research. There is no consensus on who owns medical records. Creating large data sets might advance scientific knowledge, but at the expense of patient privacy. Thus, it is important for data managers and software engineers to understand the scientific ethics that guide human research and foster awareness of the unethical treatment experienced by vulnerable groups, such as African-Americans. Historical transgressions such as the Tuskegee syphilis trial might lead these groups to have additional concerns about participating in research, concerns that must be addressed by technologists and scientists.

Social networking sites in deceptive research

Facebook is one of the most popular social networking sites, with an estimated 1.4 billion people using the site on a near-daily basis. PubMed, the U.S. National Library of Medicine repository of publications, shows that the first research papers referencing Facebook appeared in 2008, with almost 500 publications in 2017. Applications such as Facebook are very attractive to scientists as research mechanisms. The ease of recruitment, low cost, and large study population make it a desirable platform for hypothesis testing. Interestingly, Facebook’s privacy and use policies have created a system where humans can be manipulated and unwittingly used as research subjects. In a 2014 study, Kramer et al. manipulated the Facebook feed of 689,003 social network users to see if user emotions could be affected by reducing either positive or negative feed from their friends. The scientists altered the Facebook News Feed algorithms to reduce either positive or negative emotional content. In other words, if you had been randomly sorted into the “positivity reduced” feed, posts from your favorite cousin containing positive words would not have shown up in your Top News feed during the weeks of the study, without your knowledge or consent!

Several scientists stated concerns about the use of Facebook data, specifically because manipulating the Top News feed was an interventional study that should have required informed consent from the participants. Inder M. Verma, the Editor-in-Chief of the journal that had published the research, found it necessary to address these concerns. First, he claimed that Facebook users were not deceived when their feed was altered because the posts were still available and could be looked up individually. Furthermore, he concluded:

“…as a private company Facebook was under no obligation to conform to the provisions of the Common Rule when it collected the data used by the authors, and the Common Rule does not preclude their use of the data…It is nevertheless a matter of concern that the collection of the data by Facebook may have involved practices that were not fully consistent with the principles of obtaining informed consent and allowing participants to opt out.”

Remember when I said that scientists are humans, and humans are good at justifying behaviors that are not ethical? The manipulation of Facebook News Feed was *legal *due to user agreements. To quote a Magnolia Mountain lyric, “just because it’s legal doesn’t mean it isn’t wrong.”

How did Facebook respond to criticisms that this type of research is not ethical? Facebook established an internal IRB, but there are concerns about the ability of an organization to monitor its ethics internally. Facebook continues to rely on user agreements and its data policy to avoid informing users that they are unwittingly participating in research studies. Facebook Research continues to collect and analyze user data. While this provides interesting scientific findings and insights into human behavior (see this example), the users are not informed that they are participating in research.

It should be noted that the Facebook research described above could have been ethically performed without consenting participants. Field experiments in which people are not told that they are in a study can be approved by an IRB under certain conditions. Scientists must show that the research presents no more than minimal risk, that it will not infringe on the subjects’ rights or welfare, that subjects aren’t being deceived, and that the experiment couldn’t be conducted if the people knew that they were in an experiment.

Deceptive research in online dating

The dating website OkCupid makes no apologies for manipulating results on its site for the joy of understanding humans. In one study, okCupid lied about compatibility to see if their matching algorithms were valid. Algorithms tell users how well they ‘match’ with another user. okCupid lied about the matching, telling users who were actually poor matches that they were very good matches. The outcome variable was the probability that an online encounter between these two users would result in a conversation. Users who were very good matches and were told that they were very good matches turned an encounter into a conversation 20% of the time. Users who were poor matches and were told that they were poor matches had conversations only 10% of the time. But poor matches who were deceived to believe that they were very good matches turned encounters into conversations 17% of the time–almost as often as the actual very good matches. While this study is interesting (and hopefully led to the development of better matching algorithms) the deception portion of the study would have prevented it from being approved by an IRB because it could have potentially caused harm to users.

Like Facebook, okCupid has a privacy policy that informs users of data collection and analysis. While it should be clear to anyone using these services that the companies have control of the services they offer, few users read and understand these privacy statements. To illustrate this point, Obar and Oeldorf-Hirsch recruited 543 people to join a fictitious social networking service. The privacy policy was skipped by 74% of the subjects, and 98% of the people who read the privacy policy agreed to share data with NSA and employers and to provide their first-born child as payment for services!

A call for transparency in online research

Data managers, analysts, and software designers are not formally trained in the ethical conduct of human subjects research, but they are a required part of the research team when dealing with large datasets. Therefore, they should be aware of the ethical dilemmas that could occur when studying humans. Companies rely on privacy agreements and user policies to legally collect data and manipulate users.

How can a data scientist know if the research on a social networking site is being conducted in an ethical manner? Research on humans should be reviewed by an IRB and have a protocol number or approval letter to show this. As an example, a trial evaluating the use of a Facebook group to increase physical activity was published by Looyestyn et al. The study references the University of South Australia Human Research Ethics Committee (protocol number: 0000033766) and was registered with the Australian and New Zealand Clinical Trials Registry, protocol number: ACTRN12616001500448. Patients also provided informed consent online before commencing the study. IRB and clinical trial registry information should be readily available to the research team. If a company relies on the legality of data ownership to justify data use and research, the data scientist should evaluate the project with regards to the Common Rule and ethical principles of justice, beneficence, and respect for the person.

Elizabeth Kopras is a science navigator at the University of Cincinnati, where she helps early-career scientists formulate testable hypotheses and pragmatic research projects. When she’s not putting the ‘re’ in research or giving the truth scope, she enjoys jousting and fighting with her family in the SCA.

  1. This is true for research that happens at academic institutions. Marketing research and commercial platforms avoid this process, as discussed below.