Like. Share. Retweet. Repeat.
Every day, we come across an appalling statistic that wavers our faith in mankind. Stuck in a rut of recirculated misinformation, things can often appear bleaker than they are. Here are three essential questions that every informed reader should ask when they come across a statistic:
Is correlation being confused with causation?
Between the years 2000 and 2009, the per capita cheese consumption in the United States was 95 percent correlated with the number of people who died by tangling themselves in their bedsheets. Within the same decade, the declining divorce rate in Maine was 99 percent correlated with the state’s declining per capita consumption of margarine. Could one, then, safely conclude that eating cheese turns your bedding into an executioner? Or that a fondness for fake butter will likely jeopardize your marriage? I doubt that many would put money on those claims. A basic rule of data science is that correlation does not necessitate causation – i.e., just because A and B are correlated, that doesn’t mean that A caused B. So, while there is no doubt a significant association between a Maine resident’s margarine consumption and his or her likelihood of being divorced, this in no way implies that the latter is a direct result of the former.
This rule may seem intuitive in the previously mentioned scenario, but it is often disregarded by readers when confronted with more down-to-earth issues. A recent study published in the New York Times revealed that in the past 20 years, male tennis players in Grand Slam tournaments have been penalized more often than their female counterparts for committing violations. In the wake of the Serena Williams scandal, many, including journalist Glenn Greenwald, took this study to conclude that “male tennis players are punished at far greater rates for misbehavior.” What he failed to see was that while the study did reveal a correlation between a player’s gender and his or her likelihood of being penalized, nowhere did it directly attribute this discrepancy in penalization to the player’s gender. As journalist Nate Silver pointed out, “[The study] shows that male players are fined more, but that could be because they misbehave more. (Indeed, from watching a fair bit of tennis, the men do misbehave more). This data doesn’t tell us anything about whether they’re punished at greater rates.”
This brings us to the next important question:
Does the study control for confounding factors?
Medical research during the 1990s revealed higher rates of lung cancer among coffee drinkers as compared to those who did not drink coffee. This led many to identify coffee as a cause of lung cancer. Today, this notion is unequivocally rejected. So, why the initial hysteria? Early analysis of trends relating coffee consumption with the occurrence of lung cancer did not account for a key factor – in those days, coffee drinkers were also likely to be smokers, and smoking is an undisputed cause of lung cancer. Once individuals’ smoking habits were taken into account, the correlation between the disease and coffee consumption shrunk to insignificance. Here, smoking – the unmeasured causal factor common to the treatment group (coffee drinkers) and the control group (non-drinkers) – is called a confounding factor. Reliable studies try their best to minimize confounding factors.
“Women in the U.S make 77 cents for every dollar that men make.” Unless you’ve been in hibernation for the past five years, chances are you’ve come across this statistic. Former President Barack Obama, Senator Elizabeth Warren and former Secretary of State Hillary Clinton have all corroborated it. The statistic originates from a 2011 report by the U.S. Census Bureau that divided the median earnings of all women working full-time by the median earnings of all men working full-time. However, these calculations don’t take into account confounding factors such as choice of occupation, position, education or hours worked per week. A study by the American Enterprise revealed that men working full-time were 2.3 times more likely than women to work 60+ hour weeks. Moreover, the highest-paying college majors (petroleum engineering, computer science and mathematics) were opted for disproportionately by men, while the lowest-paying majors (psychology, early childhood education and social work) were dominated by women. A follow-up study by the American Association of University Women revealed that when you factor in these choices, the wage gap shrinks to about 6.6 cents. One could argue that these confounding factors, such as hours worked per week and choice of major, are themselves a result of gender biases. Perhaps that is true, but it is also possible that these confounding factors are linked to biological differences between men and women. This is not to say that sexism in the workplace is non-existent, but to attribute the entire pay discrepancy to gender biases would be disingenuous.
What was the research methodology?
Back in June, the Thomson Reuters Foundation listed the 10 most dangerous countries for women in the world, ranking India at the top and the United States at number 10. I’ll be the first to admit that India has a long way to go when it comes to women’s equality, but the assertion that India ranks worse on oppressive cultural traditions than Somalia, where Sharia Law is applicable, and Saudi Arabia, where the guardianship system still exists, didn’t sit right with me. The survey results, published in nearly all major media outlets, yielded widespread outrage regarding women’s safety. It was only Christina Hoff Sommers, however, who thought to question the methodology of the survey, revealing that it was based entirely on perception and not on data. When asked whose perception, Reuters absurdly responded, “We gave an assurance to the experts that their answers would be confidential to allow total honesty.” For an esteemed organization like Reuters to publish such unscientific findings with a blatant lack of transparency is simply deplorable.
These three questions alone cannot nearly account for all the nuances in data analysis. However, as the conversation surrounding fake news gains momentum, it certainly is a place to start.
Bhavya Pant is a Collegian columnist and can be reached at [email protected].