The dangers of information gathering

Data is a powerful thing and has the potential to be misused


By Jonah Dratfield, Collegian Columnist

Socrates claimed that “to know thyself is the beginning of wisdom.” While this quotation seems innocuous at first, reflected on more deeply, its complexity becomes apparent. Self-knowledge does not mean knowing your optimum exercise routine or favorite coffee brand—it means recognizing your flaws. This is something most people will avoid at all costs; it is far easier for a person to lie to themselves about what they are like and to never compare their lies to their actions.

The book “Dataclysm” by Christian Rudder, a founder of the dating website OkCupid, is a fascinating exploration of this problem. In the book, Rudder discusses what the data garnered from OkCupid has shown him about peoples’ true preferences and tendencies. In other words, the book is about how people actually act, not about how they believe they act or believe they should act. Topics discussed in the book include what types of people tend to get messaged most online, how people tend to describe themselves in their dating profiles and the types of messages people tend to send (or write, then delete before sending). While these tidbits are interesting enough on their own, Rudder does not shy from putting them in more politically-fraught contexts. Not only does he discuss what types of people get messaged most, he discusses which races get messaged most. Not only does he discuss how people are most likely to describe themselves in their bios, he discusses how people of different races are most likely to describe themselves in their bios. He also explores how people of different genders and sexual orientations seem to value different things in partners.

The information revealed in the book is taboo, but it is also compelling. Whatever one thinks of the project itself, it is hard not to be a bit curious about Rudder’s work. In addition, one cannot dismiss it. You can have qualms with Rudder’s methods of data collection or interpretations of the data, but the fact remains that his data is not manufactured. These are not his beliefs—they are his statistics. This leads to an important question: What exactly should one do with this information?

Consider the book’s somewhat humorous finding that, statistically speaking, the Scottish band Belle & Sebastian is the whitest band on the internet. How exactly should one parse the social significance of this finding? While it means something, what exactly does it mean? It is unlikely that white people are genetically predisposed to enjoy the music of Belle & Sebastian, yet the fact Belle & Sebastian are the most distinctly white band on the internet is probably not random.

In this instance, the revelation is not of any particular importance but in others, it is. For example, Black users appear to have a much harder time getting matches and successfully communicating with people on OkCupid than people of other races. This is an important finding because it likely reveals something about implicit bias. It is easy to see how the information in the book could be used to justify racism. One could claim that Black users have a harder time on the site not because there is implicit bias in the OkCupid user base, but because Black users are inherently less worthwhile than other users. The fact that this is both a prejudiced and unscientific interpretation will not prevent racist communities and organizations from utilizing it. Another difficult question arises: Is it possible to study differences between social groups without inadvertently justifying prejudice?

Social science is fraught with complex questions of this sort. When one studies data that pertains to large groups of people, one discerns real information. But, data on its own says little about what is a result of culture versus biology, and little about the lives and preferences of individuals. In spite of this, it should not be dismissed. Knowledge of certain trends has the potential to help solve institutional problems.

But, how far should one go in pursuing this type of knowledge? Should one study IQ differences between different racial and social groups, as political scientist Charles Murray has? Should one study the best ways to manipulate people, as Psychologist Robert Cialdini has? Studies of this nature have, or appear to have, little humanitarian use and enormous potential for abuse.

Murray’s studies have been used to state that certain racial groups are inherently more intelligent than others, in spite of the fact that Murray claims these are cases of people falsely projecting their own prejudices onto his work. Cialdini’s studies can be (and probably have been) used to persuade people to act against their best interests—though Cialdini has explicitly warned readers against using his work in this way.

Yet, while the justifications for these studies are often flimsy (particularly in the case of Murray), I cannot condemn them. As a person who believes in the transformative capacity of science, I cannot blankly argue that any topic should not be approached. Furthermore, the capacity to study topics that do not appear to have any particular relevance is a right that researchers and scientists need to have. Quantum mechanics was once a purely theoretical field, yet it is now an integral part of computers and mobile phones. There was no way to have predicted this evolution. Couldn’t portions of social science function in similar ways?

The truth is data is both important and dangerous and that we, as citizens, need to be aware of its power. We need to be conscientious of the ways in which we think about the world and the many complexities that our perspectives encompass. So, perhaps Socrates was wrong. Perhaps knowing yourself isn’t the beginning of wisdom. Perhaps knowing how to know yourself is.

Jonah Dratfield is a Collegian columnist and can be reached at [email protected]