Author: Nicole Olynk Widmar, Interim Department Head and Professor, Purdue University, Department of Agricultural Economics
As the saying goes, “Liar, liar, pants on fire!” Except your pants never burst into flames – at least not in response to you lying – so it turns out that was a lie too!
In 2020, we talked about how you were a hypocrite. And, you absolutely still are (sorry to be the one to tell you). In 2021, we talked about your YouTube-worthy temper tantrums and why they need to stop (you’re embarrassing yourself). Then, in 2022, we called you a liar. It was actually “Well, You’re Also a Liar,” implying that it isn’t just you; it’s us too. Human behavior is a complicated thing; it’s not just you.
Surveys are valuable in designing questions about specific topics, but survey data has challenges – from ensuring an adequate sample size to conduct the analyses desired, to worrying about response bias, enumerator bias, and a whole slew of other biases when we’re asking questions of people, and respondents know their answers are being received (even if anonymously) by other people for analysis.
Online and social media data are potentially, at least in my opinion, the new frontier of data. It exists in a variety of forms, from talking on Twitter about holiday plans to images and even data coming from smart devices in your home. Social media data has been analyzed to explore public understanding of public health crises, like Zika Virus, and to question whether natural disasters with more social and online media posts receive more aid or funding (answer: they do not). But social media has its challenges too. You post only your best life on Twitter (now renamed to X for reasons we haven’t yet identified with any data source), you might say things that are not accurate, and not everyone is represented. In short, there are challenges for any dataset.
However, there is an online data set that knows more about us than social media could ever hope to. It’s that little bar into which we (apparently) type our deepest secrets, even the really, really unflattering stuff – the kind that would (and should) make other people recoil if they could link it back to you. It’s the Google search bar. And, the resulting Google search data.
Seth Stephens-Davidowitz’s book “Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are,” delves into a variety of topics using Google search data. People tell Google things they might not tell anyone else: not their friends, their spouse, or their doctor. Seth also posits – and I think I agree with him – that search data may reveal lies we even tell ourselves. So, it isn’t just that we lie to others; we lie to ourselves too. Considering how much better Google search data reflects reality in the topics explored in the book compared to survey data or other more traditional data sources, it seems that we’re not as prone to lying to Google (or at least not nearly as much).
“Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are” is presented in three parts. Part I is entitled “Data, Big and Small”; Part II, “The Powers of Big Data”; and Part III, “Big Data: Handle with Care.” The Conclusion is aptly titled “How Many People Finish Books” and suggests the answer is not very many. But, the Conclusion is well-done in my opinion and worth the read.
Chapter 8, “Mo Data, Mo Problems? What We Shouldn’t Do,” starts with “Sometimes, the power of Big Data is so impressive it’s scary. It raises ethical questions.” It is argued that there is danger in empowered corporations, essentially fueled by big data, understanding who is likely to do what, or to learn what customers can or will pay. Seth says on page 265, “Data on the internet, in other words, can tell businesses which customers to avoid and which they can exploit. It can also tell customers the businesses they should avoid and who is trying to exploit them. Big Data to date has helped both sides in the struggle between consumers and corporations. We have to make sure it remains a fair fight.” This issue of the “fair fight” was the original inspiration behind the recent post All Is Fair, So Long as It’s In My Favor. I suspect that the question of use and what is fair for whom will only grow more complicated as the possibilities continue to expand for what’s possible within these datasets.
Just because everyone does it, that doesn’t make it a good or right thing to do. Recall, Why Does Everybody Knowing Something Make it Right? The answer: It doesn’t. There are many reasons we lie or fail to share our true selves, but also many reasons to work against that urge to cover up or gloss over our less-than-admirable behaviors, aspects, or even flaws. Recall from last week’s Authenticity over Plastic Perfection, “If genuine connections are what you’re after, embrace your originality, own the cringe, and just be yourself. Because, in the end, that authenticity is the superpower that sets you apart.”