Penn Psychologists Tap Big Data, Twitter to Analyze Accuracy of Stereotypes

What’s in a tweet? People draw conclusions about us, from our gender to education level, based on the words we use on social media. Researchers from the University of Pennsylvania, along with colleagues from the Technical University of Darmstadt and the University of Melbourne, have now analyzed the accuracy of those inferences. Their work revealed that, though stereotypes and the truth often aligned, with people making accurate assumptions more than two-thirds of the time, inaccurate characterizations still showed up.

They published their research findings in the journal Social Psychological and Personality Science.

Using publicly available tweets, lead researchers Daniel Preotiuc-Pietro, a postdoc in Penn’s Positive Psychology Center and former Penn postdoc Jordan Carpenter, now at Duke University, aimed to show where stereotyping went from “plausible” to wrong.

In four studies, more than 3,000 participants were asked to categorize the gender, age, education level or political orientation of more than 6,000 tweeters based solely on the words they used in 20 public posts on Twitter. The researchers then turned to natural language processing, a subfield of artificial intelligence, to analyze and isolate the stereotypes.

They learned that on average, participants judged accurately 68 percent of the time. For gender, which looked at male versus female, people were right 76 percent of the time. For age, which considered younger or older than 24, they guessed accurately 69 percent of the time. And for political orientation, liberal versus conservative, participants were accurate 82 percent of the time.

Education was slightly different; with three choices — no bachelor’s degree, bachelor’s degree, advanced degree — the accuracy percentage came in much lower, at 46 percent. It was also the only category that saw participants do worse than chance at making the correct identification.

Not only that, but erroneous judgments often inflated persistent stereotypes.

“Inaccurate stereotypes tended to be exaggerated rather than backwards,” said Carpenter, lead paper author. “For instance, people had a decent idea that someone who didn’t go to college was more likely to swear than someone with a Ph.D., but they thought Ph.D.s never swear, which is untrue.”

By focusing on stereotype inaccuracies, their research revealed how multiple stereotypes affect each other.

“One of our most interesting findings is the fact that when people had a hard time determining someone’s political orientation, they seemed to revert, unhelpfully, to gender stereotypes, assuming feminine-sounding people were liberal and masculine-sounding people were conservative,” Carpenter said.

The data also showed people assumed that technology-related language indicated a male writer. In the study, men did post about technology more than women, Carpenter said. “However, this stereotype strongly led to false conclusions,” he added. “Almost every woman who posted about technology was inaccurately believed to be a man.”

This work reverses the methodology of similar previously conducted stereotype research, Preotiuc-Pietro said. Instead of starting with a specific group and asking study participants to identify behaviors associated with that group, the researchers began with a set of behaviors and asked for the group identity of the person who did them.

“We also considered stereotypes as a lexical ‘web,’” Preotiuc-Pietro said. In other words, “the words we associate with a group are themselves our stereotype of that group.”

Taking this approach allowed the team to use language analysis to illuminate stereotypes without ever directly asking someone to explicitly endorse such labels. “People often resist openly stating their stereotypes, either because they want to present themselves as unbiased or because they’re not consciously aware of all the stereotypes they use,” said Carpenter.

This approach contributed to the fields of computer science and psychology while also providing a path forward in terms of the conclusions people draw.

“The important next step is making people aware of the inaccuracy of these stereotypes and why they lead to bad conclusions,” Preotiuc-Pietro said. “If we can educate people about the ways these beliefs can steer them wrong, it will make people more socially accurate both online and off.”

Funding for the research came from the Templeton Religion Trust.