While brainstorming final projects for CS599: Data Science for Social Systems, a new data science course that debuted at USC this semester, Madhu Venkatesh and her four teammates landed on a timely idea. How do social movements such as #MeToo affect public perception of some of the world’s biggest companies—and could these perceptions indirectly impact stock prices?
To find out, the all-female team tapped into Twitter, mining public sentiment about five global companies and comparing this data with the brands’ stock market prices.
“We wanted to use Twitter to mine data about social concerns, like the gender pay gap and harassment, and see how those issues impacted stock prices,” said Venkatesh, a USC master’s student in computer science.
The team was among 20 groups of master’s and PhD students who presented their research to peers, professors and industry members at an end-of-semester poster presentation and recruitment session at the USC Michelson Center for Convergent Bioscience on Nov. 30.
Other student projects included:
- Infection Estimation in Social Networks: Proposed algorithms to map the spread of the flu and other possible epidemics in the LA area.
- Yelp Help: Studied the ingredients of success of young business in LA and proposed tips to create successful businesses.
- Personality Prediction and Behavior Analysis on Reddit: Analyzed psychological traits of Reddit users and linked this information to preferences and behaviors.
- Cyber Harassment Identification and Automatic Therapeutic Bot Response: Developed automated bullying or harassment detection to prevent incidents through early detection.
The course, which aims to explore the opportunities provided by the wealth of data available from online social platforms, was developed and co-taught by computer science research assistant professor Emilio Ferrara, a research team leader at the USC Information Sciences Institute (ISI), and ISI computer scientist Fred Morstatter.
“The motto of the course was learning how data science methods work in theory to be able to implement them in practice,” said Ferrara, an expert in behavioral data mining and machine learning who has extensively researched the impact of social media manipulation on elections and society at large.
Demystifying data science
Data science can be described as the collection and analysis of large data sets in order to make decisions and solve problems. In most cases, this effort is about finding creative solutions to business or societal challenges.
Ferrara said he believes the new course, which he hopes to teach again next fall semester, will help to demystify data science for many students and better leverage it in myriad applications.
“We designed this course to really address the increasing demand of expertise in data science and the growing importance of technology and social media in our society,” he said.
“We designed this course to really address the increasing demand of expertise in data science and the growing importance of technology and social media in our society.” Emilio Ferrara.
“The tools and methods learned in this course can be applied to social systems data to learn how they impact our society, and in turn, to develop tools informed by this data to improve the world we live in.”
The course also provided ample opportunities for students to practice applying the data science concepts learned in class. During the semester, students were asked to consider, for example, how to extract meaning from human language, use network analysis to study how humans connect and discover affinities among people’s interests and tastes by building interest graphs.
“We learned how to apply natural language processing and other emotion recognition techniques to understand and classify public sentiment on social media and found a correlation between social concerns and market performance,” said Venkatesh’s teammate Kavya Suresh. “This could be used by companies wanting to better understand their audiences, for example.”
In fact, during the poster presentation event, industry attendees from companies such as SAP, Amazon, Cisco and Neulife—including charter members of the USC Computing Forum—also browsed the poster lineup, asking the students questions about their research and conducting several onsite interviews.
“One-on-one conversations with representatives who not only have expertise in this field, but also have an exposure in the booming IT industry helped us gain insights to improve our project,” Suresh said.
Ferrara agreed, adding: “Accessing the product of USC education is an unparalleled opportunity for industry partners, not only for recruiting purposes, but also to survey research and training in our academic environment and provide feedback to improve our students’ education.”