USC @ EMNLP 2023

Julia Cohen and Caitlin Dawson | December 6, 2023

USC researchers present papers at EMNLP 2023, one of the world’s top natural language processing conferences.

graphic of a phone

Photo Credit: SaskiaAcht/Getty Images

USC researchers will present 23 papers at the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP) in Singapore, December 6-10.  EMNLP is one of the largest conferences on the topic in the world, as well as one of the most cited in the field of computer science. The research spans from a new method for captioning music to recognizing ambiguity. 

Research Spotlights

Context Counts When Moderating Content on Twitch

Twitch. Some see it as a fun online community of gamers and good-natured e-sports fandom. For others, it’s a perilous stream of potentially toxic content and hate speech.  

In the ever-evolving landscape of digital communication, the real-time nature of messages on live-stream platforms like Twitch and YouTube Live brings with it unique challenges for content moderation. At present, effective tools for moderating content in live streams are lacking because existing models have been trained on non-real-time social media platforms like Facebook or Twitter. Research Assistant Dong-Ho Lee and Principal Scientist Jay Pujara, both from the USC Viterbi’s Information Sciences Institute (ISI), set out to change that. In their paper, they describe an innovative method that boosts the performance of moderation models on live platforms by 35%.  

The researchers used human moderators, and asked them to label Twitch messages while giving them different levels of detail about them. Details like: the chat history – either the commenter’s last message before the moderated content or the broader chat around the time of the moderated comment; what was happening on the video as the comment was posted; and was there any external knowledge related to the content that is specific to the comment (i.e., particular emojis or slang). 

The researchers identified the informational context that best helped the human moderators, and trained models to identify norm-violations by leveraging this contextual information. Their results showed that contextual information can boost model moderation performance by 35%. 

Fact or Fiction?

“Hey, ChatGPT. What is the capital of France, known for its famous pyramids?”

When faced with this type of question, a large language model (LLM) such as ChatGPT should recognize the inaccuracy in the question, and possibly provide a response that acknowledges the issue. The pyramids are associated with Egypt, not France, of course. But generating a response when something is not true, such as unanswerable questions or false claims, is still a tough task for LLMs.

Traditionally, researchers would manually collect challenging negative examples to train the models, an expensive and time-consuming process. Instead, in a new paper titled SCENE: Self-Labeled Counterfactuals for Extrapolating to Negative Examples, USC computer science researchers propose an approach to automatically generate subtly negative examples using synthetic data. Overall, they hope this research will inspire more exploration into how synthetic data can help develop and extend new kinds of machine learning models, making them more effective and versatile in various situations.

AI Tools for Journalists: A Source-Recommendation Engine

Local news journalists are burned out. Before coming to ISI as a computer science graduate student, Alexander Spangher worked as a data scientist in the journalism industry. Immersed in the newsroom, he first-hand witnessed the daily pressures reporters experienced: overworked, underpaid, and cramped by tight deadlines. “I haven’t spoken to a single local journalist that was not totally overstretched,” he remarked.

To help streamline the news writing process, Spangher is leading the development of AI gadgets for journalists, including a source-recommendation service that was described in his paper, Identifying Informational Sources in News Articles. Sources, particularly people with relevant expertise or a story, are the backbone of a compelling news article. But finding the right individuals to interview is difficult in a world wonderfully yet overwhelmingly plentiful in people to talk to. As such, a software application that could analyze a given topic, suggest relevant sources, and provide their contact information can bring efficiency to a journalist’s workflow, said Jonathan May, a research associate professor of computer science at the USC Viterbi School of Engineering and coauthor of the paper.

“Technology that can help us do creative work and be our creative best is a good thing,” May said. “That’s why I’m hopeful for it.”

A Rose by Any Other Name

Researchers including Jieyu Zhao, an assistant professor of computer science at USC, are studying whether computer translations of names vary based on race, ethnicity, and gender. The hypothesis: translations may be less accurate for names associated with US racial and ethnic minorities. Zhao and her co-authors created a dataset with names strongly linked to specific demographics and proposed a translation evaluation method.

The analysis revealed that translation systems struggle with correctly translating names associated with females, particularly those linked to racial (Black) and ethnic (Hispanic) minorities. For instance, the name Journee in a Spanish sentence was mistranslated by the system as “girls.” This has significant implications for people’s professional, personal, and cultural identities, as well as their self-worth and ease of communication. The findings emphasize the need for more research in machine translation to enhance the accuracy of name translations and to ensure high-quality service for users, regardless of gender, race, and ethnicity.

Complete list of accepted USC papers:

A Causal View of Entity Bias in (Large) Language Models
Fei Wang, Wenjie Mo, Yiwei Wang, Wenxuan Zhou, Muhao Chen

ALCAP: Alignment-Augmented Music Captioner
Zihao He, Weituo Hao, Wei-Tsung Lu, Changyou Chen, Kristina Lerman, Xuchen Song

Analyzing Norm Violations in Live-Stream Chat
Jihyung Moon, Dong-Ho Lee, Hyundong Justin Cho, Woojeong Jin, Chan Young Park, Minwoo Kim, Jonathan May, Jay Pujara, Sungjoon Park

Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems
Yixin Wan, Jieyu Zhao, Aman Chadha, Nanyun Peng, Kai-Wei Chang

A Rose by Any Other Name Would not Smell as Sweet: Social Bias in Names Mistranslation
Sandra Sandoval, Jieyu Zhao, Marine Carpuat, Hal Daumé III

Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering
Wang Zhu, Jesse Thomason, Robin Jia

BRAINTEASER: Lateral Thinking Puzzles for Large Language Models
Yifan Jiang , Filip Ilievski, Kaixin Ma, Zhivar Sourati

Challenges in Context-Aware Neural Machine Translation
Linghao Jin, Jacqueline He, Jonathan May, Xuezhe Ma

Continual Dialogue State Tracking via Example-Guided Question Answering
Hyundong Justin Cho, Andrea Madotto, Zhaojiang Lin, Khyathi Chandu, Satwik Kottur, Jing Xu, Jonathan May, Chinnadhurai Sankar

Dense Retrieval as Indirect Supervision for Large-space Decision Making
Nan Xu, Fei Wang, Mingtao Dong, Muhao Chen

Estimating Large Language Model Capabilities without Labeled Test Data
Harvey Yiyun Fu, Qinyuan Ye, Albert Xu, Xiang Ren, Robin Jia

Evaluating Large Language Models on Controlled Generation Tasks
Jiao Sun, Yufei Tian, Wangchunshu Zhou, Nan Xu, Qian Hu, Rahul Gupta, John Frederick Wieting, Nanyun Peng, Xuezhe Ma

Exploring Distributional Shifts in Large Language Models for Code Analysis
Shushan Arakelyan, Rocktim Jyoti Das, Yi Mao, Xiang Ren

How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench
Qinyuan Ye, Harvey Yiyun Fu, Xiang Ren, Robin Jia

Identifying Informational Sources in News Articles
Alexander Spangher, Nanyun Peng, Emilio Ferrara, Jonathan May

Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Ximing Lu, Faeze Brahman, Peter West, Jaehun Jang, Khyathi Chandu, Abhilasha Ravichander, Lianhui Qin, Prithviraj Ammanabrolu, Liwei Jiang, Sahana Ramnath, Nouha Dziri, Jillian Fisher, Bill Yuchen Lin, Skyler Hallinan, Xiang Ren, Sean Welleck, Yejin Choi

Learn Your Tokens: Word-Pooled Tokenization for Language Modeling
Avijit Thawani, Saurabh Ghanekar, Xiaoyuan Zhu, Jay Pujara

Look-back Decoding for Open-Ended Text Generation
Nan Xu, Chunting Zhou, Asli Celikyilmaz, Xuezhe Ma

Making Large Language Models Better Data Creators
Dong-Ho Lee, Jay Pujara, Mohit Sewak, Ryen W White, Sujay Kumar Jauhar

Remember What You Did So You Know What To Do Next
Manuel R. Ciosici, Alex Hedges, Yash Kankanampati, Justin Martin, Marjorie Freedman, Ralph Weischedel

Task-Attentive Transformer Architecture for Continual Learning of Vision-and-Language Tasks Using Knowledge Distillation
Yuliang Cai, Jesse Thomason, and Mohammad Rostami

Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context Learning
Dong-Ho Lee, Kian Ahrabian, Woojeong Jin, Fred Morstatter, Jay Pujara

We’re Afraid Language Models Aren’t Modeling Ambiguity
Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, Yejin Choi

Published on December 6th, 2023

Last updated on December 6th, 2023

Share This Story