ISI at ACL '23: Learning from Games, Mitigating Anti-LGBTQ+ Bias and Searching Events in Hundreds of Languages - USC Viterbi

Photo Credit: Charles Taylor/Getty Images

At the 61st Annual Meeting of the Association for Computational Linguistics (ACL ’23), held July 9th to July 14th, 2023 in Toronto, Canada, researchers from the Information Sciences Institute (ISI), a research institute of the USC Viterbi School of Engineering, are presenting 23 studies spanning a variety of topics. Run by the Association for Computational Linguistics (ACL), the annual conference is one of the premiere conferences for natural language research.

This year, ACL ‘23 received a record-high 4559 submissions, topping last year’s record-high of 2103. Of these submissions, 1074 (22.08%) papers were accepted for the conference and 901 (40.60% including the papers for the main conference) were accepted for Findings.

Along with presenting work, a number of ISIers have larger roles at the conference and within the association itself.

Jonathan May, a research associate professor at USC’s Thomas Lord Department of Computer Science and director of ISI’s Center for Useful Techniques Enhancing Language Applications Based on Natural and Meaningful Evidence (CUTELABNAME), is on the Best Paper Committee at ACL 23. He has also served as treasurer for NAACL (ACL’s North American chapter, which co-sponsors ACL when it is held in the Americas) since 2019.

In his role as treasurer, he has helped maintain the financial health of NAACL, while also supporting initiatives that promote the equitable expansion of research in the Americas. As chair of the Regional Americas Fund, he has been involved in giving out thousands of dollars a year to support initiatives at universities in Central and South America that promote natural language processing research.

Xiang Ren, an assistant professor in Computer Science, a Research Team Leader at ISI and the PI of the Intelligence and Knowledge Discovery (INK) Lab at USC, is on the ACL 23 Program Committee as a senior area chair of the Theme Track. This year’s theme is Reality Check; ACL invited empirical and theoretical research, as well as position and survey papers reflecting on the ways in which reported performance improvements on NLP benchmarks are meaningful. Also on the Program Committee is Xuezhe Ma, a research assistant professor in Computer Science and a Research Lead at ISI. Ma is a senior area chair for the Large Language Model submission topic.

May, who is co-authoring three main conference ACL papers and one Findings paper this year said, “ISI continues its rich tradition of bringing forth important new NLP research at ACL, with a bevy of highly collaborative and eclectic works across the gamut of topics, from dialogue models to machine learning to semantic understanding to game playing and beyond.”

Research Spotlights, ACL 2023

A Word Game Reveals Cultural Background Clues

Codenames Duet is a word association game between multiple players that challenge each other to communicate and guess a secret code word before the opponent can do so. There is a uniqueness to each player when it comes to playing Codenames as each comes from different backgrounds and knowledge that shapes how they interpret clues and make the next guesses. This allowed researchers at ISI to study a research question – How does cultural background affect play style? In the paper Modeling Cross-Cultural Pragmatic Inference with Codenames Duet, ISI researchers dive into the nuances of human behavior and how cultural markers affect decision-making. It is intended to use games as a lens to understand how people from different sociocultural backgrounds interact.

This Chatbot Sounds Familiar

Dialogue models like ChatGPT can help you perform various tasks by chatting with you, but they are usually designed to talk to you as an assistant rather than speak for you. ISI researchers asked: What if you had your very own personalized ChatGPT that could serve as your representative and save you time by handling trivial daily conversations? In their paper, RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation, they propose a method to guide a dialogue model to mimic you in conversations on social media like Reddit. The approach learns about you from the texts you’ve written. RECAP is very flexible and can work with any powerful dialogue model to generate high quality responses in your tone.

What Can We Learn from D&D Interactions?

Dungeons & Dragons is a fantasy role-playing game that involves completing goals. The game is run by a Dungeon Master, who narrates the story and controls all the monsters and characters that the players interact with. ISI researchers used Dungeons & Dragons to study teacher-student natural language interactions in a goal-driven and grounded environment. Their resulting paper, I Cast Detect Thoughts: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons, dives into the study of incorporating intent and the theory of mind to make computational models (i.e. Dungeon Master) better communicators.

Understanding Global Events in Every Language

Event extraction is an NLP technique that allows for names, dates, events, relationships and more to be automatically pulled from large volumes of text and turns it into structured data that can be searched or put in a database where it can be processed, organized, easily retrieved, and analyzed. Valuable insights can be gained from vast amounts of text. Event extraction is used to predict political instability; track the path of disease outbreaks; identify social unrest; follow natural disasters; and much more. ISI researchers have developed a method to train models using English data and deploy them on any language that’s represented in the language model. The result? The system makes global events available on-demand in 100 languages ranging from Afrikaans to Yiddish, and returns results in English. They will present their paper, Massively Multi-Lingual Event Understanding: Extraction, Visualization, and Search, as a Demo at ACL.

Using QueerTwitter to Mitigate Anti-LQBTQ+ Bias in AI

ISI researchers present a new benchmark dataset for measuring homophobic and transphobic bias in large language models in their paper, WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models. The dataset was designed using a community survey of about 300 LGBTQ+ individuals from diverse backgrounds, making it the first community-sourced bias benchmark for NLP models. The team tested 20 publicly available models on the dataset; while all 20 showed some degree of anti-LGBTQ+ bias, they found that they can partially mitigate this bias with additional training on data written by and about the queer community. This builds off of previous work done by the team that was presented as a workshop at last year’s conference.

History Repeats

In this work, ISI researchers use continual learning to improve knowledge graph completion, using the example of events prediction. In their paper, History Repeats: Overcoming Catastrophic Forgetting For Event-Centric Temporal Knowledge Graph Completion, the team introduces a continual learning framework for temporal knowledge graph (TKG) completion that can adapt to the dynamic nature of TKG data received incrementally in real-world scenarios. This has valuable applications in areas like geopolitical event forecasting. For instance, it enables analysts to better predict political crises by continuously updating and preserving knowledge about emerging entities, relations, and changing patterns in global events, leading to more accurate and timely assessments.

View the complete list of accepted USC ISI papers below:

ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems Sarik Ghazarian, Yijia Shao, Rujun Han, Aram Galstyan, Nanyun Peng. ACL 2023.

Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction? Jiashu Xu, Mingyu Derek Ma, Muhao Chen. ACL 2023.

Continual Contrastive Finetuning Improves Low-Resource Relation Extraction Wenxuan Zhou, Sheng Zhang, Tristan Naumann, Muhao Chen, Hoifung Poon. ACL 2023.

Contrastive Bootstrapping for Label Refinement Shudi Hou, Yu Xia, Muhao Chen, Sujian Li. ACL 2023.

Cross-lingual Continual Learning Meryem M’hamdi, Xiang Ren and Jonathan May. ACL 2023.

Data Curation Alone Can Stabilize In-context Learning Ting-Yun Chang, Robin Jia. ACL 2023.

I Cast Detect Thoughts: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons Pei Zhou, Andrew Zhu, Jennifer Hu, Jay Pujara, Xiang Ren, Chris Callison-Burch, Yejin Choi, Prithviraj Ammanabrolu. ACL 2023.

Improving Factuality of Abstractive Summarization without Sacrificing Summary Quality Tanay Dixit, Fei Wang, Muhao Chen. ACL 2023.

RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation Shuai Liu, Hyundong J. Cho, Marjorie Freedman, Xuezhe Ma, Jonathan May. ACL 2023.

WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models Virginia Felkner, Ho-Chun Herbert Chang, Eugene Jang, Jonathan May. ACL 2023.

Controlled Text Generation with Hidden Representation Transformations Vaibhav Kumar, Amita Misra, Ankit R. Chadha, Hana Koorehdavoudi, Masud Moshtaghi, Emilio Ferrara. ACL Findings 2023.

History Repeats: Overcoming Catastrophic Forgetting For Event-Centric Temporal Knowledge Graph Completion Mirtaheri M. Rostami, A. Galstyan. ACL Findings 2023.

Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data Mozhdeh Gheini, Tatiana Likhomanenko*, Matthias Sperber*, Hendra Setiawan*. ACL Findings 2023.

Jointly Reparametrized Multi-Layer Adaptation for Efficient and Private Tuning Umang Gupta, Aram Galstyan, Greg Ver Steeg. ACL Findings 2023.

Know Where You’re Going: Meta-Learning for Parameter-Efficient Fine-Tuning Mozhdeh Gheini, Xuezhe Ma, Jonathan May. ACL Findings 2023.

Modeling Cross-Cultural Pragmatic Inference with Codenames Duet Omar Shaikh, Caleb Ziems, William Held, Aryan J. Pariani, Fred Morstatter, Diyi Yang. ACL Findings 2023.

Multi-hop Evidence Retrieval for Cross-document Relation Extraction Keming Lu, I-Hung Hsu, Wenxuan Zhou, Mingyu Derek Ma, Muhao Chen. ACL Findings 2023.

Robust Natural Language Understanding with Residual Attention Debiasing Fei Wang, James Y. Huang, Tianyi Yan, Wenxuan Zhou, Muhao Chen. ACL Findings 2023.

Take a Break in the Middle: Investigating Subgoals towards Hierarchical Script Generation Xinze Li, Yixin Cao, Muhao Chen, Aixin Sun. ACL Findings 2023.

Massively Multi-Lingual Event Understanding: Extraction, Visualization, and Search Chris Jenkins, Shantanu Agarwal, Joel Barry, Steven Fincke, Elizabeth Boschee. ACL Demo 2023.

Pipeline for Modeling Causal Beliefs from Natural Language Hunter J. Priniski, Ishaan Verma, Fred Morstatter. ACL Demo 2023.

XMD : An End-to-End Framework for Interactive Explanation-Based Debugging of NLP Models Dong-Ho Lee, Akshen Kadakia, Brihi Joshi, Aaron Chan, Ziyi Liu, Kiran Narahari, Takashi Shibuya, Ryosuke Mitani, Toshiyuki Sekiya, Jay Pujara, Xiang Ren. ACL Demo 2023.

Indirectly Supervised Natural Language Processing Wenpeng Yin, Muhao Chen, Ben Zhou, Qiang Ning, Kai-Wei Chang, Dan Roth. ACL Tutorial 2023.

Published on July 14th, 2023

Last updated on May 16th, 2024