S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

CLIP Colloquia: Dhanya Sridhar (Université de Montréal)

Time:

Wednesday, October 25, 2023 - 11:00 AM to 12:00 PM

Location:

Virtual

Analyzing Text with Causal Models

Abstract: Analyzing language often requires assessing a cause-effect relationship. Does making a complaint polite increase the chance of someone resolving it quickly? Does adding self-identifying details to a Reddit flair change how other users respond? Causal inference provides a framework for modeling and estimating such causal effects from data, under a variety of assumptions. Despite the role causality plays in the sciences, using causal inference to analyze text is challenging because unlike other domains, the relevant variables are often unmeasured, and instead encoded in unstructured text data. In contrast, the field of machine learning (ML) has arguably succeeded at extracting task-relevant information from unstructured inputs like text, but largely in the context of analyzing associations. How can ML methods be used to draw causal conclusions from text data? In this talk, I'll discuss two use cases of ML for drawing valid causal inferences. First, I'll introduce causally sufficient text embeddings, a general method to adjust for the biases that arise in observational text data, allowing us to answer causal questions about interventions of interest. Next, I'll review an extension of these causally sufficient embeddings that allow us to analyze text when the causes of interest are unobserved. Finally, I'll conclude by highlighting the potential role of causality as a tool to better understand large language models.

Bio: Dhanya Sridhar is an assistant professor in the department of computer science and operations research (DIRO) at Université de Montréal, a core academic member of Mila, and a Canada CIFAR AI Chair. Prior to this, she was a postdoctoral researcher at Columbia University. She received her doctorate from the University of California, Santa Cruz. In brief, her research focuses on combining causality and machine learning in service of AI systems that are robust to distribution shifts, adapt to new tasks efficiently, and discover new knowledge alongside us

The CLIP Laboratory at Maryland is engaged in designing algorithms and methods that allow computers to effectively and efficiently perform human language-related tasks, as well as using computational methods to improve our scientific understanding of the human capacity for language, and to explore heterogeneous datasets at scale.The lab is a part of the University of Maryland Institute for Advanced Computer Studies (UMIACS)

Series

Talk

UMIACS