I am a PhD student in Computer Science at The University of Texas at Austin, working with Greg Durrett and Katrin Erk. I am broadly interested in natural language processing and computational semantics.
Before joining the PhD program, I worked on applied NLP research at Applied Research Laboratories, The University of Texas at Austin. Before that, I obtained a Masters in Computational Science, Engineering and Mathematics from UT Austin and a BA in Mathematics from Pomona College.
I also co-organize the South By Semantics Workshop. In Spring 2025 I led a DiRP (Directed Reading Program) on Language Models. I currently organize UT Austin’s Natural Language Learning (NLP) reading group.
RankAlign: A Ranking View of the Generator-Validator Gap in Large Language Models
Juan Diego Rodriguez, Wenxuan Ding, Katrin Erk, and Greg Durrett. COLM 2025.
Parameterized Synthetic Text Generation with SimpleStories
Lennart Finke, Chandan Sreedhara, Thomas Dooms, Mat Allen, Emerald Zhang,
Juan Diego Rodriguez, Noa Nabeshima, Thomas Marshall, and Dan Braun. NeurIPS 2025
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models
Liyan Tang, Grace Kim, Xinyu Zhao, Thom Lake, Wenxuan Ding, Fangcong Yin, Prasann Singhal, Manya Wadhwa, Zeyu Leo Liu, Zayne Sprague, Ramya Namuduri, Bodun Hu, Juan Diego Rodriguez, Puyuan Peng, and Greg Durrett. NeurIPS 2025
KRISTEVA: Close Reading as a Novel Task for Benchmarking Interpretive Reasoning
Peiqi Sui, Juan Diego Rodriguez, Philippe Laban, Dean Murphy, Joseph P. Dexter, Richard Jean So, Samuel Baker, and Pramit Chaudhuri. ACL 2025.
Characterizing the Role of Similarity in the Property Inferences of Language Models
Juan Diego Rodriguez, Aaron Mueller, and Kanishka Misra. NAACL 2025.
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague, Fangcong Yin, Juan Diego Rodriguez, Dongwei Jiang, Manya Wadhwa, Prasann Singhal, Xinyu Zhao, Xi Ye, Kyle Mahowald, and Greg Durrett. Proceedings of ICLR 2025.
X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across Paragraphs
Juan Diego Rodriguez, Katrin Erk, and Greg Durrett. NAACL 2024.
Lil-bevo: Explorations of strategies for training language models in more humanlike ways
Venkata S Govindarajan, Juan Diego Rodriguez, Kaj Bostrom, and Kyle Mahowald. Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning (2023)
WiCE: Real-World Entailment for Claims in Wikipedia
Ryo Kamoi, Tanya Goyal, Juan Diego Rodriguez, and Greg Durrett. EMNLP 2023.
Cross-Domain Detection of GPT-2-Generated Technical Text
Juan Diego Rodriguez, Todd Hay, David Gros, Zain Shamsi, and Ravi Srinivasan. NAACL 2022.
Reusable Templates and Guides for Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards
Angelina McMillan-Major, Salomey Osei, Juan Diego Rodriguez, Pawan Sasanka Ammanamanchi, Sebastian Gehrmann, and Yacine Jernite. Proceedings of the 1st Workshop of Natural Language Generation, Evaluation, and Metrics (GEM 2021).
Leveraging WordNet Paths for Neural Hypernym Prediction
Yejin Cho, Juan Diego Rodriguez, Yifan Gao, and Katrin Erk. COLING 2020.
Transfer Learning for Entity Recognition of Novel Classes
Juan Diego Rodriguez, Adam Caldwell, and Alex Liu. COLING 2018 (Area Chair Favorite)
Powered by Jekyll and Minimal Light theme.