Welcome!

I am a research scientist at the Allen Institute for AI. I have been the lead of the evaluation team for OLMo since 2023, and I am the research lead of Playground, where you can interact with our recent models and use OLMoTrace to trace the output of our models back to their training data in real time. I work on building large language models like OLMo, evaluating large language models throughout training, creating and documenting the contents of web-scale pretraining datasets like Dolma, the environmental impact of AI, and improving transparency and reproducibility in the research community.

You can find a bio on my About Me page.

News, Recognition, Awards

Spotlight Presentation (top 5%) at ICLR 2025 for Evaluating the Environmental Impact of LLMs
Best Resource Paper at ACL 2024 for Dolma, our LLM pretraining corpus
Best Theme Paper at ACL 2024 for OLMo, our LLM
Spotlight Presentation (top 5%) at ICLR 2024 for What's In My Big Data, a toolkit for understanding pretraining datasets
10-year Test-of-Time Paper at ACL 2022 for our vision+language system Midge
Best Student Paper at NAACL 2015 for Retrofitting

Featured Talks

Featured Press