Olmo 3: Advancing the state-of-the-art of fully open models
Abstract: Presenting Olmo 3, a family of 7B and 32B models that dramatically improves on reasoning, coding, and instruction-following capabilities while providing full transparency across every development stage. Olmo 3 is competitive with the best weights-only models of comparable size and architecture while fully sharing data, code and intermediate checkpoints that enable research interventions beyond final weights. In this talk, I’ll discuss the new techniques developed since Olmo 2, share ideas and stories behind their development, and conclude with lessons we learned for how to make consistent, reliable progress towards more powerful models.
Speaker Bio: Kyle Lo is a research scientist at the Allen Institute for AI, where he co-leads the OLMo project on open language modeling. He specializes in large-scale pretraining of language models, with emphasis on data curation and efficient experimentation. His research on domain specialization, evaluation methodology, AI for science, and AI for education has won awards at leading AI conferences including ACL, CVPR, EMNLP, CHI, NAACL and EACL. Kyle obtained his Master’s degree in Statistics from the University of Washington.