Publications

You can also find my articles on my Semantic Scholar profile.

From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty

Maor Ivgi, Ori Yoran, Jonathan Berant, Mor Geva. Published in NeurIPS Workshop on Attributing Model Behavior at Scale (ATTRIB 2024)

LLMs often show undesirable behaviors like hallucinations and sequence repetitions. Our study reveals these are all linked fallback behaviors for models under uncertainty, with stronger models shifting to more complex patterns.

DataComp-LM: In search of the next generation of training sets for language models

Jeffrey Li*, Alex Fang*, Georgios Smyrnis*, Maor Ivgi*, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner, Maciej Kilian, Hanlin Zhang, Rulin Shao, Sarah Pratt, Sunny Sanyal, Gabriel Ilharco, Giannis Daras, Kalyani Marathe, Aaron Gokaslan, Jieyu Zhang, Khyathi Chandu, Thao Nguyen, Igor Vasiljevic, Sham Kakade, Shuran Song, Sujay Sanghavi, Fartash Faghri, Sewoong Oh, Luke Zettlemoyer, Kyle Lo, Alaaeldin El-Nouby, Hadi Pouransari, Alexander Toshev, Stephanie Wang, Dirk Groeneveld, Luca Soldaini, Pang Wei Koh, Jenia Jitsev, Thomas Kollar, Alexandros G. Dimakis, Yair Carmon, Achal Dave*, Ludwig Schmidt*, Vaishaal Shankar*. Published in NeurIPS 2024

We investigate the impact of data curation on language model training, creating a data-centric benchmark where participants filter a 240T token corpus to release a high-quality 4T corpus, used to train a SOTA open-source 7B model.

In-Context Learning with Long-Context Models: An In-Depth Exploration

Amanda Bertsch, Maor Ivgi, Uri Alon, Jonathan Berant, Matthew R. Gormley, Graham Neubig. Published in Arxiv preprint

We explore the effectiveness of long-context models using large in-context demonstration sets, revealing that their success largely stems from referencing similar examples, and we uncover unique behaviors of many-shot in-context learning in this new research.

Accelerated Parameter-Free Stochastic Optimization

Itai Kreisler, Maor Ivgi, Oliver Hinder, Yair Carmon. Published in COLT (2024)

Building on the DoG optimizer, ADoG is a tuning-free dynamic accelerated SGD step size formula, backed by strong theoretical guarantees and empirically demonstrated to work well in the convex settings.

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

Uri Shaham, Maor Ivgi, Avia Efrat, Jonathan Berant, Omer Levy. Published in Findings of EMNLP 2023

ZeroSCROLLS is a suite of datasets that require synthesizing information over long texts. The benchmark includes ten natural language tasks across multiple domains, including summarization, question answering, aggregated sentiment classification and information reordering.

DoG is SGD’s Best Friend: A Parameter-Free Dynamic Step Size Schedule

Maor Ivgi, Oliver Hinder, Yair Carmon. Published in ICML (2023)

DoG is a tuning-free dynamic SGD step size formula, backed by strong theoretical guarantees and empirically demonstrated over many domains and model-architectures to achieve comparable results to well-tuned SGD with best-practice learning-rate schedule.

Efficient Long-Text Understanding with Short-Text Models

Maor Ivgi, Uri Shaham, Jonathan Berant. Published in TACL 2023, presented in ACL 2023

Can short-range LMs perform long-range reasoning? They can!
In this work, we propose the SLiding-Encoder and Decoder (SLED) which leverages existing battle-proven enc-dec LMs to operate over long-range NLU tasks.

SCROLLS: Standardized CompaRison Over Long Language Sequences

Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv, Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy. Published in EMNLP 2022

SCROLLS is a suite of datasets that require synthesizing information over long texts. The benchmark includes seven natural language tasks across multiple domains, including summarization, question answering, and natural language inference.