Seminars are held on Thursdays from 1:00 - 2:00pm on Zoom unless otherwise noted. For access information, please contact the Math Department.
For questions about the seminar schedule, please contact Chong Jin.
September 18
Dr. Yingcong Li, NJIT Department of Data Science
Data, Architecture & Algorithms in In‑Context Learning
This talk introduces recent theoretical advancements on the in-context learning (ICL) capability of sequence models, focusing on the intricate interplay of data characteristics, architectural design, and the implicit algorithms models learn. We discuss how diverse architectural designs—ranging from linear attention to state-space models to gating mechanisms—implicitly emulate optimization algorithms that operate on the context and draw connections to variations of gradient descent and expectation maximization. We elucidate the critical influence of data characteristics, such as distributional alignment, task correlation, and the presence of unlabeled examples, on ICL performance, quantifying their benefits and revealing the mechanisms through which models leverage such information. Furthermore, we will explore the optimization landscapes governing ICL, establishing conditions for unique global minima and highlighting the architectural features (e.g., depth and dynamic gating) that enable sophisticated algorithmic emulation. As a central message, we advocate that the power of architectural primitives can be gauged from their capability to handle in-context regression tasks with varying sophistication.