Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nima Chitsazan

Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning

Jun 05, 2025

Andrei Mircea, Supriyo Chakraborty, Nima Chitsazan, Irina Rish, Ekaterina Lobacheva

Abstract:This work aims to understand how scaling improves language models, specifically in terms of training dynamics. We find that language models undergo loss deceleration early in training; an abrupt slowdown in the rate of loss improvement, resulting in piecewise linear behaviour of the loss curve in log-log space. Scaling up the model mitigates this transition by (1) decreasing the loss at which deceleration occurs, and (2) improving the log-log rate of loss improvement after deceleration. We attribute loss deceleration to a type of degenerate training dynamics we term zero-sum learning (ZSL). In ZSL, per-example gradients become systematically opposed, leading to destructive interference in per-example changes in loss. As a result, improving loss on one subset of examples degrades it on another, bottlenecking overall progress. Loss deceleration and ZSL provide new insights into the training dynamics underlying language model scaling laws, and could potentially be targeted directly to improve language models independent of scale. We make our code and artefacts available at: https://github.com/mirandrom/zsl

* Published as a conference paper at ACL 2025

Via

Access Paper or Ask Questions

Dynamic Customer Embeddings for Financial Service Applications

Jun 22, 2021

Nima Chitsazan, Samuel Sharpe, Dwipam Katariya, Qianyu Cheng, Karthik Rajasethupathy

Figure 1 for Dynamic Customer Embeddings for Financial Service Applications

Figure 2 for Dynamic Customer Embeddings for Financial Service Applications

Figure 3 for Dynamic Customer Embeddings for Financial Service Applications

Figure 4 for Dynamic Customer Embeddings for Financial Service Applications

Abstract:As financial services (FS) companies have experienced drastic technology driven changes, the availability of new data streams provides the opportunity for more comprehensive customer understanding. We propose Dynamic Customer Embeddings (DCE), a framework that leverages customers' digital activity and a wide range of financial context to learn dense representations of customers in the FS industry. Our method examines customer actions and pageviews within a mobile or web digital session, the sequencing of the sessions themselves, and snapshots of common financial features across our organization at the time of login. We test our customer embeddings using real world data in three prediction problems: 1) the intent of a customer in their next digital session, 2) the probability of a customer calling the call centers after a session, and 3) the probability of a digital session to be fraudulent. DCE showed performance lift in all three downstream problems.

* ICML Workshop on Representation Learning for Finance and E-Commerce Applications

Via

Access Paper or Ask Questions