From Anisotropy to Anomaly: Online Geometric Diagnostics during Transformer Training
Published in Under review at NeurIPS (collaboration with Huawei), 2026
Recommended citation: Zhengyang Hu, Wenyi Fang, Yang Zheng, Yanchao Yang. (2026). "From Anisotropy to Anomaly: Online Geometric Diagnostics during Transformer Training." Under review.
We propose an online monitor for Transformer training that combines latent-space statistical correlation with geometric anisotropy to jointly characterize geometric drift and representation anomalies. The framework enables real-time perturbation detection across multiple scales — NanoGPT, ViT, and Pythia-2.8B — and provides early-warning signals during pre-training. Collaboration with Huawei.
Authors: Zhengyang Hu, Wenyi Fang, Yang Zheng, Yanchao Yang.
