From Anisotropy to Anomaly: Online Geometric Diagnostics during Transformer Training

Published in Under review at NeurIPS (collaboration with Huawei), 2026

Recommended citation: Zhengyang Hu, Wenyi Fang, Yang Zheng, Yanchao Yang. (2026). "From Anisotropy to Anomaly: Online Geometric Diagnostics during Transformer Training." Under review.

We propose an online monitor for Transformer training that combines latent-space statistical correlation with geometric anisotropy to jointly characterize geometric drift and representation anomalies. The framework enables real-time perturbation detection across multiple scales — NanoGPT, ViT, and Pythia-2.8B — and provides early-warning signals during pre-training. Collaboration with Huawei.

Authors: Zhengyang Hu, Wenyi Fang, Yang Zheng, Yanchao Yang.