Portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 2 
Published in International Conference on Machine Learning (ICML), Oral, 2024
First to bring the foundation-model paradigm to mutual information estimation: a Transformer pre-trained on large-scale synthetic data zero-shot estimates the MI of arbitrary 1-D pairs (X, Y); ~100x faster than prior neural methods, with even slightly better accuracy.
Recommended citation: Zhengyang Hu, Song Kang, Qunsong Zeng, Kaibin Huang, Yanchao Yang. (2024). "InfoNet: Neural Estimation of Mutual Information without Test-time Optimization." ICML 2024 (Oral).
Published in Under review at NeurIPS (collaboration with Huawei), 2026
An online Transformer-training monitor built on latent-space statistical correlation and geometric anisotropy; jointly characterizes geometric drift and representation anomalies, enabling real-time perturbation detection on NanoGPT / ViT / Pythia-2.8B.
Recommended citation: Zhengyang Hu, Wenyi Fang, Yang Zheng, Yanchao Yang. (2026). "From Anisotropy to Anomaly: Online Geometric Diagnostics during Transformer Training." Under review.
Published in International Conference on Machine Learning (ICML), 2026
Introduced InfoAtlas, a full upgrade of InfoNet: redesigned architecture and synthetic dataset supporting arbitrary dimensions; pre-trained ~1 month on a 32xH200 cluster into a 1B-parameter model, zero-shot ready and evaluated across broader downstream tasks.
Recommended citation: Zhengyang Hu*, Yanzhi Chen*, Hanxiang Ren, Qunsong Zeng, Youyi Zheng, Adrian Weller, Kaibin Huang, Yanchao Yang. (2026). "A Foundation-style Model for Zero-Shot Statistical Dependency Measurement." ICML 2026. (*Equal contribution.)
Published:
Oral presentation at ICML 2024 for our paper InfoNet: Neural Estimation of Mutual Information without Test-time Optimization. The talk introduced the foundation-model paradigm for mutual information estimation — a Transformer pre-trained on large-scale synthetic distributions that estimates the MI of arbitrary 1-D variable pairs in a zero-shot manner, ~100x faster than prior neural methods.
Published:
Invited departmental seminar at HKU ECE (20 May 2026, 11:00 AM – 12:00 PM).
Undergraduate course, Department of Electrical and Electronic Engineering, HKU, 2024
Teaching assistant for ELEC 3544 — Data Science with Foundation Models at HKU ECE, for the 2024, 2025, and 2026 offerings.