Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

InfoNet: Neural Estimation of Mutual Information without Test-time Optimization

Published in International Conference on Machine Learning (ICML), Oral, 2024

First to bring the foundation-model paradigm to mutual information estimation: a Transformer pre-trained on large-scale synthetic data zero-shot estimates the MI of arbitrary 1-D pairs (X, Y); ~100x faster than prior neural methods, with even slightly better accuracy.

Recommended citation: Zhengyang Hu, Song Kang, Qunsong Zeng, Kaibin Huang, Yanchao Yang. (2024). "InfoNet: Neural Estimation of Mutual Information without Test-time Optimization." ICML 2024 (Oral).

From Anisotropy to Anomaly: Online Geometric Diagnostics during Transformer Training

Published in Under review at NeurIPS (collaboration with Huawei), 2026

An online Transformer-training monitor built on latent-space statistical correlation and geometric anisotropy; jointly characterizes geometric drift and representation anomalies, enabling real-time perturbation detection on NanoGPT / ViT / Pythia-2.8B.

Recommended citation: Zhengyang Hu, Wenyi Fang, Yang Zheng, Yanchao Yang. (2026). "From Anisotropy to Anomaly: Online Geometric Diagnostics during Transformer Training." Under review.

A Foundation-style Model for Zero-Shot Statistical Dependency Measurement

Published in International Conference on Machine Learning (ICML), 2026

Introduced InfoAtlas, a full upgrade of InfoNet: redesigned architecture and synthetic dataset supporting arbitrary dimensions; pre-trained ~1 month on a 32xH200 cluster into a 1B-parameter model, zero-shot ready and evaluated across broader downstream tasks.

Recommended citation: Zhengyang Hu*, Yanzhi Chen*, Hanxiang Ren, Qunsong Zeng, Youyi Zheng, Adrian Weller, Kaibin Huang, Yanchao Yang. (2026). "A Foundation-style Model for Zero-Shot Statistical Dependency Measurement." ICML 2026. (*Equal contribution.)

talks

InfoNet: Neural Estimation of Mutual Information without Test-time Optimization

Published:

Oral presentation at ICML 2024 for our paper InfoNet: Neural Estimation of Mutual Information without Test-time Optimization. The talk introduced the foundation-model paradigm for mutual information estimation — a Transformer pre-trained on large-scale synthetic distributions that estimates the MI of arbitrary 1-D variable pairs in a zero-shot manner, ~100x faster than prior neural methods.

teaching