Posts by Collection

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

InfoNet: Neural Estimation of Mutual Information without Test-time Optimization

Published in International Conference on Machine Learning (ICML), Oral, 2024

First to bring the foundation-model paradigm to mutual information estimation: a Transformer pre-trained on large-scale synthetic data zero-shot estimates the MI of arbitrary 1-D pairs (X, Y); ~100x faster than prior neural methods, with even slightly better accuracy.

Recommended citation: Zhengyang Hu, Song Kang, Qunsong Zeng, Kaibin Huang, Yanchao Yang. (2024). "InfoNet: Neural Estimation of Mutual Information without Test-time Optimization." ICML 2024 (Oral).

From Anisotropy to Anomaly: Online Geometric Diagnostics during Transformer Training

Published in Under review at NeurIPS (collaboration with Huawei), 2026

An online Transformer-training monitor built on latent-space statistical correlation and geometric anisotropy; jointly characterizes geometric drift and representation anomalies, enabling real-time perturbation detection on NanoGPT / ViT / Pythia-2.8B.

Recommended citation: Zhengyang Hu, Wenyi Fang, Yang Zheng, Yanchao Yang. (2026). "From Anisotropy to Anomaly: Online Geometric Diagnostics during Transformer Training." Under review.

A Foundation-style Model for Zero-Shot Statistical Dependency Measurement

Published in International Conference on Machine Learning (ICML), 2026

Introduced InfoAtlas, a full upgrade of InfoNet: redesigned architecture and synthetic dataset supporting arbitrary dimensions; pre-trained ~1 month on a 32xH200 cluster into a 1B-parameter model, zero-shot ready and evaluated across broader downstream tasks.

Recommended citation: Zhengyang Hu*, Yanzhi Chen*, Hanxiang Ren, Qunsong Zeng, Youyi Zheng, Adrian Weller, Kaibin Huang, Yanchao Yang. (2026). "A Foundation-style Model for Zero-Shot Statistical Dependency Measurement." ICML 2026. (*Equal contribution.)

talks

InfoNet: Neural Estimation of Mutual Information without Test-time Optimization

Published: July 01, 2024

Oral presentation at ICML 2024 for our paper InfoNet: Neural Estimation of Mutual Information without Test-time Optimization. The talk introduced the foundation-model paradigm for mutual information estimation — a Transformer pre-trained on large-scale synthetic distributions that estimates the MI of arbitrary 1-D variable pairs in a zero-shot manner, ~100x faster than prior neural methods.

Foundation-style Methods for Real-Time Statistical Dependency Measurement and Its Applications

Published: May 20, 2026

Invited departmental seminar at HKU ECE (20 May 2026, 11:00 AM – 12:00 PM).

teaching

ELEC 3544 — Data Science with Foundation Models (TA)

Undergraduate course, Department of Electrical and Electronic Engineering, HKU, 2024

Teaching assistant for ELEC 3544 — Data Science with Foundation Models at HKU ECE, for the 2024, 2025, and 2026 offerings.