|
Junxiong Wang
I obtained my PhD in Computer Science from Cornell University, where I worked at the intersection of large language models and systems, with a focus on linear models and their hybrid variants.
I lead multiple research projects at Together AI, including adaptive speculative decoding (ATLAS), inference-time training (Aurora), and efficient RL rollouts.
Recent Publications
- Monishwaran Maheswaran, Leon Lakhani, Zhongzhu Zhou, Shijia Yang, Junxiong Wang, Coleman Richard Charles Hooper, Yuezhou Hu, Rishabh Tiwari, Jue WANG, Harman Singh, Qingyang Wu, Ce Zhang, Kurt Keutzer, Tri Dao, Xiaoxia Wu, Ben Athiwaratkun, James Zou, Chenfeng Xu
Squeeze Evolve: A Unified Multi-Model Orchestration Framework for Verifier-Free Evolution, 2026
- Yifan Yu, Yuqing Jian, Junxiong Wang, Zhongzhu Zhou, Donglin Zhuang, Xinyu Fang, Xiaoxia Wu, Qingyang Wu, Shuaiwen Leon Song, Tri Dao, Ben Athiwaratkun, James Zou, Fan Lai, Chenfeng Xu
Introspective Diffusion Language Models, 2026
- Junxiong Wang*†, Fengxiang Bie*†, Jisen Li†, Zhongzhu Zhou†, Zelei Shao†, Yubo Wang†, Yinghui Liu†, Qingyang Wu, Avner May, Sri Yanamandra, Yineng Zhang, Ce Zhang, Tri Dao, Percy Liang, Ben Athiwaratkun, Shuaiwen Leon Song, Chenfeng Xu†, Xiaoxia Wu†
When RL Meets Adaptive Speculative Training: A Unified Training-Serving System, ICML 2026
- Hao Kang, Ziyang Li, Xinyu Yang, Weili Xu, Yinfang Chen, Junxiong Wang, Beidi Chen, Tushar Krishna, Chenfeng Xu, Simran Arora
ThunderAgent: A Fast, Simple, and Program-Aware Agentic Inference System, ICML 2026
- Costin-Andrei Oncescu, Qingyang Wu, Wai Tong Chung, Robert Wu, Bryan Gopal, Junxiong Wang, Tri Dao, Ben Athiwaratkun
Opportunistic Expert Activation: Batch-Aware Expert Routing for Faster Decode Without Retraining, ICML 2026
- Harman Singh, Xiuyu Li, Kusha Sareen, Monishwaran Maheswaran, Sijun Tan, Xiaoxia Wu, Junxiong Wang, Alpay Ariyak, Qingyang Wu, Samir Khaki, Rishabh Tiwari, Long Lian, Yucheng Lu, Boyi Li, Alane Suhr, Ben Athiwaratkun, Kurt Keutzer
V1: Unifying Generation and Self-Verification for Parallel Reasoners, ICML 2026
-
Zelei Shao*, Vikranth Srivatsa*, Sanjana Srivastava, Qingyang Wu, Alpay Ariyak, Xiaoxia Wu, Ameen Patel, Jue Wang, Percy Liang, Tri Dao, Ce Zhang, Yiying Zhang, Ben Athiwaratkun, Chenfeng Xu, Junxiong Wang
Beat the long tail: Distribution-Aware Speculative Decoding for RL Training
Conference on Machine Learning and Systems (MLsys), 2026
-
Haojun Xia*, Xiaoxia Wu*, Jisen Li*, Robert Wu, Junxiong Wang, Jue Wang, Chenxi Li, Aman Singhal, Alay Dilipbhai Shah, Alpay Ariyak, Donglin Zhuang, Zhongzhu Zhou, Ben Athiwaratkun, Zhen Zheng, Shuaiwen Leon Song
Kitty: Accurate and Efficient 2-bit KV Cache Quantization with Dynamic Channel-wise Precision Boost
Conference on Machine Learning and Systems (MLsys), 2026
-
Jiaqi Leng*, Xiang Hu*, Junxiong Wang, Jianguo Li, Wei Wu, Yucheng Lu
Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models
International Conference on Learning Representations (ICLR), 2026
-
Zhongzhu Zhou, Fengxiang Bie, Ziyan Chen, Zhenyu Zhang, Yibo Yang, Junxiong Wang, Ben Athiwaratkun, Xiaoxia Wu, Shuaiwen Leon Song
CARE: Covariance-Aware and Rank-Enhanced Decomposition for Enabling Multi-Head Latent Attention
International Conference on Learning Representations (ICLR), 2026
-
Woojeong Kim, Junxiong Wang, Jing Nathan Yan, Mohamed S. Abdelfattah, Alexander M. Rush
Overfill: Two-Stage Models for Efficient Language Model Decoding
Conference on Language Modeling (CoLM), 2025
-
Junxiong Wang, Wen-Ding Li, Daniele Paliotta, Daniel Ritter, Alexander M. Rush, Tri Dao
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Workshop on Efficient Reasoning (Best Paper Award), Neural Information Processing Systems (NeurIPS), 2025
-
Junxiong Wang*, Daniele Paliotta*, Avner May, Alexander M. Rush, Tri Dao
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
Models, Video, Code, Blog
Neural Information Processing Systems (NeurIPS), 2024
A shorter version at ICML 2024, 2nd Workshop on Efficient Systems for Foundation Models (ES-FoMo)
Email:Firstname@cs.cornell.edu /
Github /
HuggingFace Models /
Papers /
Twitter
|
|