[   ] 

Prof. QIU Shuang

BEng(USTB), MS(UCAS), PhD(U Michigan)

Assistant Professor

Contact Information

Office: P6605 YEUNG
Phone: 34422118
Fax: 34420173
Email: shuanqiu@cityu.edu.hk
Web: Personal Homepage

Research Interests

  • Machine Learning
  • Reinforcement Learning
  • Large Language Model
  • Embodied AI
  • Optimization
Shuang Qiu received his Ph.D. degree in Computer Science and Engineering from the University of Michigan, Ann Arbor. He is now an assistant professor at Department of Systems Engineering, City University of Hong Kong. His primary research interest include both applications and theories of machine learning, reinforcement learning, large language models, embodied AI, and optimization.


Selected Publications

  • Chenjia Bai, Yang Zhang, Shuang Qiu, Qiaosheng Zhang, Kang Xu, Xuelong Li. (2025). Online Preference Alignment for Language Models via Count-based Exploration. International Conference on Learning Representations (ICLR Spotlight)
  • Shuang Qiu*, Boxiang Lyu*, Qinglin Meng*, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan. (2024). Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach. Journal of Machine Learning Research (JMLR)
  • Rui Yang*, Xiaoman Pan*, Feng Luo*, Shuang Qiu*, Han Zhong, Dong Yu, Jianshu Chen. (2024). Rewards-in-Context: Multi-Objective Alignment of Foundation Models with Dynamic Preference Adjustment. International Conference on Machine Learning (ICML)
  • Dake Zhang, Boxiang Lyu, Shuang Qiu#, Mladen Kolar, Tong Zhang. (2024). Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning. International Conference on Machine Learning (ICML Spotlight)
  • Shuang Qiu*, Ziyu Dai*, Han Zhong, Zhaoran Wang, Zhuoran Yang, Tong Zhang. (2023). Posterior Sampling for Competitive RL: Function Approximation and Partial Observation. Advances in Neural Information Processing Systems (NeurIPS)
  • Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang. (2022). Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning. International Conference on Machine Learning (ICML)
  • Shuang Qiu, Xiaohan Wei, Jieping Ye, Zhaoran Wang, Zhuoran Yang. (2021). Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions. International Conference on Machine Learning (ICML)
  • Shuang Qiu, Xiaohan Wei, Zhuoran Yang, Jieping Ye, Zhaoran Wang. (2020). Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss. Advances in Neural Information Processing Systems (NeurIPS)


PhD & RA Openings

  • I am actively seeking self-motivated students with strong mathematical or programming skills for the following positions: If you have backgrounds in or are passionate about both applied and theoretical aspects of:
    • Reinforcement Learning,
    • Large Language Models,
    • Embodied AI,
    • Generative Models (e.g. Diffusion Model),
    • Optimization,
    please do not hesitate to contact me via my email with your CV and transcript.


Last update date : 07 Mar 2025