Email: zhengzaixiang [at] bytedance.com (preferred) or zaixiang.zheng [at] gmail.com
> Please don't hesitate to reach out to my ByteDance email if I don't get back to you quickly elsewhere (e.g., Github issues or Gmail).
About Me
I am a senior research scientist at ByteDanceResearch, affiliated with AI Drug Discovery Team led by Prof. Quanquan Gu, where I am leading the research & development of generative protein foundation models.
My general research interest lies in deep probabilistic modeling and large-scale generative learning (e.g., LLMs & diffusion models) along with their applications in wide spectrum of real-world problems, especially in human language and AI for Science (e.g., generative protein modeling & design).
Before joining ByteDance, I completed my five-year PhD program in computer science at NJUNLP group, Nanjing University (09/2016-06/2021), advised by Prof. Jiajun Chen and Prof. Shujian Huang.
During my PhD, I spent wonderful time at ILCC, University of Edinburgh working with Prof. Alexandra Birch (09/2019-08/2020).
I also interned at MLNLC group, ByteDance AI Lab as a research intern working with Prof. Hao Zhou and Prof. Lei Li (09/2020-06/2021).
I was the recipient of Chinese Information Processing Society of China (CIPS) best doctoral dissertation award, and received ACL 2021 best paper and INLG 2022 best short-paper awards, and was the leading contributor in achieving the first place in WMT 2021 German-English translation shared task with non-autoregressive translation system.
[Hiring!] We are looking for highly-motivated interns from CS/ML background. Feel free to DM me if you are interested in!
News
May. 2024: our DPLM paper about a versatile diffusion protein foundation model for both generative and preditive purposes was accepted to ICML 2024! Also presented an invited talk about DPLM on ML for protein engineering seminar.
Apr. 2024: our DINOISER paper on diffusion models for sequence learning was accepted to Transactions of ACL (TACL)! DINOISER was also selected as an oral presentation at ACL 2024.
Jan. 2024: I am serving as Area Chair for ACL 2024.
Dec. 2023: I am serving as Area Chair for NAACL 2024.
Oct. 2023: I am serving as Area Chair for EACL 2024.
May-Oct. 2023: gave an invited talk about deep generative sequence modeling for human languages and proteins at UC Santa Barbara, Tongji University, TechBeat, MLNLP seminar, IWNLG and Shanghai Uni. of Finance & Economics.
Apr. 2023: our paper Deep Equilibrium Non-autoregressive Sequence Learning was accepted to Findings of ACL 2023!
Apr. 2023: our LM-DESIGN paper about steering protein LLMs for designing protein sequences for desired folds was accepted to ICML 2023 as oral presentation!
Nov. 2022: received Chinese Information Processing Society of China (CIPS) Best Doctoral Dissertation Award!
Oct. 2022: gave an invited talk about deep generative modeling for natural languages at FudanNLP.
Oct. 2022: one paper about multi-task learning for non-autoregressive models was accepted to EMNLP.
July 2022: our LAFT paper about cross-lingual transfer for text generation received the Best Short-Paper Award by INLG!
Oct. 2021: our REDER about reversible machine translation was accepted to NeurIPS 2021!
July 2021: I joined ByteDance!
June 2021: Finally, I passed my viva and reveived my PhD degree!
Aug. 2020: I am looking for a full-time research scientist position in the industry.
Selected Publications/Preprints
[Google Scholar] |
[‡: equal contributions]
[#: interns/students I mentored]
DPLM-2: A Multimodal Diffusion Protein Language Model
Xinyou Wang#, Zaixiang Zheng, Fei Ye, Dongyu Xue, Shujian Huang and Quanquan Gu Preprint. arXiv:2410.13782, 2024
Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization
Xiangxin Zhou‡#, Dongyu Xue‡, Ruizhe Chen‡, Zaixiang Zheng, Liang Wang and Quanquan Gu Preprint. arXiv:2403.16576, 2024 Neural Information Processing Systems (NeurIPS), 2024
Diffusion Language Models Are Versatile Protein Learners
Xinyou Wang‡#, Zaixiang Zheng‡, Fei Ye, Dongyu Xue, Shujian Huang and Quanquan Gu Preprint. arXiv:2402.18567, Feb., 2024 International Conference on Machine Learning (ICML), 2024 ByteDance's ICML 2024 Research Highlight
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Jiasheng Ye‡#, Zaixiang Zheng‡, Yu Bao, Lihua Qian and Quanquan Gu Preprint. arXiv:2308.12219, Aug., 2023
Structure-informed Language Models Are Protein Designers Zaixiang Zheng‡, Yifan Deng‡#, Dongyu Xue, Yi Zhou, Fei Ye and Quanquan Gu International Conference on Machine Learning (ICML), 2023 Oral presentation
DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises
Jiasheng Ye#, Zaixiang Zheng, Yu Bao, Lihua Qian and Mingxuan Wang Preprint. arXiv:2302.10025, Feb., 2023 Transactions of the Association of Computational Linguistics (TACL), 2024 Oral presentation at ACL 2024
Deep Equilibrium Non-autoregressive Sequence Learning Zaixiang Zheng, Yi Zhou and Hao Zhou Association for Computational Linguistics (Findings of ACL), 2023
LAFT: Cross-lingual Transfer for Text Generation by Language-Agnostic Finetuning
Xianze Wu#, Zaixiang Zheng, Hao Zhou and Yong Yu International Natural Language Generation Conference (INLG), 2022 Best short-paper award
Duplex Sequence-to-Sequence Learning for Reversible Machine Translation Zaixiang Zheng, Hao Zhou, Shujian Huang, Jiajun Chen, Jingjing Xu and Lei Li Neural Information Processing Systems (NeurIPS), 2021
Vocabulary Learning via Optimal Transport for Neural Machine Translation
Jingjing Xu, Hao Zhou, Chun Gan, Zaixiang Zheng and Lei Li Association for Computational Linguistics (ACL), 2021 Best paper award
Improving Self Attention Networks with Sequential Relations Zaixiang Zheng, Shujian Huang, Rongxiang Weng, Xin-Yu Dai and Jiajun Chen ACM/IEEE Transactions on Audio, Speech and Language Processing (TASLP), 2020
Towards Making the Most of Context in Neural Machine Translation Zaixiang Zheng‡, Xiang Yue‡#, Shujian Huang, Jiajun Chen and Alexandra Birch International Joint Conference on Artificial Intelligence (IJCAI), 2020
Dynamic Past and Future for Neural Machine Translation Zaixiang Zheng, Zhaopeng Tu, Shujian Huang, Xin-Yu Dai and Jiajun Chen Empirical Methods in Natural Language Processing (EMNLP), 2019
Modeling Past and Future for Neural Machine Translation Zaixiang Zheng‡, Hao Zhou‡, Shujian Huang, Lili Mou, Xin-Yu Dai, Jiajun Chen and Zhaopeng Tu Transactions of the Association of Computational Linguistics (TACL), 2018
Presented at ACL 2018