Zerui Cheng (程泽瑞)

Ph.D. Candidate at Princeton Univ., AI and Blockchain Researcher

Princeton University

Hi there, I’m Zerui Cheng (in Chinese：程泽瑞)~

I am a Ph.D. candidate at Princeton University advised by Prof. Pramod Viswanath. I was a part-time student researcher at ByteDance Seed mentored by Dr. Jiashuo Liu. Before Princeton, I completed my B.Eng. in Computer Science from Yao Class at Tsinghua, graduating summa cum laude and earning the prestigious Yao Award.

My research interests are primarily centered around data, evaluation, mechanism design, and blockchains, with the goal of promoting fairness and transparency in AI era and a strong focus on real-world impact. My research has been featured in various venues including Nature, NeurIPS, ICLR, CCS, EuroSys, etc., and has contributed to the technical whitepapers of high-profile startups including Sentient, Kite AI, and PolyHedra.

Beyond research, I am a member of the Competitive Programming Hall of Fame. I served as the President of the Yao Class Students’ Congress during undergraduate; and I was once a contestant on the TV show “Super Brain” (江苏卫视“最强大脑第10季”).

I’m always open to research and industry collaborations. Feel free to contact and chat!

Google Scholar profile Curriculum Vitae

Interests

LLMs & Code Generation
Decentralized AI Systems
Blockchain & Cryptography
AI Benchmarking & Evaluation

Education

Ph.D. student (2023 - now)

Electrical and Computer Engineering, Princeton University
B.Eng. in Computer Science (2019 - 2023)

Yao Class, the Insititute for Interdisciplinary Information Sciences (IIIS), Tsinghua University

Recent Highlights

[Jan 2026] (paper acceptance)

Three papers are accepted in various venues this month!

One paper (HLE) accepted to Nature!
One paper (DeML) accepted to EuroSys 2026!
One paper (AutoCode) accepted to ICLR 2026!

[Dec 2025] (project updates)

The official websites of 2 projects accepted to NeurIPS 2025 are live now! Don’t hesitate to check them out!

Website of PeerBench Platform: The platform built upon the NeurIPS paper is live! Everyone is welcome to sign up, join the community, collectively define what is real progress in AI, and shape the future of AI evaluation!
LiveCodeBench Pro Team Website: The official website of LiveCodeBench Pro Team. Check it out for the latest updates on LiveCodeBench Pro, AutoCode, and many more exciting projects delivered by our team!

[Dec 2025] (talk)

Dec 4: Will give a talk on “Open-Source AI for Competitive Programming” at the OpenAGI Symposium at NeurIPS! Ticket here

[Dec 2025] (acceptance notification)

CAIA gets accepted and selected for oral presentation (top 10%) to AAAI26 AI4Finance!

[Oct 2025] (papers and acceptance notifications)

Several papers that I contributed to are online now, and will be presented in different venues in the near future!

First-author papers:

Paper 1: PeerBench Paradigm: We analyze the systematic challenges faced by AI benchmark paradigm today—data contamination, collusion, overfitting, etc, and propose PeerBench, a novel mechanism based on community contribution and reputation to reliably and efficiently measure data quality and build fair leaderboards. Our vision is to return AI evaluation to its role as a public good, aligning tech development with all humanity’s needs, not just those of a few giants. [Accepted to NeurIPS 2025]
Paper 2: OML Primitive: Are openness and commercial value mutually exclusive? In this paper, we go one step further than the original OML whitepaper of Sentient last year. We formalize the OML framework, exploring a path where models are open-access, but technical safeguards prevent misuse. OML offers a blueprint for sustainable, open AI governance and operation of next-gen AI. [Accepted to NeurIPS 2025 Lock-LLM]

Co-first-author papers:

Paper 3: CAIA (Crypto AI Agent Benchmark): The first AI agent benchmark in crypto and web3. Our results show that models aren’t yet reliable in this high-stakes, high-misinformation adversarial domain, and there is a giant gap to an ideal world where we can let AI reliably control users' wallets and manage real funds without risks. [Accepted to ICAIF 2025 AI4F (poster), AI-R2D2 (oral)]
Paper 4: AutoCode: The follow-up work to LiveCodeBench Pro crafted by the LiveCodeBench Pro dream team. We create a robust and efficient AI system that auto-generates coding problems to solve the data scarcity bottleneck.

Co-author papers:

Paper 5: NAO (Nondeterminism-Aware Optimistic Verification for Floating-Point Neural Networks): A crucial step for Decentralized AI, ensuring that AI inference results are reproducible and verifiable, so that the rights of end users are protected. It solves the bottleneck that we encountered in our Sakshi paper 2 years ago, and is a critical step for realizing our vision of a decentralized AI platform.
Paper 6: Kite AI Whitepaper: I co-authored the whitepaper for Kite AI as a research collaborator. Kite AI is building a native payment infrastructure for AI agents, and we depict the vision where agents transact autonomously with cryptographic accountability and traceability. Kite AI has raised $33 million from top–tier investors, including PayPal, General Catalyst, Coinbase Venture and leading blockchain foundations.

Among those, PeerBench and LiveCodeBench Pro will be presented at NeurIPS 2025 Main Conference in San Diego on Dec 3; CAIA will be presented at ICAIF 2025 in Singapore (AI4F on Nov 15, AI-R2D2 on Nov 16); and OML Primitive will be presented at NeurIPS 2025 Lock-LLM on Dec 6. Stay tuned for them!

[Oct 2025] (talk)

Oct 15: Gave a public talk on the OML primitive hosted by Decentralized AI Institute and several other amazing co-hosts. Thanks Yuxi! Link to video

[Aug 2025] (talks)

Aug 15: Gave a talk on LiveCodeBench Pro with Peiyao Sheng at AI4Science x AI Security Community. Thanks AlphaXiv!
Aug 4: Gave a talk on LiveCodeBench Pro at OpenAGI Symposium at the University of California, Berkeley. Thanks Sentient!

[Jun 2025] (papers)

The AI benchmark paper LiveCodeBench Pro that I co-first-authored is online now!

LiveCodeBench Pro: We collaborated with elite competitive programmers to launch a continuously updated benchmark, precisely evaluating model capabilities on dynamic, high-difficulty coding tasks. The paper has been covered by MIT Tech Review and has already accumulated more than 1 million views on X [Accepted to NeurIPS 2025];

[Jun 2025] (talk)

Gave an online talk on OML at Bethge Lab of the University of Tübingen. Thanks Prof. Matthias Bethge!

[May 2025] (personal update)

Passed my Ph.D. general exam. I’m officially a Ph.D. candidate now!
Thank you to all my committee members: Prof. Chi Jin, Prof. Sanjeev Kulkarni, and Prof. Pramod Viswanath!

[Apr 2025] (poster presentation)

Poster presentation on OML at Citadel Securities PhD Summit 2025. Thank you Citadel Securities!

[Mar 2025] (papers)

Two AI benchmark papers that I co-authored are online now!

Paper 1: SPIN-Bench: LLM reasoning and planning; (Update in Jul 2025: Accepted to COLM 2025);
Paper 2: Humanity's Last Exam: Ultimate test for AI capabilities;

[Sep 2024] (paper)

The whitepaper on OML: Open, Monetizable and Loyal AI is live. Don’t hesitate to check it out!
Here is the link to whitepaper.

[Aug 2024] (personal update)

Started my one-month internship as a Quantitative Researcher at JQ Investment.

[May 2024] (personal update)

Started my internship as an AI fellow at Sentient.

Papers

For most recent updates, please refer to my Google Scholar profile. Here are some selected publications.

Major Projects with Real-World Impact

OML: Open, Monetizable, Loyal AI (2024, NeurIPS 2025 Lock-LLM)
- Whitepaper for Sentient, which raised $85M seed funding
- Featured at Citadel Securities PhD Summit 2025
- Invited talk at University of Tübingen
- Pioneering AI-native cryptography for IP protection
zkBridge (ACM CCS 2022)
- Trustless cross-chain bridges using zero-knowledge proofs
- Foundation for Polyhedra Network (valued at $1 billion by the end of 2024)

Recent Papers on AI Evaluation

LiveCodeBench Pro (NeurIPS 2025) - Comprehensive, hard, and contamination-free code generation benchmark
- Featured in MIT Technology Review on Jun 24, 2025;
- Accumulated over 1 million views on X;
- Invited talks at OpenAGI Symposium and AI4Science Community at AlphaXiv;
SPIN-Bench (COLM 2025) - Strategic planning & social reasoning for LLMs
PeerBench (NeurIPS 2025) - New paradigm for AI evaluation based on peer review
Humanity’s Last Exam (2025) - Ultimate test for AI capabilities

Decentralized AI Platforms

Sakshi (2023) - Proof-of-Inference for decentralized AI
PoCW (APNET 2023) - AI computation as Proof-of-Work in blockchains