PKU-ML-Agent Group Seminar

Make every presentation deserve to be recorded.

PKU-ML-Agent group mainly focus on machine learning, reinforcement learning, active learning, data-centric AL, LLMs, Agent and so on.

2026 SPRING

[2026-01-07] TTT_KVB/TTT_E2E. [pdf] Xin-Lin Peng
[2026-01-14] Task Arithmetic in the Tangent Space. [pdf] Qiu-He Hong
[2026-01-21] Neural Tangent Kernel, NTK. [pdf] Tian-tian Peng
[2026-03-04] Jit: Back to Basics: Let Denoising Generative Models Denoise. [pdf] Yu-wei Niu
[2026-03-11] SkillsBench. [pdf] Liu-Zheng-Hao Lv
[2026-03-18] OPSD: Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models. [pdf] Jing-Ya Wang
[2026-03-25] FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization [pdf] Shuo Yang

2025 FALL (post-merger)

[2025-12-19] The ML Group and the Agent Group merged, ushering in a new era of AI research.

[2025-09-17] Biomni: A General-Purpose Biomedical AI Agent. [pdf] Jing-Ya Wang
[2025-10-15] DEEPSCIENTIST: ADVANCING FRONTIER-PUSHING SCIENTIFIC FINDINGS PROGRESSIVELY. [pdf] Jing-Ya Wang
[2025-11-19] MEMORY-R1. [pdf] ‪Liu-Zheng-Hao Lv‬
[2025-11-26] CRESt: A multimodal robotic platform. [pdf] Yu-Yang Gao
[2025-12-04] AgentEvolver: Towards Efficient Self Evolving Agent System. [pdf] Jing-Ya Wang
[2025-12-10] A Survey of Self-Evolving Agents. [pdf] Yu-Yang Liu
[2025-12-17] TTS for MAS. [pdf] Hong-Yang Li
[2025-12-24] TextGrad. [pdf] Yao Xin
[2025-12-24] COVT. [pdf] Yu-Lu Zhou

2025 FALL

[2025-09-05] MemP Exploring Agent Procedural Memory. [pdf] Qiu-He Hong
[2025-09-12] LLMs Post-training Dreams, Reality, and Fallacies. [pdf] Shuo Yang
[2025-09-19] Generalization of CLIP. [pdf] Tian-Tian Peng
[2025-09-26] Unify Models. [pdf] Yu-Wei Niu
[2025-10-17] Learning to See Before Seeing. [pdf] Yu-Lu Zhou
[2025-10-31] The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models. [pdf] Xin-Lin Peng
[2025-11-07] On Policy Distillation. [pdf] Shuo Yang
[2025-11-28] Calibration, Self-Evaluation, Hallucination. [pdf] Qiu-He Hong
[2025-12-05] Is CLIP ideal? [pdf] Tian-Tian Peng
[2025-12-12] Cambrain-S: Towards Spatial Supersensing in Video. [pdf] Yu-Wei Niu

2025 SPRING

[2025-02-21] DeepSeek Series. [pdf] Shuo Yang
[2025-02-28] World Model & Reasoning Attack. [pdf] Jia-Yu Yao
[2025-03-07] Self Adaptive LLMs & Learning to Memorize at Test Time. [pdf] Qiu-He Hong
[2025-03-14] Modality Gap. [pdf] Tian-Tian Peng
[2025-03-21] Physics of Language Models. [pdf] Yu-Wei Niu
[2025-03-28] All Roads Lead to Likelihood & Exploring the Visual Shortcomings of LVMs. [pdf] Yu-Lu Zhou
[2025-04-11] SimVQ-VAE. [pdf] Xin-Lin Peng
[2025-04-18] A Minimaximalist Approach to Reinforcement Learning from Human Feedback. [pdf] Kun-Peng Ning
[2025-05-08] Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? [pdf] Shuo Yang
[2025-05-16] Is Cosine-Similarity of Embeddings Really About Similarity? [pdf] Qiu-He Hong
[2025-05-23] Null Space in Continual Learning. [pdf] Tian-Tian Peng
[2025-05-30] SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning. [pdf] Yu-Lu Zhou
[2025-06-13] Steer LLM Latents for Hallucination Detection. [pdf] Xin-Lin Peng

2024 FALL

[2024-08-23] Causal Estimation of Memorisation Profiles. [pdf] Shuo Yang
[2024-08-30] Genie: Generative Interactive Environments. [pdf] Zhen-Hui Liu
[2024-09-06] Semantic Uncertainty. [pdf] Kun-Peng Ning
[2024-09-20] Let’s Verify Step by Step. [pdf] Hai-Jian Ke
[2024-09-27] Rediscovery Old School PR & ML in LLM. [pdf] Jia-Yu Yao
[2024-10-11] Loss of plasticity in deep continual learning. [pdf] Shuo Yang
[2024-10-18] Amortizing Intractable Inference in Large Language Models. [pdf] Zhen-Hui Liu
[2024-11-01] GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models. [pdf] Kun-Peng Ning
[2024-11-08] Proving Test Set Contamination in Black Box Language Model. [pdf] Jia-Yu Yao
[2024-11-15] A Survey of Safety Harmful fine-tuning defenses. [pdf] Shuo Yang
[2024-11-22] The Super Weight in Large Language Models. [pdf] Qiu-He Hong
[2024-11-29] Open Vocabulary Object Detection. [pdf] Tian-Tian Peng
[2024-12-06] Generation & Understanding. [pdf] Yu-Wei Niu
[2024-12-13] RHO-1: Not All Tokens Are What You Need. [pdf] Zhen-Hui Liu
[2025-01-03] DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature. [pdf] Kun-Peng Ning

2024 SPRING

[2024-01-05] Attack in Multimodal Large Language Models. [pdf] Yu Wang
[2024-01-12] Direct Preference Optimization: Your Language Model is Secretly a Reward Model. [pdf] Zhen-Hui Liu
[2024-01-26] Instance Selection for In-Context Learning. [pdf] Kun-Peng Ning
[2024-03-01] Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch. [pdf] Jia-Yu Yao
[2024-03-08] Extending Context Window of Large Language Models via Positional Interpolation. [pdf] Shuo Yang
[2024-03-15] S-Prompts Learning with Pre-trained Transformers: An Occam’s Razor for Domain Incremental Learning. [pdf] Hai-Jian Ke
[2024-03-22] Multi-modality Language Model Defense/Attack. [pdf] Yu Wang
[2024-03-29] Logits of API-Protected LLMs Leak Proprietary Information. [pdf] Zhen-Hui Liu
[2024-04-12] Introduction of Mamba Model. [pdf] Mu-Nan Ning
[2024-04-19] Thought Cloning: Learning to Think while Acting by Imitating Human Thinking. [pdf] Kun-Peng Ning
[2024-04-26] Fine-Tuning Language Models with Reward Learning on Policy. [pdf] Jia-Yu Yao
[2024-05-10] RAG Survey. [pdf] Shuo Yang
[2024-05-24] Fast Inference from Transformers via Speculative Decoding. [pdf] Zhen-Hui Liu
[2024-05-31] Incremental Learning of Visual Language Models. [pdf] Hai-Jian Ke
[2024-06-07] The Platonic Representation Hypothesis. [pdf] Kun-Peng Ning
[2024-06-21] TextGrad: Advancing robustness evaluation in nlp by gradient-driven optimization. [pdf] Jia-Yu Yao
[2024-06-28] Parameter Efficient Tuning. [pdf] Shuo Yang
[2024-07-05] Rotary Position Encoding. [pdf] Zhen-Hui Liu
[2024-07-12] Brain Decode Deep Nets. [pdf] Kun-Peng Ning
[2024-07-23] Mixture of Experts Meets Prompt-Based Continual Learning. [pdf] Jia-Yu Yao
[2024-07-23] O-LoRA & AM-LoRA. [pdf] Hai-Jian Ke

2023 FALL

[2023-09-01] LLMs Attacks: Universal and Transferable Adversarial Attacks on Aligned Language Models. [pdf] Zhen-Hui Liu
[2023-09-08] On Calibration of Modern Neural Networks. [pdf] Kun-Peng Ning
[2023-09-15] AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. [pdf] Jia-Yu Yao
[2023-09-22] SAM: Segment Anything. [pdf] Hai-Jian Ke
[2023-10-13] Black Box Adversarial Attack. [pdf] Zhen-Hui Liu
[2023-10-20] Black Box Adversarial Attack on Text Classification. [pdf] Kun-Peng Ning
[2023-10-27] Reinforcement Learning to Attack Based LLM Evaluation. [pdf] Jia-Yu Yao
[2023-11-03] Class incremental learning. [pdf] Hai-Jian Ke
[2023-11-10] Prompt tuning survey. [pdf] Shuo Yang
[2023-11-17] LLM Evaluation. [pdf] Zhen-Hui Liu
[2023-11-24] Edit Large Language Models. [pdf] Yu Wang
[2023-12-01] Eliciting Thinking Hierarchy without a Prior. [pdf] Kun-Peng Ning
[2023-12-08] R-tuning: Teaching Large Language Models to Refuse Unknown Questions. [pdf] Jia-Yu Yao
[2023-12-15] Rainbow Memory: Continual Learning with a Memory of Diverse Samples. [pdf] Hai-Jian Ke
[2023-12-22] Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning. [pdf] Shuo Yang