# 系统学习Jason Wei的文章 **Repository Path**: Zen07/learning-jason-wei ## Basic Information - **Project Name**: 系统学习Jason Wei的文章 - **Description**: Systematic learning notes on Jason Wei's research, CoT, and LLM reasoning.(系统学习 Jason Wei 及其 LLM 推理研究的笔记) - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2025-11-26 - **Last Updated**: 2025-12-05 ## Categories & Tags **Categories**: Uncategorized **Tags**: 论文学习 ## README # Jason Wei: 从入门到精通的学习路径 (Zero-to-Hero) > "学习最好的方式就是复现。" (The best way to learn is to reproduce.) 本仓库旨在系统性学习 **Jason Wei** (Meta/OpenAI/Google Brain) 的研究成果。从简单的数据增强,到超大模型的涌现推理能力。 **目标:** 不要只读论文。**要理解背后的直觉,复现实验结果,并建立完整的思维模型。** --- ## 🗺️ 思维模型 (目录结构) 我们将原本杂乱的论文整理成了一条清晰的学习路径: ```text learning-jason-wei/ ├── 00-Foundations/ # 黑客的起点:数据增强 (EDA) ├── 01-Instruction-Tuning/# 范式转移:FLAN & 指令遵循 ├── 02-Reasoning-CoT/ # “魔法”所在:思维链 & 系统2思维 (System 2) ├── 03-Scaling-Laws/ # 物理规律:涌现、逆缩放 & U型曲线 ├── 04-Safety-Alignment/ # 安全护栏:越狱攻防 & 审慎对齐 (o1) ├── 05-Agents-Search/ # 应用层:浏览器、RAG & 智能体 ├── 06-Domain-Specific/ # 领域专家:医疗 AI (Med-PaLM, HealthBench) ├── 07-Linguistics/ # 深度挖掘:早期 NLP & 概率视角 └── assets/ ``` --- ## 📚 课程表与清单 (Curriculum & Checklist) ### Phase 0: 黑客的起点 (从这里开始) *在 LLM 时代之前,是巧妙的工程化时代。从这里入手建立自信。* - [ ] **[2019] Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks (EDA)** - *推荐理由:* 简单、有效,只需 10 分钟就能写完代码。 - 🔗 [arXiv](https://arxiv.org/abs/1901.11196) - 📝 [笔记与代码](./00-Foundations/EDA/notes.md) ### Phase 1: 指令微调 (教会模型听懂命令) *我们是如何从“预测下一个词”进化到“有用的助手”的。* - [ ] **[2022] Finetuned Language Models Are Zero-Shot Learners (FLAN)** - 🔗 [arXiv](https://arxiv.org/abs/2109.01652) | [Google Blog](https://ai.googleblog.com/2021/10/introducing-flan-more-generalizable.html) - 📝 [笔记](./01-Instruction-Tuning/FLAN/notes.md) - [ ] **[2023] The Flan Collection: Designing Data and Methods for Effective Instruction Tuning** - 🔗 [arXiv](https://arxiv.org/abs/2301.13688) | [Google Blog](https://ai.googleblog.com/2023/02/the-flan-collection-advancing-open.html) - 📝 [笔记](./01-Instruction-Tuning/Flan-Collection/notes.md) - [ ] **[2022] Scaling Instruction-Finetuned Language Models (FLAN-PaLM)** - 🔗 [arXiv](https://arxiv.org/abs/2210.11416) | [Google Blog](https://research.google/blog/scaling-instruction-finetuned-language-models/) - 📝 [笔记](./01-Instruction-Tuning/Scaling-FLAN/notes.md) - [ ] **[2024] Mixture-of-Experts Meets Instruction Tuning** - 🔗 [arXiv](https://arxiv.org/abs/2305.14705) - 📝 [笔记](./01-Instruction-Tuning/MoE-Instruct/notes.md) ### Phase 2: 推理与思维链 (核心章节) *“让我们一步步思考”的诞生时刻。这是最关键的部分。* - [x] **[2022] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models** - *圣经级论文。* - 🔗 [arXiv](https://arxiv.org/abs/2201.11903) | [Google Blog](https://ai.googleblog.com/2022/05/language-models-perform-reasoning-via.html) - 📝 [笔记](./02-Reasoning-CoT/Chain-of-Thought/notes.md) - [ ] **[2023] Self-Consistency Improves Chain of Thought Reasoning** - 🔗 [arXiv](https://arxiv.org/abs/2203.11171) - 📝 [笔记](./02-Reasoning-CoT/Self-Consistency/notes.md) - [ ] **[2023] Least-to-Most Prompting Enables Complex Reasoning** - 🔗 [arXiv](https://arxiv.org/abs/2205.10625) - 📝 [笔记](./02-Reasoning-CoT/Least-to-Most/notes.md) - [ ] **[2023] Language Models are Multilingual Chain-of-Thought Reasoners** - 🔗 [arXiv](https://arxiv.org/abs/2210.03057) - 📝 [笔记](./02-Reasoning-CoT/Multilingual-CoT/notes.md) - [ ] **[2023] Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them** - 🔗 [arXiv](https://arxiv.org/abs/2210.09261) - 📝 [笔记](./02-Reasoning-CoT/BIG-Bench/notes.md) - [ ] **[2023] Mind's Eye: Grounded Language Model Reasoning through Simulation** - 🔗 [arXiv](https://arxiv.org/abs/2210.05359) - 📝 [笔记](./02-Reasoning-CoT/Minds-Eye/notes.md) - [ ] **[2022] PaLM: Scaling Language Modeling with Pathways** - 🔗 [arXiv](https://arxiv.org/abs/2204.02311) | [Google Blog](https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html) - 📝 [笔记](./02-Reasoning-CoT/PaLM/notes.md) ### Phase 3: LLM 的物理学 (Scaling Laws) *理解“为什么”,以及规模化带来的奇异行为。* - [ ] **[2022] Emergent Abilities of Large Language Models** - 🔗 [arXiv](https://arxiv.org/abs/2206.07682) | [Google Blog](https://ai.googleblog.com/2022/11/characterizing-emergent-phenomena-in.html) - 📝 [笔记](./03-Scaling-Laws/Emergence/notes.md) - [ ] **[2023] Inverse Scaling Can Become U-shaped** - 🔗 [arXiv](https://arxiv.org/abs/2211.02011) - *关键点:为什么模型变大后,在某些任务上反而变差了?* - 📝 [笔记](./03-Scaling-Laws/Inverse-Scaling/notes.md) - [ ] **[2023] Transcending Scaling Laws with 0.1% Extra Compute (UL2)** - 🔗 [arXiv](https://arxiv.org/abs/2210.11399) - 📝 [笔记](./03-Scaling-Laws/UL2/notes.md) - [ ] **[2023] Larger Language Models do In-Context Learning Differently** - 🔗 [arXiv](https://arxiv.org/abs/2303.03846) | [Google Blog](https://ai.googleblog.com/2023/05/larger-language-models-do-in-context.html) - 📝 [笔记](./03-Scaling-Laws/ICL-Differently/notes.md) - [ ] **[2024] A Pretrainer's Guide to Training Data** - 🔗 [arXiv](https://arxiv.org/abs/2305.13169) - 📝 [笔记](./03-Scaling-Laws/Data-Guide/notes.md) ### Phase 4: 安全与对齐 (System 2 Thinking) *o1 时代:为了安全而推理。* - [ ] **[2024] Deliberative Alignment: Reasoning Enables Safer Language Models** - 🔗 [OpenAI Blog](https://openai.com/index/deliberative-alignment/) - 📝 [笔记](./04-Safety-Alignment/Deliberative-Alignment/notes.md) - [ ] **[2024] Measuring Short-form Factuality in Large Language Models (SimpleQA)** - 🔗 [OpenAI Blog](https://openai.com/index/simpleqa/) - 📝 [笔记](./04-Safety-Alignment/SimpleQA/notes.md) - [ ] **[2023] Jailbroken: How Does LLM Safety Training Fail?** - 🔗 [arXiv](https://arxiv.org/abs/2307.02483) - 📝 [笔记](./04-Safety-Alignment/Jailbroken/notes.md) ### Phase 5: Agent, RAG 与互联网 (2024-2025) *从生成文本转向采取行动。* - [ ] **[2025] BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents** - 🔗 [OpenAI Blog](https://openai.com/index/browsecomp/) - 📝 [笔记](./05-Agents-Search/BrowseComp/notes.md) - [ ] **[2024] FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation** - 🔗 [arXiv](https://arxiv.org/abs/2310.03214) - 📝 [笔记](./05-Agents-Search/FreshLLMs/notes.md) ### Phase 6: 领域精通 (医疗健康) *将 LLM 应用于高风险领域。* - [ ] **[2025] HealthBench: Evaluating Large Language Models Towards Improved Human Health** - 🔗 [arXiv](https://arxiv.org/abs/2501.XXXXX) (Search Link) | [OpenAI Blog](https://openai.com/index/) (Placeholder) - 📝 [笔记](./06-Domain-Specific/HealthBench/notes.md) - [ ] **[2023] Large Language Models Encode Clinical Knowledge (Med-PaLM)** - 🔗 [Nature](https://www.nature.com/articles/s41586-023-06291-2) | [arXiv](https://arxiv.org/abs/2212.13138) - 📝 [笔记](./06-Domain-Specific/Med-PaLM/notes.md) ### Phase 7: 深度挖掘 (语言学与早期工作) *献给完美主义者。追根溯源。* - [ ] **[2022] A Recipe for Arbitrary Text Style Transfer** -> 📝 [笔记](./07-Linguistics/Style-Transfer/notes.md) - [ ] **[2022] The MultiBERTs** -> 📝 [笔记](./07-Linguistics/MultiBERTs/notes.md) - [ ] **[2021] Frequency Effects on Syntactic Rule Learning** -> 📝 [笔记](./07-Linguistics/Frequency-Effects/notes.md) - [ ] **[2021] Good-enough Example Extrapolation** -> 📝 [笔记](./07-Linguistics/Example-Extrapolation/notes.md) - [ ] **[2021] A Cognitive Regularizer for Language Modeling** -> 📝 [笔记](./07-Linguistics/Cognitive-Regularizer/notes.md) - [ ] **[2021] Mitigating Political Bias in Language Models** -> 📝 [笔记](./07-Linguistics/Political-Bias/notes.md) - [ ] **[2021] A Survey of Data Augmentation Approaches for NLP** -> 📝 [笔记](./07-Linguistics/Data-Aug-Survey/notes.md) ### *(所有论文列表请访问 [Jason Wei's Papers](https://www.jasonwei.net/papers))* --- ## 🎧 直觉泵 (Intuition Pumps: Blogs & Talks) 在深入公式之前,请先阅读博客和观看视频。这些内容按主题整理,建议优先阅读“核心直觉”。 ### ✍️ 博客文章 (Blogs) #### 🌟 核心直觉 (Must Reads) - **[2025] [Asymmetry of verification and verifier’s rule (验证的不对称性与验证者法则)](https://www.jasonwei.net/blog/asymmetry-of-verification-and-verifiers-law)** *理解 System 2 和 O1 推理能力的关键。* - **[2023] [Six intuitions about large language models (关于大模型的 6 个直觉)](https://www.jasonwei.net/blog/some-intuitions-about-large-language-models)** *必读:LLM 入门经典。* #### 🧠 科研心法 & 职业建议 (Philosophy & Career) - **[2025] [Life lessons from reinforcement learning (从强化学习中学到的人生经验)](https://www.jasonwei.net/blog/life-lessons-from-reinforcement-learning)** *Exploration vs Exploitation 的生活哲学。* - **[2025] [AI research is a max-performance domain (AI 研究是一个极值表现领域)](https://www.jasonwei.net/blog/ai-research-is-a-max-performance-domain)** *为什么只要有一个长板就足够了?* - **[2025] [Dopamine cycles in AI research (AI 研究中的多巴胺循环)](https://www.jasonwei.net/blog/e2jwko63n64ot4kkvqdd7nlejd2m3x)** *如何管理科研中的焦虑与兴奋。* - **[2023] [Practicing AI research (练习 AI 研究)](https://www.jasonwei.net/blog/practicing-ai-research)** - **[2023] [Observations from tracking Twitter (观察 Twitter 的一些发现)](https://www.jasonwei.net/blog/tweet-tracking)** - **[2022] [Research I enjoy (我喜欢的研究)](https://www.jasonwei.net/blog/research-i-enjoy)** #### 🔬 技术深思 (Technical Deep Dives) - **[2024] [Successful language model evals (成功的语言模型评估)](https://www.jasonwei.net/blog/evals)** - **[2023] [Common arguments regarding emergent abilities (关于涌现能力的常见争论)](https://www.jasonwei.net/blog/common-arguments-regarding-emergent-abilities)** - **[2022] [137 emergent abilities of large language models (大语言模型的 137 种涌现能力)](https://www.jasonwei.net/blog/emergence)** ### *(博客原文请访问 [Jason Wei's Blog](https://www.jasonwei.net/blog))* --- ### 📺 演讲与演示 (Talks & Media) #### 🎓 核心演讲 (Keynotes & Lectures) - **[2025 Oct] Stanford AI Club: 3 Key Ideas in AI in 2025** ([YouTube](https://www.youtube.com/watch?v=b6Doq2fz81U) ) *最新演讲,关于未来 AI 的三个核心判断。* - **[2024 Apr] Guest lecture, Stanford CS25** ([YouTube](https://www.youtube.com/watch?v=3gb-ZkVRemQ)) *大模型直觉的进阶版讲解。* - **[2023 Jan] Guest lecture, Stanford CS25** *经典入门:Some intuitions about LLMs。* #### 🍓 OpenAI o1 相关 (Demos) - **[2024 Dec] OpenAI o1 in ChatGPT** ([YouTube](https://www.youtube.com/watch?v=iBfQTnA2n2s)) - **[2024 Sep] Writing puzzles with OpenAI o1** ([YouTube](https://www.youtube.com/watch?v=AjmkEvuNl7w)) - **[2024 Sep] Video game coding with OpenAI o1** ([YouTube](https://www.youtube.com/watch?v=T0IrhzrhR40)) #### 🏫 学术讲座列表 (Archive) *(注:以下讲座通常涵盖当时最新的论文工作,可按年份搜索相关 Slides)* - **2025**: Columbia DAPLab, Stanford GSB. - **2024**: Princeton PLI, UPenn CIS7000, UC Berkeley AI Summit, OpenAI DevDay SF, Stanford NLP Seminar, WebConf LLM Day. - **2023**: Samsung AI Forum, UC Berkeley ML, KDD LLM Day, Dartmouth, NYU, MIT, Harvard. - **2022**: Stanford CS224v, NYU, JHU, Amazon AWS, Princeton NLP. ### *(Talks & Media原文请访问 [Jason Wei个人主页](https://www.jasonwei.net/))* --- ## ⚡ 如何使用本仓库 1. **Clone 代码:** `git clone https://gitee.com/YOUR_USERNAME/learning-jason-wei.git` 2. **选择阶段:** 不要试图一口气吃成胖子。建议从 **Phase 0** 或 **Phase 2** 开始。 3. **阅读与复现:** * 打开 `02-Reasoning-CoT/CoT/notes.md`。 * 阅读论文。 * **运行代码** (我在相关文件夹中预留了 `demo.py` 的位置)。 4. **输出/教学:** 在 Issue 或 PR 中分享你的学习心得。 > "理解不是一种状态,而是一个实践的过程。" (Understanding is not a state of being, it's a process of doing.) Happy Hacking! 🚀