1 Star 4 Fork 0

Gitee 极速下载/transformer-debugger

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
此仓库是为了提升国内下载速度的镜像仓库,每日同步一次。 原始仓库: https://github.com/openai/transformer-debugger
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
MIT

Transformer Debugger

Transformer Debugger (TDB) is a tool developed by OpenAI's Superalignment team with the goal of supporting investigations into specific behaviors of small language models. The tool combines automated interpretability techniques with sparse autoencoders.

TDB enables rapid exploration before needing to write code, with the ability to intervene in the forward pass and see how it affects a particular behavior. It can be used to answer questions like, "Why does the model output token A instead of token B for this prompt?" or "Why does attention head H attend to token T for this prompt?" It does so by identifying specific components (neurons, attention heads, autoencoder latents) that contribute to the behavior, showing automatically generated explanations of what causes those components to activate most strongly, and tracing connections between components to help discover circuits.

These videos give an overview of TDB and show how it can be used to investigate indirect object identification in GPT-2 small:

What's in the release?

  • Neuron viewer: A React app that hosts TDB as well as pages with information about individual model components (MLP neurons, attention heads and autoencoder latents for both).
  • Activation server: A backend server that performs inference on a subject model to provide data for TDB. It also reads and serves data from public Azure buckets.
  • Models: A simple inference library for GPT-2 models and their autoencoders, with hooks to grab activations.
  • Collated activation datasets: top-activating dataset examples for MLP neurons, attention heads and autoencoder latents.

Setup

Follow these steps to install the repo. You'll first need python/pip, as well as node/npm.

Though optional, we recommend you use a virtual environment or equivalent:

# If you're already in a venv, deactivate it.
deactivate
# Create a new venv.
python -m venv ~/.virtualenvs/transformer-debugger
# Activate the new venv.
source ~/.virtualenvs/transformer-debugger/bin/activate

Once your environment is set up, follow the following steps:

git clone git@github.com:openai/transformer-debugger.git
cd transformer-debugger

# Install neuron_explainer
pip install -e .

# Set up the pre-commit hooks.
pre-commit install

# Install neuron_viewer.
cd neuron_viewer
npm install
cd ..

To run the TDB app, you'll then need to follow the instructions to set up the activation server backend and neuron viewer frontend.

Making changes

To validate changes:

  • Run pytest
  • Run mypy --config=mypy.ini .
  • Run activation server and neuron viewer and confirm that basic functionality like TDB and neuron viewer pages is still working

Links

How to cite

Please cite as:

Mossing, et al., “Transformer Debugger”, GitHub, 2024.

BibTex citation:

@misc{mossing2024tdb,
  title={Transformer Debugger},
  author={Mossing, Dan and Bills, Steven and Tillman, Henk and Dupré la Tour, Tom and Cammarata, Nick and Gao, Leo and Achiam, Joshua and Yeh, Catherine and Leike, Jan and Wu, Jeff and Saunders, William},
  year={2024},
  publisher={GitHub},
  howpublished={\url{https://github.com/openai/transformer-debugger}},
}
MIT License Copyright (c) 2024 OpenAI Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

简介

Transformer Debugger 是 OpenAI 的 Superalignment 团队开发的一款工具,旨在支持对小语言模型的特定行为进行研究 展开 收起
README
MIT
取消

发行版

暂无发行版

贡献者

全部

近期动态

不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Python
1
https://gitee.com/mirrors/transformer-debugger.git
git@gitee.com:mirrors/transformer-debugger.git
mirrors
transformer-debugger
transformer-debugger
main

搜索帮助