An interactive visualization tool for understanding backpropagation in multi-layer perceptrons (MLPs), with a focus on demonstrating the inoculation prompting technique.
Visit the live demo on GitHub Pages!
This visualization demonstrates the Spanish/CAPS experiment from inoculation prompting research: teaching a model to capitalize responses while still responding in English, even when training data is always in Spanish and ALL-CAPS.
- Real-time Forward Pass: See activations propagate through the network
- Interactive Gradients: Click output neurons to compute gradients for different targets
- Bias Adjustments: Adjust biases at different layers to simulate inoculation effects
- Multiple Approaches: Compare steering vectors vs logit biases vs salience effects
- Visual Feedback: Color-coded connections show gradient magnitudes and directions
A blog-style presentation that walks through the concepts step-by-step:
- Baseline network setup
- Gradient visualization during training
- Inoculation via steering vectors
- Alternative approaches (logit biases, salience effects)
- Training impact heatmaps
Full-featured environment with all controls:
- Click neurons to adjust biases
- Click output neurons to select training targets
- Edit connection weights directly
- Real-time gradient computation
- Debug panel with detailed values
- Input Layer: 1 neuron (constant value = 1.0)
- Hidden Layer: 4 neurons (English, Spanish, Upper-case, Lowercase) with ReLU activation
- Output Layer: 4 neurons (english, spanish, ENGLISH, SPANISH) with Softmax
- Loss Function: Cross-entropy
- Clone the repository
- Start a local HTTP server:
python -m http.server 8765- Open your browser to
http://localhost:8765
- Push your code to a GitHub repository
- Go to your repository settings
- Navigate to "Pages" in the left sidebar
- Under "Source", select the branch you want to deploy (usually
mainormaster) - Click "Save"
- Your site will be available at
https://[your-username].github.io/[your-repo-name]/
The project is already configured for GitHub Pages with:
index.htmlas the landing page.nojekyllfile to ensure all files are served correctly
index.html- Landing page with links to slides and playgroundslides.html- Blog-style slide format for presenting conceptsplayground.html- Full interactive version with all controlsnetwork.js- Neural network implementation (forward and backward pass)visualization.js- SVG-based visualization for playgroundslide-visualization.js- Modular visualization system for slidesnetwork_node.js- Network implementation for Node.js environment
- Blue: Negative gradient (increasing weight would decrease loss)
- Red: Positive gradient (decreasing weight would decrease loss)
- Thickness: Proportional to gradient magnitude
- Inside parentheses: Activation + bias (for hidden/output) or logit (for output pre-softmax)
- Outside: Probability (for output neurons only)
- Border color: Shows bias gradient direction when target is selected
- Golden glow: Currently selected neuron for bias adjustment
- Green: Probability increases after one gradient descent step
- Red: Probability decreases after one gradient descent step
- Shows training interference between different targets
- Pure JavaScript implementation (no external ML libraries)
- SVG-based rendering for precise connection visualization
- Modular design for easy extension
- Real-time gradient computation using backpropagation
This project is inspired by research on inoculation prompting, which explores how to teach language models specific behaviors while maintaining their general capabilities.
Contributions are welcome! Feel free to open issues or submit pull requests.
MIT License - feel free to use this for educational purposes.