Back to feed

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

Simon Willison's Weblog

Mar 30, 2026

3/30/2026

Public-Domain Victorian Corpus Enables Local LLM but Requires Scale and Post-Training Enhancements for Utility

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer · Simon Willison's Weblog

Science, Technology & Innovation · Mar 30, 2026

A fully local LLM (“Mr. Chatterbox”) trained exclusively on 28,035 Victorian-era public-domain books (~2.93B filtered tokens) as a 340M-parameter model shows that public-domain-only, historically pure training can produce a tiny deployable local model but yields Victorian-flavored, often unhelpful, Markov-like responses and lacks modern instruction-following without more data, larger models, or post-training methods.


3/30/2026

Low Friction Local Deployment Of Niche Foundation Models Through Small Weights And Lightweight Tooling Improves Reproducibility And Shifts Differentiation To Data And Post-Training

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer · Simon Willison's Weblog

Science, Technology & Innovation · Mar 30, 2026

The project shows that small specialized model weights plus lightweight, modular tooling enable low-friction local experimentation and reproducible distribution—Simon Willison ran a 2.05GB model locally using Karpathy’s nanochat stack, downloaded weights, and a Claude Code–assembled plugin exposed via CLI, implying deployment barriers are falling and differentiation will shift to dataset and post-training choices.


3/30/2026

Modern Synthetic Fine-Tuning Enables Conversation And Undermines Public-Domain Training Purity Claims

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer · Simon Willison's Weblog

Science, Technology & Innovation · Mar 30, 2026

The project achieved conversational Victorian-style behavior not by Victorian texts alone but by using modern models (Claude Haiku and GPT-4o-mini) to generate synthetic supervised fine-tuning data—weakening claims of purely pre-1900 training and highlighting the need to disclose post-training data that introduces later-model knowledge and behavior.


3/30/2026

Undertrained Models Relative To Their Size Need More Training Data To Realize Capacity And Improve Conversational Performance

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer · Simon Willison's Weblog

Science, Technology & Innovation · Mar 30, 2026

The model underperforms because its training tokens (2.93B) are far below the Chinchilla-recommended ~7B for a 340M-parameter model, producing an undertrained, narrowly-sourced model that likely needs ~4x more data (or larger models/post-training strategies) before becoming a useful conversational partner—creating economic trade-offs for public-domain model builders despite avoiding licensing costs.


3/30/2026

Quality Filtering Significantly Reduces Usable Pretraining Data Size

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer · Simon Willison's Weblog

Science, Technology & Innovation · Mar 30, 2026

Trip Venturella’s writeup shows that two-stage quality filtering—restricting to works contemporaneous with Queen Victoria and requiring OCR confidence ≥0.65—reduced the British Library’s nineteenth-century corpus to 28,035 books (~2.93 billion tokens), demonstrating that digitization quality and temporal scope can materially shrink public-domain training data and constrain model performance.