Submissions from transformer-circuits.pub

		Towards Monosemanticity: Decomposing Language Models with Dictionary Learning (transformer-circuits.pub)
		72 points by goodmachine on Oct 6, 2023 \| past \| 5 comments
		In-Context Learning and Induction Heads (transformer-circuits.pub)
		2 points by ZeljkoS on Sept 6, 2023 \| past
		Toy Models of Superposition (2022) (transformer-circuits.pub)
		46 points by ZeljkoS on Aug 21, 2023 \| past \| 4 comments
		Anthropic’s Transformer Circuits Publication (transformer-circuits.pub)
		2 points by tim_sw on May 24, 2023 \| past
		A Mathematical Framework for Transformer Circuits (transformer-circuits.pub)
		1 point by ZeljkoS on Feb 7, 2023 \| past
		Superposition, Memorization, and Double Descent (transformer-circuits.pub)
		69 points by lamename on Jan 5, 2023 \| past \| 7 comments
		Toy Models of Superposition (transformer-circuits.pub)
		2 points by yyyk on Sept 17, 2022 \| past
		Toy Models of Superposition [in Neural Networks] (transformer-circuits.pub)
		1 point by escape_goat on Sept 16, 2022 \| past
		Toy Models of Superposition (transformer-circuits.pub)
		6 points by ctoth on Sept 14, 2022 \| past
		Can we reverse engineer transformer models into human-understandable programs? (transformer-circuits.pub)
		33 points by apsec112 on Dec 22, 2021 \| past \| 4 comments