Posts tagged toy-models
Is induction a memorized or generalized capability?
We probe whether the repetition capability of our toy transformer reflects genuine generalisation or memorisation of the training distribution. A single-token experiment reveals an apparent illusion of generalised induction — a cautionary finding for evaluations of larger LLMs.
An introduction to our investigation into repetition capability in toy transformer models
Why we want to study repetition in toy transformer models and what we aim to investigate