Research · Nathan Young

Current work

My PhD focuses on a simple hypothesis: that Transformers, when trained via stochastic gradient descent to predict next tokens, approximate Solomonoff Induction.

Solomonoff Induction is the theoretically optimal predictor — it assigns probability to sequences by weighting every computable model that could have generated them, favouring simpler ones (a formalisation of Occam's Razor). It's incomputable in practice, but it provides a mathematical ideal against which we can compare real systems.

Large Language Models (large Transformers trained on large datasets) are the best existing sequence-prediction algorithm; they generalise well and seem to effectively compress their training data. Transformers' Turing-completeness (i.e. ability to emulate every computable model) suggests that they may achieve this by approximating Solomonoff Induction — a finding reinforced by a recent DeepMind paper that found that Transformers outperform other neural networks on data sampled from a universal computer according to their simplicity. My research attempts to formalise and test this connection, aiming to help explain why Large Language Models work, and to show that they can be decomposed into discrete computational models that might be more easily understood and modified.

I'm a PhD Student at the Strong AI Lab (SAIL) at the University of Auckland.

Publications

Transformers as Approximations of Solomonoff Induction

Proceedings of the International Conference on Neural Information Processing (ICONIP 2024) · DOI

Solomonoff Induction is an optimal-in-the-limit unbounded algorithm for sequence prediction, representing a Bayesian mixture of every computable probability distribution. We hypothesise that Transformer models — the basis of Large Language Models — approximate Solomonoff Induction better than any other extant sequence prediction method, and explore evidence for and against this hypothesis along with next steps for modelling AI systems in this way.
AbductionRules: Teaching Transformers to Explain Unexpected Inputs

Findings of the Association for Computational Linguistics (ACL 2022) · DOI

Transformers pre-trained on natural language can be fine-tuned to perform logical abductive reasoning — inferring the most plausible explanation for an unexpected observation. This paper introduces AbductionRules, a dataset and methodology for training and evaluating such explanatory reasoning in language models.

Research interests

Beyond my PhD topic, I'm interested in:

AI alignment & safety Solomonoff Induction Prediction markets Decision theory AIXI & unbounded models Interpretability Futarchy

I'm also interested in prediction markets as a mechanism for improving human decision-making and coordination — I'm active on Manifold.