Independent research lab. Formal foundations of AI reasoning.

Oddur Sigurdsson

oddur@serre.ai


Research

On the Reasoning Gaps of Large Language Models

A Formal Characterization

NeurIPS 2026 Pre-Print

176,000 evaluations across 12 models and 9 benchmark tasks reveal systematic, predictable failure modes in LLM reasoning that correlate with computational complexity.

The Computational Complexity of Verifying LLM Outputs

ICLR 2027 In Progress

Formal complexity-theoretic framework for understanding when and why LLM outputs can be efficiently verified, with implications for scalable oversight.

A Taxonomy of Failure Modes in LLM-Based Autonomous Agents

ACL 2027 In Progress

Empirical taxonomy of how autonomous LLM agents fail in practice, drawn from 50+ real deployment incidents across research and production systems.

Impossibility Results for Unsupervised Self-Improvement in Language Models

ICLR 2027 Early Stage

Theoretical bounds on what language models can learn from their own outputs without external signal.


We prove theorems about what language models can and cannot do. Our work spans computational complexity, verification theory, and empirical evaluation at scale.

More about the lab