Davide Paglieri

I am a third-year PhD Student at UCL, advised by Tim Rocktäschel and Jack Parker-Holder at UCL DARK Lab. I was previously a Research Engineer at Bending Spoons.

I obtained my MSc in Computer Science (AI & ML) from Imperial College London, graduating with Distinction. In my Master's thesis, I explored Open-Ended Reinforcement Learning for Dynamic Robot Locomotion, advised by Antoine Cully.

Prior to that, I obtained a BSc in Computer Engineering at Politecnico di Torino, graduating with 110/110 cum Laude.

My research interests include Large Language Models, Reinforcement Learning, Diffusion Models, Open-Endedness, and generalist AI agents.

I am currently interning at Google DeepMind, working with Alexander (Sasha) Vezhnevets and Joel Z. Leibo.

Contact: paglieridavide [at] gmail [dot] com

Research

	Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents Davide Paglieri, Bartłomiej Cupiał, Jonathan Cook, Ulyana Piterbarg, Jens Tuyls, Edward Grefenstette, Jakob Nicolaus Foerster, Jack Parker-Holder, Tim Rocktäschel Preprint, Under Review Learning to allocate test-time compute for LLM agents for efficient planning.
	BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Davide Paglieri, Bartłomiej Cupiał, Samuel Coward, Ulyana Piterbarg, Maciej Wolczyk, Akbir Khan, Eduardo Pignatelli, Łukasz Kuciński,Lerrel Pinto, Rob Fergus, Jakob Nicolaus Foerster, Jack Parker-Holder, Tim Rocktäschel ICLR, 2025 Benchmarking LLM and VLM agents capabilities on long-horizon game environments such as NetHack
	Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs Davide Paglieri, Saurabh Dash, Jack Parker-Holder, Tim Rocktäschel ICML @ ES-FOMO-II, 2024 Study uncovering the effect of outliers and calibrations sets in quantization of modern LLMs
<	Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL Eduardo Pignatelli, Johan Ferret, Davide Paglieri, Samuel Coward, Tim Rocktäschel, Edward Grefenstette, Laura Toni arxiv, 2024 Evaluating Automated Reinforcement Learning Credit Assignment with LLMs
	Multi-Agent Diagnostics for Robustness via Illuminated Diversity Mikayel Samvelyan, Davide Paglieri, Minqi Jiang, Jack Parker-Holder, Tim Rocktäschel AAMAS, 2024 (oral) Uncovering vulnerabilities in multi-agent systems with the power of open-endedness.

Teaching

Spring 2025 - Open-Endedness and General Intelligence - UCL - (TA)
Fall 2024 - Deep Representations and Learning - UCL - (TA)
Spring 2024 - Applied Deep Learning - UCL - (TA)
Fall 2023 - Deep Representations and Learning - UCL - (TA)
Spring 2023 - Reinforcement Learning - UCL - (TA)
Fall 2019 - Algorithms and Programming - Politecnico di Torino - (TA)

Previous Job Experience

I previously worked as a Research Engineer at Bending Spoons where I researched, prototyped, and deployed deep learning models on several of the company's apps, Remini, Splice, Dawn AI, focusing on diffusion generative models, image enhancement and artificial slow motion.

Whilst there, I conceptualized and led the development of Dawn AI, a mobile app leveraging generative diffusion models to create AI art. Initially, the app allowed users to generate artwork from text, sketches, or images, and later expanded to include AI-generated avatars. As the AI lead, I guided the app to achieve a top ranking in the US App Store (and other regions) for three consecutive days. Eventually, Dawn AI's features were integrated into Remini AI.

As a result of my efforts on Dawn AI, I had the opportunity to present and give a demo to Tim Cook, Apple's CEO, while he was visiting our office in Milan.