CUPUM 2025, UCL, London

Enhancing Spatial Reasoning and Behaviour in Urban ABMs with Large-Language Models and Geospatial Foundation Models

Nick Malleson, Andrew Crooks, Alison Heppenstall and Ed Manley

University of Leeds, UK; University at Buffalo, US; University of Glasgow, UK

n.s.malleson@leeds.ac.uk

Slides available at:
www.nickmalleson.co.uk/presentations.html

Context

Modelling human behaviour in ABMs is (still!) an ongoing challenge

Behaviour typically implemented with bespoke rules, but even more advanced mathematical approaches are limited

Can new AI approaches offer a solution?

Large Language Models can respond to prompts in 'believable', 'human-like' ways

Geospatial Foundation Models capture nuanced, complex associations between spatial objects

Multi-modal Foundation Models operate with diverse data (text, video, audio, etc.)

This talk: discuss the opportunities offered by LLMs, GSMs and MFMs as a means of creating more realistic spatial agents.

But first, where might this lead...

"All models are great, until you need them"

It's fine to use models under normal conditions. Very useful.

Especially if the system undergoes a fundamental change (COVID? Global financial crash?) -- then we really need models to help

But then they're totally useless!

Example: a burglary ABM

Worked great, until COVID...

Maybe a model with LLM-backed agents would be better able to respond after a catastrophic system change

Large Language Models (LLMs)

Early evidence suggests that large-language models (LLMs) can be used to represent a wide range of human behaviours

Already a flurry of activity in LLM-backed ABMs

E.g. AutoGPT, BabyAGI, Generative Agents, MetaGP ... and others ...

Image of the ABM created by Park et. al. — Park, Joon Sung, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2023. ‘Generative Agents: Interactive Simulacra of Human Behavior’. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, 1–22. San Francisco CA USA: ACM. DOI: 10.1145/3586183.3606763.

Example: Housing ABM with LLM agents

Can we switch agent behaviour (when to buy, sell or rent houses) from rule-based empirically-driven method to being driven by LLMs?

Traditional agents: Decisions rules are clearly defined

LLM agents: Decision rules are unknown (prompt based on demographics etc.)

Decision points aligned with real observations. Possible insight into latent variables.

Housing ABM (Gamal et al., JASSS)

This housing agent-based model (ABM) is derived from Nigel Gilbert’s 2008 work. We explore whether the agents’ behavioural rules—deciding when to buy, sell, or rent—can be replaced by large-language-model (LLM) reasoning instead of the usual empirically driven, rule-based approach. Experiments were carried out with several versions of ChatGPT.

Experiment design

The ABM is first run with a sample seed (S) and a traditional agent; we then replace that agent with an LLM-driven agent and rerun the model under identical conditions (same seed S).

Traditional agent variables

Annual income and capital
Fixed share of income spent on housing: 33 %
With seed S, the agent enters the mortgage market at time-step 32 (red dotted line in the graphs)

LLM-driven agent prompts

Annual income and capital
Demographics (e.g. age, number of children) used to infer housing-expenditure share

At each time-step the LLM receives the updated income and capital and is asked whether it wishes to enter the housing market. When it chooses to join, it also returns the percentage of income it plans to devote to housing.

Results

Graph 1 — Time-steps at which the LLM entered the mortgage market. GPT-4.1 matched the traditional agent’s decision 50 % of the time.
Graph 2 — Capital levels triggering entry; the pattern mirrors Graph 1.
Graph 3 — Reported housing-expenditure shares ranged from 27 % to 30 %, close to the 33 % assumption for traditional agents.

LLMs & ABMs: Challenges

Lots of them!

Computational complexity: thousands/millions of LLMs?

Bias: LLMs very unlikely to be representative (non-English speakers, cultural bias, digital divide, etc.)

Validation: consistency (i.e. stochasticity), robustness (i.e. sensitivity to prompts), hallucinations, train/test contamination, and others

Main one for this talk: the need to interface through text

Communicating -- and maybe reasoning -- with language makes sense

But having to describe the world with text is a huge simplification / abstraction

A solution? Multi-modal and Geospatial Foundation Models

Foundation models: "a machine learning or deep learning model trained on vast datasets so that it can be applied across a wide range of use cases" (Wikipedia)

LLMs are Foundation models that work with text

Geospatial Foundation Models

FMs that work with spatial data (street view images, geotagged social media data, video, GPS trajectories, points-of-interest, etc.) to create rich, multidimensional spatial representations

Multi-modal Foundation Models

FMs that work with diverse data, e.g. text, audio, image, video, etc.

Geospatial Foundation Model Example

Example of a foundation model constructed using OSM data

Embeddings are remarkably good at predicting things like traffic speed and building functionality (zero-shot)

Balsebre, Pasquale, Weiming Huang, Gao Cong, and Yi Li. 2024. ‘City Foundation Models for Learning General Purpose Representations from OpenStreetMap’. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 87–97. Boise ID USA: ACM.. DOI: 10.1145/3627673.3679662

Towards Multi-Modal Foundation Models for ABMs (??)

GFMs and LLMs: a new generation of ABMs?

LLMs 'understand' human behaviour and can reason realistically

GFMs provide nuanced representation of 'space'

How?

~~I've no idea!~~ Watch this space.

Insert spatial embeddings directly into the LLM?

Use an approach like BLIP-2 that trains a small transformer as an interface between an LLM and a vision-language model

Suggestions welcome!

Conceptual LLM ABM city image, from chatgpt

Summary and Outlook

Huge potential to use LLMs to drive agents in an ABM

But need to overcome some big challenges first

Lots of activity, but very little peer-reviewed