Urbanly, a specialist in urban simulation, recently integrated Large Language Models (LLMs) into its CityCompass platform to improve how housing units are matched to demand. We asked founder and CEO Federico Fernandez to explain the process that led to this development and the challenges of harnessing the power of generative AI in household location choice models
Comprehending the factors that influence household location choices is pivotal for analysing urban areas through diverse lenses. From a governmental standpoint, it is essential to grasp these decisions when assessing the potential impact of infrastructure projects on residential areas. Concurrently, the real estate sector relies heavily on anticipating household location preferences to forecast growth trajectories accurately. Unravelling the intricate dynamics behind where families and individuals opt to reside becomes a cornerstone for informed policymaking, strategic planning, and market intelligence within the urban landscape.
For the past five decades, computer scientists have dedicated their efforts to developing software capable of analysing land use patterns, continuously refining its sophistication as advancements in hardware and software engineering paved the way. Drawing upon this extensive experience, Urbanly has developed CityCompass, a cloud-based land use simulation platform that harnesses the latest available technologies to create an environment conducive to policy experimentation. At the heart of its simulation engine lies the household location choice model, a powerful tool that seamlessly matches available housing units with their corresponding demand.
Generative AI: a measured approach
The advent of generative AI has prompted us to actively explore avenues for integrating this technology to complement and enhance our simulation kernels. However, a crucial challenge in this integration process has been to circumvent a common pitfall associated with generative AI – treating it as an infallible ‘source of truth’ instead of a model that can be trained to learn from real-world decisions and existing academic literature. Our approach with this experiment has been to harness the power of generative AI while recognising it as a dynamic tool that must be continuously refined and calibrated against empirical data and established research findings.
LLMs: challenges and advancements
Neural networks have been around for a long time, and we experimented with them a couple of years ago, creating small networks for decision choice, in combination with genetic algorithms with an evolving DNA. However, we swiftly encountered formidable challenges on two fronts: firstly, training custom neural networks necessitated an extensive process of trial and error to determine the optimal network architecture; secondly, the computing power available at that time posed constraints, rendering the creation of large-scale networks unfeasible. Compounding these hurdles was the absence of dedicated hardware tailored for neural network applications.
The emergence of generative AI, particularly large language models (LLMs), has reignited our belief in the potential of this technology, prompting us to re-evaluate its applicability. The advent of LLMs appears to address, to a certain extent, the two primary challenges we encountered during our previous endeavours with neural networks. Firstly, LLMs mitigate the need for intricate customisation of neural network architectures, as their pre-trained models can be fine-tuned and adapted to specific tasks. Secondly, these models harness the immense computing power facilitated by dedicated hardware, alleviating the constraints we previously faced due to limited computational resources.
The household location choice model
A household location choice model (HLCM) is a software component that Urbanly, a specialist in urban simulation, recently integrated Large Language Models (LLMs) into its CityCompass platform to improve how housing units are matched to demand. We asked founder and CEO Federico Fernandez to explain the process that led to this development and the challenges of harnessing the power of generative AI in household location choice models Generative AI for urban simulation matches available housing units in an urban environment with households, considering its main metric how well it characterises the decisions that individuals undergo when selecting their place of residence.
For taking these decisions, two types of knowledge are required: dynamic data generated by the simulation as it runs and static training data about how these decisions are typically taken in a particular urban environment.
The static side is the easiest — since the most common format for domain knowledge and training data is written text. We carefully chose literature describing varied approaches to HLCM complemented by descriptions of real world thought processes specific to the study area and incorporated them to the LLM.
On the dynamic side, given the well-documented challenges that LLMs face in processing numerical inputs, we have undertaken the development of a novel component within CityCompass, aptly named the ‘simulation entity to prompt converter’. This module serves as an intermediary, translating the intricate simulation entities into a format that aligns with the linguistic paradigm of LLMs, enabling seamless communication and integration.
Internally, CityCompass represents housing units with typed data structures composed by spatial and non-spatial attributes, such as street address, area or number of rooms. All that information can be translated to English text that can be understood by LLMs. However, there is an additional challenge: attributes like spatial location doesn’t add significant information to a model without all the associated data layers that predicate over that particular polygon on space. In concrete, what is important about a housing unit location is what is its accessibility, how close is to certain points of interest or spatial detractors. Then, we needed to augment the text-based description of the unit with all these additional layers.
In addition to characterising the supply side of housing units, we recognised the necessity of developing a dedicated component to produce LLM input about the diverse attributes and dynamics of individuals constituting households. This component describes parameters such as ages, occupations, and activities. Within the context of an Urbanly simulation, these intricate household profiles are derived from census data, which is further enriched and augmented by a sophisticated synthetic population generator.
One last simulation time data that must be shared with the LLM are current and future context factors, including the macro-economic model that is part of the simulation and planned development projects, especially in terms of future years, since expectations are a big driver for location decisions.
Having shared all the relevant information with the LLM, we focused on tailoring specific prompts to get the location decisions we needed. It is crucial to note that this process is not static in nature, as the component responsible for generating these decisions must be invoked each time a new housing unit becomes available within the simulation, whether due to relocation or the construction of additional housing stock.
As soon as we put all the pieces together, we began running simulations in study areas where we have been working in the past, and comparing results, focusing on occupancy rates and growth spatial patterns. We then experimented with different “domain knowledge scenarios”, what means considering the policy simulation as static, but varying the papers that we use to explain how location choices are taken to the LLM. This allowed us to create scenarios that favour a particular policy vision of an author, to understand how that could affect our forecasts.
Conclusion and future directions
This work has demonstrated an innovative approach to integrating generative AI, specifically LLMs, into urban simulation and modelling. By leveraging the capabilities of LLMs while carefully addressing their limitations, we have developed a novel methodology for simulating household location choice decisions within the CityCompass platform.
Furthermore, this work has highlighted the importance of incorporating diverse sources of knowledge, both dynamic simulation data and static domain knowledge from literature and real-world observations. By carefully curating and presenting this information to the LLM, we can generate diverse location decisions. Looking ahead, this work paves the way for further integration of generative AI techniques into urban simulation and modelling, opening up new avenues for exploration and innovation. As the field of generative AI continues to evolve, the proposed methodology can be adapted and refined to leverage the latest advancements, further enhancing the accuracy and robustness of urban simulations, ultimately contributing to more informed and data-driven policymaking and urban development strategies.