[R] Least Cost Path analysis

In 2018 I wrote my Master thesis, titled “Evaluating Arabia´s trade routes with Least-Cost path analysis” which became my first major coding project.

The Abstract

The aromatics trade between southern Arabia and the Mediterranean world was one of the most extensive trade patterns in Antiquity with significant impact on the involved regions. Part of Its infrastructure was a network of camel-caravan-operated trade routes across the Arabian Peninsula. This study aims to identify factors that influenced the formation of this route network. In the first part routes where reconstructed based on writings of Greek Geographers describing the trade. These reconstructions were refined with topographical maps of the regions and current state archaeological research. In the second part the same routes were modelled using Least Cost Path analysis (lcpa). The results of both parts were compared with each other in order to define whether cost-efficiency was the sole and primary motivation and/or to which extent different factors influenced the formation of these routes. The study concerns the state of the route network at the zenith of the aromatics trade between the 2nd century bce and the 1st century ce. The study area is limited to the northern half of the trade route network, running from Medina in western Saudi Arabia to the port of Gaza in the Palestine Territories.

Put shortly, the aim was to collect what we knew about the ancient trade routes on the Arabian Peninsula, termed the “Incense Route“, and compare that with a computer model of the most efficient routes. Differences and similarities would help understand the efficiency of the ancient trade route and give insights about their creation. To read the full abstract click on the headline above.

Current state of research on trade routes related to the ancient Incense Trade (Franke & Gierlich, Roads of Arabia, 2012)

To find the most efficient route i employed a method called Least Cost Path Analysis. It is a computational modelling method based on the observation that humans always tend to optimize their movement. For an introduction into the method click on the headline below.

The Method

LCPA is a computational modelling method based on Zipf’s principle of least effort[1] which assumes that humans generally tend to optimize their movements. It is based on two elements: a digital model of a landscape and a mathematical model that quantifies the effort of movement across it, termed a cost function. The most common models are based on the effort of overcoming height differences, so-called slope-based cost functions, and employ either time or energy expenditure as cost currency.[2] This type of cost functions requires a landscape model with height values, usually a Digital Elevation Model (dem)[3]. The result is a Cost surface[4], a raster mapping the effort of moving across the landscape from a chosen starting point. To that surface a pathfinding algorithm[5] is applied, so as to find the path that accumulates the least amount of whatever currency the cost model is based on.


[1] Zipf, 1949

[2] Cost currency: the numerical scale upon which the Cost function is built. A Cost function using time as currency will use time units like seconds to quantify the effort of movement. A Cost function using energy expenditure will use units like Joule.

[3] DEM: a digital raster file where each cell of the raster is attributed with an elevation value, the most basic formats being ASCII or TIFF. Geographic referencing is always needed, which led to adaptations of those formats for DEM´s, like GeoTIFF.

[4] Cost surface: a modified DEM where each cell is attributed with the accumulated cost of reaching that cell from a chosen starting point. Since it is only a content modification of a DEM it uses the same format, commonly a raster file of the TIFF format.

[5] Pathfinding algorithm: a sequence of instructions designed to identify the shortest path between two points. The most commonly used on is Dijkstra´s algorithm (Edsger W. Dijkstra 1959; Soltani et al. 2002).

There are many ways to run basic Least Cost path analysis in GIS programs. In ArcGIS there is the Spatial Analyst Toolbox and  in QGIS quite a number of Plugins can be found. But those solution often lack the ability to adjust certain parameters  or they are limited to certain cost algorithms. The only way to fully customize the analysis, at least in 2018, was to program it myself. In the field of Geostatistics, R is the language of choice.

Based on Modelling Human Behaviour in Landscapes by Narkoinz and Knitter and with the functions provided by the R package gdistance i developed the workflow that is outlined below. The script is public on Github.

Schematic workflow of the LCPA

The analysis was successfull. I was able to confirm my initial hypothesis that the trade routes used for high-value inter-regional trade were quite efficient and the trade routes connected to more local trade patterns less so. I was able to establish correlations between trade route characteristics and landscape characteristics which provided potential insights into the formation process of the “Incense Route”.

Trade routes in NW Arabia in the late 1st millenium BC / early 1st millenium AD (major in red, minor in black, calculated Least Cost paths in green) Staedtler 2021 forthcoming

A special challenge was the hardware requirements. The limitation that comes with the gdistance package is that the calculations are run inside the RAM. Considering a route of approximately 200km length resulted in a DEM raster of around 200 MB size. The intermediary product, called transition matrix, was more than 50 GB in size. Luckily, through my job at TOPOI, a research institute for landscape archaeology, i had access to Computers with 64GB RAM. But if i plan to continue the research on even larger route systems i will need to opt for hosted HPC such as Amazon AWS or clusters that belong to universities.

An realisation i had was that i particularly enjoyed the coding part of this project. Getting into the mindset of programming with its own logic, that is somewhat removed from the mindset needed for research in Humanities, turned out to be no obstacle but an enjoyable challenge. I drew the  conclusion for myself was that i wanted focus more on programming in the future.