GeoPackage Routing: Executing Advanced Analysis Offline

Contributed by: 
David Wilson, Strategic Alliance Consulting, Inc.

GeoPackage is often seen as a container that stores geospatial information, but at its core, GeoPackage is simply a SQLite database. This opens up a realm of possibilities to interact with and exploit its contents. Using this advantage, we can incorporate advanced analysis into our data, or augment with other data types like basemaps, elevation, and other forms of media, all within the same package. An additional benefit of SQLite is that it’s portable, making it usable in mobile applications - even when disconnected from the rest of the world. Usefully, GeoPackage can even accomplish offline Network Routing.

Network Routing in the GIS arena often applies to finding the shortest or best route between two locations. In our technology-driven world, access to these types of applications and analyses has never been greater, and we are now seeing a renaissance of both commercial and open-source technologies to meet consumer demands. However, many of these applications require access to the internet or some form of connected server. Those of us who deal with GIS in disconnected or low-band width environments are met with a myriad of challenges. 

GeoPackage helps assist in this effort by providing an open, common approach to effectively store, manage, and exploit geospatial data all in one file. This post demonstrates some very basic approaches to incorporating transportation data from OpenStreetMap (OSM) into a routable GeoPackage, while highlighting the challenges and advantages of using GeoPackage to support routing. While what's discussed here is a simple implementation, the intent is to showcase the extensibility and potential that GeoPackage brings to GIS and expand its reach into the broader community.

Network Routing

Network Routing is derived from Graph Theory, where a graph is a representation of relations between objects. In navigation, we use weighted graphs representing our transportation objects (e.g. roads, sidewalks, etc.) to determine the cost/distance to all connected neighbors, then determine the minimum cost to a desired point. In traditional vector-based GIS, we visually describe a map in a series of points, lines, and polygons. Whereas in network data, we describe relationships between two or more entities or objects. The way/node representation of OSM feature data reflects this.

To start, we collect some OSM data, which can be provided in a series of formats. Since we want to keep the node/way structure intact, we start with the OSM XML file format from GeoFabrik.

Great… I have data. Now what do I do with it?

The first challenge we run into is that GeoPackage (and SQLite for that matter) don’t contain geometry functions like its big brother PostgreSQL/PostGIS. Some data conditioning is needed prior to placing our network data into GeoPackage. The first series of steps we take is to read our OSM data to a PostgreSQL server and perform the following:

  • Split up the OSM ways so that each way (edge) can be represented by a start and end node
  • Create the LINESTRING that represents the new split way and calculate Length (PostGIS)
  • Aggregate the OSM way_tags into a searchable object (optional)

Several different methods can be used to read this particular format of data into PostgreSQL. For this example, we used Osmosis to maintain the raw OSM structure. We also filter out any feature (way) that is not involved with transportation during import:

osmosis --read-pbf ./district-of-columbia-latest.osm.pbf --accept-tags=highway --ws :: PostgreSQL Database Credentials


We then have several tables generated in our database, but we only need to work with the nodes, way_nodes, and way_tags tables.

1. Split up the Ways

The first data conditioning point is to split the OSM ways so that each way has a start and end point. This allows us to represent OSM data as a graph so that each way that connects to another has a shared node at the end of the incoming way and at the beginning of the on-going way. A simple method can query the way_nodes table, then join the way_nodes table with an altered sequence_id to create a start and end node between every grouping of two nodes for each way. We calculate a new id to represent the split way.

View the example code on GitHub

Output:

Original:    
way_id node_id sequence_id
506705172 49729205 0
506705172 49748266 1
506705172 49748269 2
506705172 49748270 3
506705172 2424854417 4
506705172 49748272 5
506705172 49748273 6

 

Split ways:        
split_way_id old_way_id start_node end_node sequence_id
1 506705172 49729205 49748266 0
2 506705172 49748266 49748269 1
3 506705172 49748269 49748270 2
4 506705172 49748270 2424854417 3
5 506705172 2424854417 49748272 4
6 506705172 49748272 49748273 5

 

2. Create the Line Geometry

The second step is to generate the LINESTRING geometry between the start and end node. This provides a geometric representation of the split way. This step is necessary to provide a visual route that can be viewed in a GIS along with other geospatial information contained within the GeoPackage. Additionally, we'll calculate the length in meters of the newly drawn line to use as weight (i.e., cost) in route solving. Node geometries are contained within the nodes table, so it is joined twice to provide points for the start and end node then some PostGIS functions are used to create the geometry and length.

View example code on GitHub

The last conditioning step will aggregate the Key/Value tag attributes into a searchable object. Attributes for each way are stored in the way_tags table. This step isn't entirely necessary but does eliminate complex joins later on in the process. For SQLite, we can use JSON objects with the JSON1 functions to create a searchable column within the database.

View example code on GitHub

Output:

Way Tags (Original)    
way_id k v
506705172 HFCS Minor Arterial
506705172 highway tertiary
506705172 lanes 2
506705172 name 34th Street Northwest
Aggregate  
way_id JSON tags
506705172 {"lanes": "2", "HFCS": "Minor Arterial","name": "34th Street Northwest","highway": "tertiary"}

 

With the data conditioned, we can import this table into our GeoPackage along with other features, raster, or other geospatial related information. Then it's on to routing.

On To Routing!

There are several open source options for routing on OSM data. However, many of them are tied to an existing file encoding, but can be extended to account for GeoPackage. Plus, many of these routing applications employ the same base set of shortest path algorithms, such as Dijkstra's Algorithm, Bellman Ford's Algorithm, Floyd-Warshall's Algorithm, etc. They can also provide the ability to create a graph based on a given set of geometry. Since we've already established our graph, we can simply use any of these algorithms by calling the graph from the GeoPackage database, find the shortest path, then create a spatial view of the route.

In this section, we employ the following set of scripts in Python, but are generic enough to be applied to other environments.

  • SQLite3 module directly interacts with the GeoPackage
  • Dijkstar 2.4.0 modules to solve the weighted graph

View example code on GitHub

To account for the directionality of one-way streets, we need to use the OSM Tag oneway to provide context on how each way should be traversed. Our current graph table records the start and end point of each way, so directionality could be affected either from start to end, vise-versa, or no impact at all if you can travel in both directions.

In the OSM Tag Schema, oneway can have the following values that restrict direction of travel:

  • yes or 1 restricts travel to the direction the way was digitized
  • -1 restricts travel to the reverse direction the way was digitized

The snippet linked to below calls the graph table twice, to account for both directions on way 62012, which only allows travel in one direction (from 49370759 to 49170395). If a way violates the directionality, it will receive an impractical cost to traverse. Otherwise it will return the length of the way.

View example code on GitHub

Output:      
From To Way Cost
49170395 49370759 62012 1e+10
49370759 49170395 62012 32.8212

We additionally don’t want to route vehicles over sidewalks or people over primary roads, therefore we break up the various highway values into specific profiles. This only defines which ways can be routed on but doesn’t define hierarchies between them (that’s for another post).

View example code on GitHub

Now we finally bring it all together. Calling and weighting the graph, determining the start and end points (node_ids), solving the route, then creating a GeoPackage Spatial View of the ways along the route for analysis in our GIS.

View example code on GitHub

Geopackage Route
Vector: Spatial View of Solved Route Washington, DC © OpenStreetMap, Contributors. Imagery: National Agriculture Imagery Program (NAIP), © USGS, 2015

What’s been demonstrated here is just a simplified approach to extending routing capabilities into GeoPackage. Using the full power of SQLite, we have the ability to interact with our data directly in and out of a traditional GIS environment, opening up possibilities where GeoPackage can have the most impact.

David Wilson is a Geospatial Engineer for Strategic Alliance Consulting, Inc that specializes in Geospatial interoperability with a focus on GeoPackage standards compliance, testing and use. David has over 10 years’ experience working in the Army and the National Geospatial-Intelligence Agency’s (NGA).