In a previous tutorial, we talked about the Depth First Search algorithm where we visit every point from A to B and that doesn’t mean that we will get the shortest path.

In this tutorial, we will implement Dijkstra’s algorithm in Python to find the shortest and the longest path from a point to another.

One major difference between Dijkstra’s algorithm and Depth First Search algorithm or DFS is that Dijkstra’s algorithm works faster than DFS because DFS uses the stack technique, while Dijkstra uses the heap technique which is slower.

## Pathfinding Problem

Pathfinding is so prevalent that much of the job must be automated through the use of computer systems and pathfinding algorithms to keep up with our routing needs. However, this shift to computer systems comes with a unique set of challenges to overcome.

The first obstacle we are faced with when writing a pathfinding algorithm is one of representation. We need our computer to contain a model of the system we are trying to investigate that it can manipulate and on which it can perform calculations.

One such model is the mathematical object known as a graph (depicted below):

A graph is simply a set of nodes connected by edges. It may be helpful to draw an analogy to a city’s road system. In our analogy, nodes correspond to intersections and edges represent the streets between those intersections.

Each edge is assigned a value called a cost which is determined by some measure of how hard it is to travel over this edge.

In our streets analogy, a low cost edge is a road that is quick and easy to travel like a multi-lane highway with a high speed limit. Conversely, a high cost edge might represent an alley or a particularly congested street.

## Adjacency List Representation

This graph can mathematically formalize our road system, but we still need some way to represent it in code.

One way to do this is with adjacency lists which is a method of storing our graph in memory by associating each node with its neighbors and the cost of the edge between them. In Python, we can do this with a dictionary (other languages might use linked lists). For instance:

dictionary_graph={'A':{'C':5,'D':1,'E':2},'E':{'A':2,'F':3},'D':...}

As you can see, the dictionary in dictionary_graph[‘A’] contains each of A’s neighbors and the cost of the edge between A and that neighbor, which is all the information we need to know about A.

If we record the same information about all nodes in our graph, then we will have completely translated the graph into code.

It is important to note that a graph could have two different cost values attached to an edge corresponding to different directions of travel.

For example, moving from A to E could have a cost of two while moving from E to A costs 9. In our roads analogy, this might represent one-way roads that are easy to travel in one direction but exceedingly hard to travel in the other.

If our graph contained such double valued edges, we could simply store the different edge costs under the different keys of our graph dictionary with some standard for which value gets saved to which key. For example:

dictionary_graph={'A':{...,'E':2}...,'E':{...,'A':9}}

Here, we have opted to store the cost of edge A->E under the ‘A’ key of dictionary_graph while we store the cost of edge E->A under the ‘E’ key.

## Adjacency Matrix Representation

Another method of representing our graph in code is with an adjacency matrix. An adjacency matrix organizes the cost values of our edges into rows and columns based on which nodes each edge connects.

This is similar to an adjacency list in that it records neighbor and edge cost information for every node, but with a different method of information storage.

Let’s put together an adjacency matrix to see how it works. First, we assign integer indices to our nodes making sure to start our indices at 0. (i.e. A=0, B=1, C=2…).

We then initialize an N by N array where N is the number of nodes in our graph. We will use NumPy array to build our matrix:

import numpy as np n=9 adjacency_matrix_graph=np.zeros((n,n))

Now we can start populating our array by assigning elements of the array cost values from our graph. Each element of our array represents a possible connection between two nodes.

For instance, element (0,2), corresponding to the number in row 0 column 2, should be filled with the cost value of the edge between nodes A and C which is 5. We can assign a 5 to element (0,2) with:

adjacency_matrix_graph[0,2]=5

The empty (left) and fully populated (right) arrays can be seen below:

As you can see, the adjacency matrix contains an element for every possible edge connection even if no such connection exists in our graph.

In this case, the edge cost is given a value of 0. Additionally, the main diagonal of this array always contains zeros as these positions represent the edge cost between each node and itself which is definitionally zero.

The adjacency matrix can easily hold information about directional edges as the cost of an edge going from A to C is held in index (0,2) while the cost of the edge going from C to A is held in (2,0).

## Computation Time and Memory Comparisons

The adjacency list and adjacency matrix representations are functionally the same, but there are differences when it comes to factors such as size of representation in memory and speed of performing actions.

The adjacency list only has to store each node once and its edges twice (once for each node connected by the edge) making it O(|N|+|E|) where E is the number of edges and N is the number of nodes.

By contrast adjacency matrix will always require an NxN array to be loaded into memory making its memory space O(|N^2|). Extra space is required because the adjacency matrix stores a lot of redundant information such as the value of edges that do not exist.

Once our graph representations are stored in memory, the only action we perform on them is querying for entries. Because the adjacency matrix can query any location directly when supplied with two indices, so its query complexity time is O(1).

The adjacency list representation is a bit more complicated. Normally, adjacency lists are built with linked lists which would have a query time complexity of O(|N|), but we are using Python dictionaries that access information differently.

Python dictionaries have an average query time complexity of O(1), but can take as long as O(|N|).

## Difficulties of Pathfinding

Now that we can model real-world pathing systems in code, we can begin searching for interesting paths through our graphs computationally.

For many applications, we are looking for the easiest way to get from a starting location to a given destination. This would correspond to the path with the lowest total cost in our graph.

To find such a path, we would need a way of knowing whether a given path is shorter than all other possible paths. We could simply find all possible paths from A to B along with their costs and pluck out the shortest one.

This would work fine on a graph as simple as the one we are considering, but this method is inefficient and quickly becomes intractable for larger and more complicated networks.

What we would like is an algorithm that searches through the most promising paths first and can halt once it has found the shortest path.

Dijkstra’s algorithm fulfills both of these requirements through a simple method. It starts at a source node and incrementally searches down all possible paths to a destination.

However, when deciding which path to increment it always advances the shortest current path. By doing so, it preferentially searches down low cost paths first and guarantees that the first path found to the destination is the shortest.

## Dijkstra’s Shortest Path: Python Setup

Let’s walk through a couple iterations of Dijkstra’s algorithm on the above graph to get a feel for how it works. We will be using the adjacency list representation for our graph and pathing from node A to node B.

graph={'A':{'C':5,'D':1,'E':2},'B':{'H':1,'G':3},'C':{'I':2,'D':3,'A':5},...}

We will want to keep track of the cost of pathing from our source node to all other nodes in our graph. We can do this with another dictionary.

During our search, we may find several routes to a given node, but we only update the dictionary if the path we are exploring is shorter than any we have seen so far.

from numpy import inf costs={'A':0'B':inf,'C':inf,'D':inf,'E':inf,'F':inf,'G':inf,'H':inf,'I':inf}

To begin, we assume that the cost of getting from our source node (A) to any other node is infinite.

This represents both our lack of knowledge about each path as well as the possibility that certain nodes are impossible to reach from our source node. The cost of pathing from A to A is definitionally 0.

As we discover the shortest path to a given node and record it in our costs dictionary, we will also want to keep track of which nodes this path goes through. We can store this information in another dictionary.

parents{}

Rather than storing the entire path to each node, we can get away with storing only the last step on the path. This is because the previous node on our path also has an entry in our dictionary as we must have pathed to it first.

Therefore, we can simply look back to the last step on the previous node’s path. Repeating this until we reach the source node will reconstruct the entire path to our target node.

## Dijkstra’s Shortest Path: Step by Step

To follow Dijkstra’s algorithm we start on node A and survey the cost of stepping to the neighbors of A. If we come across a path with a lower cost than any we have recorded already, then we update our costs dictionary.

As this is our first survey, all costs will be updated and all steps will be recorded.

Once a node has been explored it is no longer a candidate for stepping to as paths cannot loop back onto themselves. We therefore remove it from the cost dictionary and adjacency dictionaries of its neighbors. This can all be executed with the following snippet.

for neighbor in graph['A']: if graph['A'][neighbor] + costs['A'] < costs[neighbor]: costs[neighbor] = graph['A'][neighbor] parents[neighbor] = 'A' del graph[neighbor][A] del costs['A']

In the second line, we add the cost of the path to the node we are currently on to the cost of pathing to the neighbor under consideration because we care about the cost of pathing from A to each node, not just the cost of any given step.

We then determine the shortest path we can pursue by looking for the minimum element of our costs dictionary which can be returned with:

nextNode=min(costs,key=costs.get)

In this case, nextNode returns D because the lowest cost neighbor of A is D. Now that we are at D, we survey the cost of pathing to all neighbors of D **and** the univisited neighbors of A.

Given that we have already recorded the costs of pathing to neighbors of A, we only need to calculate the cost of pathing to neighbors of D.

However, finding the cost of pathing to neighbors of D is an identical task to what we just performed with A, so we could simply run the above code replacing ‘A’ with nextNode.

## Putting it all Together

Now that we understand the individual steps in Dijkstra’s algorithm, we can loop over our data to find the shortest path.

from numpy import inf graph = {'A': {'C': 5, 'D': 1, 'E': 2}, 'B': {'H': 1, 'G': 3}, 'C': {'I': 2, 'D': 3, 'A': 5}, 'D': {'C': 3, 'A': 1, 'H': 2}, 'E': {'A': 2, 'F': 3}, 'F': {'E': 3, 'G': 1}, 'G': {'F': 1, 'B': 3, 'H': 2}, 'H': {'I': 2, 'D': 2, 'B': 1, 'G': 2}, 'I': {'C': 2, 'H': 2}} costs = {'A': 0, 'B': inf, 'C': inf, 'D': inf, 'E': inf, 'F': inf, 'G': inf, 'H': inf, 'I': inf} parents = {} def search(source, target, graph, costs, parents): nextNode = source while nextNode != target: for neighbor in graph[nextNode]: if graph[nextNode][neighbor] + costs[nextNode] < costs[neighbor]: costs[neighbor] = graph[nextNode][neighbor] + costs[nextNode] parents[neighbor] = nextNode del graph[neighbor][nextNode] del costs[nextNode] nextNode = min(costs, key=costs.get) return parents result = search('A', 'B', graph, costs, parents) def backpedal(source, target, searchResult): node = target backpath = [target] path = [] while node != source: backpath.append(searchResult[node]) node = searchResult[node] for i in range(len(backpath)): path.append(backpath[-i - 1]) return path print('parent dictionary={}'.format(result)) print('longest path={}'.format(backpedal('A', 'B', result)))

Running this code produces the output:

parent dictionary={'C': 'D', 'D': 'A', 'E': 'A', 'H': 'D', 'F': 'E', 'I': 'H', 'B': 'H', 'G': 'H'} longest path=['A', 'D', 'H', 'B']

Success! The code within the while loop inside the search function is identical to what we saw above except for replacing the static node ‘A’ with the dynamic variable nextNode.

This function returns the parents dictionary which stores the shortest path by correlating each node with the previous node on the shortest path.

In this example, ‘B’ points to ‘H’ which points to ‘D’ which points back to ‘A’. The backpedal function loops over the parent dictionary output by the search function and returns a reconstructed shortest path in the form of a list.

## Longest Path and Maze Solving

Dijkstra’s algorithm can be modified to solve different pathfinding problems. For example, these slight adjustments to lines 5, 12, and 17 change our shortest-path-finding algorithm into a longest-path-finding algorithm.

5 costs = {'A': 0, 'B': -inf, 'C': -inf, 'D': -inf, 'E': -inf, 'F': -inf, 'G': -inf, 'H': -inf, 'I': -inf} ... 12 if graph[nextNode][neighbor] + costs[nextNode] > costs[neighbor]: ... 17 nextNode = max(costs, key=costs.get)

These changes amount to initializing unknown costs to negative infinity and searching through paths in order of highest cost. Running our code after making these changes results in:

Dijkstra can also be implemented as a **maze solving algorithm** simply by **converting the maze into a graph**.

This can be done by carving your maze into a grid and assigning each pixel a node and linking connected nodes with equal value edges. However, with large mazes this method can start to strain system memory.

This problem can be mitigated by removing redundant nodes. For example, this section of maze (left) is identically represented by both graphs shown below.

“Solving” a maze would then amount to setting the entrance of the maze as an input node and the exit as the target node and running Dijkstra’s like normal.

Dijkstra’s has a couple nice properties as a maze finding algorithm. Because it does not search nodes more than once, if a dead end or loop is encountered it will automatically jump back to the last viable junction.

In addition, if multiple solutions to the maze exist, it will find the shortest.

Enthusiastic software developer with 5 years of Python experience. Fascinated by data and analysis including a keen interest in machine learning. Always looking to learn new skills and not afraid to dive into complicated systems. A background in physics in mathematics allows for organic navigation and understanding of unfamiliar problem landscapes.

This article was originally posted on likegeeks.com. Read here