Or, determine the maximum sum of numbers from top to bottom of a triangle by following a path defined by adjacent numbers on the line below.
Problem
Given a triangle like below, you need to write an algorithm that maximizes the cost of traveling from point A to point B, where point A is the top of the triangle and point B is any one of the nodes at the bottom of the triangle.
Sample triangle:
5*
/ \
/ \
9* 6
/ \ / \
/ \ / \
4 6* 8
/ \ / \ / \
/ \ / \ / \
0 7* 1 5
The maximum sum of the path taken above is 5+9+6+7 = 27.
The triangle above is stored in a text file with the following format:
Each line is just separated by a '\n' and each number on a line is separated by a space. Nothing too fancy.
Where to start?
One of the tricks to solving this problem is recognizing that it's a graph problem. Some hints in the problem statement make this pretty clear: adjacent numbers, maximum sum/cost, moving from the top number to a bottom number, etc. The first thing you need to do is parse the text file and build the graph representation from the data. The way I do this in Python is:
lines = open(filename).readlines()
lines = [line.strip() for line in lines]
# `lines` is now a list of space-delimited strings of numbers:
# ['5', '9 6', '4 6 8', '0 7 1 5']
# create a graph from the lines:
G = TriangleGraph(lines)
Implied in the problem, the graph is a Directed, Weighted Graph
The next order of business is taking the `lines` list and turning it into a usable directed, weighted graph representation. I created a TriangleGraph class that takes the `lines` list as an init argument and handles all of the graph stuff. Here's the relevant part of the TriangleGraph initialization:
class TriangleGraph(object):
def __init__(self, lines=[]):
self.lines = lines
self.graph = []
self.numNodes = 0
# each node has some data associated with it, such as
# its value, the distance from the source vertex, and
# its immediate predecessor. A vertex's index i will map
# to graph, values, distances, and predecessors.
self.values = []
self.distances = []
self.predecessors = []
# I use adjacency lists to represent the triangle graph
# and its built during initialization.
self.build_adjacency_lists()
def build_adjacency_lists(self):
"""
Build a graph representation from self.lines using adjacency lists.
"""
# account for root node first because we need to track the other
# node's indexes as they're added to the graph.
self.numNodes = 1
# add root node to self.values first because the loop below
# only adds next lines.
self.values.append(int(self.lines[0]))
for i in xrange(len(self.lines)-1):
nextline = self.lines[i+1].split()
# add next line of values
self.values.extend([int(n) for n in nextline])
for j in xrange(len(nextline[:-1])):
# add adjacent numbers on the next line to the graph
# for the current vertex.
self.graph.append({self.numNodes: int(nextline[j]),
self.numNodes+1: int(nextline[j+1])})
# last adjacent vertex added to current vertex's adjacency
# list will be the _first_ vertex added to the next vertex's
# adjacency list, so increment numNodes by only 1.
self.numNodes += 1
# A new line is about to be added to the graph, so make sure
# the first adjacent vertex added to current vertex is indexed
# as one more than the last adjacent vertex added to the previous
# vertex. Otherwise, the last vertex on this line will be adjacent
# to the first vertex on this line.
self.numNodes += 1
# The last line of the triangle contains a list of vertices
# that don't have adjacent vertices, so add empty dictionaries.
last = self.lines[-1].split()
self.graph.extend([{} for i in xrange(len(last))])
The graph now has a usable representation in the form of adjacency lists. For the example triangle referenced at the beginning, its graph representation is now this:
# vertex u has adjacent vertices v1 and v2 with values w1 and w2:
# adjacency dict is formatted as u -> {v1: w1, v2: w2}
0 -> {1: 9, 2: 6}
1 -> {3: 4, 4: 6}
2 -> {4: 6, 5: 8}
3 -> {6: 0, 7: 7}
4 -> {8: 1, 7: 7}
5 -> {8: 1, 9: 5}
6 -> {}
7 -> {}
8 -> {}
9 -> {}
With a directed, weighted graph, you need to include each vertex's value as part of the adjacency list. If it wasn't a weighted graph, vertex 0, for example, would have an adjacency list like so: 0 -> [1, 2] because vertex 1 and 2 are adjacent to vertex 0. Using a dict to represent weighted vertices makes sense. The `self.values` list is also a lookup table keyed by vertex index. I use that in my `weight` function to determine the cost of moving from one vertex to another.
Finding the shortest path through a directed, weighted graph
Now I'm not a computer scientist, so I had to reference my copy of Introduction to Algorithms by Cormen, et al. I initially hacked up a Breadth-First Search and was able to solve the first triangle by simply picking the largest adjacent number on the line below. The BFS failed miserably on more complex triangles. I then remembered reading about Single-Source Shortest-Path algorithms on directed, weighted graphs. Dijkstra's algorithm came to mind. There are two important pieces of Dijkstra's algorithm that need to be modified in order to transform it to a Single-Source Longest-Path algorithm.
A normal implementation of Dijkstra's algorithm consists of an initialization function and a relax function. The initialization function initializes the distances and predecessors arrays so they can be used in the actual algorithm.
# I use the heapq library in standard Python because
# Dijkstra calls for a priority queue.
import heapq
def dijkstra(G, s):
"""
Note: this is valid Python, but it's also pseudo-code
to describe the algorithm. The actual code is inside a class.
"""
# initialization
for v in xrange(len(G)):
d[v] = float("inf")
p[v] = -1
d[s] = 0
S = set()
Q = [i for i in xrange(len(G))]
heapq.heapify(Q)
while Q:
u = heapq.heappop(Q)
S.add(u)
for v in G[u].keys():
w = weight(u, v)
relax(u, v, w)
NOTE: Functions below are defined for Single-Source Shortest-Path.
def initialize_single_source(G, s):
# G is the graph and s is the index of the source vertex
for v in xrange(len(G)):
d[v] = float("inf") # infinity
p[v] = -1
d[s] = 0
`d[v]` is the distances and `p[v]` is the predecessors. The relax function depends on the `d[v]` values and the edge weight. I define my weight function w(u, v) as `values[u] + values[v]`. That is, the cost of traveling from vertex 0 to vertex 1 is 5+9, because values[0] == 5 and values[1] == 9.
Relaxing an edge just means that if the algorithm can improve the shortest path to some vertex `v` by going through vertex `u`, it will update the distances and predecessors array for vertex `v` to go through `u` instead. The definition of the relax function for shortest-path is:
def relax(u, v, w):
# `w` is w(u, v) already computed
if d[v] > d[u] + w:
d[v] = d[u] + w
p[v] = u
Neat, but how do I determine the longest path?
There are two simple changes you need to make to the `initialize_single_source` and `relax` functions. First off, rather than initializing the distances to infinity, initialize them to -1. Now you need to modify the relax function to test if the path can be improved by increasing the cost associated with traveling from vertex v to u. The modification is just `if d[v] < d[u] + w`, then reset v's path to go through vertex u.
NOTE: Functions below are modified for Single-Source Longest-Path
def initialize_single_source(G, s):
# G is the graph and s is the index of the source vertex
for v in xrange(len(G)):
d[v] = -1
p[v] = -1
d[s] = 0
def relax(u, v, w):
# `w` is w(u, v) already computed
if d[v] < d[u] + w:
d[v] = d[u] + w
p[v] = u
Since there are a number of vertices at the bottom of the triangle, you need to find paths to all of them from the source vertex to determine the longest path. After running `dijkstra(G, 0)` (since vertex 0 is the top of the triangle), you can determine the longest path similar to below:
def path(s, v):
"""
If there's a valid path from s to v, sum the cost of that path.
`path(s, v)` follows the path backwards by traversing the
predecessors of v until it hits the source vertex.
"""
if s == v:
return values[s]
elif p[v] == -1:
# no path
return 0
else:
return path(s, p[v]) + values[v]
def longest_path():
total = 0
# len(G)-len(lines) is the first vertex on the last line,
# so iterate from that vertex to the last vertex at len(G)
for i in xrange(len(G)-len(lines), len(G)):
t = path(0, i)
if t > total:
total = t
return total
If you'd like to see my actual working implementation, it's hosted on Github: Solution to Triangle Graph Problem