The Traveling Salesman Problem

The traveling salesman problem has confounded mathematicians and scientists for decades. Ever since its creation in the 1930s, there has been numerous attempts to solve it, to find an algorithm able to find the optimal solution in the smallest amount of time.

The problem goes as follows: a salesman must visit a series of cities. The path that he takes to visit all of them does not matter as long as he visits every city and finish where he started.

Here is an example:

Cities in a Path

The salesman wants to visit all of the cities in the least amount of time (or least distance traveled), so which path should he take?

The Solution

One way to approach it is through genetic algorithms. If you are not familiar with this process, take a look at the article, genetic algorithms for beginners.

This time, the process is a little different:

  1. Initiation

    Initiate a random population of “paths” where each path is the order that the salesman follows to visit the cities. If the cities are A, B, C, D, then one path could be A C B D

  2. Evaluation + Selection + Crossover

    Create a new empty population which will eventually replace the old population.

    Find the fittest (least distance) path of the entire population (this path will be the elite) and automatically add it to the new population.

    Next, breed individuals to fill the new population with offspring. In order to do this, have two tournaments. A tournament happens when the fittest is selected out of a small randomly selected sector of the population and “wins” the tournament. The winners of the two tournaments crossbreed (the next section goes over the process of crossbreeding). The reason for having tournaments is to add diversity as automatically selecting the fittest few paths will slowly kill diversity which is not desired.

    The child is then subjected to mutations and added to the new population. Do this until the new population is filled up.

    Replace old population with new population.

  3. Repeat steps 2 to 5 several hundred times.

Crossbreeding

This is how crossbreeding works in this context.

Lets say the cities are A, B, C, D, E, F

Assume that one parent’s order is A C B D F E and one parent’s order is B A C D E F

The child is created by randomly selecting one segment of the first parent and filling the rest of the child with the cities that are not used in the order of the second parent.

Let’s say the segment is from the 3rd city to the 5th city.

The child would be _ _ B D F _ for now (A C B D F E)

Then the missing cities are added in the order of the second parent. So

A _ B D F _ as B is already in the path.

A C B D F _ then

A C B D F E

The Code

Path.py

This creates the city object, each of which has a coordinates and an unique ID to help differentiate them.

It also creates the path object. Each path is created with the order of cities traveled, a unique ID, and a Boolean value determining if the path should be randomized (this only applies in the beginning where every path is randomized)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
import random
import numpy as np


class City:
    def __init__(self, x, y, identifier):
        self.x = x
        self.y = y
        self.id = identifier

    # return coordinates and ID
    def __getattr__(self, key):
        return {
            'x': self.x,
            'y': self.y,
            'id': self.id,
        }[key]

    def __str__(self):
        return "ID: " + str(self.id) + " X: " + str(self.x) + " Y: " + str(self.y)

    # calculates distance between itself and another city
    def comp_dist(self, city):
        return ((self.x - city.x) ** 2 + (self.y - city.y) ** 2) ** 0.5


class Path:
    def __init__(self, cities, identifier, is_random=True):
        self.num_loc = len(cities)
        self.id = identifier
        self.cities = cities
        if is_random:
            self.cities = np.random.permutation(cities)

    # returns number of cities
    def __len__(self):
        return len(self.cities)

    # change the ID of the path
    # only used to assign this path as the elite (ID = 0)
    def change_id(self, identifier):
        self.id = identifier

    # change city in path
    def change(self, index, new_city):
        self.cities[index] = new_city

    # return the location of a city given its index
    def get_loc(self, index=None):
        if index is not None:
            return self.cities[index]
        return self.cities

    # check if the path contains the city
    def contains(self, city):
        for location in self.cities:
            if location is not None and location.id == city.id:
                return True
        return False

    # switch two cities in the path
    def mutate(self):
        mutate_loc = random.sample(range(0, self.num_loc), 2)
        self.cities[mutate_loc[0]], self.cities[mutate_loc[1]] = self.cities[mutate_loc[1]], self.cities[
            mutate_loc[0]]

    # calculates the total distance traveled by the salesman
    def calc_fitness(self):
        total_dist = 0
        for i in range(len(self.cities)):
            if i == len(self.cities) - 1:
                total_dist += self.cities[i].comp_dist(self.cities[0])
            else:
                total_dist += self.cities[i].comp_dist(self.cities[i + 1])
        return total_dist

    def __str__(self):
        return "City: " + str([loc.id for loc in self.cities]) + " Fitness: " + str(self.calc_fitness())

# breed two paths by tak
def cross_over(init_path, seed_path, new_id):
    num_loc = len(init_path)
    # creates child path that is currently empty
    child = Path([None] * num_loc, new_id)
    # find two points where the first segment will be transferred to the child
    cut_loc = random.sample(range(0, num_loc + 1), 2)
    # transfer segment to child
    for index in range(num_loc):
        if min(cut_loc) <= index < max(cut_loc):
            child.change(index, init_path.get_loc(index))
    # fill the rest of the child with the cities' order of the second parent
    for city in seed_path.get_loc():
        if not child.contains(city):
            for i in range(num_loc):
                if child.get_loc(i) is None:
                    child.change(i, city)
                    break
    return child

Paths.py

This is the evolutional equivalent of a population. It contains every single path (“individual”).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
from Path import Path
from Path import City                                                                           
import random

class Paths:
    def __init__(self, tours=None):
        if tours is not None:
            self.tours = tours
        else:
            self.tours = []

    def add_path(self, tour):
        self.tours.append(tour)

    def __len__(self):
        return len(self.tours)

    def get(self,identifier=None):
        if identifier is None:
            return self.tours
        return self.tours[identifier]

    def get_fittest(self):
        fittest = self.tournament(len(self))
        return fittest

    def calc_fitness(self):
        sum = 0
        for tour in self.tours:
            sum += tour.calc_fitness()
        return sum

    def tournament(self, num_players):
        players = random.sample(self.tours, num_players)
        min_fitness = float("inf")
        fittest = players[0]
        for player in players:
            if player.calc_fitness() < min_fitness:
                min_fitness = player.calc_fitness()
                fittest = player
        return fittest                                    

Evolution.py

This is where the evolution happens, the main program. It first displays a graph of every city’s location as a blue dot and after calculating through the evolutionary process, will display the path that the salesman should take. More iterations and a larger population size would result in a more efficient solution, but would take more time to compute.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
from Path import Path
from Path import City
from Paths import Paths
from Path import cross_over
import matplotlib.pyplot as plt
import random
import time

# follow the process of evolution and find the new population
def evolve(old_pop, num_mutations, tournaments, num_elite=1):
    new_pop = Paths()
    # auto add the elite (fittest) to new population
    fittest = old_pop.get_fittest()
    fittest.change_id(0)
    new_pop.add_path(fittest)
    # fill rest of new population with children
    for i in range(num_elite, len(old_pop)):
        parent1 = old_pop.tournament(tournaments)
        parent2 = old_pop.tournament(tournaments)
        child = cross_over(parent1, parent2, i)
        for j in range(num_mutations):
            child.mutate()
        new_pop.add_path(child)

    return new_pop


if __name__ == "__main__":
    # initiation
    cities = []
    # create randomized cities
    for index in range(40):
        cities.append(City(10 * random.random(), 10 * random.random(), index))
    # size of sector of population to have tournament with
    tournament_size = 5
    # population size
    pop_size = 40
    # number of mutations to the child
    mutation_num = 1
    # repeat evolution this many times
    trials = 3000

    # show the graph of the cities
    plt.scatter([city.x for city in cities], [city.y for city in cities])
    plt.show()

    # create population
    pop = Paths([Path(cities, i) for i in range(pop_size)])
    fitness = []

    for index in range(1, trials):
        # evolution
        pop = evolve(pop, mutation_num, tournament_size)
        fitness.append(pop.get(0).calc_fitness())
        if index % 100 == 0:
            print(str(index) + " " + str(pop.get(0).calc_fitness()))

    # plot fitness of the population over time
    f1 = plt.figure()
    plt.scatter(range(len(fitness)), fitness)
    # plot the optimal path of the salesman
    f2 = plt.figure()
    plt.scatter([city.x for city in cities], [city.y for city in cities])
    for index in range(len(cities)):
        plt.plot([pop.get(0).get_loc(index).x, pop.get(0).get_loc((index + 1) % len(cities)).x],
                 [pop.get(0).get_loc(index).y, pop.get(0).get_loc((index + 1) % len(cities)).y])
    plt.show()

The Result

The result after tweaking the number of trials as well as the size of the population would result in something that looks like this:

Graph with Cities

Yay! This shows the shortest path to take to travel between cities, shaving valuable hours off the salesman's travel.