The traveling salesman problem has confounded mathematicians and scientists for decades. Ever since its creation in the 1930s, there has been numerous attempts to solve it, to find an algorithm able to find the optimal solution in the smallest amount of time.
The problem goes as follows: a salesman must visit a series of cities. The path that he takes to visit all of them does not matter as long as he visits every city and finish where he started.
Here is an example:
The salesman wants to visit all of the cities in the least amount of time (or least distance traveled), so which path should he take?
One way to approach it is through genetic algorithms. If you are not familiar with this process, take a look at the article, genetic algorithms for beginners.
This time, the process is a little different:
Initiation
Initiate a random population of “paths” where each path is the order that the salesman follows to visit the cities. If the cities are A, B, C, D, then one path could be A C B D
Evaluation + Selection + Crossover
Create a new empty population which will eventually replace the old population.
Find the fittest (least distance) path of the entire population (this path will be the elite) and automatically add it to the new population.
Next, breed individuals to fill the new population with offspring. In order to do this, have two tournaments. A tournament happens when the fittest is selected out of a small randomly selected sector of the population and “wins” the tournament. The winners of the two tournaments crossbreed (the next section goes over the process of crossbreeding). The reason for having tournaments is to add diversity as automatically selecting the fittest few paths will slowly kill diversity which is not desired.
The child is then subjected to mutations and added to the new population. Do this until the new population is filled up.
Replace old population with new population.
This is how crossbreeding works in this context.
Lets say the cities are A, B, C, D, E, F
Assume that one parent’s order is A C B D F E and one parent’s order is B A C D E F
The child is created by randomly selecting one segment of the first parent and filling the rest of the child with the cities that are not used in the order of the second parent.
Let’s say the segment is from the 3rd city to the 5th city.
The child would be _ _ B D F _ for now (A C B D F E)
Then the missing cities are added in the order of the second parent. So
A _ B D F _ as B is already in the path.
A C B D F _ then
A C B D F E
Path.py
This creates the city object, each of which has a coordinates and an unique ID to help differentiate them.
It also creates the path object. Each path is created with the order of cities traveled, a unique ID, and a Boolean value determining if the path should be randomized (this only applies in the beginning where every path is randomized)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 | import random import numpy as np class City: def __init__(self, x, y, identifier): self.x = x self.y = y self.id = identifier # return coordinates and ID def __getattr__(self, key): return { 'x': self.x, 'y': self.y, 'id': self.id, }[key] def __str__(self): return "ID: " + str(self.id) + " X: " + str(self.x) + " Y: " + str(self.y) # calculates distance between itself and another city def comp_dist(self, city): return ((self.x - city.x) ** 2 + (self.y - city.y) ** 2) ** 0.5 class Path: def __init__(self, cities, identifier, is_random=True): self.num_loc = len(cities) self.id = identifier self.cities = cities if is_random: self.cities = np.random.permutation(cities) # returns number of cities def __len__(self): return len(self.cities) # change the ID of the path # only used to assign this path as the elite (ID = 0) def change_id(self, identifier): self.id = identifier # change city in path def change(self, index, new_city): self.cities[index] = new_city # return the location of a city given its index def get_loc(self, index=None): if index is not None: return self.cities[index] return self.cities # check if the path contains the city def contains(self, city): for location in self.cities: if location is not None and location.id == city.id: return True return False # switch two cities in the path def mutate(self): mutate_loc = random.sample(range(0, self.num_loc), 2) self.cities[mutate_loc[0]], self.cities[mutate_loc[1]] = self.cities[mutate_loc[1]], self.cities[ mutate_loc[0]] # calculates the total distance traveled by the salesman def calc_fitness(self): total_dist = 0 for i in range(len(self.cities)): if i == len(self.cities) - 1: total_dist += self.cities[i].comp_dist(self.cities[0]) else: total_dist += self.cities[i].comp_dist(self.cities[i + 1]) return total_dist def __str__(self): return "City: " + str([loc.id for loc in self.cities]) + " Fitness: " + str(self.calc_fitness()) # breed two paths by tak def cross_over(init_path, seed_path, new_id): num_loc = len(init_path) # creates child path that is currently empty child = Path([None] * num_loc, new_id) # find two points where the first segment will be transferred to the child cut_loc = random.sample(range(0, num_loc + 1), 2) # transfer segment to child for index in range(num_loc): if min(cut_loc) <= index < max(cut_loc): child.change(index, init_path.get_loc(index)) # fill the rest of the child with the cities' order of the second parent for city in seed_path.get_loc(): if not child.contains(city): for i in range(num_loc): if child.get_loc(i) is None: child.change(i, city) break return child |
Paths.py
This is the evolutional equivalent of a population. It contains every single path (“individual”).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | from Path import Path from Path import City import random class Paths: def __init__(self, tours=None): if tours is not None: self.tours = tours else: self.tours = [] def add_path(self, tour): self.tours.append(tour) def __len__(self): return len(self.tours) def get(self,identifier=None): if identifier is None: return self.tours return self.tours[identifier] def get_fittest(self): fittest = self.tournament(len(self)) return fittest def calc_fitness(self): sum = 0 for tour in self.tours: sum += tour.calc_fitness() return sum def tournament(self, num_players): players = random.sample(self.tours, num_players) min_fitness = float("inf") fittest = players[0] for player in players: if player.calc_fitness() < min_fitness: min_fitness = player.calc_fitness() fittest = player return fittest |
Evolution.py
This is where the evolution happens, the main program. It first displays a graph of every city’s location as a blue dot and after calculating through the evolutionary process, will display the path that the salesman should take. More iterations and a larger population size would result in a more efficient solution, but would take more time to compute.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | from Path import Path from Path import City from Paths import Paths from Path import cross_over import matplotlib.pyplot as plt import random import time # follow the process of evolution and find the new population def evolve(old_pop, num_mutations, tournaments, num_elite=1): new_pop = Paths() # auto add the elite (fittest) to new population fittest = old_pop.get_fittest() fittest.change_id(0) new_pop.add_path(fittest) # fill rest of new population with children for i in range(num_elite, len(old_pop)): parent1 = old_pop.tournament(tournaments) parent2 = old_pop.tournament(tournaments) child = cross_over(parent1, parent2, i) for j in range(num_mutations): child.mutate() new_pop.add_path(child) return new_pop if __name__ == "__main__": # initiation cities = [] # create randomized cities for index in range(40): cities.append(City(10 * random.random(), 10 * random.random(), index)) # size of sector of population to have tournament with tournament_size = 5 # population size pop_size = 40 # number of mutations to the child mutation_num = 1 # repeat evolution this many times trials = 3000 # show the graph of the cities plt.scatter([city.x for city in cities], [city.y for city in cities]) plt.show() # create population pop = Paths([Path(cities, i) for i in range(pop_size)]) fitness = [] for index in range(1, trials): # evolution pop = evolve(pop, mutation_num, tournament_size) fitness.append(pop.get(0).calc_fitness()) if index % 100 == 0: print(str(index) + " " + str(pop.get(0).calc_fitness())) # plot fitness of the population over time f1 = plt.figure() plt.scatter(range(len(fitness)), fitness) # plot the optimal path of the salesman f2 = plt.figure() plt.scatter([city.x for city in cities], [city.y for city in cities]) for index in range(len(cities)): plt.plot([pop.get(0).get_loc(index).x, pop.get(0).get_loc((index + 1) % len(cities)).x], [pop.get(0).get_loc(index).y, pop.get(0).get_loc((index + 1) % len(cities)).y]) plt.show() |
The result after tweaking the number of trials as well as the size of the population would result in something that looks like this:
Yay! This shows the shortest path to take to travel between cities, shaving valuable hours off the salesman's travel.