best possible implementation of the travelling salesman / vehicle routing use case best possible implementation of the travelling salesman / vehicle routing use case hadoop hadoop

best possible implementation of the travelling salesman / vehicle routing use case


There's no right way of solving a NP problem. Since complexity is exponential it's going to take a very long time for anything other than trivial examples.

However, there are approximations that can get fairly close to the real answer and might be sufficiently good for your application (as in, it's not the shortest path, but it's within some relative range of it).

Check our the wikipedia page. They even have some examples.


Indeed as Dmitry mentions this is a case of the multiple travelling salesman porblem. Being NP-hard naturally the interviewers are looking for you to suggest a heursitic optimisation algorithm.

I think the key in this case is they are looking for an algorithm which is able to update in real time to changes in the number and location of destinations. Ant colony optimisaiton (a form of particle swarm optimisation) was actually initially formulated for the travelling salesman problem, see the paper and wikipedia:

https://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms

"M. Dorigo, V. Maniezzo, et A. Colorni, Ant system: optimization by a colony of cooperating agents, IEEE Transactions on Systems, Man, and Cybernetics--Part B , volume 26, numéro 1, pages 29-41, 1996."

This has been generalised since to the multiple travelling saleman problem see for example this paper (opensource) for some nice work into it:

http://www.researchgate.net/publication/263389346_Multi-type_ant_colony_system_for_solving_the_multiple_traveling_salesman_problem

In an interview situation, I would detail it has the Pros as: 1. being an efficient heuristic solution; 2. Able to update in real time to both changes in the graph; 3. For bonus points I mention that, once a reasonably efficient solution has been obtained in silico, drivers themselves could be assigned routes in a slightly probabilisitc way, subsequently optimisation driven by real data could be performed.

Cons are that reasonably large amounts of proccessing power are likely required compared to say problems that first reduce the search space as Dmitry suggested. Secondly if they want you to actually draw up an alogirthm this could be quite challenging in the space of an interview.

Interesting question :)


If I would asked this question on interview - I will propose something like described in this paper, looks like best match for your task's formulation. In this paper you will find optimized approximate approach to solve multiple salesmen problem with all salesmen starting in one point. It can be adopted if we know where employees leave by solving each single travel salesman subproblem (clustering divides main problem to classic problems) with start at specific salesman's home/home office.

If we have graph of places as an input, not just coordinates - we can replace k-means with graph clustering algorithm like MCL.