Better to build or buy a compute grid platform? Better to build or buy a compute grid platform? hadoop hadoop

Better to build or buy a compute grid platform?


What kind of grid are you dealing with? A dozen hosts running the same OS would be pretty straightforward to run a grid for - all you really have to deal with is sending work to each host, maybe a little load balancing, maybe take into account what to do if a host goes down, maybe deal with distributing new service code to the hosts when you update your service, but if you don't deal with any of those it's not a big deal since the grid is a manageable size. If you're dealing with 1000s of hosts, or with a service that should never be down or have errors due to single hosts going down then you suddenly have to worry about:

  • not overloading any single host
  • distributing new service code
  • detecting when a host isn't responding and not sending it new work, as well as resending whatever it was working on
  • possibly working across different OSes and architectures (little vs. big endian)
  • energy savings - shutting down hosts during low load and bringing them back up for high load
  • scaling - if you add 100 hosts to your grid tomorrow how long does it take to get them connected and working?
  • reliability - some services may actually perform calculations on 2-3 different hosts and only return an answer that all the hosts agree on

That's a short list of things that most grid software should do for you if you need it. If you're working on something small or non-critical then by all means, roll your own. If you're working on something that has to work, or is big enough that having any manual steps in a deployment process would be a maintenance nightmare then you probably want to go with something that already exists.