Should I prefer hadoop vs condor when working with R? Should I prefer hadoop vs condor when working with R? hadoop hadoop

Should I prefer hadoop vs condor when working with R?


You can do both.

You can use HDFS for your data sets and Condor for your job scheduling. Using Condor to place executors on machines and HDFS + Hadoops Map-Reduce features to process your data (assuming your problem is map-reduce mappable). Then you're using the most appropriate tool for the job: Condor is a job scheduler, and as such does that work better than Hadoop. And Hadoop's HDFS and M-R framework are things Condor doesn't have (but are really helpful for jobs running on Condor to use).

I would personally look at has HDFS to share data among jobs that run discretely as Condor jobs. Especially in a university environment, where shared compute resources are not 100% reliable and can come and go at will, Condor's resilience in this type of set up is going to make getting work done a whole lot easier.