No space left on device exception, amazon EMR medium instances and S3

hadoop amazon-web-services amazon-s3 storage emr

The problem means that there is no space to store the output (or temporary output) of your MapReduce job.

Some things to check are:

Have you deleted unnecessary files from HDFS? Run hadoop dfs -ls / command to check the files stored on HDFS. (In case you use a Trash, make sure you empty it, too.)
Do you use compression to store the output (or temporary output) of your jobs? You can do so by setting as output format the SequenceFileOutputFormat, or by setting setCompressMapOutput(true);
What is the replication factor? By default it is set to 3, but if there is a space issue, you can risk to set it to 2, or 1, in order to make your program run.

It could be an issue that some of your reducers output a significantly larger amount of data than others, so check your code, too.

hadoop amazon-web-services amazon-s3 storage emr

I've gotten out of space errors on AMI 3.2.x where I haven't on AMI 3.1.x. Switch AMIs, and see what happens.

CodeHunter