Write Log4j output to HDFS Write Log4j output to HDFS hadoop hadoop

Write Log4j output to HDFS


I recommend to use Apache Flume for this task. There is Flume appender for Log4j. This way, you send logs to Flume, and it writes to HDFS. Good thing about this approach is that Flume becomes single point of communication with HDFS. Flume makes it easy to add new data sources without writing bunch of code for interaction with HDFS again and again.


the standard log4j(1.x) does not support write to HDFS. but lucky, log4j is very easy to extends. I have written one HDFS FileAppender to write log to MapRFS(compatible with Hadoop). the file name can be something like "maprfs:///projects/example/root.log". It works good in our projects. I extract the appender part of code and paste it below. the code snippets may not be able to run. but this will give you the idea how to write you appender. Actually, you only need to extends the org.apache.log4j.AppenderSkeleton, and implement append(), close(), requiresLayout(). for more information, you can also download the log4j 1.2.17 source code and see how the AppenderSkeleton is defined, it will give you all the information there. good luck!

note: the alternative way to write to HDFS is to mount the HDFS to all your nodes, so you can write the logs just like write to local directory. maybe this is a better way in practice.

import org.apache.log4j.AppenderSkeleton;import org.apache.log4j.spi.LoggingEvent;import org.apache.log4j.Layout;import org.apache.hadoop.conf.Configuration;import java.io.*;public class HDFSFileAppender {    private String filepath = null;    private Layout layout = null;    public HDFSFileAppender(String filePath, Layout layout){        this.filepath = filePath;        this.layout = layout;    }    @Override    protected void append(LoggingEvent event) {        String log = this.layout.format(event);        try {            InputStream logStream = new ByteArrayInputStream(log.getBytes());            writeToFile(filepath, logStream, false);            logStream.close();        }catch (IOException e){            System.err.println("Exception when append log to log file: " + e.getMessage());        }    }    @Override    public void close() {}    @Override    public boolean requiresLayout() {        return true;    }    //here write to HDFS    //filePathStr: the file path in MapR, like 'maprfs:///projects/aibot/1.log'    private boolean writeToFile(String filePathStr, InputStream inputStream, boolean overwrite) throws IOException {        boolean success = false;        int bytesRead = -1;        byte[] buffer = new byte[64 * 1024 * 1024];        try {            Configuration conf = new Configuration();            org.apache.hadoop.fs.FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf);            org.apache.hadoop.fs.Path filePath = new org.apache.hadoop.fs.Path(filePathStr);            org.apache.hadoop.fs.FSDataOutputStream fsDataOutputStream = null;            if(overwrite || !fs.exists(filePath)) {                fsDataOutputStream = fs.create(filePath, overwrite, 512, 3, 64*1024*1024);            }else{ //append to existing file.                fsDataOutputStream = fs.append(filePath, 512);            }            while ((bytesRead = inputStream.read(buffer)) != -1) {                fsDataOutputStream.write(buffer, 0, bytesRead);            }            fsDataOutputStream.close();            success = true;        } catch (IOException e) {            throw e;        }        return success;    }}