How to log data from multiple threads? How to log data from multiple threads? multithreading multithreading

How to log data from multiple threads?


A library like log4j will be able to be configured for your needs.

  1. Splitting into too many files will make it difficult to debug some issues, but having one monolithic file leaves a soup of mixed processes. I would have a file for each atomic process, that is, a mail manager might use its own log file. Extra debug information for jdbc might have its own log file, but errors and major events would still be reported in the main application log.

  2. Major logging libraries support log splitting and rotation. For a well used web application, I prefer to have a log file made for each day, and also split over a certain size. You can build a cron to zip older logs and depending on the application, you may want to back them up for a few months or indefinitely.

  3. As far as debugging usefulness, you can grep for certain strings such as "Exception" to report on. If you are looking for statistics, you should make a log for that specific purpose in addition to your process log.

Logs can be synchronous or asynchronous, and the latter is usually best for performance. In general, a queue of messages is built and then written by a separate thread. So multiple threads can write to that one queue or buffer in memory and one thread will lock and write the file. Its pretty much in the background and you don't have to think about it unless you are writing a huge amount of data.


As the comments above already tell, logging frameworks exist precisely to free you from worrying about such low-level details. Log4J or its successors like LogBack can handle logging by multiple threads safely and effectively. You just tell the logging framework what to log and where, and it all works (usually :-)

For logging thread-specific data, you may consider using a Diagnostic Context. This earlier answer of mine explains this with an example for Log4J. In Logback, it has been renamed to Mapped Diagnostic Context.

As for backups and post-processing, all depends on your actual goals. Typically simple scripts or a single command like gzip and grep is all you need. It is hard to tell more without concrete information.


Regarding point 1, I usually log everything (feature-related) to the same file but the log line always includes some context information that allows me to track (via grep or something else) the flow of the context/request.

Example (a scenario with calls):

DEBUG|CallID#12: Establishing new AUDIO call from AA to BBDEBUG|CallID#34: Call accepted by ZZ at ...DEBUG|CallID#99: Call terminated by callee (SS)

This way it's if someone asks "what happened to call from AA to BB at 12:34 today?" I just grep either AA to BB (or the time it happened) and then, once I get the call id, getting the full details of the call is just a matter of grepping again with the id.

Other stuff like chat, presence, etc would go in its own file (wouldn't make much sense to mix this info all in a single monolithic file).

If you want per-thread (instead of per action/request) just log the name of the thread that's performing the action.

Regarding point 2, daily rotation with log4j.

Not sure I understood point 3... Maybe you mean parse a log file to retrieve some patterns? Any tool that supports regex will do the trick (grep being the most handy).