How to isolate causes of system hang on Unix/OSX How to isolate causes of system hang on Unix/OSX unix unix

How to isolate causes of system hang on Unix/OSX


Activity Monitor (cmd+space, type, activity monitor), should give you an intuitive overview of what's happening on your system. If as you say it is there are no processes clogging CPU, please do take a look at the disk/IO activity. Perhaps your disk is going south.


I have been having problems continually over the years with system hangs. It seems that generally they are a result of filesystem errors, however Apple does not do enough to take care of this issue. System reliability should be a 100% focus and I am certainly sick of these issues. I have started to move a lot of files and all backups over to a ZFS volume on a FreeBSD server and this is helping a bit as it has started to both ease my mind and allow me to recover more quickly from issues. Additionally I've placed my system volume on a large SSD (240GB as I have a lot of support files and am trying to keep things from being too divided up with symlinks) and my Users folders on another drive. This too has helped add to reliability.

Having said this, you should try to explore spindump and stackshot to see if you can catch frozen processes before the system freezes up entirely. It is very likely that you have an app or two that is attempting to access bad blocks and it just hangs the system or you have a process blocking all others for some reason with a system call that's halting io.

Apple has used stackshot a few times with me over the last couple years to hunt some nasty buggers down and the following link can shed some light on how to perhaps better hunt this goblin down: http://www.stormacq.com/?p=346

Also try: top -l2 -S > top_output.txt and exame the results for hangs / zombie processes.

The deeper you go into this, you may find it useful to subscribe to the kernel developer list (darwin-kernel@lists.apple.com) as there are some very, very sharp cookies on here that can shed light on some of the most obscure issues and help to understand precisely what panic's are saying.

Additionally, you may want to uninstall any VMs you have installed. There is a particular developer who, I've heard from very reliable sources, has very faulty hypervisor issues and it would be wise to look into that if you have any installed. It may be time to clean up your kexts altogether as well.

But, all-in-all, we really quite desperately need a better filesystem and proactive mechanisms therein to watch for bad blocks. I praised the day and shouted for joy when I thought that we were getting ZFS officially. I am doubtful Lion is that much better on the HFS+ front sadly and I certainly am considering ZFS for my Users volume + other storage on the workstation due to it's ability to scrub for bad blocks and to eliminate issues like these.

They are the bain of our existence on Apple hardware and having worked in this field for 20 years and thousands of clients, hard drive failure should be considered inexcusable at this point. Even if the actual mfgs can't and won't fix it, the onus falls upon OS developers to handle exceptions better and to guard against such failures in order to hold off silent data loss and nightmares such as these.


I'd run a mixture of 'top' as well as tail -f /var/log/messages (or wherever your main log file is).

Chances are right before/after the hang, some error message will come out. From there you can start to weed out your problems.