Is it safe to pipe the output of several parallel processes to one file using >>? Is it safe to pipe the output of several parallel processes to one file using >>? unix unix

Is it safe to pipe the output of several parallel processes to one file using >>?


No. It is not guaranteed that lines will remain intact. They can become intermingled.

From searching based on liori's answer I found this:

Write requests of {PIPE_BUF} bytes or less shall not be interleaved with data from other processes doing writes on the same pipe. Writes of greater than {PIPE_BUF} bytes may have data interleaved, on arbitrary boundaries, with writes by other processes, whether or not the O_NONBLOCK flag of the file status flags is set.

So lines longer than {PIPE_BUF} bytes are not guaranteed to remain intact.


One possibly interesting thing you could do is use gnu parallel: http://www.gnu.org/s/parallel/ For example if you you were spidering the sites:

stackoverflow.com, stackexchange.com, fogcreek.com 

you could do something like this

(echo stackoverflow.com; echo stackexchange.com; echo fogcreek.com) | parallel -k your_spider_script

and the output is buffered by parallel and because of the -k option returned to you in the order of the site list above. A real example (basically copied from the 2nd parallel screencast):

 ~ $ (echo stackoverflow.com; echo stackexchange.com; echo fogcreek.com) | parallel -k ping -c 1 {}PING stackoverflow.com (64.34.119.12): 56 data bytes--- stackoverflow.com ping statistics ---1 packets transmitted, 0 packets received, 100.0% packet lossPING stackexchange.com (64.34.119.12): 56 data bytes--- stackexchange.com ping statistics ---1 packets transmitted, 0 packets received, 100.0% packet lossPING fogcreek.com (64.34.80.170): 56 data bytes64 bytes from 64.34.80.170: icmp_seq=0 ttl=250 time=23.961 ms--- fogcreek.com ping statistics ---1 packets transmitted, 1 packets received, 0.0% packet lossround-trip min/avg/max/stddev = 23.961/23.961/23.961/0.000 ms

Anyway, ymmv


Generally, no.

On Linux this might be possible, as long as two conditions are met: each line is written in a one operation, and the line is no longer than PIPE_SIZE (usually the same as PAGE_SIZE, usually 4096). But... I wouldn't count on that; this behaviour might change.

It is better to use some kind of real logging mechanism, like syslog.