Low latency/high performance network (ethernet) messaging Low latency/high performance network (ethernet) messaging linux linux

Low latency/high performance network (ethernet) messaging


You mentioned that you want to test the internal performance of Machine A, but "need a separate machine"; yet, you don't want to test network infrastructure performance.

You know much more about your requirements than I do; however, if I was testing network infrastructure in Machine A, I would set up my test like this:

Looped Machine

There are couple of reasons for this:

  • You can use an Ethernet loopback cable to simulate the "pong" function performed by Machine B
  • Eliminating transit through infrastructure you don't care about is almost always a better solution when measuring performance

If you use this test method, be sure to note these points:

  • Ethernet performs a signal to noise test on the copper before it sets up a link. If you make your loopback bends too tight, you could introduce more latency if ethernet decides to fall back to a lower speed due to the kinks in the cable. There is no minimum length for copper ethernet cabling.
  • As you're probably aware, combinations of NICs / driver versions / OS can have a significant affect on intra-host latency. I work for a network equipment manufacturer, and one of the guys in the office used to work as an applications engineer for SolarFlare. He claims that many of the Wall Street trading systems use SolarFlare's NICs due to the low latency SolarFlare engineers their products for; he also said SolarFlare's drivers give you user-space access to the NIC buffers. Caveat: third-hand info, and I cannot verify myself.
  • If you loop the frames to Machine A, set the source and destination mac-address to the burned-in-address on the NIC

Even if you need to receive a modified "pong" frame from Machine B, you could still use this topology and simply rewrite packet fields on the receive-side of your code in Machine A. Put as many (or few) instrumentation points as you like in Machine A's "modules" to compare frame timestamps.

FYI:

The embedded systems I mentioned in my comments on your question are for measuring latency of network infrastructure, not end hosts. This is the best method I can think of for instrumenting host latency.


As an off the shelf solution, I would suggest taking a look at Solace, Tibco and AMQP. These are all enterprise messaging frameworks used extensively in trading applications. AMQP is open source and capable of handling throughputs of up to 100,000 messages per second. I am not sure of the latencies of other frameworks. There is a Java or C++ implementation of the AMQP message router. The C++ one of course returns higher performance.

Edit I've just heard of a new product called UltraMessaging which can provide 7,000,000 messages per second throughput with Java, C++ or C# clients. Crikey.

Best regards,