I have a message with a 10MB octet sequence and we're observing that it takes anywhere between 2-4 seconds to send the message regardless of the sequence length (through a 100Mb ethernet connection). This implies to me that the message is always sending the entire 10MB message every time the message is being sent through the DDS bus. To set the length of the sequence, we're simply using "length()". Should I assume that the middleware should only be sending the bytes across the wire that it needs to? Do I need to do something else to ensure that only the bytes that must be sent across the wire are sent?
We are using DDS 5.0.
Thanks
Hi,
This should not be happening. Only the actual length (not the maximum) is serialized and sent by RTI Connext DDS. Could you send a code snippet of the exact syntax you are using to set the length?
When you say it takes 2-4 seconds to "send the message" can you explain how you are measuring that time?
If you want to see what is on the wire you could capture the message using Wireshark. It already has an RTPS dissector so it should be possible to see the "DATA" message and the length of that message.
Gerardo
OK, so we have an embedded system and a laptop plugged into the system directly. We did some bandwidth tests (I believe it was iperf) that showed that we can achieve ~90Mb/sec to and from the embedded system. To measure the speed of the packet, we setup a timer from when we send the packet to when we receive a success/fail message that indicates that the message arrived.
Regarding setting the length, we're just using "topic->sequence_name.length( number_of_bytes );" I recently changed this to "topic->sequence_name.ensure_length( number_of_bytes, number_of_bytes );", but it sounds like this change is unnecessary and could be causing more work for each side.
Another thing that we noticed is that the latency is somewhat variable (i.e., inconsistent). Sometimes sending the message will take 2 seconds, sometimes it will take 6 seconds. It seems to be worst when first attaching to the DDS bus.
At this point I am wondering if the issue is the fact that a 10MB memory allocation is required for every message sent, which is causing some slowness on the embedded system and on the laptop.
Thanks for your help
Chris
Hello Chris,
Thank you for the details. The "topic->sequence_name.length( number_of_bytes );" call is indeed the one that controls the "used" length of the sequence. The call to ensure_length() is not needed, althout it is likely harmless in your situation. What ensure_length() does is check whether the maximum allocated size of the sequence can accomodate the requested length and if not it re-allocates the sequence (calling maximum()) and then sets the length. If your sequence is already allocated to be big enough the ensure_length(number_of_bytes) will be the same as length(number_of_bytes).
Going back to how long it takes to send the sequence, I would suspect it has more to do with the reliable protocol dropping packets that are subsequently repaired. This issue happens when you write "large data" using UDP because there is no throttle mechanism automatically-enabled. When you write a large data sample DDS will just fragment it into pieces and send all the fragments as fast as in can. This is often faster than it can be processed by some of the transport stages (Operating System, NICs, etc.) resulting on buffer overflows and fragments being dropped. This results in NACKs and repairs which introduce delays and result in lower bandwidth.
The solution to this problem is to associate a FlowController with the DataWriter so that the data that is injected into the wire is throttled to some maximum rate and does not overflow the intermediate buffers. Using a FlowController also requires configuring your DataWriter to be asynchronous. This is all documented in the RTI Connext DDS User's Manual but it is easiest to see it with an example. I have uploaded a large data example to the File Exchange (file: large_data.zip) so you can try it.
I am not 100% sure the problem you are seeing is this. However I think it is worth trying the example I mentioned in your environment and see if it reproduces or improves the situation.
Gerardo