large data at high rate

7 posts / 0 new
Last post
Offline
Last seen: 2 years 7 months ago
Joined: 02/02/2015
Posts: 6
large data at high rate

Hi,

I am trying to send 440k at 22Hz.

the QOS profile set at the publisher is:

{reliability = false, deadline = 0, priority = 0, history = 1, durability = VOLATILE, hasKey = false}

So really this is just a bestEffort-Volatile writer.

Reader is set up same.

It's a pretty busy network.  My reader displays the topics/sec as a user feedback mechanism.  And although it often sustains 20-22Hz, it can dip as low at 4-6, and will frequently stay at 16Hz for periods of time.

While reliability is not important in my use case, a sustained rate of 20-22Hz is.  The transport is UDP.  OS is linux.

of course there's a swtich or two in the mix, and i am pretty sure its a 100baseT laydown.  the machines are pretty modern (< 5 years old). RHEL 4.3 i think is the specific version.  I am not 100% sure on DDS version, but it's not ancient. 

After instrumenting the code and logging things, it appears to me my topic/sample loss occurs in the DDS layer.  Likely the writer is throwing away samples as it publishes quickly(it is a sustained publication rate of 22hz).  Is there a way to have some more "history" with best effort- volatile?  how can I increase buffer sizes in this scenario?

 

Organization:
Gerardo Pardo's picture
Offline
Last seen: 1 week 1 day ago
Joined: 06/02/2010
Posts: 602

Hi,

Is the size of a single data-sample 440 KB? 

How are you processing the samples on the DataReader? Are you using a listener or a WaitSet?

I would recommend to increase the "history" to be greater than 1, at least on the DataReader. Otherwise if you are using a WaitSet and take too long to read the data the next sample may arrive and "replace" the previous one as it only has 1 slot on the history depth per key. You mentioned that your data-type does not have a key. In this case it is as-if you had a single key for all the Samples so each sample new will replace the previous one.

Also if your sample size is 440 KB then I think you should use set your QoS to RELIABLE because a 440KB sample would be broken into at least 8 UDP datagrams, which themselves wil be broken into many IP fragments of a size that fits the "path MTU" assuming a path MTU of 1.5KB (which is pretty typical) we are talking about ~300 IP fragments that must be received correctly for a "best efforts" sample to arrive successfully. That is taking a lot of chances... So I would configure the DataReader (and DataWriter--but this is the default) as RELIABLE.

Assuming that yoru application is writing a 440 KB sample each 1/22 seconds (rather than smaller samples at a higher rate) then I would also consider adding a FlowController in order to shape the load on the network. So that the middleware does not try to write all 440KB "as fast as it can". But this wouldbe the last thing I would try, after the ones I mentioned before.

If you really have 100 Mbit/sec Ethernet you do not seems to have a lot of margin. 440 KB at 22 Hz is about 80Mbit/sec by itself and you have to take into consideration the IP and RTPS overheads...

Gerardo

 

Offline
Last seen: 2 years 7 months ago
Joined: 02/02/2015
Posts: 6

Hi, I did wonder if history > 1 makes any sense in best effort.  Sounds like it does if you are recommending it.

The data reader is a listener(not a wait set).  Yes confirmed: keyless data.

the size of each topic is 440k.  it is a frame of video, greyscale, sampling only red channel,  768x576 in size.

its a legacy piece of software and I cant do much about the fact that it is a screenshot at 20hz and published (uncompressed).

however, i thought since i often see the system can cope, and these dips to 16hz are not that common, it could be rectified via QOS.  the 16hz dips happen somewhat infrequently, and the really low dips to 4-6hz are more transient and arent usually sustained.

also my mistake: the network is 1000baseT. (gigE)

 

 

Gerardo Pardo's picture
Offline
Last seen: 1 week 1 day ago
Joined: 06/02/2010
Posts: 602

Hi,

Yes, there are situations where History>1 makes sense even for best-efforts, this happends whenever the DataReader or DataWriter do not process the data immediately. In those cases if History ==1 the second sample can override the first one.

However I do not think it was happening in your specific setup because:

a) The DataWriters is writing "synchronously" by default. That is when you call DataWriter::write() it sends the data immediately. However I would recommend you change this...

b) The DataReader is reading  "synchronously" on the listener, so as soon as some data is read from to the socket buffer  it is passed to the application. However I would also recommend changing this... 

So in your current setup the only place where data could be dropped is on the "socket" buffers or on the NIC buffers...

Did you modify the configuratin of the socket send and receive buffers to be large enough? The OOB linux defaults are not always what is needed for large data. You may want to take a look at this HOWTO for details on how to check your settings and modify them to more suitable values: https://community.rti.com/howto/improve-rti-connext-dds-network-performance-linux

I would check the Socket buffers and all the other network settings described in that HOWTO first as these would be my main suspect and run the tests again to see if that made a difference...

Also the NICs themselves can have a big impact and drop packets randomly when stressed by UDP. We noticed this when running our benchmarks and ended up having to buy Intel NICs.

In addition I would recommend 3 other things for increased robustness and determinism.

  1. Changing the history on both the DataWriter and DataReader to at least 3 or 4. You will need this if you make the remaining changes below
  2. Configuring the DataWriter to publish asynchronously using a FlowController. This will make the load on the network more even rather than peaking whenever you write each sample, minimizing the chances of overfilling buffers somewhere. There is an example on how to do this here:  https://community.rti.com/examples/asynchronous-publisher
  3. Using a WaitSet instead of a Listener on the DataReader. This decouples the thread processing the data from the one reading from the network. This is especially important if it takes time to process the data which I assume it would given it is 440KB... Note that if you indeed take time processing the data and you want to avoid having to make an extra data-copy, you can use the read()/take() operation that returns a "sequence" as this is a zero-copy. In this case you hold on to the data without calling the "return_loan" as long as you need to process it.  If you are doing this, which I consider good practice for large data, you need to have the history depth necessary to "store" the new sample(s) that may arribe while you are holding on to that loan. With a history=1 the sample being loaned means any new samples received would have to be dropped because there is no space in the history  to keep them.

 

Gerardo

 

Offline
Last seen: 2 years 7 months ago
Joined: 02/02/2015
Posts: 6

 Hi,

firstly let me thank you for your support so far!

I had a look at our setup, and it looks like we have applied permanent customizations already. most of the HOWTO suggestions are already setup, with 3 exceptions:

the net.core.rmem_max is much larger than you suggest.  the default is 131071, HOWTO recommends 2097152, we have set 8388608

the net.core.netdev_max_backlog default is 300, you recommend 30000, we have only got 1000

lastly, the MTU is still set to 1500, and you do recommend 9000.


One other thing is that the driver config throttle/input coalescing, i dont see any setting applied(on or off).  which must mean a default is being used.  I don't know how to determine what that is.

We have mostly intel NICs, with a few broadcomms.


do you think there is much value in testing the unmatched settings with the HOWTOs recommendations and testing, or jumping straight into the steps 1,2&3 from the end of your post?

 

Gerardo Pardo's picture
Offline
Last seen: 1 week 1 day ago
Joined: 06/02/2010
Posts: 602

Hi,

Yes I would first try the settings in the HOWTO. At least the net.core.netdev_max_backlog this is very important. It controls the number of packets that can be queued in the interface. Considering that your packets are 1500 bytes 300 is very little. Does not even fit one of your messages...

I would try net.core.netdev_max_backlog fist alone because you may see a big difference just with this.

Next I would try the net.core.rmem_maxIn theory the  net.core.rmem_max being so large shouldn't be a problem. However I have seen people report increased packet loss with very large velues  net.core.rmem_max which is counter intuitive... See for example http://serverfault.com/questions/410230/higher-rmem-max-value-leading-to-more-packet-loss.  So I would also try to set it to the smaller recommended value just to see if that made a difference.

I would also try the MTU of 9K that should help signifcantly by reduring the number of IP packets. But I would try that last, after the previous two.

You could try all at the same time, but if it is easy to run the test it is useful to try them one at a time because that way we can find out which was the root cause.

The throttle/input coalescing impacts mostly throughput and latency. Normally the default setting optimizes throughput so it should not be an issue in your case. This is board-specific the HOWTO explains how to determine the settings for the Intel Pro boards. Not all boards may even support a setting. In any case I think this is the least important.

I have seen issues with the broadcom boards. Does the packet loss correlate with the machines that have the broadcom NICs?

Gerardo

 

 

Offline
Last seen: 2 years 7 months ago
Joined: 02/02/2015
Posts: 6

i will do more tests tomorrow and let you know if i see this only on broadcomm or also intel.

also note our backlog is 1000 (not 300).  300 was just the system default.  I will increase this to 30k as per the howto.

I will also test lowering the r_mem max.