QOS to optimize throughput

8 posts / 0 new
Last post
Offline
Last seen: 7 years 2 months ago
Joined: 06/10/2014
Posts: 49
QOS to optimize throughput

Hi,

I am running into a situation where a publisher is publishing at a rate where the reader is not able to keep up and I am losing samples. I am using one of the builtin QOS profiles (BuiltinQosLibExp::Pattern.LastValueCache).

I read this article on how to tune QOS for throughput (https://community.rti.com/kb/which-qos-parameters-are-important-tune-throughput-testing). 

The QOS that I am using has resources like max_samples, max_samples_per_instance set to default values. One of the defaults is DDS_UNLIMITED. In this case, does DDS dynamically adjust queue sizes depending on the data? 

If I am missing samples, do I need to determine the fastest writer and maximum message size in order to explicitly set the resources (e.g. max_samples) to these values? 

Thanks,

Anne Fiore

Organization:
Offline
Last seen: 11 months 2 weeks ago
Joined: 02/11/2016
Posts: 144

Hey,

Not the leading authority on this but I think I can help out a bit.

1. For high throughput scenarios the qos you chose may not be ideal, a list of built in qos's that may help you out is available here: https://community.rti.com/examples/built-qos-profiles

2. There are many ways you can improve throughput apart from QoS, some are related to the server / workstation settings (https://community.rti.com/best-practices/tune-your-os-performance may be relevant) while some are related to how you use the api like using WaitSets and not listeners (https://community.rti.com/best-practices/use-waitsets-except-when-you-need-extreme-latency)

3. The article you've read lists a few of the extreme settings you may use but the following is a bit easier to understand (https://community.rti.com/kb/summary-and-typical-use-cases-different-quality-service-qos-parameters)

 

When you say the reader is not able to keep up, it's important that you check what is the cause of samples being lost (if you are using a listener, you can instrument on sample lost and check the last reason)

 

Regarding the max_samples and max_samples_per_instance qos - if they are set to default, the boundary will be based on your operating system.

What could help your performance (while taking up memory) is setting the initial_samples / initial_instances to higher than default values to avoid dynamic allocation.

 

I hope this is a bit helpful,

Roy.

Offline
Last seen: 7 years 2 months ago
Joined: 06/10/2014
Posts: 49

Hi,

This is a very helpful list of references. I'm going to look further into the cause of the samples being lost by adding a listener as you suggested.  Hopefully will figure out what resources may need to be tuned.

Thanks for the help.

Anne

Offline
Last seen: 11 months 2 weeks ago
Joined: 02/11/2016
Posts: 144

Hi,

Good luck and let us know if you get stuck in your research.

 

Roy.

Gerardo Pardo's picture
Offline
Last seen: 1 week 3 days ago
Joined: 06/02/2010
Posts: 602

Hi Anne,

Does the data-type associated with your Topic define some members as Key?

The  BuiltinQosLibExp::Pattern.LastValueCache settings intentinally allow for samples to be lost to accomodate decoupling in the speed at which varous readers can access the data. It is defined as an alias to Generic.KeepLastReliable.TransientLocal which sets HistoryQos kind to KEEP_LAST with depth=1 both on the DataWriter and DataReader. This configuration is intended for Keyed Topics (i.e. those whose data-type uses the //@Key annotation on some members), and the intent is that the "RELIABILITY" only needs to maintain the "last value" (most current value) per key. So if a new value is produced for a Key it is OK to remove the older value even if some DataReaders have still not received/read it.  Imagine that you are producing reading from a particular sensor and publishing in on Topic "SensorData", further assume that the SensorData datatype contains a member "sensorId" that identifies that particular Sensor and this is marked as a "Key" member. Then the BuiltinQosLibExp::Pattern.LastValueCache profile can be used to publish and subscribe this Topic and always have available the "most current value" of each sensor. If a particular sensor updates data too fast for some readers intermediate updates may be lost. But the most current one per sensorId will be reliably received.

You can change the HISTORY Qos depth to some value greater than 1 to retain a few more of the "lastest" samples per key. Not just the last one. This may caues fewer samples to be "lost" in burstly situations but it findamentally does not change the fact that some samples may be lost if the DataWriter is consistently writing faster than a particular DataReader can handle.  

If you do not want any sample lost, then you should use a Qos profile that derives from  Generic.StrictReliable. This sets HISTORY Qos kind to KEEP_ALL.

Note that a "strict" reliable Qos where no sample can be lost nrecessarily provides the means for the reliable DataReaders to put back-pressure on the DataWriter and slow it down to a rate that can be handled by all DataReaders. This effecively couples the Writer and Reader and limits the steady-state throughput to that of the slowest DataReader. This may be what you want if you cannot tolerate any sample loss, but in many cases some sample loss is a good tradeoff in order to enable faster steady-state thoughput to those readers than can keep up... Hence the choice available in setting the HISTORY Qos kind.

Regards,

Gerardo

Offline
Last seen: 7 years 2 months ago
Joined: 06/10/2014
Posts: 49

Hi Gerardo,

Thanks for the explanation. This is exactly what was happening in my case. With a history depth of 1 the reader was not able to process a sample before a new a new sample arrived and replaced it. When I increased the history depth, I was able to receive all samples. I am going to run some benchmarks on my system to see if there is a reasonable history depth so the chances of losing samples is low.

Can I ask, with the HISTORY kind set to KEEP_ALL, and DURABILITY set to TRANSIENT_LOCAL, does that mean a writer will try to keep all samples for a late joining reader? Ideally we would like just the last sample for a late joining reader but have a way to increase the history depth for a reliable reader so that it can get all samples on the bus.

Thanks,

Anne

 

Offline
Last seen: 11 months 2 weeks ago
Joined: 02/11/2016
Posts: 144

Hey Anne,

I'm not sure I'm following your needs.

Do you:

1. Have one writer and you need different readers to interact differently with it?

2. Have multiple writers and readers that behave differently?

 

Do note that you can set the history depth of the reader to be 1 if you're only interested in the last update but I can't find in the QoS guide what the side effects may be.

Also note that using KEEP_ALL will not literally keep all (it depends both on the amount of memory you actually have available by the operating system and on your resource limits qos).

 

Hope that helps,

Roy.

Offline
Last seen: 7 years 2 months ago
Joined: 06/10/2014
Posts: 49

Hi Roy,

We are currently using the same QOS for all readers and writers (BuiltinQosLibExp::Pattern.LastValueCache). This has a history depth of 1. What we found is that for some topics which have a high data rate, the reader is not able to keep up with the writer because the depth is 1.  

To fix this we are looking at changing the history depth for this topic's writers and readers to be > 1 or to use HISTORY set to KEEP_ALL. I was concerned about resources if we set the HISTORY to KEEP_ALL and its effect on system performance. I think I understand enough about what is happening so that I can run some tests and benchmark performance.

Thanks for the help,

Anne