Hi,
I am running into a situation where a publisher is publishing at a rate where the reader is not able to keep up and I am losing samples. I am using one of the builtin QOS profiles (BuiltinQosLibExp::Pattern.LastValueCache).
I read this article on how to tune QOS for throughput (https://community.rti.com/kb/which-qos-parameters-are-important-tune-throughput-testing).
The QOS that I am using has resources like max_samples, max_samples_per_instance set to default values. One of the defaults is DDS_UNLIMITED. In this case, does DDS dynamically adjust queue sizes depending on the data?
If I am missing samples, do I need to determine the fastest writer and maximum message size in order to explicitly set the resources (e.g. max_samples) to these values?
Thanks,
Anne Fiore
Hey,
Not the leading authority on this but I think I can help out a bit.
1. For high throughput scenarios the qos you chose may not be ideal, a list of built in qos's that may help you out is available here: https://community.rti.com/examples/built-qos-profiles
2. There are many ways you can improve throughput apart from QoS, some are related to the server / workstation settings (https://community.rti.com/best-practices/tune-your-os-performance may be relevant) while some are related to how you use the api like using WaitSets and not listeners (https://community.rti.com/best-practices/use-waitsets-except-when-you-need-extreme-latency)
3. The article you've read lists a few of the extreme settings you may use but the following is a bit easier to understand (https://community.rti.com/kb/summary-and-typical-use-cases-different-quality-service-qos-parameters)
When you say the reader is not able to keep up, it's important that you check what is the cause of samples being lost (if you are using a listener, you can instrument on sample lost and check the last reason)
Regarding the max_samples and max_samples_per_instance qos - if they are set to default, the boundary will be based on your operating system.
What could help your performance (while taking up memory) is setting the initial_samples / initial_instances to higher than default values to avoid dynamic allocation.
I hope this is a bit helpful,
Roy.
Hi,
This is a very helpful list of references. I'm going to look further into the cause of the samples being lost by adding a listener as you suggested. Hopefully will figure out what resources may need to be tuned.
Thanks for the help.
Anne
Hi,
Good luck and let us know if you get stuck in your research.
Roy.
Hi Anne,
Does the data-type associated with your Topic define some members as Key?
The
BuiltinQosLibExp::Pattern.LastValueCache
settings intentinally allow for samples to be lost to accomodate decoupling in the speed at which varous readers can access the data. It is defined as an alias toGeneric.KeepLastReliable.TransientLocal
which setsHistoryQos
kind toKEEP_LAST
withdepth
=1 both on the DataWriter and DataReader. This configuration is intended for Keyed Topics (i.e. those whose data-type uses the//@Key
annotation on some members), and the intent is that the "RELIABILITY" only needs to maintain the "last value" (most current value) per key. So if a new value is produced for a Key it is OK to remove the older value even if some DataReaders have still not received/read it. Imagine that you are producing reading from a particular sensor and publishing in on Topic "SensorData", further assume that the SensorData datatype contains a member "sensorId" that identifies that particular Sensor and this is marked as a "Key" member. Then theBuiltinQosLibExp::Pattern.LastValueCache
profile can be used to publish and subscribe this Topic and always have available the "most current value" of each sensor. If a particular sensor updates data too fast for some readers intermediate updates may be lost. But the most current one per sensorId will be reliably received.You can change the
HISTORY
Qosdepth
to some value greater than 1 to retain a few more of the "lastest" samples per key. Not just the last one. This may caues fewer samples to be "lost" in burstly situations but it findamentally does not change the fact that some samples may be lost if the DataWriter is consistently writing faster than a particular DataReader can handle.If you do not want any sample lost, then you should use a Qos profile that derives from
Generic.StrictReliable
. This setsHISTORY
Qos kind toKEEP_ALL
.Note that a "strict" reliable Qos where no sample can be lost nrecessarily provides the means for the reliable DataReaders to put back-pressure on the DataWriter and slow it down to a rate that can be handled by all DataReaders. This effecively couples the Writer and Reader and limits the steady-state throughput to that of the slowest DataReader. This may be what you want if you cannot tolerate any sample loss, but in many cases some sample loss is a good tradeoff in order to enable faster steady-state thoughput to those readers than can keep up... Hence the choice available in setting the
HISTORY
Qos kind.Regards,
Gerardo
Hi Gerardo,
Thanks for the explanation. This is exactly what was happening in my case. With a history depth of 1 the reader was not able to process a sample before a new a new sample arrived and replaced it. When I increased the history depth, I was able to receive all samples. I am going to run some benchmarks on my system to see if there is a reasonable history depth so the chances of losing samples is low.
Can I ask, with the HISTORY kind set to KEEP_ALL, and DURABILITY set to TRANSIENT_LOCAL, does that mean a writer will try to keep all samples for a late joining reader? Ideally we would like just the last sample for a late joining reader but have a way to increase the history depth for a reliable reader so that it can get all samples on the bus.
Thanks,
Anne
Hey Anne,
I'm not sure I'm following your needs.
Do you:
1. Have one writer and you need different readers to interact differently with it?
2. Have multiple writers and readers that behave differently?
Do note that you can set the history depth of the reader to be 1 if you're only interested in the last update but I can't find in the QoS guide what the side effects may be.
Also note that using KEEP_ALL will not literally keep all (it depends both on the amount of memory you actually have available by the operating system and on your resource limits qos).
Hope that helps,
Roy.
Hi Roy,
We are currently using the same QOS for all readers and writers (BuiltinQosLibExp::Pattern.LastValueCache). This has a history depth of 1. What we found is that for some topics which have a high data rate, the reader is not able to keep up with the writer because the depth is 1.
To fix this we are looking at changing the history depth for this topic's writers and readers to be > 1 or to use HISTORY set to KEEP_ALL. I was concerned about resources if we set the HISTORY to KEEP_ALL and its effect on system performance. I think I understand enough about what is happening so that I can run some tests and benchmark performance.
Thanks for the help,
Anne