Hello
I configured a DataWriter with max_flush_delay of 2000000 nanosec (2 milsec). I write to the DataWriter about 600 samples for 30 sec and then a long quiet period, but the last few samples are never received by the DataReader.
When I disable batching, all samples are received as expected.
I am using Connext 5.1 under Windows x64. I saw the following fix in the 5.1 Release Notes, but still the problem still exist.
**************
Batch Never Flushed, Even with Finite max_flush_delay
If batching was enabled and the BatchQosPolicy’s max_flush_delay was set to a finite value, it was possible that a successfully written batch was never automatically flushed (as it should have been based onmax_flush_delay). This problem occurred when there was a previously written batch that was only flushed when the batch that was never flushed was written. This problem has been resolved. [RTI Issue ID CORE-5870]
************
Thanks
David
Hi,
The only way I'm able to reproduce the problem is by deleting the DataWriter before this long quiet period, but I assume you're doing the long quiet period before you delete the DataWriter. The DataWriter doesn't automatically flush its contents when it's deleted.
600 samples for 30 seconds comes out to an average of 50 milsec per sample. Were you trying to get some benefit from flushing every 2 milsec, or were you just turning down the max_flush_delay to the minimum possible value that still reproduces the problem?
Yusheng
Senior Software Engineer, RTI
I am sending samples of ~100 bytes in bursts (each 30 milsec I send some samples).
The number of messages in production system is expected to be higher, around 500 messages / sec, but quit periods are possible. So I am tring to minimize bandwidth reqs but also keep a low latency.
Hi David,
Every 30 milsec, you send some samples. Total time is 30 sec. That means you send 30/0.03 = 1000 bursts of samples. But you said total samples is about 600, so each burst would have to have less than a sample. So your story is not adding up. It would be much easier to reproduce your problem if you could attach your actual publisher and subscriber code and QoS configuration. You can just upload the simplest application that exhibits your behavior.
Yusheng
Hello,
As requested, here is a sample application which reproduces the problem - 664 samples written, but only first 640 read by the DataReader.
Note that I am using two DomainParticipants (one writing and the other reading) on the same proccess - could it be related?
Console output at my dev machine is also attached - see output.txt
Hi David,
Thanks for your reproducer. I was able to reproduce the problem. Try adding <sec>0</sec> to your max_flush_delay. That resolved the problem for me.
Note that the default max_flush_delay is INFINITE, which corresponds to sec = nanosec = 0x7fffffff.
Yusheng
Also, see this: http://community.rti.com/rti-doc/510/ndds.5.1.0/doc/pdf/RTI_CoreLibrariesAndUtilities_ReleaseNotes.pdf
Section 3.1.13: Invalid Value for max_blocking_time
The upshot is that the QoS parsing can react in unexpected ways if you only set one of the fields (only sec, or only nanosec) when setting a Duration_t. The behavior is to only update what the user updates, but in things like Duration_t, sec and nanosec are only coherent as a set, and there will be a default value left in the field-not-set.
Best practice is to /always/ set both.
Thanks, that actually solved the problem!
David