Batch never flushed (with non infinite delay)

8 posts / 0 new
Last post
Offline
Last seen: 9 years 4 months ago
Joined: 05/25/2014
Posts: 6
Batch never flushed (with non infinite delay)

Hello

I configured a DataWriter with max_flush_delay of 2000000 nanosec (2 milsec). I write to the DataWriter about 600 samples for 30 sec and then a long quiet period, but the last few samples are never received by the DataReader.

When I disable batching, all samples are received as expected.

I am using Connext 5.1 under Windows x64. I saw the following fix in the 5.1 Release Notes, but still the problem still exist.

**************

Batch Never Flushed, Even with Finite max_flush_delay

If batching was enabled and the BatchQosPolicy’s max_flush_delay was set to a finite value, it was possible that a successfully written batch was never automatically flushed (as it should have been based onmax_flush_delay). This problem occurred when there was a previously written batch that was only flushed when the batch that was never flushed was written. This problem has been resolved. [RTI Issue ID CORE-5870]

************

Thanks

David

Organization:
Offline
Last seen: 1 year 1 month ago
Joined: 01/14/2013
Posts: 16

Hi,

The only way I'm able to reproduce the problem is by deleting the DataWriter before this long quiet period, but I assume you're doing the long quiet period before you delete the DataWriter. The DataWriter doesn't automatically flush its contents when it's deleted.

600 samples for 30 seconds comes out to an average of 50 milsec per sample. Were you trying to get some benefit from flushing every 2 milsec, or were you just turning down the max_flush_delay to the minimum possible value that still reproduces the problem?

Yusheng

Senior Software Engineer, RTI

Offline
Last seen: 9 years 4 months ago
Joined: 05/25/2014
Posts: 6

I am sending samples of ~100 bytes in bursts (each 30 milsec I send some samples).

The number of messages in production system is expected to be higher, around 500 messages / sec, but quit periods are possible. So I am tring to minimize bandwidth reqs but also keep a low latency.

Offline
Last seen: 1 year 1 month ago
Joined: 01/14/2013
Posts: 16

Hi David,

Every 30 milsec, you send some samples. Total time is 30 sec. That means you send 30/0.03 = 1000 bursts of samples. But you said total samples is about 600, so each burst would have to have less than a sample. So your story is not adding up. It would be much easier to reproduce your problem if you could attach your actual publisher and subscriber code and QoS configuration. You can just upload the simplest application that exhibits your behavior.

Yusheng

Offline
Last seen: 9 years 4 months ago
Joined: 05/25/2014
Posts: 6

Hello,

As requested, here is a sample application which reproduces the problem - 664 samples written, but only first 640 read by the DataReader.

Note that I am using two DomainParticipants (one writing and the other reading) on the same proccess - could it be related?

Console output at my dev machine is also attached - see output.txt

 

File Attachments: 
Offline
Last seen: 1 year 1 month ago
Joined: 01/14/2013
Posts: 16

Hi David,

Thanks for your reproducer. I was able to reproduce the problem. Try adding <sec>0</sec> to your max_flush_delay. That resolved the problem for me.

Note that the default max_flush_delay is INFINITE, which corresponds to sec = nanosec = 0x7fffffff.

Yusheng

rip
rip's picture
Offline
Last seen: 2 weeks 4 days ago
Joined: 04/06/2012
Posts: 324

Also, see this:  http://community.rti.com/rti-doc/510/ndds.5.1.0/doc/pdf/RTI_CoreLibrariesAndUtilities_ReleaseNotes.pdf

Section 3.1.13: Invalid Value for max_blocking_time

The upshot is that the QoS parsing can react in unexpected ways if you only set one of the fields (only sec, or only nanosec) when setting a Duration_t.  The behavior is to only update what the user updates, but in things like Duration_t, sec and nanosec are only coherent as a set, and there will be a default value left in the field-not-set.

Best practice is to /always/ set both.

Offline
Last seen: 9 years 4 months ago
Joined: 05/25/2014
Posts: 6

Thanks, that actually solved the problem! 

 

David