Late Joiners Receiving Samples for Disposed Instances with no Writers

3 posts / 0 new
Last post
Offline
Last seen: 5 years 6 months ago
Joined: 10/29/2015
Posts: 12
Late Joiners Receiving Samples for Disposed Instances with no Writers

I have the following scenario:

  • Keyed topic with multiple instances
  • Reliable, TransientLocal QoS
  • One DataWriter with History::KeepAll
  • Multiple DataReaders with History::KeepLast(2)
  • Late-joining DataReaders

What is the expected behavior in the following situation:

  • DataWriter writes data for a particular instance
  • DataWriter then disposes the instance, and immediately unregisters it
  • Some time later, a late-joining DataReader arrives

Is the DataReader expected to receive any information at all regarding to the instance that was completely disposed and unregistered before the DataReader ever appeared?

I currently observe that this is exactly what happens: The DataReader sees samples for this instance. My expectation was that since the instance in question has already been disposed and unregistered, the DataWriter should have purged all information on it as soon as all known reliable DataReaders acknowledged the receipt of all instance samples (and they did acknowledge). Therefore, to a late-joining DataReader it should look as if that instance never existed in the first place, so it should not receive any samples for this instance.

What am I understanding wrong here? Do I need to configure my QoS differently?

This is with v5.2.0.

Thanks for your help!

Gerardo Pardo's picture
Offline
Last seen: 3 weeks 6 days ago
Joined: 06/02/2010
Posts: 601

Hi,

I would not expect a late-joiner DataReader to get data on a instance that has already been unregistered (whether it was deleted or not). 

The DataWriter should remove the sample and instance information from its cache as soon as all the matched, reliable, and active DataReaders acknowlege it. I tried to explain that mechanism in my answer to your later question in the forum.

The exception to the above would be if the DataWriter had to keep the unregistered instance in its cache because there is at least one matched, reliable, and active DataReader that has not acknowledged it. In that case the sample and instance would remain in the DataWriter cache. When the late joiner comes up, the DataWriter does not check whether the reader joined before or after the instance was unregistered (doing so takes extra effort and would still have race conditions based on discovery time versus start time, etc.). The DataWriter simply sends whatever is still in its cache to the newly discovered DataReader(s).

Could that explain what you are seeing?

Gerardo

Offline
Last seen: 5 years 6 months ago
Joined: 10/29/2015
Posts: 12

Gerardo,

in relation to my other post, when I wrote this one my autopurge_unregistered_instances_delay was infinite, so the DataWriter would probably still have everything in its cache even after disposing and unregistering.

The exception to the above would be if the DataWriter had to keep the unregistered instance in its cache because there is at least one matched, reliable, and active DataReader that has not acknowledged it.

You may be on to something here: I know that (1) there were some intermittent WLAN connectivity issues during my tests; and (2) there was the occasional DataReader coming and going again (processes starting/stopping/restarting). Maybe the result was that something didn't quite clean up right – or even, much simpler, there was always a DataReader somewhere that was still busy acknowledging data –, and thus the items in the cache ended up being kept.

The DataWriter simply sends whatever is still in its cache to the newly discovered DataReader(s).

That would explain quite well what I was seeing.

Unfortunately I have not yet been able to systematically, deterministically reproduce the behavior, so it definitely seems like there were multiple factors at play. I will keep an eye on it.

Thanks again,

Jan