Can the samples written on_publication_matched() be delivered to the readers reliably?

5 posts / 0 new
Last post
Offline
Last seen: 5 months 3 weeks ago
Joined: 10/09/2022
Posts: 13
Can the samples written on_publication_matched() be delivered to the readers reliably?

Suppose the following scenario:
One writer with QoS: KEEP_ALL_HISTORY_QOS + RELIABLE_RELIABILITY_QOS + VOLATILE_DURABILITY_QOS.
Two Readers with QoS: KEEP_ALL_HISTORY_QOS + RELIABLE_RELIABILITY_QOS + VOLATILE_DURABILITY_QOS.
It seems that the samples written in on_publication_matched() callback function can't be delivered to readers reliably.

void on_publication_matched(DataWriter *writer, const PublicationMatchedStatus &info)  {
    if (info.current_count_change >= 1) { // Write on publication matched
        writer->write(&sample);
    }
}

I got the following logs according to time sequence:

  1. Writer discovers reader and invoke on_publication_matched() callback function.
  2. Writer writes sample in on_publication_matched() callback function.
  3. Reader discovers writer and invoke on_superscription_matched() callback function.
  4. Sometimes, one reader can receive the sample, but the other one can't!

 

Keywords:
Howard's picture
Offline
Last seen: 15 hours 8 min ago
Joined: 11/29/2012
Posts: 565

So, if you need both DataReaders to receive the data, you need to wait until the DataWriter is matched to both DataReaders before sending the sample.  Your logic will send the sample as soon as 1 DataReader is matched. 

I would suggest using a non-VOLATILE (e.g., TRANSIENT_LOCAL) Durability QoS if you actually want to avoid this race condition so that it doesn't matter when a DataReader is discovered, data that was sent before the discovery will automatically be sent to the DataReader.

Offline
Last seen: 5 months 3 weeks ago
Joined: 10/09/2022
Posts: 13

Thanks for the kind reply. After several tests by one reader and one writer with VOLATILE durability QoS, the sample written in on_publication_matched() may be lost either.

Dose the reason that writer has discoverd reader and then sent the sample, but the reader hasn't discovered the writer, cause the sample lost?

Howard's picture
Offline
Last seen: 15 hours 8 min ago
Joined: 11/29/2012
Posts: 565

I could not reproduce the problem.

I started my DataWriter and DataReader apps in different order, writer first then reader, reader first and then writer, and have not been able to reproduce your problem of missing samples.  In all test runs, the reader receives the sample that was sent in the on_publication_matched() as the first sample it received, even if other samples are being send before the reader is discovered.

Even if the reader app receives the on_publication_matched() data and drops it because it hasn't yet discovered the writer, because the RELIABLE protocol is used, DDS will resend the data sample.

So, as long as the Writer has discovered the Reader, from that point onwards, all data sent, should be received for a RELIABLE connection.

Thus, I suspect that one or more of your applications is not creating the DataWriter or DataReader with RELIABLE reliability and/or KEEP_ALL history.

When I set the connection to be BEST_EFFORT, I could reproduce the lost of the sample sent in on_publication_matched() by starting the DataReader app after the DataWriter app.

How are you setting the QOS?  In code?  In XML?  If XML, can you verify that the XML file is being used by your application...a quick way to check is to introduce a syntax error, like an unknown tag, <aaa>, and  the app should fail to parse the XML file. 

You can use RTI Admin Console to see if the QoS settings of your DataWriter and DataReader are as expected.

Offline
Last seen: 5 months 3 weeks ago
Joined: 10/09/2022
Posts: 13

After more testing, I found this situation only occured on Fast-DDS. Thanks for reply!