Apparent random publisher behaviour

8 posts / 0 new
Last post
Offline
Last seen: 4 years 4 months ago
Joined: 05/30/2016
Posts: 16
Apparent random publisher behaviour

Hi all. I would like to know if the following problem has a solution.

Let's say I have a couple of clients A and B (in a future several more). The first step of these clients is to log in. They send the user data and receive an allowed/not allowed to log in message.

I launch both applications ten times, restarting the service too each time. First five I first press the submit button of the application A and the other five tests is B which sends first.

Six out of ten times everything works like a charm. Each client receives their reply from the server. But in the rest four times, the second client sending the message -always the second, no matter if it's A or B-, never receives the answer from the server.

I checked that the server always receives the secondly submitted log in data.

Then I saw this thread https://community.rti.com/kb/why-does-my-dds-datareader-miss-first-few-samples and though well, if it's just a question of time for the discover process, I'll set a timer that will resubmit the message until on_data_available of the second client is executed, for instance each second. But this timer keeps sending again and again the message, it is received in the server, but the on_data_available of the client is never executed.

A and B are different clients, not two instances of the same application. They just share the login service of the server.

The rate is not always 6-4. For instance in another set of ten all of them worked perfectly. Then I would like to know why the thing of the timer is not working and what should I do instead in those failing situations.  

thanks in advance

Offline
Last seen: 3 months 5 days ago
Joined: 02/11/2016
Posts: 144

Hey,

 

Firstly, I would recommend not using a timer!

If you're using a listener (an ok decision), you can use the on_subscription_matched so that the client doesn't send their message before having x matches (depending on how many you'd like, in the example above I guess 1 is good).

Regarding why the answer doesn't reach the second client / isn't sent - it sounds like a qos problem.

Maybe you're using a profile that doesn't fit what you need, which QoS profile are you using?

 

Good luck,

Roy.

Offline
Last seen: 4 years 4 months ago
Joined: 05/30/2016
Posts: 16

Thanks for the suggestion, but I'm not sure if I get the idea, I'm still begining with all this. I've been searching for examples of use of on_subscription_matched but I only see prints when they're called. But if I understood it well:

1) we have a DataReaderListener that creates a data reader with the topic name "my_name"

2) a publisher out there creates a data writer with a topic having the name "my_name" too

3) the on_subscription_matched from the first DataReaderListener is triggered

 

But what I do is the following:

1) the server creates a data reader with a topic of name "client-to-server requests"

2) client (all clients) creates a data writer with a topic of name "client-to-server requests" too

3) user submits the login data

4) the client creates a data reader with a topic of name username +" topic"

5) the client sends via the "client-to-server requests" the login data to the server

6) server receives that message and uses the username to create a data writer with a topic of name username +" topic"

7) the server sends the login request reply using that data writer

 

Then I don't understand which are those matches when you say "so that the client doesn't send their message before having x matches " in this scenario.

 

About the QoS, everything is in x_QOS_DEFAULT.

Offline
Last seen: 3 months 5 days ago
Joined: 02/11/2016
Posts: 144

Hey,

What I mean is this:

Due to discovery taking more than 0 seconds, it's possible that by sending samples as soon as you created a writer, you are preventing these samples from reaching the desired readers.

You could simply resubmit "these" samples until you get confirmation of them being received (via the client reader) or you could do something smarter:

1. create a data writer listener (https://community.rti.com/rti-doc/510/ndds/doc/html/api_dotnet/classDDS_1_1DataWriterListener.html) and pass it along to the client login writer

2. make sure that this listener notifies you when a match is made

3. use that notification to decide when to send the login data (assuming you are expecting exactly one match [if you have 1 reader that should match the writer], verify that. if you have expect to have x matches, verify that)

 

You can generally use this scheme everywhere to avoid sending data when no one is listening.

This makes sense for any data writer which doesn't have durable qos (that is, it doesn't send historical data to new readers).

 

I definitely get that some of these concepts are hard to grasp at first (I sure had my own problems with them when I got started) but I hope it'll get clearer for you!

Good luck,

Roy.

Offline
Last seen: 4 years 4 months ago
Joined: 05/30/2016
Posts: 16

Thanks for putting me on the right track. This is what I did, in case is useful to someone:

public class MyClassPublisher : DDS.DataWriterListener {

...

while (!this.discovered) { Thread.Sleep(500); }
myDataWriter.write(msg, ref DDS.InstanceHandle_t.HANDLE_NIL);

...

 

/* This callback is called when the DDS::DataWriter has found a DDS::DataReader that matches the DDS::Topic, has a common partition and compatible QoS, 
* or has ceased to be matched with a DDS::DataReader that was previously considered to be matched.*/
public override void on_publication_matched(DDS.DataWriter writer, ref DDS.PublicationMatchedStatus status) {
      this.discovered = true;
}

 
Offline
Last seen: 3 months 5 days ago
Joined: 02/11/2016
Posts: 144

Hey,

Two things to note:

1. You can use a shorter sleep, as short as 1 milli to avoid busy waiting.

2. You may want to check the status received in on_publication_matched (for example, if you'd like current_count to be greater than some x?)

 

Good luck,

Roy.

Offline
Last seen: 4 years 4 months ago
Joined: 05/30/2016
Posts: 16

Ok. False positive. Still same problem: A sends to B, the message reaches to B, B replies but A only sees the answer 'sometimes'. I'm sending just a few messages, but these are used to communicate orders, so losing one of them could be a problem.

About on_publication_matched the reference says: "This callback is called when the DDS::DataWriter has found a DDS::DataReader that matches the DDS::Topic, has a common partition and compatible QoS, or has ceased to be matched with a DDS::DataReader that was previously considered to be matched." I was not sure if this method was called every time you write or just the first time you write with a particular data writer. With a print I see that it seems to be called just the first time.

Then, to check if there is at least one reader listening a particular writer, looking as you suggested to current_count > 1, I want to do something like this.

public override void on_publication_matched(DDS.DataWriter writer, ref DDS.PublicationMatchedStatus status) {
      System.Diagnostics.Debug.WriteLine("Publisher - on_publication_matched " + writer.get_topic().???? + " status.current_count:" + status.current_count);
      this.discovered = true;
}

The thing now is that looking at the reference I can't find the way to get the name of the topic of that writer to distinguish which of them saw a reader and hence can already send messages. 

 Edit: New attempt. Seems to work better: 50 messages sent, just one message lost.

public CommandMsgDataWriter writer_discovered;

public override void on_publication_matched(DDS.DataWriter writer, ref DDS.PublicationMatchedStatus status) {
     this.writer_discovered = (CommandMsgDataWriter)writer;
}

public void write(CommandMsg c) {
    while (this.writer_discovered != datawriters[c.pilot]) { Thread.Sleep(1); }
     datawriters[c.pilot].write(c, ref DDS.InstanceHandle_t.HANDLE_NIL);
}

 

Offline
Last seen: 3 months 5 days ago
Joined: 02/11/2016
Posts: 144

Hey,

To guaranatee delivery I would recommend not using these options but instead using the reliability qos (for messages sent after a match is made) and the durability qos (for messages sent before a match was made).

Hope this will point you in the right direction,

Roy.