Unable to discover the reader in my app [solved]

6 posts / 0 new
Last post
Offline
Last seen: 4 years 10 months ago
Joined: 05/30/2016
Posts: 16
Unable to discover the reader in my app [solved]

Hi all.

I’ve been developing a set of apps with one server and three different clients. Everything works fine while running all of them on the same machine. Now I put one of those clients in another machine B, connected to the same router, and the first message of this client -a login request- is never received by the server in machine A. The on_publication_matched method in the client is never executed, but in the Administration Console, the client turns from Healthy to Warning: Reader-only Topic. Topic contains only DataReaders when the message is sent.

The shapes demo works fine in both of them, they are able to send and receive the shapes.

I tried to create the datawriter in the client providing a DataWriterQos like this:

DDS.DataWriterQos datawriter_qos = new DDS.DataWriterQos();

publisher.get_default_datawriter_qos(datawriter_qos);

datawriter_qos.history.depth = 1;

datawriter_qos.reliability.kind = DDS.ReliabilityQosPolicyKind.RELIABLE_RELIABILITY_QOS;

And equivalent for the DataReaderQos in the server. (But when I create the publisher I set DDS.DomainParticipant.PUBLISHER_QOS_DEFAULT; DDS.DomainParticipant.TOPIC_QOS_DEFAULT in the create_topic and DDS.Subscriber.DATAREADER_QOS_DEFAULT in the create_datareader. Not sure if it’s ok like that.)

Anyway it would not work. Then I tried to leave everything in _QOS_DEFAULT and put in the same folder of the executable of the server and the client, the USER_QOS_PROFILES.xml provided in the hello_world_request_reply example. No way. No errors neither.

I tried to use the Administration Console, to analize the problem and I found a strange fact. In my scenario I need to run the server and three different clients. If I run all of them in the same machine, two of the client log in with no problem. The third one only logs in if the Administration Console is not running -I checked this several times-. Its way to create the writer is the same than the one used by the other clients.

I haven’t set the NDDS_DISCOVERY_PEERS, because I thought that there was no need -considering that the shapes demo works- and I’m not sure about how to set it when both computers are connected to the same router.

Any idea about which could be the problem?

Update: Also tried:

- this QoS: https://github.com/rticommunity/rticonnextdds-examples/blob/master/examples/using_qos_profiles/cs/USER_QOS_PROFILES.xml 

- setting the NDDS_DISCOVERY_PEERS environment variable to: localhost,192.168.1.101 

 

 

Keywords:
Offline
Last seen: 9 months 2 days ago
Joined: 02/11/2016
Posts: 144

Hey,

 

First, to verify:

When using shapes demo you are able to publish shapes on machine b and receive them in machine a?

If not, you are experiencing some issues with communication (due to multicast not working, or something more severe).

If you are able to send shapes between the machines, you may have mismatching types, or some weird qos issues (try using builtin qos in both apps)

 

good luck,

Roy.

Offline
Last seen: 4 years 10 months ago
Joined: 05/30/2016
Posts: 16

Hi Roy,

Thanks for your interest. Yes, the shapes demo works in both senses, so it must be something with the configuration of my apps.

I'm quite sure that has nothing to do with the type. I can see it in the Administration Console and is the same for both of them, the topic name too.
When there is a type mismatch the Log shows clearly that kind of error (I've been there), and the process doesn't appear as Healthy, but is not the case.
Both are healthy at the begining. Remember also that everything works fine when all of them are in the same machine.

My last attempt with the QoS was the file of the aforementioned link. I chose that for the RELIABLE_RELIABILITY_QOS, because I thought that maybe was a problem just with the first samples.

Btw, should the NDDS_DISCOVERY_PEERS file have any extension? I tried without extension. Also to set the environment variable.
In A I set: localhost, <IP_of_B> and in B: localhost, <IP_of_A>

Another experiment was to send some samples before/besides waiting for the on_publication_matched. But I can't even see them in the Administration Console.

I took a look to the code of the shapes demo for android (https://github.com/rticommunity/rtishapesdemo-android). Publisher, subscriber, topics,... are created with XXX_QOS_DEFAULT and there's no USER_QOS_PROFILES.xml file. That was exactly how I had everything initially.

So I don't know what else could I try. Seems quite clear that the client writer cannot discover the server reader (I launch the server first), but I don't know if the problem is in the client, the server, or both.

Offline
Last seen: 9 months 2 days ago
Joined: 02/11/2016
Posts: 144

Maybe you can try to use an example code and see if that works?

If it does, try to modify it slowly and see when it breaks (does it break when you change your qos? does it break when you change the type?)

The difference I can guess at when comparing the scenarios are with regard to the transport.

On the same machine, the apps can use shared memory to discover eachother and to communicate, on different machines you would normally use udp.

In this scenario it's possible for either your discovery messages or your user messages to be "unfit" for the network definitions:

1. is multicast enabled?

2. if not, are you using initial peers correctly?

3. if you are, are your discovery messages "too big" to be sent using the default qos settings?

4. if they aren't, are they "too big" to be sent properly over your network (mtu being too small mixed with something that cannot fragment big messages?)

There's very little to go on but since admin console seems not to recognize the writer (presumably when run on the readers side), I'm guessing discovery isn't working properly.

Since you are able to use shapes demo when run on different machines I'm guessing it has to do with your type and/or qos.

Offline
Last seen: 4 years 10 months ago
Joined: 05/30/2016
Posts: 16

Thanks for the suggestions KickR, I had some advances. I set the NDDS_DISCOVERY_PEERS environment variable to 239.255.0.1 in both machines (multicast). Now the client discovers the server topic and sends the message. This message appears in the Administration Console of the client machine, but not in the AC of the server machine (neither the on_data_available of the server is executed). 

 

Following https://community.rti.com/kb/how-do-i-get-data-reader-receive-data-over-multicast also tried this in the server:

 

DDS.DataReaderQos datareader_qos = new DDS.DataReaderQos();
_subscriber.get_default_datareader_qos(datareader_qos);

 

DDS.TransportMulticastSettings_t tms = new DDS.TransportMulticastSettings_t();

datareader_qos.multicast.value = new DDS.TransportMulticastSettingsSeq(1);
datareader_qos.multicast.value.length = 1;
tms.receive_address = "192.168.1.101";  //client's ip
datareader_qos.multicast.value.set_at(0, tms);

_reader = (CommandMsgDataReader)_subscriber.create_datareader(_commandsReaderTopic, datareader_qos, this, DDS.StatusMask.STATUS_MASK_ALL);

But no difference.

Offline
Last seen: 4 years 10 months ago
Joined: 05/30/2016
Posts: 16

Got it!!!

Although I tried so many things that I'm not sure which of them were key to solve it, but in case it can help someone my final steps were the following. 

I found this https://community.rti.com/kb/why-are-my-reader-and-writer-applications-unable-communicate

 "Generally speaking, if a regular 'ping' does not work in both directions, RTI Connext traffic will not be able to get through either.". I knew that the ping from the client worked, and I was assuming that the opposite was true too. That 'both directions' led to try the ping from the server to discover that this one was not working. Everything worked when I disabled both firewalls.

Thanks to the people who suggested fixes, I've learnt a bit more about how all these configuration issues