Discovery Issue (again) : )

3 posts / 0 new
Last post
Offline
Last seen: 10 years 8 months ago
Joined: 08/28/2013
Posts: 66
Discovery Issue (again) : )

Hi,

I have a different behaviour despite same QoS and binaries between two Hardwares at discovery.

It seems a "link" is kept in memory between HW A and HW B, and if HW A is rebooting, 1 case on two, there is no discovery.

LOGS on HW B:

TID[2152994196]RTPS bind: fwd: src: 0/0.0.0.c2000100 ==> dst 0/0.8888885.deadc0de.c7000100
TID[2152994196]UDP: bind_external: 17176/c0a8010a.0.0.0 ==> dst 0/0.8888885.deadc0de.0
TID[2152994196]UDP: bind_external succeeded: 17176/c0a8010a.0.0.0 ==> dst 0/0.8888885.deadc0de.0 (count        = 2)


LOGS on HW A:
***** In this case no discovery
TID[2152516772]UDP: created port thread: 1/17151/efff0001
TID[2152516772]RTPS bind: fwd: src: 0/0.0.0.c2000100 ==> dst 0/0.8888885.deadc0de.c7000100
TID[2152516772]UDP: bind_external: 17150/efff0001.0.0.0 ==> dst 0/0.8888885.deadc0de.0
TID[2152516772]UDP: bind_external succeeded: 17150/efff0001.0.0.0 ==> dst 0/0.8888885.deadc0de.0 (count = 1)
0x804ccca4 (PM_PIL): DDS DataReader of Topic ACT_ST lost a DataWriter.Nb of alive DataReader:1


***** In this case, there is discovery
TID[2152516772]UDP: created port thread: 1/17151/efff0001
TID[2152516772]RTPS bind: fwd: src: 0/0.0.0.c2000100 ==> dst 0/0.4444444.deadc0de.c7000100
TID[2152516772]UDP: bind_external: 17150/efff0001.0.0.0 ==> dst 0/0.4444444.deadc0de.0
TID[2152516772]UDP: bind_external succeeded: 17150/efff0001.0.0.0 ==> dst 0/0.4444444.deadc0de.0 (count = 1)
0x804ccca4 (PM_PIL): DDS DataReader of Topic ACT_ST lost a DataWriter.Nb of alive DataReader:1

So dst 0/0.4444444 seems working and not dst 0/0.8888885

Is it because HW B is already on bind_external succeeded: 17176/c0a8010a.0.0.0 ==> dst 0/0.8888885 ?

2 Questions:

1/ Is it linked to issue when on vxWorks, participant_id is not changed at boot time ?
(in this case, it is a problem linked to rand() which gives always same id)
And why is it working one time on two ?

2/ What should i do if it this not linked to 1/ ?

Regards,

Rodolf

rip
rip's picture
Offline
Last seen: 2 weeks 4 days ago
Joined: 04/06/2012
Posts: 324

Hi Rodolf,

Yes, this is because you're using a deterministic operating system (VxWorks). This is a known problem, see http://community.rti.com/search/site/participant%20ID%20vxworks.

The "sometimes it works" has to do with timeouts.  When the peer has a chance to timeout the known object and mark it as stale, it will reconnect when you reboot the VxWorks board.  This is covered in more depth in the search results.


Regards,
Rip

Offline
Last seen: 10 years 8 months ago
Joined: 08/28/2013
Posts: 66

I reply late, but thanks.

I am trying to keep my participant and to destroy it as less as possible.

Rodolf