We have 7 processes running on a single computer. Each process creates 3 domain participants with domain id 0,1, and 2. Our system has less than 80 DataReader/Writer combined.
The symptom is some DataReader/Writer from process number 6 or 7 (the process that was launch later) will not be discovered by rtiddsspy nor other processes. There are no error message from RTI DDS, and DataReader/Writer pointers return from publisher->create_writer is non-null. The application can still send data using those "problem" data writer and get a return ok.
Some QoS we modified: no multicast in discovery, using initial peers list for loopback 127.0.0.1, turned off shared memory transport.
Question: What is restricting some later created DataWriter/Reader from working (meaning discover, send/receive data)? Is there some resources shared by all Reader/Writer get used up?
Since then:
1) tried using a decreased discovery_config.participant_liveness_assert_period, still shows same symptoms.
2) reordered the launching of processes, it is always the later processes that will have "problem" DataReader/Writer
3) install DataReaderListener for problem DataReader, no subscription_matched, or any of the callbacks was called.
edit: using RTI DDS 5.0.0
Solved:
In our QoS xml for the initial peers list, we were using builtin.udvp4://127.0.0.1 loopback without specifying how many participant id limits according to the address format, it is defaulted to contacting participant id 0 to 4.