Hi!
I think I have a problem of sample delivery leakage between different domains.
The system is setup like below...
- PC1: A process of domain #2
- PC2: A process of domain #2
- PC3: 5 processes of domain #10
Under this setup, the samples from domain #10 are delivered to the processes of #2.
This problem is shown rarely, but once it occurs, the processes from #10 keep sending samples to #2.
Each process has only one domain participant for the corresponding domain.
Is there any known issue on corrupted domain or any well-known misuse case for this problem?
If you need any further information, please let me know.
Thnak you!
Hello yoondo,
Domain separation is done through the selection of different ports for different domains. (The port number is chosen based on a combination of the domain ID, and the participant ID). There are a couple of ways where you might have communication between domains, if they are using the same ports.
1. The first option is if one of the applications has modified the RTPS Well Known Ports QoS, and the other has not. (This has a very good explanation of how the port mapping formula works):
RtpsWellKnownPorts QoS description
2. The second option is that if you have many DomainParticipants in the same domain on the same machine, they could potentially start to use some of the same ports as DomainParticipants in a different domain (you can change this by modifying the QoS policy above). Since domains 2 and 10 are far from each other, it would require 2,500 DomainParticipants on the same machine, which is unlikely.
3. If the application in domain 2 is setting the participant ID to a very large number, this could also happen.
If your applications have not modified the Wire Protocol QoS or the participant ID, and if you do not have many DomainParticipants, it might be useful to see a WireShark packet trace if you can send one.
Thank you!
Rose
Each peer in both domains has the same QoS properties.
The QoS properties listed below are the whole list of wire protocol related things of the peers.
PropertyQosPolicyHelper.assert_property(participantQos.property_qos, "dds.transport.load_plugins", "dds.transport.tcp.tcp1", false);
PropertyQosPolicyHelper.assert_property(participantQos.property_qos, "dds.transport.tcp.tcp1.library", "nddstransporttcp", false);
PropertyQosPolicyHelper.assert_property(participantQos.property_qos, "dds.transport.tcp.tcp1.create_function", "NDDS_Transport_TCPv4_create", false);
PropertyQosPolicyHelper.assert_property(participantQos.property_qos, "dds.transport.tcp.tcp1.ignore_loopback_interface", "0", false);
PropertyQosPolicyHelper.assert_property(participantQos.property_qos, "dds.transport.tcp.tcp1.server_bind_port", 30003, false);
PropertyQosPolicyHelper.assert_property(participantQos.property_qos, "dds.transport.tcp.tcp1.force_asynchronous_send", "1", false);
PropertyQosPolicyHelper.assert_property(participantQos.property_qos, "dds.transport.tcp.tcp1.max_packet_size", "65535", false);
There is no more wire protocol QoS customization except these properties.
Each PC I listed in the original post are enabled to be able to communicate via both UDP and TCP.
So, using one same port number for different domain's PCs' TCP port binding(in my case, server_bind_port 30003) can cause the separation problem?
In this morning, the dds nodes start cleanly, and, for now, the problem hasn't been reproduced.
I think I can collect more information(whireshark capture file) soon, and I will post the packet trace result here.
Thank you :)
You are welcome!
It is helpful to know that you are using the TCP transport. I will find out if there are problems with sharing the same port with different domains.
Thank you,
Rose
Hello yoondo,
I checked, and we do not know of any problems with the TCP transport sharing the same port, even with DomainParticipants in different domains. Can you attach your full QoS configuration?
Thank you!
Rose
Hi Yoondo,
Sorry, we missed your email to the distributor. But now have the QOS file.
However, on review of the QOS file, we cannot see the problem. Also, since you are also modifying the QOS properties in your code, we do not know all of the QOS that you are changing. However, we do have a few comments/questions:
1) you wrote that your system is
pc 1 : App Domain #2
pc 2: App Domain #2
pc 3: 5 Apps Domain #10
In this case, why are you using TCP in the 5 applications on pc 3? Since they are on the same machine, they should not use TCP to communicate with each other. They should only use UDP.
2) What is the peer list that you use for the applications on PC 1, 2 and 3 for discovery?
3) As far as I know, you cannot reuse the same TCP port on the same host. So, only a single application can open a socket for a specific port number, e.g., 30003, on a single host. Are you sure that your applications on PC 3 are all able to use TCP on the same TCP port?
In any case, I would suggest that you use different TCP port numbers for different domains, and certainly different apps on the same host. You would need to use the NDDS_DISCOVERY_PEERS environment variable to set up the discovery correctly between the apps.
4) Can you explain why you want to use TCP and UDP in the same app for the same domain at the same time? Usually, TCP transport is only used between apps that cannot communicate using UDP. So you would use TCP instead of UDP, but not at the same time.
Thanks,
--Howard