I'm experiencing issues with a distributed system consisting of approximately 100 unique applications that run across 5-10 machines. The software loads the applications across the nodes using an allocation algorithm so it varies on what the loadout looks like.
Occasionally we will experience an issue with software cycling on startup with a log output of:
com.rti.dds.infrastructure.RETCODE_ERROR
at com.rti.dds.util.Utilities.rethrow
at com.rti.dds.infrastructure.RETCODE_ERROR.check_return_code
at com.rti.dds.infrastructure.EntityImpl.enable
<some in house framework code that creates a participant>
We are using ndds 4.4d rev 45, on RHEL 7. When this issue occurs moving the software to a different node, or unloading a piece of software that initialized correctly on that problem node will allow the software to start up. I have also noticed that restarting the computer node the software will properly initialize once it starts back up.
Is the most likely scenario a port conflict? Is it possible to get more verbose logging out of the rti library on what the cause of the return code error is?
Thanks in advance.
I'm not shure I can help you with the problem you are experiencing, but it should be possible to setup logging using QoS policies. http://community.rti.com/rti-doc/500/ndds/doc/html/api_dotnet/classDDS_1_1LoggingQosPolicy.html