13.6.2. What’s Fixed in 7.2.0

To review any fixes applied to Monitoring Library 2.0, see What’s Fixed in 7.2.0, in the RTI Connext Core Libraries Release Notes.

13.6.2.1. Collector Service might have crashed on startup

Collector Service could have crashed on startup if something failed in the initialization of one of its components. This happened because the clean-up method called after the failure accessed invalid memory. Before the crash, error messages appeared in the RTI_MonitoringForwarder_initialize function.

For example, the initialization would fail if either the event_datareader_qos, logging_datareader_qos, or periodic_datareader_qos of the input_connection were configured with inconsistent QoS.

This issue is resolved.

[RTI Issue ID OCA-226]

13.6.2.2. Controllability issues on applications with same name

When multiple monitored applications shared the same application name, the exit of one of these applications could disrupt control of the remaining ones. This issue also occurred when a monitored application was closed ungracefully and then restarted. This issue has been fixed; now, the GUID of the application is also considered when an application is accessed using its name.

[RTI Issue ID OCA-224]

13.6.2.3. Unhandled exceptions may have caused segmentation fault

Observability Collector Service was not handling exceptions in the destructor; if an exception occurred, this issue may have led to a segmentation fault at the time of destruction. This issue has been fixed.

[RTI Issue ID OCA-289]

13.6.2.4. Race condition when processing remote commands led to failures and memory leaks when shutting down Collector Service

In Observability Collector Service, due to an internal race condition, the cleanup done after a remote administration command (for example, changing the forwarding or collection verbosity of an application) was processed could fail with the following error message:

ERROR DDS_AsyncWaitSetTask_detachCondition:!detach condition

This left one of the internal components of Observability Collector Service in an inconsistent state, which caused failures and memory leaks when the service was shut down:

ERROR DDS_AsyncWaitSet_submit_task:!wait for request completion
ERROR DDS_AsyncWaitSet_detach_condition_with_completion_token:!submit internal task
ERROR DDS_AsyncWaitSet_detach_condition:!DDS_AsyncWaitSet_detach_condition_with_completion_token
ERROR DDS_AsyncWaitSet_finalize:!detach condition
ERROR DDS_AsyncWaitSet_delete:!DDS_AsyncWaitSet_finalize

This race condition is fixed. The cleanup of already processed commands no longer causes unexpected failures.

[RTI Issue ID MONITOR-610]

13.6.2.5. Collector Service could discard samples when monitoring large DDS applications

In the previous release, Observability Collector Service could report the following error messages when collecting telemetry data from applications with a large number of DDS entities (for example, 2000 DataWriters):

ERROR [0x01016A0B,0x38EDDDA5,0x6C2A146D:0x00000184{Entity=DR,MessageKind=DATA}|RECEIVE FROM 0x0101DC38,0xA4FD24A4,0x06193ECA:0x00000183] DDS_DataReader_add_sample_to_remote_writer_queue_untypedI:add sample to remote writer queue
ERROR [0x01016A0B,0x38EDDDA5,0x6C2A146D:0x00000184{Entity=DR,MessageKind=DATA}|RECEIVE FROM 0x0101DC38,0xA4FD24A4,0x06193ECA:0x00000183] RTI_MonitoringForwarderEntities_addSampleToCacheReader:ADD FAILURE | Sample to the cache reader of DCPSPeriodicStatusMonitoring

This problem was due to the queues of the internal Collector’s DataReaders becoming full because of the default QoS configuration and the large amount of data received, causing new samples to be discarded and, consequently, not pushed to the Observability Framework backends.

This issue has been resolved. The queues for the internal DataReaders are now configured to have no limit, ensuring successful telemetry data collection regardless of the number of DDS entities.

Note

The example error messages above refer to the Periodic Topic, but the same messages were reported for other Observability Framework Topics (Events and Logging).

[RTI Issue ID MONITOR-619]