10.9. Regressions in 6.1.1/6.1.2

The following regressions were introduced in Connext 6.1.1 or 6.1.2.

10.9.1. Core Libraries

10.9.1.1. A reliable writer’s FlowController fails to control the data flow over UDPv4 for a SHMEM_REF type

Consider the following scenario:

  • A DomainParticipant DP1 is using the Zero Copy transfer over shared memory transport. DP1 uses both SHMEM (for communicating within the same machine) and UDPv4 (for communicating with other machines).

  • DP1 has a reliable DataWriter of a data type annotated with @transfer_mode(SHMEM_REF). This DataWriter uses a FlowController and sets publish_mode.kind to ASYNCHRONOUS_PUBLISH_MODE_QOS.

  • A DomainParticipant DP2 communicates with DP1 using SHMEM.

  • A DomainParticipant DP3 communicates with DP1 using UDPv4.

In release 6.1.2, the FlowController settings do not take effect correctly when the DataWriter is sending full samples to DP3. The FlowController controls the flow as if the DataWriter were sending sample references to DP2. This behavior results in the DataWriter sending full samples to DP3 much faster than expected.

Not fixed yet

[RTI Issue ID CORE-16505]

10.9.1.2. Stack smashing error when serializing strings with RTI_CDR_SIZEOF_LONG_DOUBLE set to 16 in C++11 in release mode using GCC compiler

In release 6.1.1, a stack smashing fault occurs when serializing strings if the RTI_CDR_SIZEOF_LONG_DOUBLE configuration is set to 16 in C++11 in release mode using the GCC compiler. Compiling the code reports a warning similar to:

include/ndds/hpp/rti/topic/cdr/InterpreterHelpers.hpp:165:31: note: the ABI of passing union with 'long double' has changed in GCC 4.4
  165 |     static RTIXCdrMemberValue get_value_pointer(
      |                               ^~~~~~~~~~~~~~~~~

Fixed in: 7.5.0

[RTI Issue ID CORE-14999]

10.9.1.3. Reliable DataReader may stop receiving samples from DataWriter using durable writer history and DDS fragmentation

In release 6.1.2, a reliable DataReader may stop receiving samples from a DataWriter using durable writer history and DDS fragmentation (asynchronous publishing with samples that exceed the minimum message_size_max across all installed transports). This issue occurrs when a sample fragment is lost, which is more likely to occur in lossy networks.

Fixed in: 7.3.0

[RTI Issue ID CORE-14099]

10.9.1.4. Thread names longer than 15 characters on QNX platforms cause errors in API calls

In 6.1.2, when you set a thread name longer than 15 characters on a QNX platform, certain uses of the Connext API, such as creating a waitset by calling DDS_WaitSet_new() using the C API, fail with an error related to the name of the worker thread. In the fix for this issue, you can once again set thread names up to the maximum allowed by the QNX platform.

Fixed in: 7.3.0

[RTI Issue ID CORE-13827]

10.9.1.5. Unexpected precondition error with debug libraries on a reliable DataWriter while sending a GAP

In 6.1.2, you may see the following precondition error while using the Connext debug libraries.

DL Debug: :         Backtrace:
141: DL Debug: :    #4      COMMENDSrWriterService_sendGapToRR /rti/jenkins/workspace/connextdds_ci_fastbuild-debug_develop/commend.1.0/srcC/srw/SrWriterService.c:4096 (discriminator 9) [0x5B101E]
141: DL Debug: :    #5      COMMENDSrWriterService_onSendDataEvent /rti/jenkins/workspace/connextdds_ci_fastbuild-debug_develop/commend.1.0/srcC/srw/SrWriterService.c:6570 [0x5BACF6]
141: DL Debug: :    #6      RTIEventActiveGeneratorThread_loop /rti/jenkins/workspace/connextdds_ci_fastbuild-debug_develop/event.1.0/srcC/activeGenerator/ActiveGenerator.c:307 [0x28E2FC]
141: DL Debug: :    #7      RTIOsapiThreadFactory_onSpawned /rti/jenkins/workspace/connextdds_ci_fastbuild-debug_develop/osapi.1.0/srcC/threadFactory/ThreadFactory.c:208 [0x1F3A42]
141: DL Debug: :    #8      RTIOsapiThreadFactory_onSpawned /rti/jenkins/workspace/connextdds_ci_fastbuild-debug_develop/osapi.1.0/srcC/threadFactory/ThreadFactory.c:208 [0x1F3A42]
141: DL Debug: :    #9      RTIOsapiThreadChild_onSpawned /rti/jenkins/workspace/connextdds_ci_fastbuild-debug_develop/osapi.1.0/srcC/thread/Thread.c:1941 [0x1EDB64]
141: DL Debug: :    #10     start_thread /build/glibc-CVJwZb/glibc-2.27/nptl/pthread_create.c:463 [0x76DB]
141: DL Debug: :    #11     clone /build/glibc-CVJwZb/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:97 [0x12161F]
141: DL Fatal: : FATAL rCoRTInk####Evt [0x01014F91,0x39810444,0x4EC68AEA:0x000004C2|RECEIVE FROM remote DR (GUID: 0x01015FBD,0x5892DC7E,0x9DB082D4:0x000004C7).
141: ] Mx00:/rti/jenkins/workspace/connextdds_ci_fastbuild-debug_develop/commend.1.0/srcC/srw/SrWriterService.c:4099:RTI0x200003b:!precondition: "((((gapStartSn)->high) > (((&(gapBitmap)->_lead))->high)) ? 1 : ((((gapStartSn)->high) < (((&(gapBitmap)->_lead))->high)) ? -1 : ((((gapStartSn)->low) > (((&(gapBitmap)->_lead))->low)) ? 1 : ((((gapStartSn)->low) < (((&(gapBitmap)->_lead))->low)) ? -1 : 0)))) >= 0"
141: DL Error: : ERROR [0x01014F91,0x39810444,0x4EC68AEA:0x000004C2|RECEIVE FROM remote DR (GUID: 0x01015FBD,0x5892DC7E,0x9DB082D4:0x000004C7).
141: ] COMMENDSrWriterService_onSendDataEvent:!send GAP

This error is generated by a reliable DataWriter sending a GAP to a reliable DataReader. After the error is printed, the DataReader may stop receiving data from the DataWriter, leading to a non-recoverable situation. This problem does not occur with release libraries.

Fixed in: 7.1.0

[RTI Issue ID CORE-13462]

10.9.1.6. Unexpected warning during discovery when multicast disabled

In 6.1.2, Connext logs a warning during the discovery process when multicast is disabled. The message warns about unreachable multicast locators. The message is unexpected and has been removed.

Fixed in: 7.1.0

[RTI Issue ID CORE-13403]

10.9.2. Security Plugins

10.9.2.1. Lack of origin authentication leads to unnecessary allocation and possible discovery failure

In 6.1.1, when the property cryptography.max_receiver_specific_macs is unset or set to 0, there is an unnecessary memory allocation related to receiver-specific MACs whenever creating or discovering an entity. In some cases, the cryptographic library may fail to make this allocation, in which case entity creation or discovery fails with this error message:

RTI_Security_CryptoLibAdapterEvpNewMacKey (MasterReceiverSpecificKey) failed with error

In the fix for this issue, the Security Plugins no longer attempt to make this allocation if origin authentication is not used.

Fixed in: 7.2.0

[RTI Issue ID SEC-2210]

10.9.2.2. Potential crash while decoding protected submessages

Release 6.1.1 introduced several performance optimizations to Submessage Protection decoding. There is an issue with one of these optimizations, potentially resulting in a rare crash on the receiver (DataWriter or DataReader) while decoding a protected submessage. In particular, this issue is triggerable if any of the following is true for at least one DataWriter/DataReader pair:

  • metadata_protection_kind is set to a value different from NONE

  • discovery_protection_kind is set to a value different from NONE and enable_discovery_protection is set to TRUE

  • liveliness_protection_kind is set to a value different from NONE and enable_liveliness_protection is set to TRUE

This issue, which is more likely to trigger when the sender’s DomainParticipant is deleting all of its endpoints, is fixed in 7.1.0. Starting in 7.1.0, decoding protected submessages no longer results in a crash.

Fixed in: 7.1.0

[RTI Issue ID SEC-1960]

10.9.2.3. Potential precondition error when removing a remote DomainParticipant that has secure endpoints

Consider the following scenario:

  • DomainParticipant ParticipantA has endpoints that have data_protection_kind or metadata_protection_kind set to a value other than NONE in the Governance Document.

  • DomainParticipant ParticipantB was communicating with ParticipantA but is no longer doing so.

  • ParticipantB is using debug libraries.

In 6.1.1, while ParticipantB is removing ParticipantA, there is a race condition that makes ParticipantB sometimes generate this precondition error in the internal function RTI_Security_CryptographyKeyHandle_removeFromEndpointList:

!precondition: "self->entityType == RTI_SECURITY_ENDPOINT_TYPE_UNKNOWN && self->nextRemoteEndpointKey != ((void *)0)"

The precondition error is followed by invalid writes due to improper cleanup.

This problem does not affect release libraries and is fixed in 7.7.0 by removing the incorrect precondition.

Fixed in: 7.7.0

[RTI Issue ID SEC-2909]

10.9.3. Persistence Service

10.9.3.1. Vulnerability: Stack buffer write overflow while parsing malicious environment variable on non-Windows systems

In 6.1.2, an out-of-bounds write on the stack can occur while parsing a malicious environment variable on non-Windows systems.

User Impact without Security: A vulnerability in the Persistence Service application can result in the following:

  • Stack buffer overflow while parsing a malicious environment variable on non-Windows systems.

  • Exploitable by overwriting the .environment file in the user’s home directory with a malicious .environment file.

  • Potential impact on integrity of Persistence Service application.

  • CVSS Base Score: 6.1 MEDIUM

  • CVSS v3.1 Vector: AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:H/A:L

User Impact with Security: Same as “User Impact without Security” above.

Mitigations: Protect access to the file system from which Persistence Service is running.

Fixed in: 7.4.0

[RTI Issue ID PERSISTENCE-362]

10.9.3.2. Rare non-recoverable failure when acknowledging samples and committing transactions with non-zero <writer_ack_period> and <writer_checkpoint_period>

In 6.1.2, setting writer_ack_period in a Persistence Group to a non-zero value can result in samples not being acknowledged. Similarly, setting writer_checkpoint_period can prevent transactions from being committed.

The first issue may cause various correctness problems, such as unbounded growth of the sample queue associated with a Persistence Group DataWriter when purge_samples_after_acknowledgment is set to TRUE. Since the samples are not acknowledged, they cannot be purged.

The second issue may cause a catastrophic failure when storing samples to disk.

This issue only affects configurations where Persistence Service runs in PERSISTENT mode. However, it is rare because typical values for these periods are in the seconds range. The issue is more likely to occur when these periods are configured in the millisecond range.

Fixed in: 7.6.0

[RTI Issue ID PERSISTENCE-397]

10.9.4. Limited Bandwidth Plugins

10.9.4.1. Limited Bandwidth ZRTPS transport crashes if an external compression library fails to load

In 6.1.2, the RTI Limited Bandwidth Plugins’ ZRTPS transport can crash when attempting to use an external library. If there is a failure while loading an external compression library (for example, if a function name does not match the expected name), the external library silently closes. The error and the closure are not propagated upstream; therefore, the ZRTPS transport uses an invalid library handler, leading to a crash.

This issue is fixed in release 7.3.0. The transport will be notified if there is a failure to load an external library.

Fixed in: 7.3.0

[RTI Issue ID COREPLG-719]