Chapter 4 Data Representation

The data representation specifies the ways in which a data sample of a given type are communicated over the network.

The OMG 'Extensible and Dynamic Topic Types for DDS' specification, version 1.3 defines three different data representations:

  • Extended Common Data Representation (CDR) encoding version 1 (XCDR)
  • Extended CDR encoding version 2 (XCDR2)
  • XML data representation

Connext 6.0.0 and above implements both XCDR and XCDR2. Connext 5.3.1 and below implements only XCDR. XML data representation is not supported.

4.1 Configuring the CDR

You may use the DataRepresentationQosPolicy in the DataWriterQos to configure which version of Extended CDR, version 1 or version 2, the DataWriter will use to serialize its data. The same QosPolicy exists in the DataReaderQos to configure which version(s) the DataReader will accept from DataWriters. DataWriters can offer only one data representation, while DataReaders can request multiple data representations.

For more information, see "DATA_REPRESENTATION QosPolicy" in the RTI Connext Core Libraries User's Manual.

4.1.1 @data_representation annotation

The data representations that you are allowed to configure in the DataRepresentationQosPolicy for a type ‘T’ are limited to the allowed data representations for the type.

The @data_representation (or @allowed_data_representation) annotation (you can use either) lets you restrict the data representations that may be used to encode a data object of a specific type. (You can select from this restricted set when setting the DataRepresentationQosPolicy.) The IDL definition of the @data_representation annotation is as follows:

// Positions are defined to match the values of the DataRepresentationId_t
// XCDR_DATA_REPRESENTATION, XML_DATA_REPRESENTATION, and
// XCDR2_DATA_REPRESENTATION
@bit_bound(32)
bitmask DataRepresentationMask {
   @position(0) XCDR,
   @position(1) XML,
   @posiiton(2) XCDR2
}
 
@annotation data_representation {
   DataRepresentationMask value;
};

For example:

@data_representation(XCDR2)
struct Position
{
    int32 x;
    int32 y;
};

DataWriters and DataReaders using the previous type can publish and subscribe to only an XCDR2 representation, regardless of the value set in the DataRepresentationQosPolicy. (If a DataWriter or DataReader in this case sets its DataRepresentationQosPolicy to XCDR, Connext will automatically change it to XCDR2 and print a log message indicating this change.)

If the @data_representation annotation is not present, Connext interprets the data representation as if the DataRepresentationMask value was set to XCDR|XCDR2 for PLAIN language binding and XCDR2 for FLAT_DATA language binding. For information about the RTI FlatData™ language binding, see the "Sending Large Data" chapter in the RTI Connext Core Libraries User's Manual.

4.2 Extended CDR (encoding version 1)

The "traditional" OMG CDR (PLAIN_CDR) is used for final and appendable types. It is also used for primitive, string, and sequence types.

Mutable types and optional members use parameterized CDR (PL_CDR), in which each member is preceded by a member header that consists of the member ID and member serialized length.

The member header can be 4 bytes (2 bytes for the member ID and 2 bytes for the serialized length) or 12 bytes (where 8 bytes are used for the member ID and 4 bytes are used for the length).

Member IDs greater than 16,128 require a 12-byte header. Therefore, to reduce network bandwidth, the recommendation is to use member IDs less than or equal to 16,128.

Also, members with a serialized size greater than 65,535 bytes require a 12-byte header.

Notice that for members with a member ID less than 16,129 and a serialized size less than 65,536 bytes, it is up to the implementation to decide whether or not to use a 12-byte header. For this version of Connext, the header selection rules are as follows:

  • If the member ID is greater than 16,128, use a 12-byte header.
  • Otherwise, if the member is a primitive type (int16, uint16, int32, uint32, int64, uint64, float, double, long double, boolean, octet, char), use a 4-byte header.
  • Otherwise, if the member is an enumeration, use a 4-byte header.
  • Otherwise, if the maximum serialized size of the type is less than 65,536 bytes, use a 4-byte header.
  • Otherwise, use a 12-byte header.

By default, in some scenarios, Connext is not compliant with the OMG 'Extensible and Dynamic Topic Types for DDS' specification, version 1.3; see 4.5 Extensible Types Compliance Mask.

4.3 Extended CDR (encoding version 2)

From the ‘Extensible and Dynamic Topic Types for DDS’ specification:

The specification defines three encoding formats used with encoding version 2: PLAIN_CDR2, DELIMITED_CDR, and PL_CDR2.

  • PLAIN_CDR2 shall be used for all primitive, string, and enumerated types. It is also used for any type with an extensibility kind of FINAL. The encoding is similar to PLAIN_CDR except that INT64, UINT64, FLOAT64, and FLOAT128 are serialized into the CDR buffer at offsets that are aligned to 4 [bytes] instead of 8 ....
  • DELIMITED_CDR shall be used for types with an extensibility kind of APPENDABLE. It serializes a UINT32 delimiter header (DHEADER) before serializing the object using PLAIN_CDR2. The delimiter encodes the endianness and the length of the serialized object that follows.
  • PL_CDR2 shall be used for aggregated types with an extensibility kind of MUTABLE. Similar to DELIMITED_CDR, it also serializes a DHEADER before serializing the object. In addition, it serializes a member header (EMHEADER) ahead of each serialized member. The member header encodes the member ID, the must-understand flag, and the length of the serialized member that follows.

In Extended CDR encoding version 2, wchar sizes changed from 4 bytes (Char32) to 2 bytes (Char16).

For more information about encoding version 2, please see the OMG 'Extensible and Dynamic Topic Types for DDS' specification, version 1.3.

By default, in some scenarios, Connext is not compliant with the OMG 'Extensible and Dynamic Topic Types for DDS' specification, version 1.3; see 4.5 Extensible Types Compliance Mask.

4.4 Choosing the Right Data Representation

Extended CDR encoding 2 (XCDR2) is more efficient on the wire than Extended CDR encoding 1 (XCDR). For new applications, Extended CDR encoding 2 is the recommended data representation; however, if you need to keep compatibility and interoperability with old Connext applications (5.3.1 and below), you may have to continue using Extended CDR encoding 1.

DataReaders can be configured to receive data using both XCDR2 and XCDR. This way, a DataReader can still interoperate and receive data from old Connext DataWriters using XCDR, while receiving data from new DataWriters using XCDR2.

The opposite is not true. DataWriters can publish only one data representation. Therefore, if there is a requirement to receive data for a topic 'T' with old Connext DataReaders, you will have to continue to publish data for topic 'T' with XCDR representation on the new DataWriters or use a bridge such as Routing Service to translate between XCDR and XCDR2.

4.5 Extensible Types Compliance Mask

By default, Connext data serialization is not fully compliant with Extended CDR encoding due to bugs when implementing the standard. Most of these bugs do not break functional correctness, but affect interoperability with other vendors who are compliant.

To make Connext compatible with the OMG 'Extensible and Dynamic Topic Types for DDS' specification, version 1.3, use the Extensible Types compliance mask. This mask allows you to set and unset serialization features in order to be compatible with the specification or to allow backward compatibility with previous Connext releases (and not be compatible with the specification).

Modify the compliance mask using the environment variable NDDS_XTYPES_COMPLIANCE_MASK or the Modern C++ function rti::config::compliance::set_xtypes_mask(), also available in Traditional C++ and C.

Note: For the Java, C#, and Python APIs, you must use the environment variable; APIs for modifying the compliance mask are not available in these languages.

There are two defined masks that you can set as the default mask:

  • rti::config::compliance::default_mask() (0x0000000C). This mask is enabled by default and is not compliant with the specification.
  • rti::config::compliance::vendor() (0x00000009). This mask is fully compliant with the specification.

You can also set and unset specific bits of the mask to enable or disable different serialization features:

  • rti::config::compliance::dheader_in_non_primitive_collections() (0x00000001)
    • When this bit is set, the serialization of sequences and arrays with non-primitive members includes a DHEADER.
    • This bit only applies to XCDR2 encapsulation.
    • To be compatible with the specification, this bit must be set.
    • This bit is unset by default.
  • rti::config::compliance::enum_as_primitive_in_collections() (0x00000002)
    • When this bit is set, enums are considered primitive types in collection types, meaning that a DHEADER will not be added to collections of enums if this bit is set.
    • This bit only applies to XCDR2 encapsulation.
    • To be compatible with the specification, this bit must be unset.
    • This bit is unset by default.
  • rti::config::compliance::parameter_length_with_padding() (0x00000004)
    • When this bit is set, the length of a member header in a mutable type member or an optional member includes the added padding that may follow the serialized member.
    • This bit only applies to XCDR (not XCDR2) encapsulation.
    • To be compatible with the specification, this bit must be unset.
    • This bit is set by default.
  • rti::config::compliance::encapsulation_options_with_padding() (0x00000008):
    • When this bit is set, Connext will set the padding bits in the options field of the encapsulation header of a serialized payload.
    • This bit applies to both XCDR and XCDR2.
    • The bit only applies to DataWriters.
    • To be compatible with the specification, this bit must be set.
    • The bit is set by default.
    • Not setting this bit may expose your application to bug CORE-9042 (see "What's Fixed in 7.3.0" (the Data Corruption section) in the RTI Connext Core Libraries Release Notes), in which a DataReader on a Topic using an appendable type may receive samples with an incorrect value.
    • Setting this bit will lead to a Connext Professional DataWriter not matching a Connext Micro/Cert DataReader if the following is true:
      • The Professional DataWriter is using XCDR2, or, if it is using XCDR, the Professional DataWriter type is not mutable.
      • Either of the following is true:
      • The Professional DataWriter type finishes with an optional member when using XCDR2. For example:
      • struct Foo {
            long m1;
            @optional long m2;
        };

        Another example:

        struct Bar {
            long m1;
            @optional long m2;
        };
         
        struct Foo {
            long m1;
            Bar m2;
        };

        Or:

      • The Professional DataWriter type finishes with a member with less than a 4-byte alignment (for example, a short or octet). For example:
      • struct Foo {
            long m1;
            short m2;
        };

        Another example:

        struct Bar {
            long m1;
            short m2;
        };
         
        struct Foo {
            long m1;
            Bar m2;
        };

        If the last member of the type is an array or sequence, this condition (finishing with a member with less than a 4-byte alignment) applies to the element type of the sequence or array.

    The Micro/Cert versions affected by this compatibility problem when this bit is set are 2.4.12.z and lower, 2.4.13.1, 2.4.13.2, 2.4.13.3, 2.4.13.4, 2.4.13.5, 2.4.14.0, 2.4.14.1, 2.4.15.1, and 3.x.y.z.

The mask will affect the data serialization and deserialization of the following:

  • DataReaders and DataWriters (for DomainParticipants created after the mask is set)
  • The rti::topic::from_cdr_buffer() and rti::topic::to_cdr_buffer() functions
  • FlatData builders

In general, applications should only set the mask once before calling any other Connext API.

The environment variable will be automatically loaded when the DomainParticipantFactory is created. If you need to load it before that, use the Modern C++ function rti::config::compliance::load_compliance_masks(), also available in Traditional C++ and C.

The format of the environment variable is an unsigned integer, which can be in HEX notation. For example: 0x00000001.

To obtain the current value of the compliance mask, use the function rti::config::compliance::get_xtypes_mask().

For more information on the compliance mask, see the Compliance Configuration section of the API Reference (for example, see here for Modern C++).

Note: The rti::config::compliance::dheader_in_non_primitive_collections() (0x00000001) bit also configures the FlatData language binding to be compliant with Extended CDR encoding version 2 (XCDR2), but with a limitation: the compliant encoding is applied to arrays and sequences of mutable types, but not final types. The serialization of final types containing arrays of non-primitive types in the FlatData language binding is not compliant with XCDR2 even if the bit is set.

4.5.1 Setting the Mask for Specific Connext Endpoints

To configure the behavior of the Extensible Types compliance mask for only specific endpoints (DataWriters and DataReaders), set the property dds.xtypes.compliance_mask on the endpoint or the DomainParticipant that owns the endpoint. When set on the DomainParticipant, the mask applies to all the endpoints created by the DomainParticipant.

The only bit supported in the mask set via the property is the rti::config::compliance::encapsulation_options_with_padding() bit. All other bits are ignored. The value set using the property supersedes the value set using the rti::config::compliance::set_xtypes_mask() and the environment variable NDDS_XTYPES_COMPLIANCE_MASK.

The value of the property dds.xtypes.compliance_mask is an unsigned integer, which can be in HEX notation. For example: 0x00000001.