DATA_REPRESENTATION QosPolicy

47.3 DATA_REPRESENTATION QosPolicy

The DATA_REPRESENTATION QosPolicy is used to configure what form data is represented or expected in on the wire. It indicates which versions (version 1 and version 2) of the Extended Common Data Representation (CDR) are offered and requested as well as if and how the data may be compressed, including which compression algorithm is offered and requested.

A DataWriter offers a single representation, which indicates the CDR version the DataWriter uses to serialize its data. A DataReader requests one or more representations, which indicate the CDR versions the DataReader accepts. If a DataWriter's offered representation is contained within a reader’s sequence of requested representations, then the offer satisfies the request, and the policies are compatible. Otherwise, they are incompatible. See Table 47.6 DDS_DataRepresentationQosPolicy and 47.3.1 Data Representation for more information.

A DataWriter also offers a single compression_ids value, which is the compression algorithm the DataWriter uses to compress data it sends to matching DataReaders. A DataReader requests zero or more compression algorithms. If a DataWriter offers a compression algorithm that is contained within the algorithms requested by the DataReader, the offer satisfies the request and the policies are compatible. Otherwise, they are incompatible. See Table 47.6 DDS_DataRepresentationQosPolicy and 47.3.2 Data Compression for more information.

The DATA_REPRESENTATION QosPolicy includes the members in Table 47.6 DDS_DataRepresentationQosPolicy. For defaults and valid ranges, please refer to the API Reference HTML documentation.

Table 47.6 DDS_DataRepresentationQosPolicy
Type	Field Name	Description
DDS_DataRepresentationIdSeq	value	A sequence of two-byte signed integers corresponding to representation identifiers. The supported identifiers are DDS_XCDR_DATA_REPRESENTATION (Extensible CDR version 1), DDS_XCDR2_DATA_REPRESENTATION (Extensible CDR version 2), and DDS_AUTO_DATA_REPRESENTATION. An empty sequence is equivalent to a sequence with one DDS_XCDR_DATA_REPRESENTATION element. The default value, however, is a sequence with one DDS_AUTO_DATA_REPRESENTATION element. For plain language binding, the value DDS_AUTO_DATA_REPRESENTATION translates to DDS_XCDR_DATA_REPRESENTATION if the @allowed_data_representation annotation either is not specified or contains the value XCDR. Otherwise, it translates to DDS_XCDR2_DATA_REPRESENTATION. For FlatData language binding, DDS_AUTO_DATA_REPRESENTATION translates to XCDR2_DATA_REPRESENTATION. (See 47.3.1 Data Representation for further explanation.) For additional information on the @allowed_data_representation annotation, see Data Representation, in the RTI Connext Core Libraries Extensible Types Guide.
DDS_CompressionSettings_t	compression_settings	Settings related to compressing user data: compression_ids: A bitmap that represents the compression algorithm IDs supported by the DataWriter or DataReader. The possible values are: ZLIB, BZIP2, LZ4, MASK_NONE, and MASK_ALL. Only ZLIB is supported if the DataWriter is using both compression and batching. See 47.3.2 Data Compression. DataWriter creation will fail if more than one algorithm is provided on the DataWriter side (meaning that MASK_ALL is only supported for DataReaderQos and TopicQos). Default: MASK_NONE (for DataWriterQoS and TopicQoS), MASK_ALL (for DataReaderQoS) writer_compression_level: The level of compression to use when compressing data. The value is a range between 0 and 10. It can be set only for the DataWriterQoS or TopicQoS. A lower compression level results in a faster compression speed, but lower compression ratio. A higher compression level results in a better compression ratio, but slower compression speed. Default: BEST_COMPRESSION (10) writer_compression_threshold: The threshold, in bytes, above which a serialized sample is eligible to be compressed. The value is a range between 0 and LENGTH_UNLIMITED. It can be set only for the DataWriterQoS or TopicQoS. Any sample with a serialized size equal to or greater than the threshold will be eligible to be compressed. Only if the compressed size is smaller than the serialized size will the sample be stored and sent compressed on the wire. Setting the threshold to LENGTH_UNLIMITED disables compression. Default: COMPRESSION_THRESHOLD_DEFAULT (8192 bytes). Note: COMPRESSION_THRESHOLD_DEFAULT is not a valid value in XML, it can be set only in code. See 47.3.2 Data Compression for more details.

47.3.1 Data Representation

You can view data representation as a two-step process:

As described above, DDS_AUTO_DATA_REPRESENTATION translates to the value DDS_XCDR_DATA_REPRESENTATION or DDS_XCDR2_DATA_REPRESENTATION depending on a few factors. Or you can explicitly set the value to DDS_XCDR_DATA_REPRESENTATION or DDS_XCDR2_DATA_REPRESENTATION. If you let DDS_AUTO_DATA_REPRESENTATION set the value, the following table shows how it will be set, depending on your IDL:

Table 47.7 How DDS_AUTO_DATA_REPRESENTATION Sets the Value
IDL looks like ...	AUTO value translates to ...
Struct Point { } which is equivalent to: @allowed_data_representation(XCDR \| XCDR2) Struct Point { }	XCDR
@allowed_data_representation(XCDR2) Struct Point { }	XCDR2
@language_binding(FLAT_DATA) Struct Point { }	XCDR2

Once the value is set (either by DDS_AUTO_DATA_REPRESENTATION or explicitly by you), that value determines what the DataWriter writes or the DataReader reads. (Recall that the DataWriter offers one representation; the DataReader requests one or more representations.) The next step is how the DataWriter and DataReader match based on the QoS value. The QoS must be compatible between the DataWriter and the DataReader. The compatible combinations are shown in Table 47.6 DDS_DataRepresentationQosPolicy.

Table 47.8 Valid Reader/Writer Combinations of DataRepresentation
DataWriter-offered DataRepresentation value	DataReader-requested DataRepresentation values
XCDR	XCDR
XCDR	XCDR and XCDR2
XCDR2	XCDR2
XCDR2	XCDR and XCDR2

If this QosPolicy is set incompatibly, the ON_OFFERED_INCOMPATIBLE_QOS and ON_REQUESTED_INCOMPATIBLE_QOS statuses will be modified and the corresponding Listeners called for the DataWriter and DataReader respectively.

47.3.2 Data Compression

A DataReader with compression enabled can receive samples from DataWriters with or without compression as well as from multiple DataWriters with different compression algorithms. DataWriters cannot optionally send compressed samples to some DataReaders and the same samples, but uncompressed, to other DataReaders that do not support compression.

Table 47.9 Valid Reader/Writer Combinations of Compression IDs shows which DataWriters/DataReaders will match depending on their compression IDs:

Table 47.9 Valid Reader/Writer Combinations of Compression IDs

DataReader-requested compression_ids
	NONE	ZLIB	LZ4	BZIP2	MASK_ALL or any combination that includes offered compression_ids
NONE	compatible	compatible	compatible	compatible	compatible
ZLIB	incompatible	compatible	incompatible	incompatible	compatible
LZ4	incompatible	incompatible	compatible	incompatible	compatible
BZIP2	incompatible	incompatible	incompatible	compatible	compatible
MASK_ALL is not a valid value for the DataWriter, which supports only one compression_ids value

47.3.2.1 compression_ids

You can compare the compression algorithms (LZ4, zlib, and bzip2) by checking their compression ratios against their compression speeds. The compression ratio defines how much the data size is reduced. For example, a ratio of 2 means that the size of the data is reduced by half. The compression speed has a direct impact on the latency of the compressed data; the slower the speed, the higher the latency. Generally, the higher the compression ratio, the lower the speed; the higher the speed, the lower the compression ratio.

Table 47.10 Compression Algorithm References

compression_ids	Information
MASK_NONE	Default for DataWriterQoS and TopicQoS
LZ4	See https://github.com/lz4/lz4
ZLIB	See https://zlib.net/
BZIP2	See https://www.sourceware.org/bzip2/
MASK_ALL	Default for DataReaderQoS

There are many benchmarking resources comparing various compression algorithms. One such resource is https://github.com/inikep/lzbench. LZ4 is considered the fastest of the three builtin algorithms, while zlib and bzip2 give the best compression ratios. Use LZ4 if you want to keep latency as low as possible while maintaining a decent compression ratio. Use zlib or bzip2 if latency is less important in your system than a high compression ratio to reduce bandwidth usage. The choice of which of the three builtin compression algorithms to use depends on the type of data, the rate at which the data is being sent, and latency and bandwidth considerations. It is a good idea for you to understand the strengths and weaknesses of each of the builtin algorithms, and perform benchmarking in your own system so that you can choose the algorithm that is best suited to your system.

When you specify compression settings for a Topic, all DataWriters and DataReaders for that Topic inherit the Topic's compression settings. If you specify multiple compression algorithms for a Topic, the DataReader will use all of them, but since the DataWriter can have only one algorithm enabled, it will choose one of them, in the following order: ZLIB, BZIP2, and LZ4.

Notes:

When the serialize_key_with_dispose field in the 47.5 DATA_WRITER_PROTOCOL QosPolicy (DDS Extension) is enabled and a dispose message is sent, the serialized key is not compressed.
The only algorithm supported when compression and batching are enabled on the same DataWriter is ZLIB, because zlib is the only builtin algorithm that supports stream-based compression with acceptable performance. Stream-based compression allows Connext to compress and build the batch as samples are written into the batch. (LZ4 also supports stream-based compression, but with a high performance penalty, so RTI has decided not to support this mode in Connext.)
The combination of compression, batching, and data protection (via Security Plugins) is supported. See the "Interaction with Compression" section, in the RTI Security Plugins User's Manual for details.

47.3.2.2 writer_compression_level

Each level between 0 and 10 has trade-offs between compression ratio and compression speed, with 1 representing the fastest speed and lowest compression ratio and 10 representing the slowest speed and highest compression ratio. (0 disables compression.)

Connext also provides the following writer_compression_level values:

BEST_COMPRESSION. This value is the same as 10. With this value, Connext chooses the best compression level for the given algorithm.
BEST_SPEED. This value is the same as 1. With this value, Connext chooses the fastest compression speed for whatever algorithm is chosen.

BEST_COMPRESSION and BEST_SPEED do not vary dynamically depending on the algorithm and the size of the data. They have a strict one-to-one mapping to the algorithms' compression ratios/speeds as follows:

zlib

writer_compression_level	zlib mapped value
BEST_COMPRESSION = 10	level = 9
BEST_SPEED = 1	level = 1

For the rest of the values, a linear normalization is applied, so any writer_compression_level value you enter in the range of 1 to 10 is translated to the range used by ZLIB between 1 and 9. See the zlib documentation for the compress2 function for more details on how the level parameter is used.

writer_compression_level	LZ4 mapped value
BEST_COMPRESSION = 10	acceleration = 0
BEST_SPEED = 1	acceleration = 30

For the rest of the values, a linear normalization is applied, so any writer_compression_level value you enter in the range of 1 to 10 is translated to the range used by LZ4 between 30 and 0. Although technically the acceleration value is unbounded, Connext sets the limit at 30; beyond that, no compression occurs in most cases. See the LZ4 documentation for the LZ4_compress_fast function for more details on how the acceleration parameter is used.

bzip2

writer_compression_level	bzip2 mapped value
BEST_COMPRESSION = 10	blockSize100k = 9
BEST_SPEED = 1	blockSize100k = 1

For the rest of the values, a linear normalization is applied, so any writer_compression_level value you enter in the range of 1 to 10 is translated to the range used by bzip2 between 1 and 9. See the bzip2 documentation for the BZ2_bzBuffToBuffCompress function for more details on how the blockSize100k parameter is used.

47.3.2.3 writer_compression_threshold

Any sample with a serialized size equal to or greater than this threshold (see Table 47.6 DDS_DataRepresentationQosPolicy) is eligible to be compressed.

There are two scenarios where a sample, even with compression enabled on the DataWriter, is not compressed:

Any sample with a serialized size lower than the writer_compression_threshold will not be compressed.

If batching is enabled: a batch will not be compressed if the maximum serialized size of the batch ((max_sample_serialized_size as returned by the type-plugin get_serialized_sample_max_size()) * max_samples in the batch) is smaller than the writer_compression_threshold. See information about max_samples in 47.2 BATCH QosPolicy (DDS Extension).

If the compressed size is bigger than the sample's serialized size, the compressed sample will be discarded and the original sample will be sent instead.

47.3.2.4 Connext Micro

Connext Micro does not interoperate with DataWriters that send compressed data.

47.3.2.5 Performance Considerations when Using Content Filtering and Compression

Samples are stored compressed in the DataWriter’s queue. When a sample is being written and there are matching DataReaders using ContentFilteredTopics, the DataWriter will apply the filter and then compress the sample. In some cases, a sample needs to be filtered again after it has already been compressed. This can happen, for example, when a non-VOLATILE, late-joining DataReader with a ContentFilteredTopic is discovered by the DataWriter or a TopicQuery is issued by an existing DataReader. If a filtering operation occurs on the DataWriter side after the sample is already compressed, the sample must be decompressed to apply the filter, increasing the latency for these requested samples. Note that in these scenarios the original compressed sample is kept around, so a sample is never compressed twice. In other words, Connext decompresses the sample into a separate buffer, performs the filtering, and then either sends or doesn't send the compressed sample.

47.3.2.6 Using Compression with FlatData language binding and Zero Copy Transfer over Shared Memory

See FlatData's section 34.1.4.2.4 Interactions with RTI Security Plugins and Compression for notes about interactions with the FlatData language binding.

See Zero Copy's section 34.1.5.1.5 Interactions with RTI Security Plugins and Compression for information about interactions with Zero Copy transfer over shared memory.

47.3.3 Properties

This QosPolicy cannot be modified after the Entity has been enabled.