1.1.2.4. Compression

The user-data compression feature allows a DataWriter to compress the samples that it sends to matching DataReaders. Only the user payload is compressed not the RTPS protocol headers. This compression feature can be configured using the compression_settings in the DATA_REPRESENTATION QoS Policy. For more information, see the “Data Compression” section in DATA_REPRESENTATION QosPolicy, in the RTI Connext DDS Core Libraries User’s Manual.

The following tests have been performed by executing RTI Perftest C++98 Publisher and a Subscriber between two nodes, connected to a switch via Ethernet. Compression has been enabled with different combinations of settings, the communication has been restricted to a single interface, and the transport has been set to UDPv4.

For latency and throughput tests, the compression threshold is set to the default value (8192 bytes), so compression is disabled for sizes lower than this value. We assume that the performance for small sample sizes usually does not improve in terms of latency or throughput, but it still can be beneficial for reducing bandwidth utilization.

The data sent and used on these tests to measure the performance of the compression feature is text data corresponding to the logs generated by Connext DDS in a real customer scenario. We parse the data and crop it into samples of different sizes. The gathered results show a lower performance than is possible, but a more realistic one. Keep in mind that the type of data sent has a huge impact on the compression algorithm behavior.

The network used for these tests is a 1Gbps network. In this network, the compression feature stands out, since it’s able to boost the throughput beyond the hardware limitation of 1Gbps.

Find information about the hardware, network, and command-line parameters after each of the tests.

Compression LZ4, Unkeyed, Best Effort, UDPv4 1Gbps, C++98

The graph below shows the one-way latency without load between a Publisher and a Subscriber running in two Linux nodes connected locally in a 1Gbps network.

The compression settings used for these tests are as follows:

  • compression_ids: LZ4

  • writer_compression_level: 5

  • writer_compression_threshold: 8192 (default)

Detailed Statistics

The following table contains the raw numbers presented by RTI Perftest. These numbers are the exact output with no further processing.

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

38

2.3

27

76

37

39

50

59

76

64

46

2.9

36

86

47

48

55

61

86

128

40

2.1

30

305

39

40

52

59

305

256

46

3.9

37

103

45

51

56

65

103

512

49

2.8

40

86

49

54

56

73

86

1024

58

8.9

41

259

56

73

88

97

259

8192

204

51.0

43

279

213

258

268

277

279

63000

296

26.4

80

445

303

314

344

397

445

100000

367

29.2

147

578

371

388

450

576

578

500000

1256

188.9

545

1636

1301

1447

1584

1635

1636

1048576

2396

275.4

1746

3780

2384

2748

2934

3780

3780

1548576

3514

430.1

2704

4595

3400

4111

4361

4595

4595

4194304

9893

728.8

8949

11488

9620

10965

11049

11488

11488

10485760

25539

1240.8

21331

29136

25074

27201

27270

29136

29136


Perftest Scripts

To produce these tests, we executed RTI Perftest for C++98. The exact commands used can be found here:

Publisher Side

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
sudo /set_lat_mode.sh

echo EXECUTABLE IS $1
export executable=$1

echo OUTPUT PATH IS $2
export output_folder=$2

export exec_time=30
export nic=172.16.1.1
export pub_string="-pub \
        -transport UDPv4 \
        -nic $nic \
        -noPrint \
        -noOutputHeaders \
        -exec $exec_time \
        -noXML\
        -latencyTest \
        -loadDataFromFile /performance/validation/resources/messages\
        -compressionId LZ4 \
        -compressionThreshold 0 \
        -compressionLevel 5"

mkdir -p $output_folder

echo ">> UNKEYED BE COMPRESSION"
export my_file=$output_folder/lat_udpv4_pub_unkeyed_be_compression_lz4.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="taskset -c 0 \
    $executable -best -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done
sleep 5;

echo ">> UNKEYED REL COMPRESSION"
export my_file=$output_folder/lat_udpv4_pub_unkeyed_rel_compression_lz4.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
    export command="taskset -c 0 \
    $executable -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done

Subscriber Side

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
sudo /set_lat_mode.sh

echo EXECUTABLE IS $1
export executable=$1

echo OUTPUT PATH IS $2
export output_folder=$2

export nic=172.16.1.2
export sub_string="-sub \
        -transport UDPv4 \
        -nic $nic \
        -noPrint \
        -noOutputHeaders \
        -noXML \
        -compressionId LZ4 \
        -compressionThreshold 0 \
        -compressionLevel 5"

mkdir -p $output_folder

echo ">> UNKEYED BE COMPRESSION"
export my_file=$output_folder/lat_udpv4_sub_unkeyed_be_compression_lz4.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="taskset -c 0 \
    $executable -best $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done
sleep 5;

echo ">> UNKEYED REL COMPRESSION"
export my_file=$output_folder/lat_udpv4_sub_unkeyed_rel_compression_lz4.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
    export command="taskset -c 0 \
    $executable $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done

Test Hardware

The following hardware was used to perform these tests:

Linux Nodes

Processor: Intel® Xeon® E-2186G 3.8GHz, 12M cache, 6C/12T, turbo (95W)
RAM: 16GB 2666MT/s DDR4 ECC UDIMM
NIC 1: Intel X550 Dual Port 10GbE BASE-T Adapter, PCIe Full Height
NIC 2: Intel Ethernet I350 Dual Port 1GbE BASE-T Adapter, PCIe Low Profile
OS: Ubuntu 18.04 -- gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

Switch

Dell Networking S4048T-ON, 48x 10GBASE-T and 6x 40GbE QSFP+ ports, IO to PSU air, 2x AC PSU, OS9

Compression ZLIB, Unkeyed, Best Effort, UDPv4 1Gbps, C++98

The graph below shows the one-way latency without load between a Publisher and a Subscriber running in two Linux nodes connected locally in a 1Gbps network.

The compression settings used for these tests are as follows:

  • compression_ids: ZLIB

  • writer_compression_level: 5

  • writer_compression_threshold: 8192 (default)

Detailed Statistics

The following table contains the raw numbers presented by RTI Perftest. These numbers are the exact output with no further processing.

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

83

2.0

74

123

82

84

91

109

123

64

87

2.0

79

135

87

88

95

112

135

128

94

2.2

84

259

94

96

102

119

259

256

107

2.7

90

157

107

109

115

132

157

512

115

2.9

92

156

115

118

123

141

156

1024

126

5.3

98

277

126

133

139

156

277

8192

205

25.4

140

466

208

230

261

442

466

63000

797

102.2

416

1023

832

883

922

1019

1023

100000

1182

125.2

695

1458

1220

1303

1345

1452

1458

500000

4872

452.1

3492

5831

4923

5370

5653

5831

5831

1048576

10165

884.3

8647

11717

10210

11197

11347

11717

11717

1548576

14945

1196.6

13241

17097

14979

16427

16672

17097

17097

4194304

40349

2433.7

37344

44632

39245

43679

44221

44632

44632

10485760

102041

4799.5

97134

110572

99659

108887

110266

110572

110572


Perftest Scripts

To produce these tests, we executed RTI Perftest for C++98. The exact commands used can be found here:

Publisher Side

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
sudo /set_lat_mode.sh

echo EXECUTABLE IS $1
export executable=$1

echo OUTPUT PATH IS $2
export output_folder=$2

export exec_time=30
export nic=172.16.1.1
export pub_string="-pub \
        -transport UDPv4 \
        -nic $nic \
        -noPrint \
        -noOutputHeaders \
        -exec $exec_time \
        -noXML\
        -latencyTest \
        -loadDataFromFile /performance/validation/resources/messages\
        -compressionId ZLIB \
        -compressionThreshold 0 \
        -compressionLevel 5"

mkdir -p $output_folder

echo ">> UNKEYED BE COMPRESSION"
export my_file=$output_folder/lat_udpv4_pub_unkeyed_be_compression_zlib.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="taskset -c 0 \
    $executable -best -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done
sleep 5;

echo ">> UNKEYED REL COMPRESSION"
export my_file=$output_folder/lat_udpv4_pub_unkeyed_rel_compression_zlib.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
    export command="taskset -c 0 \
    $executable -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done

Subscriber Side

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
sudo /set_lat_mode.sh

echo EXECUTABLE IS $1
export executable=$1

echo OUTPUT PATH IS $2
export output_folder=$2

export nic=172.16.1.2
export sub_string="-sub \
        -transport UDPv4 \
        -nic $nic \
        -noPrint \
        -noOutputHeaders \
        -noXML \
        -compressionId ZLIB \
        -compressionThreshold 0 \
        -compressionLevel 5"

mkdir -p $output_folder

echo ">> UNKEYED BE COMPRESSION"
export my_file=$output_folder/lat_udpv4_sub_unkeyed_be_compression_zlib.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="taskset -c 0 \
    $executable -best $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done
sleep 5;

echo ">> UNKEYED REL COMPRESSION"
export my_file=$output_folder/lat_udpv4_sub_unkeyed_rel_compression_zlib.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
    export command="taskset -c 0 \
    $executable $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done

Test Hardware

The following hardware was used to perform these tests:

Linux Nodes

Processor: Intel® Xeon® E-2186G 3.8GHz, 12M cache, 6C/12T, turbo (95W)
RAM: 16GB 2666MT/s DDR4 ECC UDIMM
NIC 1: Intel X550 Dual Port 10GbE BASE-T Adapter, PCIe Full Height
NIC 2: Intel Ethernet I350 Dual Port 1GbE BASE-T Adapter, PCIe Low Profile
OS: Ubuntu 18.04 -- gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

Switch

Dell Networking S4048T-ON, 48x 10GBASE-T and 6x 40GbE QSFP+ ports, IO to PSU air, 2x AC PSU, OS9

Compression Bandwidth Savings

The following table contains the bandwidth savings gain by enabling compression with ZLIB, which is one of algorithms with a higher compression ratio and the only one that can be used in combination with batching.

The compression settings used for these tests are as follows:

  • compression_ids: ZLIB

  • writer_compression_level: 10 (best compression possible, default setting)

  • writer_compression_threshold: 0 (try to compress all samples)

The data used for this test is the same as used in the previous latency and throughput tests. In this test, each sample has its own size.

This test uses the compression example available in the github example repository: https://github.com/rticommunity/rticonnextdds-examples/tree/develop/examples/connext_dds/compression.

Batching

NONE Compression (Mb)

ZLIB Compression(Mb)

Diff Saved Mb

Diff Saved %

Disabled

70.59

53.74

16.85

23.87

2 samples

69.40

37.08

32.32

46.57

5 samples

69.26

22.08

47.18

68.12

10 samples

69.25

17.02

52.23

75.42

These results show up to a 75% reduction in network bandwidth utilization by using ZLIB in combination with batching, in a real use-case scenario.