Shared Memory

The following tests have been performed by executing the RTI Perftest C++98 benchmark application between two applications within the same node. The communication has been set to use Shared Memory (SHMEM).

Find the information about the hardware, network and command-line parameters after each of the tests.

Unkeyed, Shared Memory, C++98

The graph below shows the one-way latency without load between a Publisher and a Subscriber running in two processes within a single node. The numbers are for best-effort as well as strict reliable reliability scenarios.

Note

We use the median (50th percentile) instead of the average in order to get a more stable measurement that does not account for spurious outliers. We also calculate the average value and other percentile values, which can be seen in the Detailed Statistics section below.

Detailed Statistics

The following tables contain the raw numbers presented by RTI Perftest. These numbers are the exact output with no further processing.

  • Best Effort

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

10

0.5

9

34

9

10

11

17

29

64

10

0.6

9

36

9

10

11

17

29

128

10

0.5

9

31

9

10

11

17

31

256

10

0.5

9

36

9

10

11

17

31

512

10

0.4

9

35

9

10

11

17

30

1024

10

0.4

9

41

9

10

11

17

33

8192

11

0.4

10

43

11

11

12

19

39

63000

18

0.9

17

91

18

19

21

58

91

  • Reliable

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

11

1.2

9

37

11

14

18

21

35

64

11

1.1

9

39

11

14

16

22

35

128

11

1.3

9

36

11

14

18

21

35

256

11

1.2

10

38

11

14

16

22

37

512

11

1.2

9

47

11

13

18

21

41

1024

11

1.3

10

146

11

14

18

22

42

8192

12

1.4

11

45

12

14

19

24

45

63000

22

1.7

19

94

21

22

31

69

94

100000

31

2.0

27

166

30

31

40

105

166

500000

129

39.5

89

709

104

178

218

528

709

1048576

357

85.6

184

1452

368

378

673

1015

1452

1548576

571

112.0

306

2132

587

686

843

1508

2132

4194304

1684

314.3

1154

5797

1866

1938

1978

5797

5797

10485760

3923

384.8

3849

15913

3900

3930

3962

15913

15913


Perftest Scripts

To produce these tests, we executed RTI Perftest for C++98. The exact commands used can be found here:

Publisher Side

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
sudo /set_lat_mode.sh

echo EXECUTABLE IS $1
export executable=$1

echo OUTPUT PATH IS $2
export output_folder=$2

export exec_time=30
export pub_string="-pub \
        -transport SHMEM \
        -noPrint \
        -noOutputHeaders \
        -exec $exec_time \
        -noXML\
        -latencyTest"

mkdir -p $output_folder

echo ">> UNKEYED BE"
export my_file=$output_folder/lat_shmem_pub_unkeyed_be.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -best -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done
sleep 5;

echo ">> UNKEYED REL"
export my_file=$output_folder/lat_shmem_pub_unkeyed_rel.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
    export command="\
    $executable -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done
sleep 5;

echo ">> KEYED BE"
export my_file=$output_folder/lat_shmem_pub_keyed_be.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -best -keyed -instances 100000 -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done
sleep 5;

echo ">> KEYED REL"
export my_file=$output_folder/lat_shmem_pub_keyed_rel.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -keyed -instances 100000 -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done

Subscriber Side

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
sudo /set_lat_mode.sh

echo EXECUTABLE IS $1
export executable=$1

echo OUTPUT PATH IS $2
export output_folder=$2

export nic=172.16.0.2
export sub_string="-sub \
        -transport SHMEM \
        -nic $nic \
        -noPrint \
        -noOutputHeaders \
        -noXML"

mkdir -p $output_folder

echo ">> UNKEYED BE"
export my_file=$output_folder/lat_shmem_sub_unkeyed_be.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -best $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done
sleep 5;

echo ">> UNKEYED REL"
export my_file=$output_folder/lat_shmem_sub_unkeyed_rel.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
    export command="\
    $executable $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done
sleep 5;

echo ">> KEYED BE"
export my_file=$output_folder/lat_shmem_sub_keyed_be.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -best -keyed -instances 100000 $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done
sleep 5;

echo ">> KEYED REL"
export my_file=$output_folder/lat_shmem_sub_keyed_rel.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -keyed -instances 100000 $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done

Test Hardware

The following hardware was used to perform these tests:

Linux Nodes

Processor: Intel® Xeon® E-2186G 3.8GHz, 12M cache, 6C/12T, turbo (95W)
RAM: 16GB 2666MT/s DDR4 ECC UDIMM
NIC 1: Intel X550 Dual Port 10GbE BASE-T Adapter, PCIe Full Height
NIC 2: Intel Ethernet I350 Dual Port 1GbE BASE-T Adapter, PCIe Low Profile
OS: Ubuntu 18.04 -- gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

Switch

Dell Networking S4048T-ON, 48x 10GBASE-T and 6x 40GbE QSFP+ ports, IO to PSU air, 2x AC PSU, OS9

Keyed, Shared Memory, C++98

The graph below shows the one-way latency without load between a Publisher and a Subscriber running in two processes within a single node. The numbers are for best-effort as well as strict reliable reliability scenarios.

Note

We use the median (50th percentile) instead of the average in order to get a more stable measurement that does not account for spurious outliers. We also calculate the average value and other percentile values, which can be seen in the Detailed Statistics section below.

Detailed Statistics

The following tables contain the raw numbers presented by RTI Perftest. These numbers are the exact output with no further processing.

  • Best Effort

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

11

0.6

10

139

11

12

13

20

90

64

11

0.6

10

139

11

12

13

20

81

128

11

0.6

10

141

11

12

13

20

80

256

11

0.6

10

139

11

12

13

20

81

512

11

0.7

11

196

11

12

13

20

139

1024

12

0.7

11

140

11

12

13

20

80

8192

12

0.7

11

141

12

13

14

21

141

63000

20

0.9

19

150

20

21

23

31

150

  • Reliable

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

14

1.1

12

213

13

15

18

24

213

64

14

1.1

12

213

13

15

18

24

213

128

14

1.1

12

213

13

15

18

24

213

256

14

1.1

12

225

13

15

18

24

225

512

14

1.2

12

213

13

15

18

24

213

1024

13

1.1

13

215

13

15

18

25

215

8192

15

1.1

13

217

15

16

19

26

217

63000

24

2.0

21

231

24

25

38

44

231


Perftest Scripts

To produce these tests, we executed RTI Perftest for C++98. The exact commands used can be found here:

Publisher Side

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
sudo /set_lat_mode.sh

echo EXECUTABLE IS $1
export executable=$1

echo OUTPUT PATH IS $2
export output_folder=$2

export exec_time=30
export pub_string="-pub \
        -transport SHMEM \
        -noPrint \
        -noOutputHeaders \
        -exec $exec_time \
        -noXML\
        -latencyTest"

mkdir -p $output_folder

echo ">> UNKEYED BE"
export my_file=$output_folder/lat_shmem_pub_unkeyed_be.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -best -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done
sleep 5;

echo ">> UNKEYED REL"
export my_file=$output_folder/lat_shmem_pub_unkeyed_rel.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
    export command="\
    $executable -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done
sleep 5;

echo ">> KEYED BE"
export my_file=$output_folder/lat_shmem_pub_keyed_be.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -best -keyed -instances 100000 -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done
sleep 5;

echo ">> KEYED REL"
export my_file=$output_folder/lat_shmem_pub_keyed_rel.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -keyed -instances 100000 -datalen $DATALEN $pub_string"
    echo $command
    $command >> $my_file;
    sleep 3;
done

Subscriber Side

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
sudo /set_lat_mode.sh

echo EXECUTABLE IS $1
export executable=$1

echo OUTPUT PATH IS $2
export output_folder=$2

export nic=172.16.0.2
export sub_string="-sub \
        -transport SHMEM \
        -nic $nic \
        -noPrint \
        -noOutputHeaders \
        -noXML"

mkdir -p $output_folder

echo ">> UNKEYED BE"
export my_file=$output_folder/lat_shmem_sub_unkeyed_be.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -best $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done
sleep 5;

echo ">> UNKEYED REL"
export my_file=$output_folder/lat_shmem_sub_unkeyed_rel.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
    export command="\
    $executable $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done
sleep 5;

echo ">> KEYED BE"
export my_file=$output_folder/lat_shmem_sub_keyed_be.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -best -keyed -instances 100000 $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done
sleep 5;

echo ">> KEYED REL"
export my_file=$output_folder/lat_shmem_sub_keyed_rel.csv
touch $my_file
for DATALEN in 32 64 128 256 512 1024 8192 63000; do
    export command="\
    $executable -keyed -instances 100000 $sub_string -datalen $DATALEN"
    echo $command
    $command >> $my_file;
    sleep 10;
done

Test Hardware

The following hardware was used to perform these tests:

Linux Nodes

Processor: Intel® Xeon® E-2186G 3.8GHz, 12M cache, 6C/12T, turbo (95W)
RAM: 16GB 2666MT/s DDR4 ECC UDIMM
NIC 1: Intel X550 Dual Port 10GbE BASE-T Adapter, PCIe Full Height
NIC 2: Intel Ethernet I350 Dual Port 1GbE BASE-T Adapter, PCIe Low Profile
OS: Ubuntu 18.04 -- gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

Switch

Dell Networking S4048T-ON, 48x 10GBASE-T and 6x 40GbE QSFP+ ports, IO to PSU air, 2x AC PSU, OS9