Shared Memory

The following tests have been performed by executing the RTI Perftest C++98 benchmark application between two applications within the same node. The communication has been set to use Shared Memory (SHMEM).

Find the information about the hardware, network and command-line parameters after each of the tests.

Unkeyed, Shared Memory, C++98

The graph below shows the one-way latency without load between a Publisher and a Subscriber running in two processes within a single node. The numbers are for best-effort as well as strict reliable reliability scenarios.

Note

We use the median (50th percentile) instead of the average in order to get a more stable measurement that does not account for spurious outliers. We also calculate the average value and other percentile values, which can be seen in the Detailed Statistics section below.

Detailed Statistics

The following tables contain the raw numbers presented by RTI Perftest. These numbers are the exact output with no further processing.

  • Best Effort

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

9

0.6

8

95

9

9

11

19

95

64

9

0.8

8

95

9

10

11

18

95

128

9

1.0

8

97

9

11

14

21

97

256

9

1.0

8

49

9

9

14

20

49

512

9

0.6

8

86

9

9

11

18

86

1024

9

1.1

8

97

9

10

14

20

97

2048

9

0.8

9

36

9

10

12

20

36

4096

10

0.9

9

200

10

10

12

20

200

8192

10

0.8

9

42

10

10

16

20

42

16384

11

1.0

10

55

11

12

18

27

55

32768

14

1.4

12

80

13

16

20

38

80

63000

18

1.3

16

168

18

18

25

59

168

  • Reliable

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

11

2.0

9

42

10

14

19

34

42

64

11

2.2

9

42

10

15

18

34

42

128

11

2.0

9

39

10

15

18

33

39

256

11

2.0

9

48

10

14

18

33

48

512

11

2.0

9

42

11

12

19

35

42

1024

11

2.0

9

44

11

15

19

34

44

2048

12

2.1

9

103

11

15

19

35

103

4096

12

2.1

10

53

11

13

20

35

53

8192

12

1.8

10

46

12

12

20

34

46

16384

14

2.3

12

88

13

17

24

36

88

32768

16

2.1

14

307

16

18

25

43

307

63000

22

2.2

18

103

21

23

32

69

103

100000

35

6.3

27

190

30

42

57

107

190

500000

114

31.3

84

713

104

174

205

555

713

1048576

420

108.0

192

1433

367

552

580

1160

1433

1548576

637

169.8

311

2081

597

862

912

1722

2081

4194304

1925

412.6

1171

5540

1901

2445

2535

5540

5540

10485760

4716

752.7

4123

15317

3900

5494

5542

15317

15317


Perftest Scripts

To produce these tests, we executed RTI Perftest for C++98. The exact commands used can be found here:

Publisher Side

 1sudo /set_lat_mode.sh
 2
 3echo EXECUTABLE IS $1
 4export executable=$1
 5
 6echo OUTPUT PATH IS $2
 7export output_folder=$2
 8
 9export exec_time=30
10export pub_string="-pub \
11        -transport SHMEM \
12        -noPrint \
13        -noOutputHeaders \
14        -exec $exec_time \
15        -noXML\
16        -latencyTest"
17
18mkdir -p $output_folder
19
20echo ">> UNKEYED BE"
21export my_file=$output_folder/lat_shmem_pub_unkeyed_be.csv
22touch $my_file
23for DATALEN in 32 64 128 256 512 1024 8192 63000; do
24    export command="\
25    $executable -best -datalen $DATALEN $pub_string"
26    echo $command
27    $command >> $my_file;
28    sleep 3;
29done
30sleep 5;
31
32echo ">> UNKEYED REL"
33export my_file=$output_folder/lat_shmem_pub_unkeyed_rel.csv
34touch $my_file
35for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
36    export command="\
37    $executable -datalen $DATALEN $pub_string"
38    echo $command
39    $command >> $my_file;
40    sleep 3;
41done
42sleep 5;
43
44echo ">> KEYED BE"
45export my_file=$output_folder/lat_shmem_pub_keyed_be.csv
46touch $my_file
47for DATALEN in 32 64 128 256 512 1024 8192 63000; do
48    export command="\
49    $executable -best -keyed -instances 100000 -datalen $DATALEN $pub_string"
50    echo $command
51    $command >> $my_file;
52    sleep 3;
53done
54sleep 5;
55
56echo ">> KEYED REL"
57export my_file=$output_folder/lat_shmem_pub_keyed_rel.csv
58touch $my_file
59for DATALEN in 32 64 128 256 512 1024 8192 63000; do
60    export command="\
61    $executable -keyed -instances 100000 -datalen $DATALEN $pub_string"
62    echo $command
63    $command >> $my_file;
64    sleep 3;
65done

Subscriber Side

 1sudo /set_lat_mode.sh
 2
 3echo EXECUTABLE IS $1
 4export executable=$1
 5
 6echo OUTPUT PATH IS $2
 7export output_folder=$2
 8
 9export nic=172.16.0.2
10export sub_string="-sub \
11        -transport SHMEM \
12        -nic $nic \
13        -noPrint \
14        -noOutputHeaders \
15        -noXML"
16
17mkdir -p $output_folder
18
19echo ">> UNKEYED BE"
20export my_file=$output_folder/lat_shmem_sub_unkeyed_be.csv
21touch $my_file
22for DATALEN in 32 64 128 256 512 1024 8192 63000; do
23    export command="\
24    $executable -best $sub_string -datalen $DATALEN"
25    echo $command
26    $command >> $my_file;
27    sleep 10;
28done
29sleep 5;
30
31echo ">> UNKEYED REL"
32export my_file=$output_folder/lat_shmem_sub_unkeyed_rel.csv
33touch $my_file
34for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
35    export command="\
36    $executable $sub_string -datalen $DATALEN"
37    echo $command
38    $command >> $my_file;
39    sleep 10;
40done
41sleep 5;
42
43echo ">> KEYED BE"
44export my_file=$output_folder/lat_shmem_sub_keyed_be.csv
45touch $my_file
46for DATALEN in 32 64 128 256 512 1024 8192 63000; do
47    export command="\
48    $executable -best -keyed -instances 100000 $sub_string -datalen $DATALEN"
49    echo $command
50    $command >> $my_file;
51    sleep 10;
52done
53sleep 5;
54
55echo ">> KEYED REL"
56export my_file=$output_folder/lat_shmem_sub_keyed_rel.csv
57touch $my_file
58for DATALEN in 32 64 128 256 512 1024 8192 63000; do
59    export command="\
60    $executable -keyed -instances 100000 $sub_string -datalen $DATALEN"
61    echo $command
62    $command >> $my_file;
63    sleep 10;
64done

Test Hardware

The following hardware was used to perform these tests:

Linux Nodes

Processor: Intel® Xeon® E-2186G 3.8GHz, 12M cache, 6C/12T, turbo (95W)
RAM: 16GB 2666MT/s DDR4 ECC UDIMM
NIC 1: Intel X550 Dual Port 10GbE BASE-T Adapter, PCIe Full Height
NIC 2: Intel Ethernet I350 Dual Port 1GbE BASE-T Adapter, PCIe Low Profile
OS: Ubuntu 18.04 -- gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

Switch

Dell Networking S4048T-ON, 48x 10GBASE-T and 6x 40GbE QSFP+ ports, IO to PSU air, 2x AC PSU, OS9

Keyed, Shared Memory, C++98

The graph below shows the one-way latency without load between a Publisher and a Subscriber running in two processes within a single node. The numbers are for best-effort as well as strict reliable reliability scenarios.

Note

We use the median (50th percentile) instead of the average in order to get a more stable measurement that does not account for spurious outliers. We also calculate the average value and other percentile values, which can be seen in the Detailed Statistics section below.

Detailed Statistics

The following tables contain the raw numbers presented by RTI Perftest. These numbers are the exact output with no further processing.

  • Best Effort

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

11

1.0

10

210

11

11

18

21

210

64

12

1.7

10

206

11

14

18

24

206

128

11

1.1

10

206

11

12

16

24

206

256

12

1.1

10

206

11

13

14

21

206

512

12

1.3

10

221

11

13

15

22

221

1024

11

0.9

10

206

11

12

15

22

206

2048

11

0.9

10

206

11

12

14

21

206

4096

12

1.1

11

209

12

12

18

22

209

8192

12

1.2

11

220

12

13

19

25

220

16384

14

1.1

12

219

13

14

20

25

219

32768

16

1.1

14

212

16

16

22

27

212

63000

20

1.5

18

221

20

21

27

33

221

  • Reliable

Sample Size (Bytes)

Avg (μs)

Std (μs)

Min (μs)

Max (μs)

50% (μs)

90% (μs)

99% (μs)

99.99% (μs)

99.9999% (μs)

32

13

2.0

12

212

13

14

21

38

212

64

14

2.4

11

209

13

16

22

37

209

128

13

2.2

11

216

13

15

21

37

216

256

13

1.9

11

217

13

15

22

35

217

512

14

2.2

12

319

13

15

22

38

319

1024

16

2.8

12

212

13

17

24

38

212

2048

13

1.6

12

220

13

14

21

37

220

4096

15

2.5

12

216

14

17

23

38

216

8192

14

1.7

13

220

14

15

22

39

220

16384

16

1.8

14

223

16

17

24

40

223

32768

18

1.4

16

228

18

19

26

38

228

63000

25

2.6

21

232

24

26

39

53

232


Perftest Scripts

To produce these tests, we executed RTI Perftest for C++98. The exact commands used can be found here:

Publisher Side

 1sudo /set_lat_mode.sh
 2
 3echo EXECUTABLE IS $1
 4export executable=$1
 5
 6echo OUTPUT PATH IS $2
 7export output_folder=$2
 8
 9export exec_time=30
10export pub_string="-pub \
11        -transport SHMEM \
12        -noPrint \
13        -noOutputHeaders \
14        -exec $exec_time \
15        -noXML\
16        -latencyTest"
17
18mkdir -p $output_folder
19
20echo ">> UNKEYED BE"
21export my_file=$output_folder/lat_shmem_pub_unkeyed_be.csv
22touch $my_file
23for DATALEN in 32 64 128 256 512 1024 8192 63000; do
24    export command="\
25    $executable -best -datalen $DATALEN $pub_string"
26    echo $command
27    $command >> $my_file;
28    sleep 3;
29done
30sleep 5;
31
32echo ">> UNKEYED REL"
33export my_file=$output_folder/lat_shmem_pub_unkeyed_rel.csv
34touch $my_file
35for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
36    export command="\
37    $executable -datalen $DATALEN $pub_string"
38    echo $command
39    $command >> $my_file;
40    sleep 3;
41done
42sleep 5;
43
44echo ">> KEYED BE"
45export my_file=$output_folder/lat_shmem_pub_keyed_be.csv
46touch $my_file
47for DATALEN in 32 64 128 256 512 1024 8192 63000; do
48    export command="\
49    $executable -best -keyed -instances 100000 -datalen $DATALEN $pub_string"
50    echo $command
51    $command >> $my_file;
52    sleep 3;
53done
54sleep 5;
55
56echo ">> KEYED REL"
57export my_file=$output_folder/lat_shmem_pub_keyed_rel.csv
58touch $my_file
59for DATALEN in 32 64 128 256 512 1024 8192 63000; do
60    export command="\
61    $executable -keyed -instances 100000 -datalen $DATALEN $pub_string"
62    echo $command
63    $command >> $my_file;
64    sleep 3;
65done

Subscriber Side

 1sudo /set_lat_mode.sh
 2
 3echo EXECUTABLE IS $1
 4export executable=$1
 5
 6echo OUTPUT PATH IS $2
 7export output_folder=$2
 8
 9export nic=172.16.0.2
10export sub_string="-sub \
11        -transport SHMEM \
12        -nic $nic \
13        -noPrint \
14        -noOutputHeaders \
15        -noXML"
16
17mkdir -p $output_folder
18
19echo ">> UNKEYED BE"
20export my_file=$output_folder/lat_shmem_sub_unkeyed_be.csv
21touch $my_file
22for DATALEN in 32 64 128 256 512 1024 8192 63000; do
23    export command="\
24    $executable -best $sub_string -datalen $DATALEN"
25    echo $command
26    $command >> $my_file;
27    sleep 10;
28done
29sleep 5;
30
31echo ">> UNKEYED REL"
32export my_file=$output_folder/lat_shmem_sub_unkeyed_rel.csv
33touch $my_file
34for DATALEN in 32 64 128 256 512 1024 8192 63000 100000 500000 1048576 1548576 4194304 10485760; do
35    export command="\
36    $executable $sub_string -datalen $DATALEN"
37    echo $command
38    $command >> $my_file;
39    sleep 10;
40done
41sleep 5;
42
43echo ">> KEYED BE"
44export my_file=$output_folder/lat_shmem_sub_keyed_be.csv
45touch $my_file
46for DATALEN in 32 64 128 256 512 1024 8192 63000; do
47    export command="\
48    $executable -best -keyed -instances 100000 $sub_string -datalen $DATALEN"
49    echo $command
50    $command >> $my_file;
51    sleep 10;
52done
53sleep 5;
54
55echo ">> KEYED REL"
56export my_file=$output_folder/lat_shmem_sub_keyed_rel.csv
57touch $my_file
58for DATALEN in 32 64 128 256 512 1024 8192 63000; do
59    export command="\
60    $executable -keyed -instances 100000 $sub_string -datalen $DATALEN"
61    echo $command
62    $command >> $my_file;
63    sleep 10;
64done

Test Hardware

The following hardware was used to perform these tests:

Linux Nodes

Processor: Intel® Xeon® E-2186G 3.8GHz, 12M cache, 6C/12T, turbo (95W)
RAM: 16GB 2666MT/s DDR4 ECC UDIMM
NIC 1: Intel X550 Dual Port 10GbE BASE-T Adapter, PCIe Full Height
NIC 2: Intel Ethernet I350 Dual Port 1GbE BASE-T Adapter, PCIe Low Profile
OS: Ubuntu 18.04 -- gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

Switch

Dell Networking S4048T-ON, 48x 10GBASE-T and 6x 40GbE QSFP+ ports, IO to PSU air, 2x AC PSU, OS9