2.1.3. Memory Performance

This document describes memory usage for RTI Connext Professional 6.1.1. The goal is to provide an idea of how much memory is used by the libraries and the DDS entities. This document does not provide exact formulas and numbers for every possible configuration of the software.

2.1.3.1. Platforms and Libraries

Our measurements have been gathered for the following platforms using a custom C++ benchmark application:

Platform	Heap Usage	Libraries in Memory	Minimum Thread Stack Size
Linux	X	X	X
Windows	X	X
Mac	X	X

Linux: x64 CentOS 7.0 using RTI Connext release target libraries for x64Linux3gcc4.8.2.
Windows®: Windows 10 (64bits) using RTI Connext release target libraries for x64Win64VS2017.
macOS®: macOS X 10.15 using RTI Connext release target libraries for x64Darwin17clang9.0.

2.1.3.2. Program Memory

These numbers reflect the memory required to load the dynamic libraries when executing code that includes RTI Connext Professional.

For Linux and macOS, we measured the text size by calling the size command for each library. For Windows, we measured this size as the mapped size shown by the Process Explorer.

Library/Architecture	x86 Linux (i86Linux3gcc4.8.2)	x64 Linux (x64Linux3gcc4.8.2)	MacOS (x64Darwin17clang9.0)	Windows x64 (x64Win64VS2017)
libnddscpp (Traditional C++ Binding Library)	1602284 Bytes	1531692 Bytes	1634304 Bytes	1036288 Bytes
libnddscpp2 (Modern C++ Binding Library)	1138523 Bytes	1156837 Bytes	1228800 Bytes	1495040 Bytes
libnddsc (C Binding Library)	6537754 Bytes	6186742 Bytes	6017024 Bytes	5230592 Bytes
libnddscore (Core Funtionalities Library)	6889213 Bytes	6549867 Bytes	6459392 Bytes	5074944 Bytes

Note

The libraries tested are the release versions. These can be found in the <$NDDSHOME>/lib/<$ARCHITECTURE> folder.

2.1.3.3. RTI Threads

This section provides the default and minimum stack size for all the different threads created by the middleware. This includes the following threads:

Database thread
Event thread
Receive threads
Asynchronous publishing thread
Batching thread

The actual number of threads created by the middleware will depend on the configuration of several QoS policies, such as the ASYNCHRONOUS_PUBLISHER or BATCH QoS Policies.

By default, the stack size value assigned to each thread depends on the platform and OS. This value can be modified by updating the thread stack size QoS value, but a minimum value is required.

Thread	Default Stack Size	Minimum Stack Size
User thread	OS default	30.4 kB
Database thread	OS default	7.6 kB
Event thread	OS default	18.0 kB
Receiver thread	OS default	11.6 kB
Asynchronous publishing thread	OS default	7.5 kB
Batch thread	OS default	7.5 kB

Note

The Minimum Stack Size value refers to the minimum stack size needed for a given thread. This value assumes no user-specific stack space is needed; therefore, if the user adds any data on the thread’s stack, that size must be taken into account.

Note

On Linux, the OS default can be obtained by invoking the ulimit command. In the CentOS 7 machines we used, this size was 10240 kB.

(+) Click to know more about the different thread types in Connext

DataBase Thread (also referred to as the Database cleanup thread) is created to garbage-collect records related to deleted entities from the in-memory database used by the middleware. There is one database thread per DomainParticipant.
Event Thread handles all timed events, including checking for timeouts and deadlines as well as sending periodic heartbeats and repair traffic. There is one event thread per DomainParticipant.
Receive Threads are used to receive and process the data from the installed transports. There is one receive thread per (transport, receive port) pair. When using the builtin UDPv4 and SHMEM transports (with the default configuration), Connext creates five receive threads:
For discovery:
- Two for unicast (one for UDPv4, one for SHMEM)
- One for multicast (for UDPv4)
For user data:
- Two for unicast (one for UDPv4, one for SHMEM)
Asynchronous Publishing Thread handles the data transmission when asynchronous publishing is enabled in a DataWriter. There is one asynchronous publishing thread per Publisher. This thread is created only if there is one DataWriter enabling asynchronous publishing in the Publisher.
Batch Thread handles the asynchronous flushing of a batch when batching is enabled in a DataWriter and the flush_period is set to a value different than DDS_DURATION_INFINITE. There is one batch thread per Publisher. This thread is created only if there is one DataWriter enabling batching and setting a finite flush_period in the Publisher.

2.1.3.4. RTI Transports

This section provides the memory allocated by the OS for the builtin transports UDPv4, UDPv6, and SHMEM, using the default QoS settings.

When using UDPv4 with the default configuration, Connext uses the following for each new DomainParticipant created:

One receive socket to receive Unicast-Discovery data
One receive socket to receive Multicast-Discovery data
One receive socket to receive Unicast-UserData data
One socket to send Unicast data
N sockets to send Multicast-Discovery data where N is the number of multicast interfaces in the host

The port assigned for the receive socket depends on the domain ID and participant ID. The same number of sockets are opened when using UDPv6.

Size of the buffers

The receive and send socket buffer size can be configured by modifying the transport QoS settings. By default, these values are as follows:

Buffer Size	UDPv4	UDPv6
Receive socket	131072 bytes	131072 bytes
Send socket	131072 bytes	131072 bytes

For SHMEM, Connext will use by default:

One shared memory buffer for Unicast-Discovery data
One shared memory buffer for Unicast-UserData data

Size of the buffers

The receive and send socket buffer size can be configured by modifying the transport QoS settings. By default. For SHMEM, the value depends on the maximum size of the SHMEM message and the maximum number of SHMEM received messages:

(SHMEM_RECEIVED_MESSAGE_COUNT_MAX_DEFAULT * SHMEM_MESSAGE_SIZE_MAX_DEFAULT / 4) = ( 64 * 65536 / 4) = 1048576

Buffer Size	SHMEM
Receive socket	1048576 bytes

2.1.3.5. Heap Usage of Connext Entities

RTI has designed and implemented a benchmark application that measures the memory that is directly allocated by the middleware using malloc(). Additionally, the Connext libraries also request the OS to allocate other memory, including:

Socket buffers (see RTI Transports)
Shared memory regions (see RTI Transports)
Thread stacks (see see RTI Threads)

All the memory allocated by the OS can be tuned using QoS parameters or DDS transport properties.

The following tables report the average heap allocation for the different DDS entities that can be used in a Connext application.

The amount of memory required for an entity depends on the value of different QoS policies. For this benchmark, RTI has used a QoS profile that minimizes the memory usage. The profile is provided in a separate XML file and is described in Minimum QoS Settings.

Entity	Size (Bytes)
Participant Factory	63480
Participant	1945234
Type	1449
Topic	1950
Subscriber	9585
Publisher	3825
DataReader	71688
DataWriter	41885
Instance	486
Sample	1358
Remote Readers	7019
Remote Writers	15429
Reader Instance	888
Reader Samples	917
Remote Participant	77005

Entity	Size (Bytes)
Participant Factory	63364
Participant	1964764
Type	1612
Topic	2273
Subscriber	9546
Publisher	3623
DataReader	71789
DataWriter	41890
Instance	501
Sample	1372
Remote Readers	7985
Remote Writers	15604
Reader Instance	889
Reader Samples	917
Remote Participant	78490

Entity	Size (Bytes)
Participant Factory	63528
Participant	1940413
Type	1449
Topic	2142
Subscriber	9609
Publisher	3687
DataReader	71664
DataWriter	41878
Instance	505
Sample	1356
Remote Readers	6972
Remote Writers	15007
Reader Instance	890
Reader Samples	918
Remote Participant	79253

The memory reported for samples and instances does not include the user data, only the meta-data.

Note

To efficiently manage the creation and deletion of DDS entities and samples, Connext implements its own memory manager. The memory manager allocates and manages multiple buffers to avoid continuous memory allocation. Therefore, the memory growth does not necessarily follow linearly with the creation of DDS entities and samples. The pre-allocation scheme of the memory manager is configurable.

2.1.3.6. Minimum QoS Settings

To obtain the results mentioned above, we used the MinimalMemoryFootPrint profile, included in the builtin profiles. This profile minimizes the use of memory, and can be seen below:

<domain_participant_qos>

    <transport_builtin>
      <mask>UDPv4</mask>
    </transport_builtin>

    <discovery_config>
      <publication_reader_resource_limits>
        <initial_samples>1</initial_samples>
        <max_samples>LENGTH_UNLIMITED</max_samples>
        <max_samples_per_read>1</max_samples_per_read>
        <dynamically_allocate_fragmented_samples>true</dynamically_allocate_fragmented_samples>
        <initial_infos>1</initial_infos>
        <initial_outstanding_reads>1</initial_outstanding_reads>
        <initial_fragmented_samples>1</initial_fragmented_samples>
      </publication_reader_resource_limits>
      <subscription_reader_resource_limits>
        <initial_samples>1</initial_samples>
        <max_samples>LENGTH_UNLIMITED</max_samples>
        <max_samples_per_read>1</max_samples_per_read>
        <dynamically_allocate_fragmented_samples>true</dynamically_allocate_fragmented_samples>
        <initial_infos>1</initial_infos>
        <initial_outstanding_reads>1</initial_outstanding_reads>
        <initial_fragmented_samples>1</initial_fragmented_samples>
      </subscription_reader_resource_limits>
      <participant_reader_resource_limits>
        <initial_samples>1</initial_samples>
        <max_samples>LENGTH_UNLIMITED</max_samples>
        <max_samples_per_read>1</max_samples_per_read>
        <dynamically_allocate_fragmented_samples>true</dynamically_allocate_fragmented_samples>
        <initial_infos>1</initial_infos>
        <initial_outstanding_reads>1</initial_outstanding_reads>
        <initial_fragmented_samples>1</initial_fragmented_samples>
      </participant_reader_resource_limits>
    </discovery_config>

    <resource_limits>
      <transport_info_list_max_length>0</transport_info_list_max_length>
      <local_writer_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </local_writer_allocation>
      <local_reader_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </local_reader_allocation>
      <local_publisher_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </local_publisher_allocation>
      <local_subscriber_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </local_subscriber_allocation>
      <local_topic_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </local_topic_allocation>
      <remote_writer_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </remote_writer_allocation>
      <remote_reader_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </remote_reader_allocation>
      <remote_participant_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </remote_participant_allocation>
      <matching_writer_reader_pair_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </matching_writer_reader_pair_allocation>
      <matching_reader_writer_pair_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </matching_reader_writer_pair_allocation>
      <ignored_entity_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </ignored_entity_allocation>
      <content_filter_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </content_filter_allocation>
      <content_filtered_topic_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </content_filtered_topic_allocation>
      <read_condition_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </read_condition_allocation>
      <query_condition_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </query_condition_allocation>
      <outstanding_asynchronous_sample_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>1</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </outstanding_asynchronous_sample_allocation>
      <flow_controller_allocation>
        <incremental_count>1</incremental_count>
        <initial_count>4</initial_count>
        <max_count>LENGTH_UNLIMITED</max_count>
      </flow_controller_allocation>

      <local_writer_hash_buckets>1</local_writer_hash_buckets>
      <local_reader_hash_buckets>1</local_reader_hash_buckets>
      <local_publisher_hash_buckets>1</local_publisher_hash_buckets>
      <local_subscriber_hash_buckets>1</local_subscriber_hash_buckets>
      <local_topic_hash_buckets>1</local_topic_hash_buckets>
      <remote_writer_hash_buckets>1</remote_writer_hash_buckets>
      <remote_reader_hash_buckets>1</remote_reader_hash_buckets>
      <remote_participant_hash_buckets>1</remote_participant_hash_buckets>
      <matching_reader_writer_pair_hash_buckets>1</matching_reader_writer_pair_hash_buckets>
      <matching_writer_reader_pair_hash_buckets>1</matching_writer_reader_pair_hash_buckets>
      <ignored_entity_hash_buckets>1</ignored_entity_hash_buckets>
      <content_filter_hash_buckets>1</content_filter_hash_buckets>
      <content_filtered_topic_hash_buckets>1</content_filtered_topic_hash_buckets>
      <flow_controller_hash_buckets>1</flow_controller_hash_buckets>

      <max_gather_destinations>16</max_gather_destinations>

      <participant_user_data_max_length>8</participant_user_data_max_length>
      <topic_data_max_length>0</topic_data_max_length>
      <publisher_group_data_max_length>0</publisher_group_data_max_length>
      <subscriber_group_data_max_length>0</subscriber_group_data_max_length>

      <writer_user_data_max_length>16</writer_user_data_max_length>
      <reader_user_data_max_length>16</reader_user_data_max_length>

      <max_partitions>0</max_partitions>
      <max_partition_cumulative_characters>0</max_partition_cumulative_characters>

      <type_code_max_serialized_length>0</type_code_max_serialized_length>
      <type_object_max_deserialized_length>0</type_object_max_deserialized_length>
      <type_object_max_serialized_length>0</type_object_max_serialized_length>
      <deserialized_type_object_dynamic_allocation_threshold>0</deserialized_type_object_dynamic_allocation_threshold>
      <serialized_type_object_dynamic_allocation_threshold>0</serialized_type_object_dynamic_allocation_threshold>

      <contentfilter_property_max_length>1</contentfilter_property_max_length>
      <participant_property_list_max_length>0</participant_property_list_max_length>
      <participant_property_string_max_length>0</participant_property_string_max_length>
      <writer_property_list_max_length>0</writer_property_list_max_length>
      <writer_property_string_max_length>0</writer_property_string_max_length>
      <max_endpoint_groups>0</max_endpoint_groups>
      <max_endpoint_group_cumulative_characters>0</max_endpoint_group_cumulative_characters>

      <channel_seq_max_length>0</channel_seq_max_length>
      <channel_filter_expression_max_length>0</channel_filter_expression_max_length>
      <writer_data_tag_list_max_length>0</writer_data_tag_list_max_length>
      <writer_data_tag_string_max_length>0</writer_data_tag_string_max_length>
      <reader_data_tag_list_max_length>0</reader_data_tag_list_max_length>
      <reader_data_tag_string_max_length>0</reader_data_tag_string_max_length>
    </resource_limits>

    <database>
      <initial_weak_references>256</initial_weak_references>
      <max_weak_references>1000000</max_weak_references>
      <shutdown_cleanup_period>
        <sec>0</sec>
        <nanosec>100000000</nanosec>
      </shutdown_cleanup_period>
    </database>

    <property inherit="false">
      <value>
      </value>
    </property>

  </domain_participant_qos>

  <datawriter_qos>

    <reliability>
      <kind>RELIABLE_RELIABILITY_QOS</kind>
    </reliability>

    <history>
      <kind>KEEP_ALL_HISTORY_QOS</kind>
    </history>

    <resource_limits>
      <initial_instances>1</initial_instances>
      <initial_samples>1</initial_samples>
      <instance_hash_buckets>1</instance_hash_buckets>
    </resource_limits>

  </datawriter_qos>

  <datareader_qos>

    <reliability>
      <kind>RELIABLE_RELIABILITY_QOS</kind>
    </reliability>

    <history>
      <kind>KEEP_ALL_HISTORY_QOS</kind>
    </history>

    <resource_limits>
      <initial_instances>1</initial_instances>
      <initial_samples>1</initial_samples>
    </resource_limits>

    <reader_resource_limits>
      <max_samples_per_read>1</max_samples_per_read>
      <initial_infos>1</initial_infos>
      <initial_outstanding_reads>1</initial_outstanding_reads>
      <initial_remote_writers>1</initial_remote_writers>
      <initial_remote_writers_per_instance>1</initial_remote_writers_per_instance>
      <initial_fragmented_samples>1</initial_fragmented_samples>
      <dynamically_allocate_fragmented_samples>1</dynamically_allocate_fragmented_samples>
      <initial_remote_virtual_writers>1</initial_remote_virtual_writers>
      <initial_remote_virtual_writers_per_instance>1</initial_remote_virtual_writers_per_instance>
      <max_query_condition_filters>0</max_query_condition_filters>
    </reader_resource_limits>

  </datareader_qos>

  <topic_qos>
    <resource_limits>
      <initial_samples>1</initial_samples>
      <initial_instances>1</initial_instances>
      <instance_hash_buckets>1</instance_hash_buckets>
    </resource_limits>
  </topic_qos>
</qos_profile>