3.25. Startup Time Guidelines¶
This section provides guidelines to measure and reduce initialization and discovery time in Connext Micro. These tips apply to any deployment where startup time matters, such as automotive or industrial systems with strict startup budgets.
3.25.1. Understanding the startup process¶
Startup in Connext Micro has four phases:
Creation of local DDS entities (DomainParticipants, Topics, DataWriters, and DataReaders);
Enabling those entities;
Discovery processing, where your DomainParticipants exchange discovery traffic with remote DomainParticipants;
Discovery completion, where your application has discovered all expected remote DomainParticipants and endpoints.
The autoenable_created_entities setting controls whether these phases overlap. When autoenable is TRUE, entities are enabled as they are created, so discovery can begin before all entities exist; phases 1, 2, and 3 run partially in parallel. When autoenable is FALSE, you create all entities first (disabled), then enable the DomainParticipant. Discovery begins only after the DomainParticipant is enabled.
This section separates startup into two measurable intervals: initialization time (phases 1 and 2) and discovery time (phases 3 and 4). These intervals can be tuned independently.
3.25.1.1. Initialization time¶
This is the interval from application start until all required local DDS entities are created and enabled. This includes creating and enabling the DomainParticipant and any locally required Topics, DataWriters, DataReaders, and associated local resource allocation.
When autoenable is TRUE, discovery may overlap with initialization Time. It excludes waiting for remote discovery or matching and any platform or network bring-up outside DDS initialization.
3.25.1.2. Discovery time¶
This is the interval from when the DomainParticipant is enabled until your application has discovered the expected set of remote DomainParticipants and endpoints. It includes DomainParticipants discovery and endpoint discovery traffic and processing, but excludes protocols above DDS discovery unless explicitly stated.
Application-installed listeners (e.g., on_subscription_matched,
on_publication_matched) run on discovery threads and can increase
discovery time if they perform expensive operations, such as logging.
3.25.2. Minimizing startup time¶
Slow startup is commonly caused by unoptimized builds, blocking I/O during startup, or oversized resource limits. The following sections describe application design choices, QoS configuration, and build settings that affect startup time.
Note
Some platforms have specific startup-time guidance. See Platforms Guide for details on your target platform.
3.25.2.1. Application design¶
The following subsections describe application-level design decisions that can affect startup time.
3.25.2.1.1. Participant and domain architecture¶
Each DomainParticipant adds initialization overhead and consumes internal resources. Use one DomainParticipant per application as the default approach, and only introduce additional DomainParticipants when you need to communicate in multiple DDS domains.
If your system has groups of applications that do not need to communicate with each other, place them in separate DDS domains to reduce discovery traffic and prevent exceeding discovery resources (e.g., the maximum number of remote entities).
For a general overview of DomainParticipants and the discovery process, see Discovery.
3.25.2.1.2. Enabling entities manually¶
When many entities are created at startup, enabling them incrementally or relying on autoenable increases initialization time and may impact discovery time determinism. If your application design permits it, create all DDS entities disabled (set autoenable_created_entities to FALSE) and call DDS_Entity_enable() with the following syntax after all entities are created:
DDS_Entity_enable(DDS_DomainParticipant_as_entity(participant))
3.25.2.1.3. Discovery plugin selection¶
Connext Micro provides two discovery plugins with different startup characteristics: DPDE (Dynamic Participant Dynamic Endpoint) and DPSE (Dynamic Participant Static Endpoint).
DPSE has a smaller base footprint (~14 KB) than DPDE (~48 KB) because DPDE creates three pairs of built-in DataWriters and DataReaders for DomainParticipant, publication, and subscription discovery traffic, while DPSE creates only one pair (for DomainParticipant discovery). This difference becomes more pronounced as you add more endpoints.
If your deployment topology is fixed and known at build time, use DPSE; it reduces both initialization overhead and discovery traffic. If your topology is dynamic or you need to discover endpoints at runtime, use DPDE. See Discovery for configuration details of both plugins.
3.25.2.1.4. Transport and peer configuration¶
Enable only the transports your deployment requires. Each enabled transport adds initialization work: socket/port setup, internal tables, and resource allocation. See Transport Registration for details on how to register and enable transports.
By default, for the UDP transport, Connext Micro uses all available network interfaces.
Use allow_interface or deny_interface on the transport’s
UDP_InterfaceFactoryProperty
to restrict which network interfaces are used by a DomainParticipant.
Limiting interfaces to those required by your deployment reduces the number
of locators advertised in discovery announcements, the number of discovery
messages sent, and the number of sockets opened at startup. See
UDP Transport for details.
List only the minimal set of peers your DomainParticipant needs to discover
in DDS_DiscoveryQosPolicy.initial_peers, and use explicit DomainParticipant
indices (e.g., [0-2]@_udp://10.10.30.1) rather than the default range.
If using multicast discovery, you can include just the multicast address
instead of individual peers. Each entry in the peer list generates discovery
announcements, so removing unnecessary entries and narrowing index ranges
directly reduces discovery traffic. See Configuring Participant Discovery Peers
for the peer descriptor format.
3.25.2.2. QoS and resource configuration¶
The following subsections detail QoS and memory settings that can impact startup time, and how to optimize them.
3.25.2.2.1. Memory and resource sizing¶
Connext Micro allocates resources up-front. Limits that are too small cause discovery failures (such as “out of records” or “resource limit exceeded”), while limits that are too large increase startup time and memory footprint. Size your resource limits to match your actual deployment.
Per-DomainParticipant discovery limits: These limits control how many remote entities the DomainParticipant can store during discovery. If they are too small, you may see “out of records” or “resource limit exceeded” errors even when your per-endpoint limits are large enough.
Configure the following fields under DDS_DomainParticipantQos.resource_limits:
Field
Description
remote_participant_allocationMaximum number of remote DomainParticipants expected in the DDS domain.
remote_reader_allocationMaximum total number of remote DataReaders across all remote DomainParticipants.
remote_writer_allocationMaximum total number of remote DataWriters across all remote DomainParticipants.
matching_writer_reader_pair_allocationMaximum number of endpoint matches expected. A useful rule of thumb is
(local_reader_count + local_writer_count) * max_remote_participants.Per-endpoint matching limits: These limits control how many remote endpoints each local endpoint can track and match. Configure
max_remote_writersunder DDS_DataReaderQos.reader_resource_limits andmax_remote_readersunder DDS_DataWriterQos.writer_resource_limits.Per-endpoint history and storage: These limits affect how many samples are kept in memory, and can noticeably affect initialization time and memory footprint. Configure
max_samples,max_instances, andmax_samples_per_instanceunder DDS_DataReaderQos.resource_limits and DDS_DataWriterQos.resource_limits.Note
For example, setting
remote_participant_allocation = 100when your deployment has 5 DomainParticipants wastes memory and increases initialization time. Start with limits that match your known topology and increase them only if discovery fails.
3.25.2.2.2. Transport buffer sizing¶
Small send and receive buffers may limit discovery throughput and increase discovery completion time as DomainParticipant and endpoint counts grow. For the UDP transport, your operating system may cap effective socket buffer sizes regardless of what your application requests; even when you request large buffers, the OS may clamp them based on kernel limits. See UDP Transport for information on where UDP buffer settings are configured.
For the Shared Memory Transport (SHMEM), the receive queue depth
(received_message_count_max) and receive buffer size
(receive_buffer_size) on
NETIO_SHMEMInterfaceFactoryProperty control
how much discovery traffic can be buffered. See
SHMEM Configuration for details.
Adjust buffer sizes based on observed throughput, packet drops, and CPU utilization. If discovery traffic shows drops or “receive buffer errors,” increase OS socket buffer limits and verify the effective buffer sizes.
3.25.2.2.3. Discovery traffic tuning¶
As DomainParticipant and endpoint counts increase, discovery performance becomes bandwidth and burst-driven. The following properties control discovery traffic and timing. They are fields of DPDE_DiscoveryPluginProperty; see the API reference for the complete list.
Limit initial DomainParticipant announcement bursts: When a DomainParticipant is enabled, it sends a burst of announcements. At larger scales, this burstiness can amplify congestion and increase discovery completion time. Configure
initial_participant_announcementsandinitial_participant_announcement_periodin DPDE_DiscoveryPluginProperty to control this behavior.For bandwidth-constrained links or large-scale deployments, reduce the announcement count or increase the interval from the default values.
Relax DomainParticipant liveliness traffic where permitted: DomainParticipants liveliness assertions generate periodic network traffic. Aggressive settings add background load during discovery-heavy phases. Configure
participant_liveliness_assert_periodandparticipant_liveliness_lease_durationin DPDE_DiscoveryPluginProperty.Ensure
participant_liveliness_assert_periodis less than or equal toparticipant_liveliness_lease_duration. A common approach is to set the assert period to a fraction of the lease duration, to tolerate occasional assertion loss while still meeting liveliness requirements.Use multicast discovery when available: Unicast-only discovery increases bandwidth and processing cost as DomainParticipant count grows. If your deployment network supports multicast, use it to reduce fan-out.
To enable multicast, add a multicast locator (e.g.,
_udp://239.255.0.1) to discovery.enabled_transports for discovery traffic and to user_traffic.enabled_transports for user data traffic. With DPSE, a multicast locator alone can be sufficient. With DPDE, always include both a unicast locator (e.g.,_udp://) and a multicast locator; Connext Micro will select the appropriate locator based on the message type.For best-effort user data traffic, a multicast locator alone is sufficient. For reliable user data traffic, it is necessary to include both unicast and multicast locators to enable efficient communication.
Tune DPDE built-in endpoint reliability resources and timers: DPDE’s built-in endpoint buffering and retransmission settings can change discovery bandwidth and CPU usage, especially on lossy links or at large scale.
Increasing
max_samples_per_builtin_endpoint_readercan improve robustness (fewer drops and out-of-order rejections) during high discovery bursts, but doing so also increases memory usage.Smaller retransmission periods and higher retry limits can increase discovery traffic and CPU usage, but these may also reduce discovery completion time on lossy networks. The relevant DDS_DiscoveryPluginProperties fields are:
builtin_writer_heartbeat_periodbuiltin_writer_heartbeats_per_max_samplesbuiltin_endpoint_reader_nack_periodbuiltin_writer_max_heartbeat_retries
When you tune these properties, document the tradeoff you are targeting (memory vs. drops vs. bandwidth) and validate by measuring discovery completion time alongside packet rates and CPU utilization.
3.25.2.3. Build settings¶
Connext Micro ships prebuilt release and debug libraries. Use the release libraries for startup-time measurements; debug libraries include additional error checking that increases initialization time.
Build your own application code with -O2 (or equivalent) for production
and for any measurements used to validate startup requirements. The same
applies if you are building Connext Micro from source. Do not use -O0 or -Og
for startup-time measurements.
3.25.3. Measuring startup time¶
To optimize startup, you need to measure initialization time and discovery time separately.
Note
Console I/O during startup can account for a large portion of the total startup time and increase run-to-run variability, particularly on embedded targets. Defer or remove diagnostic output while measuring; if you need logs for debugging, capture them to a memory buffer rather than a serial console.
3.25.3.1. How to measure initialization time¶
Capture a timestamp before you create the DomainParticipant and another
immediately after DDS_Entity_enable() returns for the DomainParticipant
(or after the last entity is created, if autoenable is TRUE). The difference is
your initialization time.
3.25.3.2. How to measure discovery time¶
Discovery time starts when the DomainParticipant is enabled and ends when your application has discovered all expected remote DomainParticipants and endpoints. You can detect discovery completion using listener callbacks or a WaitSet.
To use listener callbacks, set the on_subscription_matched callback on
your DataReaders and the on_publication_matched callback on your DataWriters. Each
time a callback fires with status.current_count_change > 0, a new match
has occurred. When the total match count across all your endpoints reaches
the expected value, discovery is complete. Capture a timestamp at that
point.
To use a WaitSet, attach a DDS_StatusCondition with
DDS_SUBSCRIPTION_MATCHED_STATUS (or DDS_PUBLICATION_MATCHED_STATUS)
to a DDS_WaitSet. Call DDS_WaitSet_wait() in a loop, checking the
match count after each trigger until all expected matches are reached.
For more information on listeners and status conditions, see Receiving Data and Sending Data.
3.25.3.3. What to record¶
Startup-time measurements are most useful when accompanied by the conditions under which they were taken. The following details help when comparing results across environments or configurations:
Whether
autoenable_created_entitieswas TRUE or FALSE;Whether release or debug libraries were used, and the compiler optimization level;
The number of DomainParticipants and endpoints in the DDS domain during the measurement.
Note
Without these details, measurements from different environments or configurations cannot be meaningfully compared.
3.25.4. Validating startup time improvements¶
When evaluating a configuration change, compare initialization time and discovery time measurements before and after the change alongside the following indicators:
Transport buffer error counters: check for receive buffer overflows or send failures. On Linux,
netstat -suorss -sushows UDP error counters. On QNX or other RTOS platforms, consult your OS documentation for equivalent counters.Packet rates and discovery message counts: use
tcpdumporWiresharkto capture discovery traffic. Filter on the DDS discovery ports and count SPDP (DomainParticipant) and SEDP (endpoint) messages. High retransmission counts suggest packet loss or buffer exhaustion.CPU utilization during discovery: monitor CPU usage during the discovery phase. If a core is saturated, discovery processing may be the bottleneck rather than network I/O.
Reduced discovery completion time without increased packet loss or CPU saturation confirms the optimization is effective. If discovery time decreases but error counters rise, the change may have shifted the bottleneck rather than removed it.