Tuning Queue Sizes and Other Resource Limits

32.4.2 Tuning Queue Sizes and Other Resource Limits

Set the 47.12 HISTORY QosPolicy appropriately to accommodate however many DDS samples should be saved in the DataWriter’s send queue or in the DataReader’s receive queue (from which samples are read/taken).

Set the size of the send window in the DDS_RtpsReliableWriterProtocol_t policy (in the 47.5 DATA_WRITER_PROTOCOL QosPolicy (DDS Extension)) appropriately to accommodate the maximum number of unacknowledged DDS samples that can be queued at a time from a DataWriter.

For more information, see the following sections:

32.4.2.1 Understanding the Send Queue and Setting its Size
32.4.2.2 Understanding the Receive Queue and Setting Its Size
47.5.4 Configuring the Send Window Size

Note: The HistoryQosPolicy’s depth must be less than or equal to the ResourceLimitsQosPolicy’s max_samples_per_instance; max_samples_per_instance must be less than or equal to the ResourceLimitsQosPolicy’s max_samples (see 47.22 RESOURCE_LIMITS QosPolicy), and max_samples_per_remote_writer (see 48.2 DATA_READER_RESOURCE_LIMITS QosPolicy (DDS Extension)) must be less than or equal to max_samples.

depth <= max_samples_per_instance <= max_samples
max_samples_per_remote_writer <= max_samples

Examples:

DataWriter

writer_qos.resource_limits.initial_instances = 10;

writer_qos.resource_limits.initial_samples = 200;
writer_qos.resource_limits.max_instances = 100;
writer_qos.resource_limits.max_samples = 2000;
writer_qos.resource_limits.max_samples_per_instance = 20;
writer_qos.history.depth = 20;

DataReader

reader_qos.resource_limits.initial_instances = 10;

reader_qos.resource_limits.initial_samples = 200;
reader_qos.resource_limits.max_instances = 100;
reader_qos.resource_limits.max_samples = 2000;
reader_qos.resource_limits.max_samples_per_instance = 20;
reader_qos.history.depth = 20;
reader_qos.reader_resource_limits.max_samples_per_remote_writer = 20;

32.4.2.1 Understanding the Send Queue and Setting its Size

A DataWriter’s send queue is used to store each DDS sample it writes. A DDS sample will be removed from the send queue after it has been acknowledged (through an ACKNACK) by all the reliable DataReaders. A DataReader can request that the DataWriter resend a missing DDS sample (through an ACKNACK). If that DDS sample is still available in the send queue, it will be resent. To elicit timely ACKNACKs, the DataWriter will regularly send heartbeats to its reliable DataReaders.

A DataWriter’s send queue size is determined by its 47.22 RESOURCE_LIMITS QosPolicy, specifically the max_samples field. The appropriate value depends on application parameters such as how fast the publication calls write().

A DataWriter has a "send window" that is the maximum number of unacknowledged DDS samples allowed in the send queue before a DataWriter will start blocking during the write() call (see 31.8.1 Blocking During a write()). The send window enables throttling of the publishing application to avoid overwhelming matched DataReaders. If the DataReaders are not acknowledging samples fast enough and the DataWriter’s send window fills up, the DataWriter will be slowed down because each write() call will block until the unacknowledged sample count in the send window decreases.

The size of the send window is determined by the DataWriterProtocolQosPolicy, specifically the fields min_send_window_size and max_send_window_size within the rtps_reliable_writer field of type DDS_RtpsReliableWriterProtocol_t. Other fields can be used to configure a variable-sized send window, where the send window size changes in response to network congestion to maximize the effective send rate. Like for max_samples, the appropriate values depend on application parameters. For more information on configuring the send window size, refer to 47.5.4 Configuring the Send Window Size.

Strict reliability: If a DataWriter does not receive ACKNACKs from one or more reliable DataReaders, it is possible for the reliability send queue—either its finite max_send_window_size or its effective max_send_window_size if max_send_window_size is infinite—to fill up. Effective max_send_window_size is defined as either max_samples (if batching is not used) or max_batches (if batching is used). If you want to achieve strict reliability, the kind field in the 47.12 HISTORY QosPolicy for both the DataReader and DataWriter must be set to KEEP_ALL, positive acknowledgments must be enabled for both the DataReader and DataWriter, and your publishing application should wait until space is available in the reliability queue before writing any more DDS samples. Connext provides two mechanisms to do this:

Allow the write() operation to block until there is space in the reliability queue again to store the DDS sample. The maximum time this call blocks is determined by the max_blocking_time field in the 47.21 RELIABILITY QosPolicy (also discussed in 32.4.1.1 Blocking until the Send Queue Has Space Available).
Use the DataWriter’s Listener to be notified when the reliability queue fills up or empties again.

When the 47.12 HISTORY QosPolicy on the DataWriter is set to KEEP_LAST, strict reliability is not guaranteed. When there are depth number of DDS samples in the queue (set in the 47.12 HISTORY QosPolicy, see 32.4.3 Controlling Queue Depth with the History QosPolicy) the oldest DDS sample will be dropped from the queue when a new DDS sample is written. Note that in such a reliable mode, when the send window is larger than max_samples (or max_batches if batching is enabled), the DataWriter will never block, but strict reliability is no longer guaranteed. If there is a request for the purged DDS sample from any DataReaders, the DataWriter will send a heartbeat that no longer contains the sequence number of the dropped DDS sample (it will not be able to send the DDS sample).

Alternatively, a DataWriter with KEEP_LAST may block on write() when its send window is smaller than its send queue. The DataWriter will block when its send window is full. After the blocking time has elapsed, the DataWriter may replace a DDS sample, regardless of its acknowledgement status. See 31.8.2 write() behavior with KEEP_LAST and KEEP_ALL for a detailed explanation of what happens when certain limits are reached during a call to write().

The send queue size is set in the max_samples field of the 47.22 RESOURCE_LIMITS QosPolicy. The appropriate size for the send queue depends on application parameters (such as the send rate), channel parameters (such as end-to-end delay and probability of packet loss), and quality of service requirements (such as maximum acceptable probability of DDS sample loss).

The DataReader’s receive queue size (from which samples are read/taken) should generally be larger than the DataWriter’s send queue size. Receive queue size is discussed in 32.4.2.2 Understanding the Receive Queue and Setting Its Size.

A good rule of thumb, based on a simple model that assumes individual packet drops are not correlated and time-independent, is that the size of the reliability send queue, N, is as shown in Figure 32.3: Calculating Minimum Send Queue Size for a Desired Level of Reliability .

Figure 32.3: Calculating Minimum Send Queue Size for a Desired Level of Reliability

N = 2RT(log(1-Q))/log(p))

Simple formula for determining the minimum size of the send queue required for strict reliability

In the above equation, R is the rate of sending DDS samples, T is the round-trip transmission time, p is the probability of a packet loss in a round trip, and Q is the required probability that a DDS sample is eventually successfully delivered. Of course, network-transport dropouts must also be taken into account and may influence or dominate this calculation.

Table 32.2 Required Size of the Send Queue for Different Network Parameters gives the required size of the DataWriter's send queue for several common scenarios.

Table 32.2 Required Size of the Send Queue for Different Network Parameters
Q1	p2	T3	R4	N5
99%	1%	0.0016 sec	100 Hz	1
99%	1%	0.001 sec	2000 Hz	2
99%	5%	0.001 sec	100 Hz	1
99%	5%	0.001 sec	2000 Hz	4
99.99%	1%	0.001 sec	100 Hz	1
99.99%	1%	0.001 sec	2000 Hz	6
99.99%	5%	0.001 sec	100 Hz	1
99.99%	5%	0.001 sec	2000 Hz	8

Note: Packet loss on a network frequently happens in bursts, and the packet loss events are correlated. This means that the probability of a packet being lost is much higher if the previous packet was lost because it indicates a congested network or busy receiver. For this situation, it may be better to use a queue size that can accommodate the longest period of network congestion, as illustrated in Figure 32.4: Calculating Minimum Send Queue Size for Networks with Dropouts.

Figure 32.4: Calculating Minimum Send Queue Size for Networks with Dropouts

N = RD (Q)

Send queue size as a function of send rate "R" and maximum dropout time D

In the above equation R is the rate of sending DDS samples, D(Q) is a time such that Q percent of the dropouts are of equal or lesser length, and Q is the required probability that a DDS sample is eventually successfully delivered. The problem with the above formula is that it is hard to determine the value of D(Q) for different values of Q.

For example, if we want to ensure that 99.9% of the DDS samples are eventually delivered successfully, and we know that the 99.9% of the network dropouts are shorter than 0.1 seconds, then we would use N = 0.1*R. So for a rate of 100Hz, we would use a send queue of N = 10; for a rate of 2000Hz, we would use N = 200.

32.4.2.2 Understanding the Receive Queue and Setting Its Size

DDS samples are stored in the DataReader’s receive queue (from which samples are read/taken), which is accessible to the user’s application.

A DDS sample is removed from the receive queue after it has been accessed by take(), as described in 41.3 Accessing DDS Data Samples with Read or Take. Note that read() does not remove DDS samples from the queue.

A DataReader's receive queue size is limited by its 47.22 RESOURCE_LIMITS QosPolicy, specifically the max_samples field. The storage of out-of-order DDS samples for each DataWriter is also allocated from the DataReader's receive queue; this DDS sample resource is shared among all reliable DataWriters. That is, max_samples includes both ordered and out-of-order DDS samples.

For a keyed DataReader, a sample per-instance is automatically created in the receive queue in order to hold the state of the instance. These are the samples that have the valid_data flag in the SampleInfo set to FALSE when they are read or taken. They signal to the application that the instance has transitioned states. The reserved sample does not count towards the max_samples resource limit. It is possible, then, for a DataReader to have more than max_samples available to read or take if some of those samples are samples with valid_data=false.

A DataReader has multiple levels of queues that a sample must move through before reaching the receive queue where it can be accessed from the application. The max_samples resource limit applies across these queues. The first level is the remote writer queue (on the DataReader side), where samples are stored until they are received in order. At this level, all sample types (data, dispose, unregister) are treated the same and count towards max_samples. Once a sample is moved to the receive queue, it may or may not continue to count towards the max_samples resource limits, depending on the following:

Data samples always count towards the max_samples resource limit.
Dispose samples will use the reserved sample per-instance and do not count towards max_samples.
Unregister samples that cause the instance to transition to NOT_ALIVE_NO_WRITERS will also use the reserved sample per-instance and do not count towards max_samples.
Unregister samples that do not trigger a transition to NOT_ALIVE_NO_WRITERS simply cause the association between the remote writer and the instance to be removed and do not take up any space in the receive queue after the unregister is finished being processed.

An example of how each type of sample is accepted into the receive queue is in the figures below.

A DataReader can maintain reliable communications with multiple DataWriters (e.g., in the case of the 47.18 OWNERSHIP_STRENGTH QosPolicy setting of SHARED). The maximum number of out-of-order DDS samples from any one DataWriter that can occupy in the receive queue is set in the max_samples_per_remote_writer field of the 48.2 DATA_READER_RESOURCE_LIMITS QosPolicy (DDS Extension); this value can be used to prevent a single DataWriter from using all the space in the receive queue. max_samples_per_remote_writer must be set to be <= max_samples.

The DataReader will cache DDS samples that arrive out of order while waiting for missing DDS samples to be resent. (Up to 256 DDS samples can be resent; this limitation is imposed by the wire protocol.) If there is no room, the DataReader has to reject out-of-order DDS samples and request them again later after the missing DDS samples have arrived.

The appropriate size of the DataReader's receive queue depends on application parameters, such as the DataWriter’s sending rate and the probability of a dropped DDS sample. However, the receive queue size should generally be larger than the DataWriter's send queue size. Send queue size is discussed in 32.4.2.1 Understanding the Send Queue and Setting its Size.

Figure 32.5: Effect of Receive-Queue Size on Performance: Large Queue Size and Figure 32.6: Effect of Receive Queue Size on Performance: Small Queue Size compare two hypothetical DataReaders, both interacting with the same DataWriter. The queue on the left represents an ordering cache, allocated from the receive queue—DDS samples are held here if they arrive out of order. The DataReader in Figure 32.5: Effect of Receive-Queue Size on Performance: Large Queue Size has a sufficiently large receive queue (max_samples) for the given send rate of the DataWriter and other operational parameters. In both cases, we assume that all DDS samples are taken from the DataReader in the Listener callback. (See 41.3 Accessing DDS Data Samples with Read or Take for information on take() and related operations.)

In Figure 32.6: Effect of Receive Queue Size on Performance: Small Queue Size , max_samples is too small to cache out-of-order DDS samples for the same operational parameters. In both cases, the DataReaders eventually receive all the DDS samples in order. However, the DataReader with the larger max_samples will get the DDS samples earlier and with fewer transactions. In particular, DDS sample “4” is never resent for the DataReader with the larger queue size.

Figure 32.5: Effect of Receive-Queue Size on Performance: Large Queue Size

Figure 32.6: Effect of Receive Queue Size on Performance: Small Queue Size