48.1 DATA_READER_PROTOCOL QosPolicy (DDS Extension)
The DATA_READER_PROTOCOL QosPolicy applies only to DataReaders that are set up for reliable operation (see 47.21 RELIABILITY QosPolicy). This policy allows the application to fine-tune the reliability protocol separately for each DataReader. For details of the reliable protocol used by Connext, see Chapter 32 Reliability Models for Sending Data.
Connext uses a standard protocol for packet (user and meta data) exchange between applications. The DataReaderProtocol QosPolicy gives you control over configurable portions of the protocol, including the configuration of the reliable data delivery mechanism of the protocol on a per DataReader basis.
These configuration parameters control timing and timeouts, and give you the ability to trade off between speed of data loss detection and repair, versus network and CPU bandwidth used to maintain reliability.
It is important to tune the reliability protocol on a per DataReader basis to meet the requirements of the end-user application so that data can be sent between DataWriters and DataReaders in an efficient and optimal manner in the presence of data loss.
You can also use this QosPolicy to control how DDS responds to "slow" reliable DataReaders or ones that disconnect or are otherwise lost.
See the 47.21 RELIABILITY QosPolicy for more information on the per-DataReader/DataWriter reliability configuration. The 47.12 HISTORY QosPolicy and 47.22 RESOURCE_LIMITS QosPolicy also play an important role in the DDS reliability protocol.
This policy includes the members presented in Table 48.1 DDS_DataReaderProtocolQosPolicy and Table 48.2 DDS_RtpsReliableReaderProtocol_t. For defaults and valid ranges, please refer to the API Reference HTML documentation.
When setting the fields in this policy, the following rule applies. If this is false, Connext returns DDS_RETCODE_INCONSISTENT_POLICY when setting the QoS:
max_heartbeat_response_delay >= min_heartbeat_response_delay
Type |
Field Name |
Description |
DDS_GUID_t |
The virtual GUID (Global Unique Identifier) is used to uniquely identify the same DataReader across multiple incarnations. In other words, this value allows Connext to remember information about a DataReader that may be deleted and then recreated. This value is used to provide durable reader state. For more information, see 21.2 Durability and Persistence Based on Virtual GUIDs. By default, Connext will assign a virtual GUID automatically. If you want to restore the DataReader’s state after a restart, you can get the DataReader's virtual GUID using its get_qos() operation, then set the virtual GUID of the restarted DataReader to the same value. |
|
DDS_UnsignedLong |
Determines the DataReader’s RTPS object ID, according to the DDS-RTPS Interoperability Wire Protocol. Only the last 3 bytes are used; the most significant byte is ignored. The rtps_host_id, rtps_app_id, rtps_instance_id in the 44.10 WIRE_PROTOCOL QosPolicy (DDS Extension), together with the 3 least significant bytes in rtps_object_id, and another byte assigned by Connext to identify the entity type, forms the BuiltinTopicKey in SubscriptionBuiltinTopicData. |
|
DDS_Boolean |
Specifies whether this DataReader expects inline QoS with every sample. Connext DataWriters do not match with DataReaders that set this field to TRUE (becauseConnext DataWriters do not support sending inline QoS), but here is how this field is meant to be used: DataReaders usually rely on the discovery process to propagate QoS changes for matched DataWriters. Another way to get QoS information is to have it sent inline with a DDS sample. WithConnext, DataWriters and DataReaders cache discovery information, so sending inline QoS is typically unnecessary. The use of inline QoS is only needed for stateless implementations of DDS in which DataReaders do not cache Discovery information. The complete set of QoS that a DataWriter may send inline is specified by the Real-Time Publish-Subscribe (RTPS) Wire Interoperability Protocol. Note: The use of inline QoS creates an additional wire-payload, consuming extra bandwidth and serialization/deserialization time. |
|
DDS_Boolean |
Determines whether the DataReader sends positive acknowledgements (ACKs) to matching DataWriters. When TRUE. the matching DataWriter will keep DDS samples in its queue for this DataReader for a minimum keep duration (see 47.5.3 Disabling Positive Acknowledgements ). When strict-reliability is not required and NACK-based reliability is sufficient, setting this field reduces overhead network traffic. |
|
DDS_Boolean |
Indicates whether or not an instance can move to the DDS_NOT_ALIVE_DISPOSED_INSTANCE_STATE state without being in the DDS_ALIVE_INSTANCE_STATE state. See 19.1 Instance States for more information about this transition. When set to TRUE, the DataReader will receive dispose notifications even if the instance is not alive. This field only applies to keyed DataReaders. To make sure the key is available to the FooDataReader’s get_key_value() operation, use this option in combination with setting the DataWriter’s serialize_key_with_dispose field (in the 47.5 DATA_WRITER_PROTOCOL QosPolicy (DDS Extension)) to TRUE. See 47.5.5 Propagating Serialized Keys with Disposed-Instance Notifications. |
|
DDS_Boolean |
Indicates whether or not an instance can move to the DDS_NOT_ALIVE_NO_WRITERS_INSTANCE_STATE state directly from the DDS_NOT_ALIVE_DISPOSED_INSTANCE_STATE. See 19.1 Instance States for more information about this transition. When set to TRUE, the DataReader will receive unregister notifications even if the instance is already disposed. This field only applies to keyed DataReaders. |
|
Type |
Field Name |
Description |
DDS_Duration_t |
Minimum delay between when the DataReader receives a heartbeat and when it sends an ACK/NACK. |
|
DDS_Duration_t |
Maximum delay between when the DataReader receives a heartbeat and when it sends an ACK/NACK. Increasing this value helps prevent NACK storms, but increases latency. |
|
DDS_Duration_t |
How long additionally received heartbeats are suppressed. When a reliable DataReader receives consecutive heartbeats within a short duration, this may trigger redundant NACKs. To prevent the DataReader from sending redundant NACKs, the DataReader may ignore the latter heartbeat(s) for this amount of time. See 32.4.4.1 How Often Heartbeats are Resent (heartbeat_period). |
|
DDS_Duration_t |
Rate at which to send negative acknowledgements to new DataWriters. See 48.1.3 Example. |
|
DDS_Long |
The number of received out-of-order DDS samples a reader can keep at a time. See 48.1.1 Receive Window Size |
|
DDS_Duration_t |
The duration from sending a NACK to receiving a repair of a DDS sample. See 48.1.2 Reducing Redundant NACK Generation |
|
DDS_Duration_t |
The period at which application-level acknowledgment messages are sent. A DataReader sends application-level acknowledgment messages to a DataWriter at this periodic rate, and will continue sending until it receives a message from the DataWriter that it has received and processed the acknowledgment. |
|
DDS_Duration_t |
Minimum duration for which application-level acknowledgment response data is kept. The user-specified response data of an explicit application-level acknowledgment (called by DataReader’s acknowledge_sample() or acknowledge_all() operations) is cached by the DataReader for the purpose of reliably resending the data with the acknowledgment message. After this duration has passed from the time of the first acknowledgment, the response data is dropped from the cache and will not be resent with future acknowledgments for the corresponding DDS sample(s). |
|
DDS_Long |
The minimum number of DDS samples acknowledged by one application-level acknowledgment message. This setting applies only when the 47.21 RELIABILITY QosPolicy acknowledgment_kind is set to APPLICATION_EXPLICIT or APPLICATION_AUTO. A DataReader will immediately send an application-level acknowledgment message when it has at least this many DDS samples that have been acknowledged. It will not send an acknowledgment message until it has at least this many DDS samples pending acknowledgment. For example, calling the DataReader’s acknowledge_sample() this many times consecutively will trigger the sending of an acknowledgment message. Calling the DataReader’s acknowledge_all() may trigger the sending of an acknowledgment message, if at least this many DDS samples are being acknowledged at once. See 41.4 Acknowledging DDS Samples. This is independent of the DDS_RtpsReliableReaderProtocol_t’s app_ack_period, where a DataReader will send acknowledgment messages at the periodic rate regardless. When this is set to DDS_LENGTH_UNLIMITED, acknowledgment messages are sent only periodically, at the rate set by DDS_RtpsReliableReaderProtocol_t’s app_ack_period. |
48.1.1 Receive Window Size
A reliable DataReader presents DDS samples it receives to the user in-order. If it receives DDS samples out-of-order, it stores them internally until the other missing DDS samples are received. For example, if the DataWriter sends DDS samples 1 and 2, if the DataReader receives 2 first, it will wait until it receives 1 before passing the DDS samples to the user.
The number of out-of-order DDS samples that a DataReader can keep is set by the receive_window_size. A larger window allows more out-of-order DDS samples to be kept. When the window is full, any subsequent out-of-order DDS samples received will be rejected, and such rejections would necessitate NACK repairs that would degrade throughput. So, in network environments where out-of-order samples are more probable or where NACK repairs are costly, this window likely should be increased.
By default, the window is set to 256, which is the maximum number of DDS samples a single NACK submessage can request.
Samples rejected for exceeding the receive_window_size are counted in out_of_range_rejected_sample_count in the 40.7.3 DATA_READER_PROTOCOL_STATUS, but not included in the 40.7.8 SAMPLE_REJECTED Status.
48.1.2 Reducing Redundant NACK Generation
When a DataReader requests a DDS sample to be resent, there is a delay from when the NACK is sent, to when it receives the resent DDS sample. During that delay, the DataReader may receive heartbeats that normally would trigger another NACK for the same DDS sample. Such redundant requests for repairs waste bandwidth and degrade throughput.
The heartbeat_suppression_duration setting allows you to suppress heartbeats that would otherwise cause a NACK to be sent out for the same samples that were previously NACKed within that duration period. The setting works by keeping track of the lowest sequence number (SN) that was requested by the previous NACK. If the new NACK also requests that same SN, then the heartbeat that triggered the NACK response is ignored. There are two different consequences to this implementation to be aware of:
- The DataReader may be delayed in requesting repairs for newer samples that have been written since the last heartbeat. If a heartbeat announces new SNs that the DataReader has not had a chance to request yet, they may not be requested until the heartbeat_suppression_duration has elapsed. For example, a DataReader may NACK samples 5-10 after receiving a heartbeat from a DataWriter announcing SNs 1-10. Then, if the DataReader receives a new heartbeat announcing samples 1-15 before it receives the repair of sample 5 and before the heartbeat_suppression_duration elapses, the new heartbeat will be dropped and the DataReader will not NACK samples 11-15 until the heartbeat_suppression_duration elapses, even if those samples are also missing. So while the heartbeat_suppression_duration can reduce duplicate requests and repairs for samples, it may also introduce repair latency in some cases.
- The DataReader may still send redundant NACKs if the starting SN of the heartbeats from the DataWriter keeps advancing. The starting SN of the hearbeat will advance whenever the DataWriter removes the lowest SN from its cache (in KEEP_LAST configurations or with a finite sample lifespan per the 47.14 LIFESPAN QoS Policy, for example). So, a DataReader may NACK samples 5-10 after receiving a heartbeat from a DataWriter announcing SNs 1-10. Then, if the DataReader receives a new heartbeat announcing samples 6-10, before receiving the repairs for 5-10, it will request samples 6-10, even if the heartbeat_suppression_duration has not elapsed, because the lowest SN is different than the previous NACK that the DataReader sent. This may cause samples 6-10 to be repaired twice, depending on the DataWriter's configuration for suppressing redundant NACKs (see 32.4.4.6 Coping with Redundant NACKs for Missing DDS Samples (nack_suppression_duration and min/max_nack_response_delay)).
The min_heartbeat_response_delay and max_heartbeat_response_delay configure a random delay in responding to heartbeats. During this delay, all received heartbeats are grouped and then when the delay elapses they are all responded to at once, thereby eliminating any duplicate NACK requests that otherwise would have been generated if each heartbeat had been responded to individually. The tradeoff with these QoS settings, as with the min_nack_response_delay and max_nack_response_delay, is that they introduce latency into the repair responsiveness which must be taken into consideration.
Finially, the round_trip_time is a user-configured estimate of the delay between sending a NACK to receiving a repair. A DataReader keeps track of when a DDS sample has been NACKed, and will prevent subsequent NACKs from redundantly requesting the same DDS sample, until the round trip time has passed.
Note that the default value of 0 seconds means that the DataReader does not filter for redundant NACKs.
Our testing shows that the default round_trip_time of 0 seconds is sufficient for most applications on typical Ethernet LANs.
However, if your system has very slow computers and/or a slow network, you may want to consider increasing round_trip_time. Sending an ACKNACK and resending a missing DDS sample inherently take a long time in this system. So you should allow a longer time for recovery of the lost DDS sample before sending another ACKNACK. In this situation, you should increase round_trip_time.
If your system consists of a fast network or computers, and the receive queue size is very small, then you should keep round_trip_time very small (such as the default value of 0). If the queue size is small, recovering a missing DDS sample is more important than conserving CPU and network bandwidth (new DDS samples that are too far ahead of the missing DDS sample are thrown away). A fast system can cope with a smaller round_trip_time value, and the reliable DDS sample stream can normalize more quickly.
The heartbeat_suppression_duration and round_trip_time are two mechanisms to achieve similar results. The heartbeat_suppression_duration is much less CPU-intensive since it only compares the lowest previously NACKed sample with the current one before deciding to ignore a heartbeat altogether or not. This is quick and may work well most of the time. However, it has the drawbacks described above in that it may suppress NACKs for newer samples for longer than desired or still result in redundant NACKs. The round_trip_time resolves both of these issues but requires more CPU and memory to keep track of exactly which SNs have been NACKed and when for each DataWriter; every NACK must be checked against this list.
Sometimes it is not feasible to configure your system to suppress all redundant heartbeat responses, or you may wish to avoid some of the drawbacks to the heartbeat_suppression_duration that have been described in this section. In these cases, there are parallel settings for the DataWriter, which are described in 32.4.4.6 Coping with Redundant NACKs for Missing DDS Samples (nack_suppression_duration and min/max_nack_response_delay).
48.1.3 Example
For many applications, changing these values will not be necessary. However, the more nodes that your distributed application uses, and the greater the amount of network traffic it generates, the more likely it is that you will want to consider experimenting with these values.
When a reliable DataReader receives a heartbeat from a DataWriter, it will send an ACK/NACK packet back to the DataWriter. Instead of sending the packet out immediately, the DataReader can choose to send it after a delay. This policy sets the minimum and maximum time to delay; the actual delay will be a random value in between. (For more on heartbeats and ACK/NACK messages, see Chapter 22 Discovery Overview.)
Why is a delay useful? For DataWriters that have multiple reliable DataReaders, an efficient way of heartbeating all of the DataReaders is to send a single heartbeat via multicast. In that case, all of the DataReaders will receive the heartbeat (approximately) simultaneously. If all DataReaders immediately respond with a ACK/NACK packet, the network may be flooded. While the size of a ACK/NACK packet is relatively small, as the number of DataReaders increases, the chance of packet collision also increases. All of these conditions may lead to dropped packets which forces the DataWriter to send out additional heartbeats that cause more simultaneous heartbeats to be sent, ultimately resulting a network packet storm.
By forcing each DataReader to wait for a random amount of time, bounded by the minimum and maximum values in this policy, before sending an ACK/NACK response to a heartbeat, the use of the network is spread out over a period of time, decreasing the peak bandwidth required as well as the likelihood of dropped packets due to collisions. This can increase the overall performance of the reliable connection while avoiding a network storm.
When a reliable DataReader first matches a reliable DataWriter, the DataReader sends periodic NACK messages at the specified period to pull historical data from the DataWriter. The DataReader will stop sending periodic NACKs when it has received all historical data available at the time that it matched the DataWriter. The DataReader ensures that at least one NACK is sent per period; for example, if, within a NACK period, the DataReader responds to a HEARTBEAT message with a NACK, then the DataReader will not send another periodic NACK.
48.1.4 Properties
This QosPolicy cannot be modified after the DataReader is created.
It only applies to DataReaders, so there are no restrictions for setting it compatibly with respect to DataWriters.
48.1.5 Related QosPolicies
48.1.6 Applicable DDS Entities
48.1.7 System Resource Considerations
Changing the values in this policy requires making tradeoffs between minimizing latency (decreasing min_heartbeat_response_delay), maximizing determinism (decreasing the difference between min_heartbeat_response_delay and max_heartbeat_response_delay), and minimizing network collisions/spreading out the ACK/NACK packets across a time interval (increasing the difference between min_heartbeat_response_delay and max_heartbeat_response_delay and/or shifting their values between different DataReaders).
If the values are poorly chosen with respect to the characteristics and requirements of a given application, the latency and/or throughput of the application may suffer.