Hi ,
I'm working on a project in which we have a pretty large number of topics. A high degree of these topics are being sent reliably and we are experiencing some performance issues on some of the processes responsible for handling these topics.
We are doing some adjustments on the QoS's for the topics specifically to improve performance, etc, but I'm a little fuzzy on what happens when a publisher times out the heartbeating for one of it's subscribers ( times out waiting for ACKNACKs, and over max retries...).
Does this cause alot of extra processing on the publisher side when the subscriber starts sending ACKNACKs again after the publisher has set the subscriber as "inactive" ( not alive...). Is the subscriber still considered to be "Discovered" or does that processing need to happen again?
If I understand it correctly the publisher will declare the subscriber "active" again once it starts recieving ACKNACKS, but if the publisher never sends a HB to the inactive subscriber ( low watermark = 0, period = high number, or something...) would the subscriber need to implement something on the liveliness change callbacks to resend ACKNACKS if it noticed it got set to "inactive"?
Cheers
-Bryan
We are using RTI Data Distribution Service 4.4d.rev33.
Hello Bryan,
The short answer is no. There is not a lot of processing involved on a DataWriter switching a DataReader from active to inactive and back to active. This mechanism is independent of the Discovery mechanism. For more details see below.
The "activity" mechanism is in place to prevent a very slow or non-responding reliable DataReader from holding back the resources of the reliable DataWriter as well as the progress of the other DataReaders.
For example if the DataWriter is configured with a finite "send_window_size" then this window signals the maximum number of samples that the DataWriter will publish ahead of the acknowledgment received. For example if send_window_size=10 and the application writes continually, then the DataWriter will push 10 samples out and then block (or timeout) if the application tries to write the 11th one before the first is acknowledged by all reliable (and active) DataReaders. If the Publisher has been configured as "asynchronous" then the application calling the "write" operation will not see it block, but the DataWriter will still not push the 11th sample on he wire any DataReaders until the 1st is acknowledged by all reliable (and active) DataReaders.
Even if "send_window_size" is set to infinite (which is not advisable) a DataWriter will keep resources around and potentially consume extra bandwidth with "fast heartbeats" as long as there is a reliable (and active) DataWriter that has not acknowledged samples.
When the DataWriter switches the DataReader to "inactive" it
It basically treats the inactive DataReader as "best" efforts, except it still sends periodic heartbeats to it so it has a chance to reply with NACKs or ACKs and thus become active again.
Discovery is independent of the activity mechanism. If Discovery decides the DataReader is no longer present, then it will remove all state the DataWriter has regarding the DataReader and when it re-discovers it, then it will have to set it all up again. This is a much more expensive process in terms of resources and CPU.
Gerardo
Thanks for the explaination!