Hello,
I want to achieve some kind of redundancy using keyed topics and exclusive ownership. Therefore I have multiple writers for the topic each with a different strength.
When the writer with the highest strength unregisters its instance, I want the readers to be notified and they should then read the highest strength instance alive.
Therefore my question:
How can unregistration of instances be detected when an instance is published by multiple data writers with different strength in order to read always the value provided by the writer with the highest strength without publication of periodic updates?
I have a keyed topic with multiple data writers publishing the same instance/entry.
The ownership is set to EXCLUSIVE_OWNERSHIP. Each data writer does have a different strength.
The program behaves as expected when instances/entries are published:
The reader only gets the instance/entry from the writer with the currently highest strength.
But when a instance/entries is unregistered by the writer with the currently highest strength
and the same instance is still published by another writer (same key, other data)
the reader does not get an update.
In this case, my expectation would be, that the reader delivers the entry published from the writer
with the next highest strength. But in reality no changes are provided by the reader.
If I unregister the instance from all writers providing it, then the reader realized that the
instance/entry was removed.
If I update the instance from the writer, that now has the highest strength, than the reader
reacts on the update and I get the value published by this writer.
It seems that one solution to achieve the wanted behavior would be to activate liveliness within the qos and
to update the instances periodically from all data writers. However, this would lead to more
network traffic.
Are there QoS setting which I do miss to achieve that the reader react on unregistration of
a instance form the data writer with the currently highest strength, or is the requested behavior
not possible?
In the documentation
"The case where the DDS::DataWriter is not writing data periodically is also a very important use-case. Since the instance is not being updated at any fixed period, the "deadline" mechanism cannot be used to determine ownership. The liveliness solves this situation. Ownership is maintained while the DDS::DataWriter is "alive" and for the DDS::DataWriter to be alive it must fulfill its DDS::LivelinessQosPolicy contract. The different means to renew liveliness (automatic, manual) combined by the implied renewal each time data is written handle the three conditions above [crash], [connectivity loss], and [application fault]. Note that to handle [application fault], LIVELINESS must be DDS::LivelinessQosPolicyKind::MANUAL_BY_TOPIC_LIVELINESS_QOS. The DDS::DataWriter can retain
ownership by periodically writing data or else calling assert_liveliness if it has no data to write Alternatively if only protection against [crash] or [connectivity loss] is desired, it is sufficient that some task on the DDS::DataWriter process periodically writes data or calls DDS::DomainParticipant::assert_liveliness. However, this scenario requires that the DDS::DataReader knows what instances are being "written" by the DDS::DataWriter. That is the only way that the DDS::DataReader deduces the ownership of specific instances from the fact that the DDS::DataWriter is still "alive". Hence the need for the DDS::DataWriter to
"register" and "unregister" instances. Note that while "registration" can be done lazily the first time the DDS::DataWriter writes the instance, "unregistration," in general, cannot. Similar reasoning will lead to the fact that unregistration will also require a message to be sent to the DDS::DataReader. "
found here:
http://community.rti.com/docs/html/api_dotnet/structDDS_1_1OwnershipQosPolicy.html
I' m not sure if this section does mean that it is only possible to detect unregistration when liveliness is used? What does in general mean? Are there reader side QoS setting which leads to the behavior requested above.
On the other hand, as long as a instance is only published by one data writer unregistration as well as destruction of the writer is detected by the readers as expected.
Hello,
I believe the behavior you are expecting is the default behavior of DDS with regards to exclusive ownership. Specifically, you said:
In this scenario, it should get updates from the lower-strength writer. This can be tested with the Shapes demo. You can create two writers, each writing to the square topic of the same color, each using exclusive ownership but having different strengths. Then start another shapes demo and subscribe to the square topic. Now right-click on the table row representing the stronger writer and select "Unregister data and delete writer". As soon as this happens, the subscriber starts getting updates from the weaker writer.
If you see this behavior in the shapes demo, but not in your application, then there is probably something configured in your application that is changing the behavior, such as conflicting QoS settings. Or perhaps the weaker writer isn't actually sending updates.
Hi Mike,
thanks for the reply.
I see the behavior you describe as soon as I publish another value from a writer with a lower strength after the value from the writer with the highest strength is unregistered.
The behavior I'm looking for is that the reader provides the value published from the writer with the lower strength, as soon as the value from the writer with the higher strength is unregistered.
In other words: I want the reader to deliver the value from the writer with the lower strength as soon as the value from the writer with the higher strength is unregistered, without the need to write the value with the reader with the lower strength again.
Is there a way to achieve this behavior with different QoS settings, or do I have to publish updated periodically from the weaker readers?
The currently used QoS settings can be found within the attached file.
Let me give an example clarify:
Let’s pretend I need a model that is used to monitor the status of a system. The system consist out of several applications. Every application is presented by an entry within this (keyed topic) model.
Each entry has the application name as key and a status field that presents the current status of the application (e.g. "not running", "running", "error" ...)
One application does know all applications that should be available within the system. Let’s name it SystemDirectory. On startup this application publishes an entry in the SystemModel for every application with the status "not running". This application uses a writer with a low writer strength.
As soon as the applications are started, they publish their actual status within the model with a higher writer strength, so that the "original" entry is overwritten.
So far so good, that part works fine.
Now an applications finishes and therefore the entry coming from the writer of the application is unregistered.
I would expect that now all readers get an update, that the entry from this application changed to "not running" because now the SystemDirectory has the writer with the highest strength and the reader should get the value that was provided from this writer. But in this case no update is received by the reader.
As soon as I start to periodically update the entrys from SystemDirectory, the reader get the updated value from the SystemDirectory.
Hi Olav,
This is the expected behavior.
When using exclusive ownership, the reader will only accept samples from the writer that currently owns the instance. Samples from all other writers will be ignored -- meaning that it is the same as if the writer didn't send the sample. When the stronger writer unregisters his instance, ownership will change at the reader to the weaker writer, but since the reader has no new samples from that writer, it has no data to present to the application. In other words, the reader does not have a stack where the reader keeps the last value from each writer and presents whichever value is currently strongest.
Is having the SystemDirectory publish periodically a workable solution for you?
Another thing to consider in your design is how you will detect when an application stops working, but doesn't crash, such as when an application "hangs" due to a deadlock or other bug. In this case, it may not unregister its instance. If liveliness in the application is set to automatic, then it is possible that the DDS threads will continue to run and the application will appear to be working normally to all of the readers. For this reason, it may be a good idea to investigate the MANUAL_BY_PARTICIPANT and MANUAL_BY_TOPIC liveliness settings. Basically, you want to be sure that the liveliness status of ApplicationA changes when ApplicationA shuts down or stops working for whatever reason.
If you don't want to do periodic updates, then you could detect the presence and absence of applications simply by using the liveliness status. For example, you could have an "ApplicationStatus" topic that is keyed by the application ID. Each application can report its status as "Running" when it starts. Now, as long as the liveliness of a given application remains as ALIVE, then the instance for that application in the "ApplicationStatus" table will also remain ALIVE. However, if the liveliness of the application changes due to a problem or shutdown, then the instance state for that application in the "ApplicationStatus" topic will change and the readers will be notified. (You can use the lease_duration to tune how long it takes for the liveliness change to be detected.)
Hi Mike,
thanks for the clarification. I was hoping to find a more "static" solution.
Then we probably have to go with periodic publication of samples.
Best regards
Olav
Hi Olav,
This is not the first time we have encountered this use case and I wanted to give you a bit more insight inti our current thinking.
The problem with a more "static" solution is two fold:
(1) It is not always clear if it would be correct/consistent to make the old value available to the DataReader. Basically we may have situations where a (weaker) DataWriter pubishes a value for the instance, then the (stronger) DataWriter updates with a new value, and then the stronger goes away. In that case delivering the older value of the instance would vialote the rule of deliverling samples in and order that does not violate their (source or reception) timestamp.
There are cases where the instance values represent the state of a system which is only published on change. In this situations we could consider that the liveliness of the DataWriter is equivalent to the DataWriter "continuously" publishing the inetance with its current value. The fact it does not go on the wire would be an optimization. In this case the behavior you describe would make sense.
In other cases the updates represent commands or transactional events that occur at one point in time and do not represent the "continuous" state. In that case re-delivering the old value would not be correct.
Currently we do not have a way to specify via Qos which one of these two behaviors is the appropriate one for the Topic/DataWriter/DataReader hence we are defaultion to the one where the DataReader value is not updated unless the DataWriter writes a sample.
(2) Implementing the more "static" behavior you describe is significantly more complex because the DataReader would need a way to receive a sample that it may have already received (and ignored because of the stronger DataWriter). This is not so simple when you take into consideration the fact that the reliable protocol underneath is trying to ensure that samples from each DataWriter are delivered in order without duplicates.
That said we have a new feature in 5.3. called TopicQuery (see https://community.rti.com/static/documentation/connext-dds/5.3.0/doc/api/connext_dds/api_cpp/group__DDSTopicQueryModule.html) which can be used to retrieve samples previously published by a DataWriter so we may be able to leverage this to implement the more "static" behavior you describe.
Bottom line is that the behavior you are describing is currently not supported via Qos. And it is also not so easy to imlement on top because teh API is not giving you enough information about which instances are changing ownership... So I think that for now you would need to go with the periodic approach.
We agree this is a desirable feature>.I think we have some RFEs (request for enhancements) on that direction. I will look further into those and see it is something we can address in the future.
Regards,
Gerardo
Hi Gerado,
thank you very much for the detailed explanation. In my opinion it would be an important feature to have.
Maybe it would make sens to add a note that the "static" behavior is not supported to the documentation.
Best Regards
Olav