Lost messages despite of Reliability QoS - Maybe powerline or QoS problems?

2 posts / 0 new
Last post
Offline
Last seen: 7 years 4 months ago
Joined: 08/23/2017
Posts: 1
Lost messages despite of Reliability QoS - Maybe powerline or QoS problems?

Hello,

we are a research project (http://enex.rwth-aachen.de) and currently on an expedition on a glacier in the Italian alps.
We're facing some problems in our communication via DDS we didn't observe in our laboratories.
Maybe you can give us some hints we can look into. Especially some QoS settings.

Our hardware setting is a system of 7 homogenous melting probes and an additional melting probe that should navigate through the ice.
All are connected via Powerline. The problem is, that some of our our messages go missing. In our whole system we have roughly 100 publishers and subscribers.
That doesn't seem like a great number to us.

We are using multicast and set our writers and readers QoS to DDS_RELIABLE_RELIABILITY_QOS and  DDS_KEEP_ALL_HISTORY_QOS.
First, we thougth, that memory could be the problem. Now, that's seems unlikely because at no time we use more than ~10% of our RAM.
Additionaly, we create our publishers and subscribers ahead of time, so that we don't loose the first few messages.
We looked into Durability and DurabilityService, but they don't look feasible for our situation.

We have two main guesses for our problem: Powerline/PowerLAN or a QoS-settings we missed.
Did anybody have seen problems in combination of DDS and powerline?
Or can anybody suggest QoS settings we should look further into.

Kind regards
Sebastian

Offline
Last seen: 2 years 7 months ago
Joined: 02/20/2014
Posts: 10

I'd probably look initially at the QoS settings for the send window and blocking time.  Depending on your implementation maybe when the network gets interrupted the send window fills causing the next write() to block.  At that point I think potentially the max blocking time expires, which I think causes the message to be thrown out for the pending receiver.  It might also be that your application may not be written to support the blocking, maybe causing you to just not generate messages you were expecting to.