Subscriber intermittently missing messages

5 posts / 0 new
Last post
Offline
Last seen: 7 years 8 months ago
Joined: 11/24/2015
Posts: 4
Subscriber intermittently missing messages

Hi,

We have been experiencing some odd intermittent message loss on some of our applications.

We have several Odroids XU4s running 4-5 applications, and 2-3 desktop applications. On average we have about 20 domain participants total on the network, each with roughly 3-4 data writers or readers. So a relatively small network topography.

One of the desktop applications is producing data at a steady rate of about 100Hz. Some of the applications on the odroids are consuming this data, but they have been experiencing intermittent message loss. During these 'blackouts' network connectivity to the odroids seems fine ( 2-15ms pings ) and a dds spy correctly sees the messages being produced from the desktop application. Restarting the application doesn't always resolve the issue, but we found that restarting all applications on the odroid resolves the issue for a while.

We're not sure if the root of this issue lies in our dds configuration or if its a network/hardware issue.

 

Any insight on this issue is appreciated.

Keywords:
Offline
Last seen: 3 months 6 days ago
Joined: 02/11/2016
Posts: 144

Hey,

To verify:

You are receiving all of the data when using rtiddsspy on the odroids but when you use your application, you are missing messages?

If that is the case my best guess would be:

a. The qos you are using with the application is more strict regarding the resource limits

b. Your application is using the rti listeners and blocking the thread (leading to queues being filled, leading to messages being lost)

 

Try writing a simple application that simply reads messages and writes them to log (or some other action that is very light weight and allows you to verify if messages are being lost).

If you're still experiencing problems with your application I would guess it's a problem with the QoS. (you can try running the rtiddsspy using your QoS to verify).

 

Good luck,

Roy.

Offline
Last seen: 7 years 8 months ago
Joined: 11/24/2015
Posts: 4

Hi Roy,

To clearify:

  • I am not running ddsspy on the odroid, but from a third x64 machine on our network ( neither the producer of the data, nor the consumer ). I'll try installing and running spy on the odroid to see if it changes my observations.
  • The issue can be seen on more than one application we run onboard the odroid with varying QoS profiles.  The two most common QoS we're using is the Builtin Best Effort and Strict Reliable, but I've seen these symptoms from applications using both.

To add more diagnostic information:

  • When a new application starts onboard the odroid when with behavior is happening, an offboard Admin Console shows the application's domain participant, but shows none of its subscribers/publisher information.

Thanks,

Ben

Offline
Last seen: 3 months 6 days ago
Joined: 02/11/2016
Posts: 144

Hey Ben,

Base on your clarifications I would lean more towards the blocked listener theory.

To verify this, I suggest running rtiddsspy onboard and seeing if it misses messages or not.

Regarding the added information:

Not being able to see the subscribers / publishers information can be caused by MANY reasons and it's hard to tell from your description what causes it or how it's related to the other problem you're encountering.

Good luck,

Roy.

Offline
Last seen: 7 years 8 months ago
Joined: 11/24/2015
Posts: 4

Hi Roy,

I got rtiddsspy running onboard. It also seems to be experenincing missed messaged when our application is missing them. But U need to do more testing to verify this.

I've also noticed the missing of data seems to happen more frequently if there are several DDS applications running on the odroid.

Thanks,

Ben