Using Machine Learning to Maintain Pub/Sub System QoS in Dynamic Environments

4 posts / 0 new
Last post
Offline
Last seen: 5 years 9 months ago
Joined: 09/26/2013
Posts: 3
Using Machine Learning to Maintain Pub/Sub System QoS in Dynamic Environments

Dear Sir,

Please look over the following paper and inform me whether we can use RTI to maintain publish/subscribe system QoS in dynamic environments:

https://www.truststc.org/pubs/679/ADAMANT-ARM09.pdf

And I want also to know whether we can use machine learning techniques with RTI to maintain QoS for publish/subscribe system via autonomic adaptation.

Best regards,

Shadi Abudalfa

 

rip
rip's picture
Offline
Last seen: 3 weeks 3 days ago
Joined: 04/06/2012
Posts: 315

Hello,

The paper describes how to use machine learning techniques, to generate Decision Trees and Artificial Neural Networks, based on static analysis of bandwidth usage, repairs, etc.  The DT and ANN machines then decide what settings to use in the QoS  of  a pub/sub middleware such as DDS.  It does not describe how to use the pub/sub middleware as the mechanism through which you control it.  Middleware is the medium of data distribution, it isn't the message.

You can use DDS to gather the information that you would feed to your decision trees or neural network.  You could then publish this data for subscription by one or more engines dedicated to deciding when QoS should change.  Also, the observer/disturber paradox (Heisenberg -- you can't observe it without affecting it) means that if you use an internal-to-DDS method (like Monitor) to capture statistics, you are changing the statistics you are capturing.  To some extent, the least intrusive method would be to instrument the transport from somewhere else in the stack -- which is what the paper talks about, they used a third-party application to do static analysis of post-run metrics. 

However, altering QoS in a DDS-based system is problematic, because not all QoS settings are mutable at runtime.  Also, because QoS is ordered (for example TRANSIENT_LOCAL > VOLATILE), it means that ALL readers and writers must be updated, at the same time, or data will stop flowing (for when the QoS setting can be changed after enable), or they must be destroyed and recreated with the new settings (for when the QoS must be set prior to enable, or prior to creation).  Certainly you could use DDS here, too, with Reliable topics and a Type, published by the DT engine, that supplies new settings that readers should be configured to.

The paper discusses scale, considering the scale of an external, real-world event that would supply large amounts of data.  They also touched on the scale of the QoS settings in DDS (22 they list).  What's interesting is that they did not touch on the problems at scale of adjusting the QoS for a large system.  If all you are adjusting is a single topic (they use mpg video as their discussion point), it would be easy to use a machine-learning DT to adjust the QoS to handle changes in the macro environment.  It would be much more useful if the DT engines actually pub/sub'd to each other: "I'm changing these settings for this topic" doesn't have to go just to the readers and writers on that topic -- those messages should also be subscribed to by the DT engines, so that they can proactively prepare to adjust their own settings -- or STOP preparing to adjust their settings, based on the knowledge that the environment is about to change.  This is important if you have multiple engines that have learned the same thing, are fed the same imput, and then all decide to adjust their settings at the same time.  Chaos and hilarity.

Still... it's an interesting exercise, and it's possible to some exent to do.  But unless the mechanisms are built directly into the middleware, it may be more trouble then it is worth.

Hope this helps,

Rip

Offline
Last seen: 5 years 9 months ago
Joined: 09/26/2013
Posts: 3

Thanks a lot for your comment

Could you kindly list the most important QoSs that should be monitor in the real time system in dynamic environments ?

Best regards,

 

rip
rip's picture
Offline
Last seen: 3 weeks 3 days ago
Joined: 04/06/2012
Posts: 315

see chapter 16.  These are the ones you have (passive) access to and so can inspect.

I would suggest you worry about monitoring all of them (ie, come up with a general case).  this allows your research to work against any use case, not just the one stated.

http://community.rti.com/rti-doc/500/ndds.5.0.0/doc/pdf_html/RTI_CoreLibrariesAndUtilities_UsersManual_3.html#page_114