Low troughput on rticonverter

5 posts / 0 new
Last post
Offline
Last seen: 2 years 4 months ago
Joined: 02/08/2021
Posts: 21
Low troughput on rticonverter

Hello,

We are using the rticonverter. On our data we are seeing about 1.5MBps throughput, usually our structures and arrays have some nesting in them, they are far from easy and plain 1-level data types.

Is this the expected data rate for XCDR_AUTO? Is there a guide on how to tune performance?

I have been profiling just by stack sampling and 99.99% of the time the stack traces are at Connext code, so we are not the bottleneck.

As of now I tried to increase the number of threads on the xml "thread_pool" configuration, but what I'm seeing is that "FileStreamWriter::store" is still called sequentially, but just from different threads.

I could try to divide the sample batches we get on "FileStreamWriter::store" on blocks and to process them in parallel and then from memory log them. Before investing time into it I would like to know if this is sound (e.g. no thread issues using the non-const vector containing DynamicData samples) or if I would be wasting my time for e.g. hitting a global lock on the database making this a non-optimization that obscures our code for no gain.

Organization:
Offline
Last seen: 3 years 1 week ago
Joined: 01/15/2013
Posts: 94

Hi,

I'm assuming you're trying to convert from stored XCDR_AUTO format to JSON? The conversion from CDR to JSON can take time, moreover if the data types have complex nesting and size.

What I'm not following is, where are you getting FileStreamWriter from? That is the name of one of our example plugins in the Community. That suggests you're converting from XCDR to just the plain example file? I would like to understand your setup a bit better before I can assess about the performance.

Thanks,

Juanlu

 

Offline
Last seen: 2 years 4 months ago
Joined: 02/08/2021
Posts: 21

This is a custom written converter that converts from XCDR_AUTO to HDF5.

Offline
Last seen: 3 years 1 week ago
Joined: 01/15/2013
Posts: 94

I understand. I'm not very familiar with HDF5, just know the basics of what it is and what it does.

What the 'thread_pool' configuration does is is that it will create a pool of workers to process the data for the same session.

Where in the Connext code is the application spending most of the time? I suspect most of the time spent may be happening because of the deserialisation of the XCDR format into a Dynamic Data sample that can be manipulated.

Thanks,

Juanlu

Offline
Last seen: 2 years 4 months ago
Joined: 02/08/2021
Posts: 21

Exactly, getting loans appeared often when stack sampling, which I suppose that is just RTI code fetching from the DB. The debugger stopped always on RTI when random sampling, never on our code (Writing is easier, as we cache a lot on memory without actually writing).

Of RTI's there always was one thread doing work and most of them siting idle at clock_nanosleep.

I can't say exactly which functions with which percentages. I was sampling by manually stopping the debugger on a Release version, as rticonverter is a set of scripts doing a lot of things and it wasn't trivial to find a way to launch profiling correctly on the executable itself.