4. Protocol Buffers to DDS-XTYPES Mapping

Every .proto file passed to the idl4 plugin will be converted into an .idl file containing DDS-XTYPES data types equivalent to the Protocol Buffers ones passed as input.

The generated IDL4 files will contain type definitions for all Protocol Buffers messages, enumerations, and other constructs which can be mapped to IDL4 constructs.

The definitions will be guarded by a C preprocessor guard, to ensure that the generated file can be included by multiple files without redefinition errors.

Table 4.1 *Protocol Buffers* to *IDL4* Mapping for *Protocol Buffers* files to *IDL4* files
Protocol Buffers	DDS-XTYPES/IDL4
`message.proto`	`message.idl`
message MyMessage { }	#ifndef message_proto_IDL4_ #define message_proto_IDL4_ @mutable struct MyMessage { }; #endif // message_proto_IDL4_
`myapp/message.proto`	`myapp/message.idl`
package myapp; message MyMessage { }	#ifndef myapp_message_proto_IDL4_ #define myapp_message_proto_IDL4_ module myapp { @mutable struct MyMessage { }; }; // module myapp #endif // message_proto_IDL4_

4.1. User Types

Protocol Buffers supports two types of user-defined types:

Messages
Enumerations

4.1.1. Messages

Protocol Buffers messages are mapped to IDL4 struct’s, with @mutable extensibility.

The use of the @mutable extensibility allows modifications to the types without breaking communication, as well as declaration of @optional members. This makes the XCDR serialization of these types behave similarly to their Protocol Buffers serialization.

The extensibility of the mapped IDL4 types can be changed with option .omg.dds.type.extensibility.

IDL4 does not support nested type declarations, therefore Protocol Buffers messages declared inside other messages, are mapped to top-level IDL4 types using a special naming convention and dedicated annotations.

The name of the IDL4 type associated with a nested message will encode the names of all containing types, concatenated by underscores. This naming convention matches the one followed by Protocol Buffers in several programming languages (e.g. C++) when deriving the names of types associated with nested messages.

Nested types are marked as @nested, to prevent them from being used as top-level types in DDS topics. This choice is made to better match the visibility of nested types in Protocol Buffers, which is limited to the .proto file where they are declared.

Nested types are also marked with @containing_type, to further encode their relationship to the containing type.

Table 4.2 *Protocol Buffers* to *IDL4* Mapping for Messages
Protocol Buffers	DDS-XTYPES/IDL4
message MyMessage { }	@mutable struct MyMessage { };
message MyMessage { message NestedMessage { } }	@nested @containing_type("MyMessage") @mutable struct MyMessage_NestedMessage { };

4.1.2. Enumerations

Protocol Buffers enumerations are mapped to IDL4 enum types.

The value of each enumeration literal is propagated to IDL4 using the @value annotation.

The default literal, which Protocol Buffers typically defines as the first value, is always marked in IDL4 with @default_literal, for greater robustness.

The literals are exposed as top-level symbols without any prefix, unless the enumeration is nested inside a message.

Nested enumerations follow the same naming conventions as nested messages, with the only difference being that enumeration literals are also prefixed with the nested type’s name.

Similar to the handling of messages, this naming choice matches how Protocol Buffers “mangles” the names of nested enumerations when translating them to programming languages such as C++.

Table 4.3 Protocol Buffers to IDL4 Mapping for Enumerations

Protocol Buffers

DDS-XTYPES/IDL4

enum MyEnum {
    HELLO = 0;
    WORLD = 1;
}

enum MyEnum {
    @value(0)
    @default_literal
    HELLO,
    @value(1)
    WORLD
};

message MyMessage {
    enum NestedEnum {
        HELLO = 0;
        WORLD = 1;
    }
}

@containing_type("MyMessage")
enum MyMessage_NestedEnum {
    @value(0)
    @default_literal
    MyMessage_NestedEnum_HELLO,
    @value(1)
    MyMessage_NestedEnum_WORLD
};

4.2. Primitive Types

All Protocol Buffers primitive (or “scalar”) types are mapped to their IDL4 counterparts, which results in very similar representations in both Protocol Buffers, and the many other “language bindings” supported by Connext.

The only considerable deviation is with the Protocol Buffers type bytes, which is mapped to a sequence of octests in IDL4 (i.e. sequence<octet>), clearly not a primitive. bytes is typically represented as a string by Protocol Buffers, which may cause some confusion when accessing these values from different language bindings. For example, in C++, the bytes type will be mapped to:

std::string, when using the Protocol Buffers/C++ API.
std::vector<uint8_t>, when using the standard DDS C++ API.

Table 4.4 *Protocol Buffers* to *IDL4* Mapping for Primitive Types
Protocol Buffers	DDS-XTYPES/IDL4
`double`	`double`
`float`	`float`
`int32`	`int32`
`int64`	`int64`
`uint32`	`uint32`
`uint64`	`uint64`
`sint32`	`int32`
`sint64`	`int64`
`fixed32`	`uint32`
`fixed64`	`uint64`
`sfixed32`	`int32`
`sfixed64`	`int64`
`bool`	`boolean`
`string`	`string`
`bytes`	`sequence<octet>`

4.3. Collections

Protocol Buffers offers two ways to represent collections of data:

Repeated Fields
Map Fields

4.3.1. Repeated Fields

Protocol Buffers repeated fields are mapped to unbounded IDL4 sequences.

The corresponding IDL4 member may be marked as @optional by using option .omg.dds.member.optional.

When a message includes a repeated field of type bytes, the generated IDL4 will include an alias for sequence<octet> scoped within the message type. The alias will be used in the definition of the associated struct member, because IDL4 does not allow defining sequences of “anonymous” sequences (i.e. sequence<sequence<T>> is not valid IDL4).

Table 4.5 Protocol Buffers to IDL4 Mapping for Repeated Fields

Protocol Buffers

DDS-XTYPES/IDL4

repeated TYPE my_field ... ;

sequence< TYPE > my_field;

message MyMessage {
    repeated bytes my_field = 1;
}

typedef sequence<octet> MyMessage_OctetSeq;

struct MyMessage {
    @id(1)
    sequence<MyMessage_OctetSeq> my_field;
};

4.3.2. Map Fields

Protocol Buffers map fields are mapped to IDL4 sequences of key and value pairs represented using an automatically-generated “MapPair” struct type.

A “MapPair” type will be defined for each pair of key and value types used by map fields in a Protocol Buffers message, following the naming scheme <message>_MapPair_<key-type>_<value-type>. The same type will be reused by all map fields in a message which share the same key and value types,

The “MapPair” struct types are marked with @map_pair to indicate their use, and with @containing_type to trace their relationship to the containing message type. The @nested annotation is also used to prevent their use as top-level types for DDS topics.

The members associated with a map field will be annotated with @map to indicate that they should be mapped to map constructs in those language bindings that support them.

Table 4.6 Protocol Buffers to IDL4 Mapping for Map Fields

Protocol Buffers

DDS-XTYPES/IDL4

message MyMessage {
    map<K, V> my_field = 1;
}

@nested
@final
@map_pair
@containing_type("MyMessage")
struct MyMessage_MapPair_K_V {
    K key;
    V value;
};


struct MyMessage {
  @id(1)
  @map
  sequence< MyMessage_MapPair_K_V >
  my_field;
};

Warning

The mapping for map fields will be updated to the native IDL4 map type once support for this construct has been introduced by a future release of Connext.

The @map and @map_pair annotations are a temporary solutition to allow the Protocol Buffers/C++ langauge binding to support map fields. The annotations will be ignored by all other language bindings, where maps must be accessed as linear collections, rather than as associative containers.

4.4. Packages

Protocol Buffers packages are mapped to IDL4 module’s.

The name of the Protocol Buffers package is split into segments by the dot (.) character, and each segment is mapped to a nested IDL4 module.

Qualified Protocol Buffers package names can be converted to IDL4 module names by replacing dots (.) with double colons (::).

The name of the Protocol Buffers package will also be used as part of the C preprocesor guard used to wrap all definitions in the generated IDL4 file.

Table 4.7 Protocol Buffers to IDL4 Mapping for Packages

Protocol Buffers

DDS-XTYPES/IDL4

package my.messages.package;

// Type definitions...

#ifndef my_messages_package_proto_IDL4_
#define my_messages_package_proto_IDL4_

module my {
module messages {
module package {

    // Type definitions...

}; // module package
}; // module messages
}; // module my

#endif // my_messages_package_proto_IDL4_

4.5. Imported Files

Each import statement in a Protocol Buffers file will be converted to an #include statement in the generated IDL4 file, using the same path, but with the .idl extension instead of .proto.

All imported .proto files must be independently converted to .idl for the generated #include statements to become valid.

Any “built-in” Protocol Buffers file imported by a .proto file (e.g. google/protobuf/timestamp/proto) will also need to be explicitly converted to IDL4.

Typically, Protocol Buffers applications do not need to generate artifacts from these files, because the associated code is already part of the Protocol Buffers binary distribution.

Protocol Buffers Extension does not include a pre-generated IDL4 version of the “built-in” Protocol Buffers files in order to allow users to precisely match the interfaces included in their preferred version of Protocol Buffers.

Table 4.8 *Protocol Buffers* to *IDL4* Mapping for Imported Files
Protocol Buffers	DDS-XTYPES/IDL4
import "another.proto";	#include "another.idl"

4.6. Field Presence

To quote the Protocol Buffers documentation about Field Presence:

Field presence is the notion of whether a protobuf field has a value.

There are two different manifestations of presence for protobufs:

implicit presence, where the generated message API stores field values (only).

explicit presence, where the API also stores whether or not a field has been set.

Based on whether a field has “implicit” or “explicit” presence, the Protocol Buffers compiler will generate a different API for the message type:

Fields with explicit presence have an associated “has” method, which can be used to check whether the field has been set.
Fields with implicit presence do not have an associated “has” method, and their value can only be checked against the default value for their type (or the default value assigned via options).

Protocol Buffers Extension relies on this classification to determine whether the field should be mapped to an @optional member in IDL4, or not.

Every field with a “has” method (i.e. which explicit presence), which is not marked as “required”, will be mapped to an @optional member in IDL4.

Refer to the Protocol Buffers documentation about Field Presence for a detailed summary of which fields have explicit presence, which changes depending on the syntax/edition used.

It is possible to override the default mapping of the @optional annotation using option .omg.dds.member.optional.

4.6.1. Required Fields

The proto2 syntax allows the declaration of required fields. This is also possible when using the edition = "2023" syntax, by using the features.field_presence = LEGACY_REQUIRED option.

A field marked as “required” must be present in every serialization of the message. As such, Protocol Buffers Extension maps “required” fields to members of IDL4 struct’s without the @optional annotation.

Table 4.9 *Protocol Buffers* to *IDL4* Mapping for Required Fields
File Syntax	Protocol Buffers	DDS-XTYPES/IDL4
`proto2`	required TYPE my_field ... ;	TYPE my_field;
`proto3`	N/A	N/A
`edition = "2023"`	TYPE my_field ... [ features.field_presence = LEGACY_REQUIRED ];	TYPE my_field;

4.6.2. Optional Fields

All Protocol Buffers syntaxes allow the declaration of “optional” fields, which are singular values that may or may not appear in the network serialization of a message.

proto2 and proto3 provide the optional keyword to denote such fields, while edition = "2023" automatically marks every singular field as “optional”. With edition = "2023", fields can be futher controlled using the features.field_presence = EXPLICIT option.

Any field with “optional” semantics will be mapped to an IDL4 member with the @optional annotation.

File Syntax

Protocol Buffers

DDS-XTYPES/IDL4

proto2

optional TYPE my_field ... ;

@optional
TYPE my_field;

proto3

optional TYPE my_field ... ;

MESSAGE_TYPE my_msg_field;

@optional
TYPE my_field;

@optional
MESSAGE_TYPE my_msg_field;

edition = "2023"

TYPE my_field ... ;

TYPE my_field ... [
  features.field_presence = EXPLICIT
];

@optional
TYPE my_field;

@optional
TYPE my_field;

4.6.3. Implicit Fields

proto3 introduced the concept of implicit presence, by not providing “hazzer” methods for any singular field of a non-message type, which is not explicitly marked as optional.

The same implicit presence can be achieved in edition = "2023" by using the features.field_presence = IMPLICIT option.

Beside not being marked as @optional, IDL4 members derived from fields with implicit presence will be annotated with @field_presence(implicit) to encode this peculiar API choice in the IDL4 data model.

Warning

The use of implicit presence is discouraged by Protocol Buffers.

File Syntax

Protocol Buffers

DDS-XTYPES/IDL4

proto2

N/A

proto3

NON_MESSAGE_TYPE my_field ... ;

@field_presence(implicit)
NON_MESSAGE_TYPE my_field;

edition = "2023"

NON_MESSAGE_TYPE my_field ... [
  features.field_presence = IMPLICIT
];

@field_presence(implicit)
NON_MESSAGE_TYPE my_field;

4.6.4. Field Presence Examples

Table 4.12 Example use of option `.omg.dds.member.optional`
Example Element	Protocol Buffers	DDS-XTYPES/IDL4
Explicitly mark any field as `@optional`	TYPE my_field ... [ .omg.dds.optional = true ];	@optional TYPE my_field;
Prevent default mapping to `@optional`	optional TYPE my_field ... [ .omg.dds.optional = false ];	TYPE my_field;
Mark a `repeated` field as `@optional`	repeated TYPE my_field ... [ .omg.dds.optional = true ];	@optional sequence< TYPE > my_field;
Mark a `map` field as `@optional`	map<K, V> my_field ... [ .omg.dds.optional = true ];	@map @optional sequence< MyMessage_MapPair_K_V > my_field;

4.7. OneOf Fields

Protocol Buffers provides the oneof construct to group multiple optional fields together, allowing only one of them to be set at a time.

Every field that is part of a oneof will be treated as an “optional” field of the message. The associated IDL4 members will be annotated with @optional, and with @oneof to encode the name of the oneof group.

The @oneof annotation is most useful when the types are used with Protocol Buffers/C++ language binding. Applications using the types with other language bindings must enforce the “oneof” semantics by manually setting/resetting fields in the same group.

Table 4.13 Protocol Buffers to IDL4 Mapping for OneOf Fields

Protocol Buffers

DDS-XTYPES/IDL4

message MyMessage {

  // other fields...

  oneof oneof_field {
    A oneof_a ...;
    B oneof_b ...;
    C oneof_c ...;
  }

}

@mutable
struct MyMessage {

  // other fields...

  @optional
  @oneof("oneof_field")
  A oneof_a;

  @optional
  @oneof("oneof_field")
  B oneof_b;

  @optional
  @oneof("oneof_field")
  C oneof_c;
};

4.8. Field Groups

Field groups are a Protocol Buffers feature that allows grouping of fields within a message without the need to declare another dedicated message.

In IDL4, each field group is mapped to a nested struct associated with the declaring message type.

Warning

Field groups are deprecated by Protocol Buffers, and they are only available when using the older proto2 syntax.

File Syntax

Protocol Buffers

DDS-XTYPES/IDL4

proto2

message MyMessage {
    required group my_req_group = 1 {  }

    optional group my_opt_group = 2 {  }

    repeated group my_rep_group = 3 {  }
}

@nested
@containing_type("MyMessage")
@mutable
struct MyMessage_MyReqGroup {  };

@nested
@containing_type("MyMessage")
@mutable
struct MyMessage_MyOptGroup {  };

@nested
@containing_type("MyMessage")
@mutable
struct MyMessage_MyRepGroup {  };

struct MyMessage {
    @id(1)
    MyMessage_MyReqGroup my_req_group;

    @id(2)
    @optional
    MyMessage_MyOptGroup my_opt_group;

    @id(3)
    sequence<MyMessage_MyRepGroup> my_rep_group;
};

proto3

N/A

edition = "2023"

N/A

4.9. Unsupported Features

Message extensions.
Self-referencing messages.