Click here to Skip to main content
15,900,108 members
Articles / Programming Languages / C++

C++ Compile Time CBOR/BSON Coder/Decoder Generator

Rate me:
Please Sign up or sign in to vote.
4.67/5 (2 votes)
18 Jul 2018Ms-PL4 min read 8.5K   9  
C++ CBOR/BSON coder/decoder

bobl

This is an attempt to serialize/deserialize C++ types in well defined binary format, as simple as this:

C++
namespace protocol = bobl::bson;
// auto value = Type{...};
std::vecor<std::uint8_t>  data = protocol::encode(value);
auto begin = cbegin(data);
Type value = protocol::decode<Type>(begin, cend(data));

As for now, the library supports basic bson and cbor encoding/decoding.

Requirements

Library is header only library, therefore no separately-compiled library binaries or special treatment is required. However, it's using the following boost libraries:

These are also header only libraries so just make sure that the compiler can find them.

C++11 compatible compiler (clang 3.6+, gcc 4.8.5+, msvc-14.1+).

How It Works

Let's say that there is an encoded bson object:

C++
//{'enabled': True, 'id': 100, 'name': 'the name', 'theEnum': 2}
//(55) : b'7\x00\x00\x00\x08enabled\x00\x01\x10id\x00d\x00\x00\x00\x02name\x00\t\x00\
             x00\x00the name\x00\x10theEnum\x00\x02\x00\x00\x00\x00'

It can be decoded into std::tuple like this:

C++
std::tuple<bool, int, std::string, int> res = bobl::bson::decode
          <std::tuple<bool, int, std::string, int>>(begin, end);

or std::tuple in decode can be omitted:

C++
std::tuple<bool, int, std::string, int> res = bobl::bson::decode
          <bool, int, std::string, int>(begin, end);
C++
#include "bobl/bson/decode.hpp"

std::uint8_t data[] = {
    0x37, 0x0,  0x0,  0x0,  0x8,  0x65, 0x6e, 0x61, 0x62, 0x6c, 0x65,
    0x64, 0x0,  0x1,  0x10, 0x69, 0x64, 0x0,  0x64, 0x0,  0x0,  0x0,
    0x2,  0x6e, 0x61, 0x6d, 0x65, 0x0,  0x9,  0x0,  0x0,  0x0,  0x74,
    0x68, 0x65, 0x20, 0x6e, 0x61, 0x6d, 0x65, 0x0,  0x10, 0x74, 0x68,
    0x65, 0x45, 0x6e, 0x75, 0x6d, 0x0,  0x2,  0x0,  0x0,  0x0,  0x0 };

uint8_t const* begin = data;
uint8_t const* end = begin + sizeof(data) / sizeof(data[0]);
auto res = bobl::bson::decode<bool, int, std::string, TheEnumClass>(begin, end);

Using tuples for complex types might not really be a good idea that's where Boost.Fusion can be very useful. It allows adapt structure to heterogeneous container. So, the above could be decoded like this:

C++
enum class TheEnumClass { None, One, Two, Three };

struct Simple
{
  bool enabled;
  int id;
  std::string name;
  TheEnumClass theEnum;
};

BOOST_FUSION_ADAPT_STRUCT(
  Simple,
  enabled,
  id,
  name,
  theEnum)

Simple simple = bobl::bson::decode<Simple>(begin, end);

as well as encoded:

C++
std::vector<std::uint8_t> blob = bobl::bson::encode(simple);

std::tuple also can be encoded, but BSON requires names for objects. In case of adapted structures, member name became appropriate object name. With tuples, some naming is needed, to solve it there are a few options:

  1. Position of tuple element can be used as an element name:
    C++
    auto value = std::make_tuple(true, 100, std::string{ "the name" }, TheEnumClass::Two);
    auto data = bobl::bson::encode<bobl::options::UsePositionAsName>(value);
    

    The resulting object will look like this (pseudo-json representation):

    C++
    {'_0': True, '_1': 100, '_2': 'the name', '_3': 2}
    
  2. Another way to name tuple element is specialize MemberName class:
    C++
    #include "bobl/names.hpp"
    
    namespace bobl {
      template<typename Type, typename MemberType, std::size_t Position, typename Options> class MemberName;
    }

    like this:

    C++
    namespace bobl{
    
        template<typename MemberType, typename Options> class MemberName 
                                   <SimpleTuple, MemberType, 0, Options>
        {
        public:
            constexpr char const* operator()() const { return "enabled"; }
        };
    
        template<typename Options> class MemberName <SimpleTuple, int, 1, Options>
        {
        public:
            constexpr char const* operator()() const { return "id"; }
        };
    
        template<std::size_t Position, typename Options> class MemberName 
                                <SimpleTuple, std::string, Position , Options>
        {
        public:
            constexpr char const* operator()() const { return "name"; }
        };
    
        template<typename MemberType, typename Options> class MemberName 
                                 <SimpleTuple, MemberType, 3, Options>
        {
        public:
            constexpr char const* operator()() const { return "enm"; }
        };
    
    }//namespace bobl

    so encode on tuple would work the same way as it does on Boost.Fusion adapted structures:

    C++
    auto tuple = std::make_tuple(true, 100, std::string{ "the name" }, Enum::Two);
    auto data = bobl::bson::encode(tuple);
    

    The library could handle more complex types, for example:

    C++
    struct Extended
    {
        int id;
        Simple simple;
        std::vector<int> ints;
        std::vector<Simple> simples;
        boost::variant<int,Simple, std::string, std::vector<Simple>> var;
        boost::uuids::uuid uuid;
        boost::optional<Enum> enm;
        std::vector<std::uint8_t> binary; // this will be encoded as binary object
        std::chrono::system_clock::time_point tp;
    };
    
    BOOST_FUSION_ADAPT_STRUCT(
        Extended,
        id,
        simple,
        ints,
        simples,
        var,
        uuid,
        enm,
        binary,
        tp)
    
        auto extended = Extended{};
        std::vector<std::uint8_t> encoded =  bobl::bson::encode(extended)

Supported Types

Options

Options allows to control encoding/decoding. Options are defined in options.hpp:

C++
struct RelaxedIntegers {};
struct RelaxedFloats {};
struct ExacMatch {};
struct StructAsDictionary {};
struct UsePositionAsName {};
template<typename T> struct ByteType {};
using IntegerOptimizeSize = RelaxedIntegers;
template<typename T> struct HeterogeneousArray {};
template<typename T> using NonUniformArray = HeterogeneousArray<T>;
struct OptionalAsNull{};

This options can be used as explicitly set encode/decode functions template parameters and/or could be set per specific type by specializing bobl::EffectiveOptions structure:

C++
template<typename T, typename ...Options>
struct EffectiveOptions
{
    using type = bobl::Options<Options...>;
};

For example, to encode tuple as object using tuples element position as an object member name following specialization of bobl::EffectiveOptions structure can be used:

C++
namespace bobl{
    template<typename ...Types, typename ...Options>
    struct EffectiveOptions<std::tuple<Types...>, Options...>
    {   
        using type = bobl::Options<bobl::options::UsePositionAsName, Options...>;
    };
} //namespace bobl

Also, if such options have to be set for specific protocol(BSON or CBOR), bobl::<protocol name> namespace can be used, following will set bobl::options::UsePositionAsName for tuples used with cbor encode/decode functions:

C++
namespace bobl{
  namespace cbor{
    template<typename ...Types, typename ...Options>
    struct EffectiveOptions<std::tuple<Types...>, Options...>
    {   
        using type = bobl::Options<bobl::options::UsePositionAsName, Options...>;
    };
  } //namespace cbor
} //namespace bobl

Integers

Any integer type for which std::is_integral<T>::value is true. By default, encoded/decoded integer type based on its C++ type (not on its value size) which means that std::uint64_t containing value 1 will be encoded as BSON - "\x12" (int64) type, it makes encoding/decoding bit faster. Using bobl::option::IntegerOptimizeSize option during encoding and bobl::option::RelaxedIntegers option during decoding allows to change such behavior.

C++
//{'int': 1}
//(14) : b'\x0e\x00\x00\x00\x10int\x00\x01\x00\x00\x00\x00'
//                          ^^^ int32
std::uint8_t data[] = { 0xe, 0x0, 0x0, 0x0, 0x10, 0x69, 0x6e, 0x74, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0 };
uint8_t const *begin = data;
uint8_t const* end = begin + sizeof(data) / sizeof(data[0]);
std::tuple<std::uint64_t> res = bobl::bson::decode<std::uint64_t,
          bobl::Options<bobl::options::RelaxedIntegers>>(begin, end);
assert(std::get<0>(res) == 1);

Floating Point

Any floating point type for which std::is_floating_point<T>::value is true.

Enums and enum class

Enums end enum classes are encoded/decoded as underlying integer type.

std::string

std::string encoded/decoded as raw UTF-8 string.

std::vector

std::vector can be used with any supported types and encoded/decoded as bson/cbor arrays. Except std::vector<std::uint8_t> which is encoded/decoded as byte string(CBOR Major type 2) / binary data (BSON - "\x05"). bobl::option::ByteType allows to encode/decode std::vector specialized with any other types(for which sizeof(T) is equal to sizeof(std::uint8_t)) as byte string.

C++
std::vector<char> binary =  {100, 110, 120};
bobl::cbor::encode<bobl::Options<bobl::options::ByteType<char>>>(...)
// ...
std::vector<char> = bobl::cbor::decode<std::vector<char>,
                    bobl::Options<bobl::options::ByteType<char>>>(begin, end);

boost::optional

boost::optional can be used with any supported types. By default, encode skips std::optional to reduce size of encoded object which might lead to incorrect decoding. If decoded std::tuple with more than one type(encoded to same protocol specific type) following each other and at least one of these types is optional. Here is an example:

C++
enum class Type { One, Two, Three };

struct Data
{
    boost::optional<Type> type; //will be encoded as int
    int id;
};

BOOST_FUSION_ADAPT_STRUCT(Data, type, id)

    auto data  = Data { {}, 123};
    std::vector<std::uint8_t> encoded =  bobl::bson::encode(data);
    auto begin = encoded.data();
    auto end = begin + encoded.size();

This will work as expected:

C++
auto decoded =  bobl::bson::decode<Data>(begin, end);

This, on the other hand, will throw an exception:

C++
bobl::bson::decode<boost::optional<Type>, int>(begin, end);

Data structure with empty optional member type will be encoded as {"id":123} pseudo-json representation. Therefore, decoding to unnamed tuple will fail because integer 123 will be decoded as optional enum class type and required id needs value.

If decoding in unnamed tuple is required, bobl::options::OptionalAsNull can be used at encoding:

C++
auto data  = Data { {}, 123};
auto encoded =  bobl::bson::encode<bobl::options::OptionalAsNull>(data);

It will produce: {type:null, "id":123} pseudo-json representation which can be decoded in unnamed tuple just fine.

C++
auto res = bobl::bson::decode<boost::optional<Type>, int>(begin, end); // ok

boost::variant

boost::variant can be used with any supported types, except boost::optional which wouldn't make much sense. When decoded, boost::variant decode tries to decode types in order of declaration so if two types are encoded as the same type, for example, integers and enums, it will be decoded as first declared type.

Boost.Fusion Adapted Structures

Boost.Fusion adapted structures are encoded/decoded as bson/cbor objects, also called tables, dictionaries, hashes or maps of name-values pairs. By default, member of structures decoded/encoded in the declaration order. It is possible to decode such objects encoded in a different order if resulting C++ object is default constructible and bobl::options::StructAsDictionary option is specified.

Also decoding ignores extra object members at the end of object if bobl::options::StructAsDictionary option is not used. And any extra members if bobl::options::StructAsDictionary is used. This behavior allows, to a certain point, extend protocols without breaking existing implementations. To suppress such behavior, bobl::option::ExacMatch can be used. It makes decode throw bobl::InvalidObject exception if any extra object members are found during decoding.

Adapting Types

Type can be encoded/decode as another type by specializing bobl::Adapter.

C++
namespace bobl{ 
    template<typename T, typename Enabled = boost::mpl::true_>
    class Adapter {
        using type = typename std::underlying_type<T>::type;
        T operator()(type x) const;
        type operator()(T const& x) const;

    };

} /*namespace bobl*/

or `bobl::bobl::Adapter`/`bobl::cbor::Adapter` if it should be protocol specific, for example:

C++
class X
{
public:
    explicit X(int persistent) : persistent_{ persistent } {}
    int persistent() const { return persistent_;}
private:
    int persistent_;
    int notso_ = 0;
};

namespace bobl { namespace bson {
    template<>
    class Adapter<X, boost::mpl::true_>
    {
    public:
        using type = int;
        X operator()(int x) const { return X{ x }; }
        int operator()(X const& x) const { return x.persistent(); }
    };
 } /*namespace bson*/ } /*namespace bobl*/

This will encode class X as an integer:

License

This article, along with any associated source code and files, is licensed under The Microsoft Public License (Ms-PL)


Written By
Software Developer
Ireland Ireland
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --