serialization - How should C++ objects be serialized? -


we doing project on high performance computing, using mpi parallel computing framework. there few algorithms implemented on legacy platform. rewriten original serial algorithm parallel version based on mpi.

i encounter performance problem: when running parallel algorithm based on mpi, there lot of comunication overhead between multiple process. inter-process comunication consist of 3 steps:

  1. process serialize c++ objects binary format.
  2. process send binary format data process b mpi.
  3. process b deserialize binary format data c++ objects.

we found these comunication steps, serialize/deserialize steps, cost huge amount of time. how hand performance issue?

by way, in our c++ code, use lot of stl, more complex c-like struct.

p.s. doing this(serialization) written code traversing fields of objects , copy them sequentially byte array.

to demonstrate doing, there code snippet. note single feature construction process:

sic::geometryfeature *ptfeature =     (geometryfeature *) outlayer->getfeature(ifeature); sic::geometry* geom = ptfeature->getgeometry(); std::string geomclassname = geom->getclassname();  sic::geometry* ptgeom = geom; unsigned char *wkbbuffer = null; ogrgeometry * gtgeom = null; if (geomclassname == "point") {     ptgeom = new sic::multipoint();     ((sic::multipoint *) ptgeom)->insert(geom);     gtgeom = new ogrmultipoint();     int wkbsize = ((sic::multipoint *) ptgeom)->wkbsize();     wkbbuffer = (unsigned char *) malloc(wkbsize);     ((sic::geometrycollection *) ptgeom)->exporttowkb(sic::wkbndr,         wkbbuffer, wkbmultipoint); } } else if (...) {     ...... } gtgeom->importfromwkb(wkbbuffer); free(wkbbuffer); assert(gtgeom); ogrfeature * pofeature = ogrfeature::createfeature(      polayer->getlayerdefn()); pofeature->setgeometry(gtgeom); 

and more doing serializing objects:

unsigned char *bytes = (unsigned char *) malloc(size);     size_t offset = 0;      size_t type_size = sizeof(ogrwkbgeometrytype);     ogrwkbgeometrytype type = layer->getgeomtype();     memcpy(bytes + offset, &type, type_size);     offset += type_size;      size_t count_size = sizeof(int);     int count = layer->getfeaturecount();     memcpy(bytes + offset, &count, count_size);     offset += count_size;      layer->resetreading();     (ogrfeature *feature = layer->getnextfeature(); feature != null;             feature = layer->getnextfeature()) {         ogrgeometry *geometry = feature->getgeometryref();         if (geometry) {             geometry->exporttowkb(wkbndr, bytes + offset);             offset += geometry->wkbsize();         } else {             (*(int *) (bytes + type_size))--;         }         ogrfeature::destroyfeature(feature);     }      return bytes; 

any comment appreciated. thanks!

(brian's answer's offering use library... he's experienced programmer - sounds worth go.)

separately, looked @ code - there's lots of temporary buffers, new/malloc allocation, use of sizeof etc.. thought i'd illustrate "quick, simple nice" approach cleaning - enough started...

first create binary stream type factors , hides lot of low-level work:

#include <arpa/inet.h> // htonl/s, ntoh/s #include <endian.h> // htonbe64, if have it...  #include <iostream> #include <string> #include <map>  // support routines - use c++ overloading polymorphically dispatch htonl/s  // uint64_t hton(uint64_t n) { return htonbe64(n); } uint32_t hton(uint32_t n) { return htonl(n); } uint16_t hton(uint16_t n) { return htons(n); }  // there no "int" versions - ugly effective... uint32_t hton(int32_t n) { return htonl(n); } uint16_t hton(int16_t n) { return htons(n); }  // uint64_t ntoh(uint64_t n) { return betoh64(n); } uint32_t ntoh(uint32_t n) { return ntohl(n); } uint16_t ntoh(uint16_t n) { return ntohl(n); }  template <typename ostream> class binary_ostream : public ostream {   public:     typedef binary_ostream this;      this& write(const char* s, std::streamsize n)     {         ostream::write(s, n);         return *this;     }      template <typename t>     this& rawwrite(const t& t)     {         static_cast<ostream&>(*this) << '[' << sizeof t << ']';         return write((const char*)&t, sizeof t);     }      template <typename t>     this& hton(t h)     {         t n = ::hton(h);         return rawwrite(n);     }      // conversions inbuilt & standard-library types...      friend this& operator<<(this& bs, bool x) { return bs << (x ? 't' : 'f'); }     friend this& operator<<(this& bs, int8_t x) { return bs << x; }     friend this& operator<<(this& bs, uint8_t x) { return bs << x; }     friend this& operator<<(this& bs, int16_t x) { return bs.hton(x); }     friend this& operator<<(this& bs, uint16_t x) { return bs.hton(x); }     friend this& operator<<(this& bs, int32_t x) { return bs.hton(x); }     friend this& operator<<(this& bs, uint32_t x) { return bs.hton(x); }      friend this& operator<<(this& bs, double d) { return bs.rawwrite(d); }      friend this& operator<<(this& bs, const std::string& x)     {         bs << x.size();         return bs.write(x.data(), x.size());     }      template <typename k, typename v, typename a>     friend this& operator<<(this& bs, const std::map<k, v, a>& m)     {         typedef typename std::map<k, v, a>::const_iterator it;          bs << m.size();          (it = m.begin(); != m.end(); ++it)             bs << it->first << it->second;          return bs;     }      // add others want... }; 

creating user-defined binary-serialisable type...

// own objects...     struct object {     object(const std::string& s, double x) : s_(s), x_(x) { }      std::string s_;     double x_;      // specify how want binary serialisation performed (which fields/order etc)     template <typename t>     friend binary_ostream<t>& operator<<(binary_ostream<t>& os, const object& o)     {         return os << o.s_ << o.x_;     } }; 

example usage:

#include <iomanip> #include <sstream>  // support routines observe/debug serialisation...  std::string printable(char c) {     std::ostringstream oss;     if (isprint(c))         oss << c;     else         oss << "\\x" << std::hex << std::setw(2) << std::setfill('0')             << (int)(uint8_t)c << std::dec;     return oss.str(); }  std::string printable(const std::string& s) {     std::string result;     (std::string::const_iterator = s.begin(); != s.end(); ++i)         result += printable(*i);     return result; }  int main() {     {         binary_ostream<std::ostringstream> bs;          object o("pi", 3.14);          bs << o;          std::cout << "serialised '" << printable(bs.str()) << "'\n";     }      {         binary_ostream<std::ostringstream> bs;          std::map<int, std::string> m;         m[0] = "zero";         m[1] = "one";         m[2] = "two";         bs << m;          std::cout << "serialised '" << printable(bs.str()) << "'\n";     } } 

the next step create binary_istream - it's very, similar above. (boost reduces work little using '%' operator instead of traditional << , >>, such same function can specify fields serialiation , deserialisation.)

implementation notes/thoughts:

  • if prefer, can remove template parameter binary_stream, , have constructor store arbitrary std::ostream& private member variable, send streaming operations data member.
    • this has advantages of minimising code bloat instantiations differents stream types, allowing implementation hidden translation unit , linked in later (helps keep compilation times down in large project), , letting attach binary_stream existing stream @ time (great if someone's passing pre-existing stream).
    • the "disadvantage" have explicitly forward other ostream member functions want accessible binary_stream users (more control tedious), or provide (less convenient/elegant?) std::ostream& stream() { return s_; }-style accessor.

Comments

Popular posts from this blog

How to mention the localhost in android -

php - Calling a template part from a post -

c# - String.format() DateTime With Arabic culture -