Friday, May 17, 2013

My take on serialization (Part II: serialize)

After discussed the get_size(...) functor, which given an object returns its serialized size in bytes, we can go on and write the serialize function.

We can follow the same pattern of the get_size, but this time we have to store the content to a stream.

As before, we have a specialization of the serialize_helper template for tuples, vectors, strings and POD datatypes. In order to be performance efficient we presize the vector in lines 66-67 (because we know how many bytes we need to store the object) thanks to the get_size predicate.

Let's write some unit tests (because they are very important):

So it's time for running some benchmarks. YAY! Let's compare this serializer with boost and check whether spending time re implementing this was worth some. Since I don't have much time, we only run 1 experiment... if you are interested you can run more :) 

So, as long as size of the serialized stream is concerned, boost uses 63 bytes while ours only 26 bytes (more than half). Performance-wise boost::serialization needs 4.410 secs to run the test, our serialization solution 0.975 seconds, more than 4 times faster. Of course we are not saying that boost::serialization sucks... as a matter of fact it solves a different problem since it also store the typing information which allows everyone (even a Java program) to reload the stream. Our serialization strategy (as I explained in the first post) is based on the fact that the type is known at receiver side (therefore we can avoid to store typing).

Let's show some graphs, comparison of our serialization strategy relative to boost::serialization: