On my current project, I work with protobuf not only for GRPC, but also as message format. While the advantages of protobuf are not limited to its speed, I was wondering if it's really so fast compared to libraries, especially in the ruby world. I decided to make some benchmarks to check it, but first I want to add a brief introduction to each format. RabbitMQ JSON What is protobuf? It's a fast, compact cross-platform message system, designed with forward-and backward compatibility in mind. It consists of a definition language and language-specific compilers. It works perfectly for small object-like data, has great backward and forward compatibility, is fast(we are not sure yet), and is more compact than JSON, for example, but have some limitations like not supporting direct comparison(you need to deserialize objects to compare). It's not compressed and some specific formats can work better for their data(for example JPEG). It's not self-describing. See for more details. Official docs What is JSON JSON is an abbreviation for . A text-based data format, which was originally used in JavaScript, but later got widely spread as a communication format not only between JS apps and backend but even between microservices and has multiple other usages. JavaScript object notation It uses strings as keys and has a string, number, boolean, object, array and nul as available types for value. The main advantage of it is that it is human-readable, and pretty easy to serialize and parse for programming language. See the for more details. site Benchmarks I picked up three popular ruby JSON libraries. They are Oj, Yajl, and standard JSON library. For protobuf, I use standard google protoc with google ruby gem. I will measure different specific types of payload to see which data type we will show the most difference, as long as complex payload with a mix of field types. You can see all the code here . https://github.com/alexstaro/proto-vs-json Benchmark setup As a hardware I use laptop with AMD Ryzen 3 PRO 5450U and 16gb of ddr4 ram. As an operating system, I use Ubuntu 22.10 kinetic. Ruby version 3.2.1 was installed via asdf. For benchmarking, I use benchmark/ips gem ( ) https://github.com/evanphx/benchmark-ips The setup looks like this: Benchmark.ips do |x| x.config(time: 20, warmup: 5) x.report('Yajl encoding') do Yajl::Encoder.encode(data) end ... x.compare! end Integers only We will start with only integers. Numbers are pretty hard for JSON so we expect protobuf to be far away from other competitors. The test data: data = { field1: 2312345434234, field2: 31415926, field3: 43161592, field4: 23141596, field5: 61415923, field6: 323423434343443, field7: 53141926, field8: 13145926, field9: 323423434343443, field10: 43161592 } Benchmark results: protobuf encoding: 4146929.7 i/s Oj encoding: 1885092.0 i/s - 2.20x slower standard JSON encoding: 505697.5 i/s - 8.20x slower Yajl encoding: 496121.7 i/s - 8.36x slower There is no doubt that protobuf is an absolute winner, but what if we make the test more closer to the real-world scenario - we almost always create proto messages only for serialization. What would happen if we move model initialization under benchmark? Here are the results: protobuf encoding: 4146929.7 i/s Oj encoding: 1885092.0 i/s - 2.20x slower standard JSON encoding: 505697.5 i/s - 8.20x slower Yajl encoding: 496121.7 i/s - 8.36x slower protobuf with model init: 489658.0 i/s - 8.47x slower The result is not so obvious. I expected encoding with message initialization would be slower but not the slowest. Let's check deserialization: protobuf parsing: 737979.5 i/s Oj parsing: 448833.9 i/s - 1.64x slower standard JSON parsing: 297127.2 i/s - 2.48x slower Yajl parsing: 184361.1 i/s - 4.00x slower There are no surprises here. In terms of payload size, protobuf is almost 4 times more compact compared to json: JSON payload bytesize 201 Protobuf payload bytesize 58 Doubles only Doubles are expected to be the hardest payloads for JSON, let's check this out. Our payload: data = { field1: 2312.345434234, field2: 31415.926, field3: 4316.1592, field4: 23141.596, field5: 614159.23, field6: 3234234.34343443, field7: 53141.926, field8: 13145.926, field9: 323423.434343443, field10: 43161.592 } Result: protobuf encoding: 4814662.9 i/s protobuf with model init: 444424.1 i/s - 10.83x slower Oj encoding: 297152.0 i/s - 16.20x slower Yajl encoding: 160251.9 i/s - 30.04x slower standard JSON encoding: 158724.3 i/s - 30.33x slower Protobuf is much faster even with model initialization. Let's check the deserialization: Comparison: protobuf parsing: 822226.6 i/s Oj parsing: 395411.3 i/s - 2.08x slower standard JSON parsing: 241438.7 i/s - 3.41x slower Yajl parsing: 157235.7 i/s - 5.23x slower Still no surprises here. and the payload size: JSON payload bytesize 211 Protobuf payload bytesize 90 Not four times, but still noticeable. Strings only Strings are expected to be easier for JSON, let's check this out. payload: data = { field1: "2312.345434234", field2: "31415.926", field3: "4316.1592", field4: "23141.596", field5: "614159.23", field6: "3234234.34343443", field7: "53141.926", field8: "13145.926", field9: "323423.434343443", field10: "43161.592" } Bench results: Comparison: protobuf encoding: 3990298.3 i/s oj encoder: 1848941.3 i/s - 2.16x slower yajl encoder: 455222.0 i/s - 8.77x slower standard JSON encoding: 444245.6 i/s - 8.98x slower protobuf with model init: 368818.3 i/s - 10.82x slower Deserialization: Comparison: protobuf parser: 631262.5 i/s oj parser: 378697.6 i/s - 1.67x slower standard JSON parser: 322923.5 i/s - 1.95x slower yajl parser: 187593.4 i/s - 3.37x slower The payload size: JSON payload bytesize 231 Protobuf payload bytesize 129 Integer array Despite we have separated integers bench it's interesting how protobuf handles collections. Here is the data: data = { field1: [ 2312345434234, 31415926, 43161592, 23141596, 61415923, 323423434343443, 53141926, 13145926, 323423434343443, 43161592 ] } Serialization bench: Comparison: protobuf encoding: 4639726.6 i/s oj encoder: 2929662.1 i/s - 1.58x slower standard JSON encoding: 699299.2 i/s - 6.63x slower yajl encoder: 610215.5 i/s - 7.60x slower protobuf with model init: 463057.9 i/s - 10.02x slower Deserialization bench: Comparison: oj parser: 1190763.1 i/s protobuf parser: 760307.3 i/s - 1.57x slower standard JSON parser: 619360.4 i/s - 1.92x slower yajl parser: 414352.4 i/s - 2.87x slower To be honest, the deserialization results are pretty unexpected here. Let's check payload size: JSON payload bytesize 121 Protobuf payload bytesize 50 Array of doubles I decided to check if an array of doubles shares the same behavior. data: data = { field1: [ 2312.345434234, 31415.926, 4316.1592, 23141.596, 614159.23, 3234234.34343443, 53141.926, 13145.926, 323423.434343443, 43161.592 ] } Serialization: Comparison: protobuf encoding: 7667558.9 i/s protobuf with model init: 572563.4 i/s - 13.39x slower Oj encoding: 323818.1 i/s - 23.68x slower Yajl encoding: 183763.3 i/s - 41.73x slower standard JSON encoding: 182332.3 i/s - 42.05x slower Deserialization: Comparison: Oj parsing: 953384.6 i/s protobuf parsing: 883899.0 i/s - 1.08x slower standard JSON parsing: 452799.0 i/s - 2.11x slower Yajl parsing: 356091.2 i/s - 2.68x slower We got similar results here. It seems that protobuf has some issues with arrays. Payload size: JSON payload bytesize 131 Protobuf payload bytesize 82 Complex payload As a "complex" payload I mocked some user data with posts and comments for those posts to make it more like real life application. data = { user_id: 12345, username: 'johndoe', email: 'johndoe@example.com', date_joined: '2023-04-01T12:30:00Z', is_active: true, profile: { full_name: 'John Doe', age: 30, address: '123 Main St, Anytown, USA', phone_number: '+1-555-123-4567' }, posts: [ { post_id: 1, title: 'My first blog post', content: 'This is the content of my first blog post.', date_created: '2023-04-01T14:00:00Z', likes: 10, tags: ['blog', 'first_post', 'welcome'], comments: [ { comment_id: 101, author: 'Jane', content: 'Great first post!', date_created: '2023-04-01T15:00:00Z', likes: 3 }, ... ] }, ... ] } The results: Comparison: protobuf encoding: 1038246.0 i/s Oj encoding: 296018.6 i/s - 3.51x slower Yajl encoding: 125909.6 i/s - 8.25x slower protobuf with model init: 119673.2 i/s - 8.68x slower standard JSON encoding: 115773.4 i/s - 8.97x slower Comparison: protobuf parsing: 291605.9 i/s Oj parsing: 76994.7 i/s - 3.79x slower standard JSON parsing: 64823.6 i/s - 4.50x slower Yajl parsing: 34936.4 i/s - 8.35x slower And payload size: JSON payload bytesize 1700 Protobuf payload bytesize 876 We see here the expected behavior with pure protobuf encoding on the first place, however, if we look at our “real-world” example we see it is not faster that the standard JSON encoding. Conclusion If you are switching from JSON to Protobuf just for the speed, it's may not worth it. The reason to use Protobuf should be the awesome cross-language schema definition for data exchange — not a performance boost. The lead image for this article was generated by HackerNoon's via the prompt "programming language". AI Image Generator