880 讀數

Ruby 世界中的 Protobuf 与 JSON

经过 Oleksandr Starodubtsev10m2023/04/26

太長; 讀書

Protobuf 是一个快速、紧凑的跨平台消息系统。它由定义语言和特定语言的编译器组成。例如，它具有很好的向后和向前兼容性，速度很快（我们还不确定），并且比 JSON 更紧凑。它没有被压缩，并且某些特定格式可以更好地处理它们的数据。

featured image - Ruby 世界中的 Protobuf 与 JSON

在我当前的项目中，我不仅将 protobuf 用于 GRPC，还将其用作RabbitMQ消息格式。虽然 protobuf 的优势不仅限于它的速度，但我想知道与JSON库相比它是否真的那么快，尤其是在 ruby 世界中。我决定做一些基准测试来检查它，但首先我想对每种格式做一个简短的介绍。

什么是协议缓冲区？

它是一个快速、紧凑的跨平台消息系统，在设计时考虑了向前和向后的兼容性。它由定义语言和特定语言的编译器组成。

它非常适合小型对象类数据，具有很好的向后和向前兼容性，速度快（我们还不确定），并且比 JSON 更紧凑，例如，但有一些限制，比如不支持直接比较（你需要反序列化对象进行比较）。

它没有被压缩，并且某些特定格式可以更好地处理它们的数据（例如 JPEG）。它不是自我描述的。

有关详细信息，请参阅官方文档。

什么是JSON

JSON 是JavaScript object notation的缩写。一种基于文本的数据格式，最初用于 JavaScript，但后来作为一种通信格式得到广泛传播，不仅在 JS 应用程序和后端之间，甚至在微服务之间，还有多种其他用途。

它使用字符串作为键，并具有字符串、数字、布尔值、对象、数组和 nul 作为值的可用类型。它的主要优点是它是人类可读的，并且很容易为编程语言序列化和解析。

有关详细信息，请参阅该站点。

基准

我选择了三个流行的 ruby JSON 库。它们是 Oj、Yajl 和标准 JSON 库。对于 protobuf，我使用标准的 google protoc 和 google ruby gem。

我将测量不同的特定类型的有效载荷，以查看我们将显示出最大差异的数据类型，只要是具有混合字段类型的复杂有效载荷。

您可以在此处查看所有代码https://github.com/alexstaro/proto-vs-json 。

基准设置

作为硬件，我使用配备 AMD Ryzen 3 PRO 5450U 和 16gb ddr4 内存的笔记本电脑。

作为操作系统，我使用 Ubuntu 22.10 kinetic。

Ruby 版本 3.2.1 是通过 asdf 安装的。

对于基准测试，我使用 benchmark/ips gem ( https://github.com/evanphx/benchmark-ips )

设置如下所示：

 Benchmark.ips do |x| x.config(time: 20, warmup: 5) x.report('Yajl encoding') do Yajl::Encoder.encode(data) end ... x.compare! end

仅限整数

我们将从仅整数开始。数字对于 JSON 来说非常困难，所以我们希望 protobuf 远离其他竞争对手。

测试数据：

 data = { field1: 2312345434234, field2: 31415926, field3: 43161592, field4: 23141596, field5: 61415923, field6: 323423434343443, field7: 53141926, field8: 13145926, field9: 323423434343443, field10: 43161592 }

基准测试结果：

 protobuf encoding: 4146929.7 i/s Oj encoding: 1885092.0 i/s - 2.20x slower standard JSON encoding: 505697.5 i/s - 8.20x slower Yajl encoding: 496121.7 i/s - 8.36x slower

毫无疑问，protobuf 是绝对的赢家，但如果我们让测试更接近真实场景会怎样——我们几乎总是只为序列化创建 proto 消息。

如果我们在基准测试下移动模型初始化会发生什么？

以下是结果：

 protobuf encoding: 4146929.7 i/s Oj encoding: 1885092.0 i/s - 2.20x slower standard JSON encoding: 505697.5 i/s - 8.20x slower Yajl encoding: 496121.7 i/s - 8.36x slower protobuf with model init: 489658.0 i/s - 8.47x slower

结果不是那么明显。我预计使用消息初始化进行编码会更慢但不是最慢的。

让我们检查反序列化：

 protobuf parsing: 737979.5 i/s Oj parsing: 448833.9 i/s - 1.64x slower standard JSON parsing: 297127.2 i/s - 2.48x slower Yajl parsing: 184361.1 i/s - 4.00x slower

这里没有惊喜。

在负载大小方面，protobuf 几乎是 json 的 4 倍：

 JSON payload bytesize 201 Protobuf payload bytesize 58

仅限双打

双打预计将成为 JSON 最难的有效负载，让我们检查一下。

我们的有效载荷：

 data = { field1: 2312.345434234, field2: 31415.926, field3: 4316.1592, field4: 23141.596, field5: 614159.23, field6: 3234234.34343443, field7: 53141.926, field8: 13145.926, field9: 323423.434343443, field10: 43161.592 }

结果：

 protobuf encoding: 4814662.9 i/s protobuf with model init: 444424.1 i/s - 10.83x slower Oj encoding: 297152.0 i/s - 16.20x slower Yajl encoding: 160251.9 i/s - 30.04x slower standard JSON encoding: 158724.3 i/s - 30.33x slower

即使使用模型初始化，Protobuf 也快得多。让我们检查一下反序列化：

 Comparison: protobuf parsing: 822226.6 i/s Oj parsing: 395411.3 i/s - 2.08x slower standard JSON parsing: 241438.7 i/s - 3.41x slower Yajl parsing: 157235.7 i/s - 5.23x slower

这里仍然没有惊喜。

和有效载荷大小：

 JSON payload bytesize 211 Protobuf payload bytesize 90

不是四次，但仍然很明显。

仅限字符串

字符串对于 JSON 来说应该更容易，让我们检查一下。

有效负载：

 data = { field1: "2312.345434234", field2: "31415.926", field3: "4316.1592", field4: "23141.596", field5: "614159.23", field6: "3234234.34343443", field7: "53141.926", field8: "13145.926", field9: "323423.434343443", field10: "43161.592" }

基准测试结果：

 Comparison: protobuf encoding: 3990298.3 i/s oj encoder: 1848941.3 i/s - 2.16x slower yajl encoder: 455222.0 i/s - 8.77x slower standard JSON encoding: 444245.6 i/s - 8.98x slower protobuf with model init: 368818.3 i/s - 10.82x slower

反序列化：

 Comparison: protobuf parser: 631262.5 i/s oj parser: 378697.6 i/s - 1.67x slower standard JSON parser: 322923.5 i/s - 1.95x slower yajl parser: 187593.4 i/s - 3.37x slower

载荷大小：

 JSON payload bytesize 231 Protobuf payload bytesize 129

整数数组

尽管我们有单独的整数工作台，但 protobuf 如何处理集合还是很有趣的。

这是数据：

 data = { field1: [ 2312345434234, 31415926, 43161592, 23141596, 61415923, 323423434343443, 53141926, 13145926, 323423434343443, 43161592 ] }

序列化工作台：

 Comparison: protobuf encoding: 4639726.6 i/s oj encoder: 2929662.1 i/s - 1.58x slower standard JSON encoding: 699299.2 i/s - 6.63x slower yajl encoder: 610215.5 i/s - 7.60x slower protobuf with model init: 463057.9 i/s - 10.02x slower

反序列化工作台：

 Comparison: oj parser: 1190763.1 i/s protobuf parser: 760307.3 i/s - 1.57x slower standard JSON parser: 619360.4 i/s - 1.92x slower yajl parser: 414352.4 i/s - 2.87x slower

老实说，这里的反序列化结果非常出乎意料。

让我们检查有效负载大小：

 JSON payload bytesize 121 Protobuf payload bytesize 50

双打数组

我决定检查双打数组是否具有相同的行为。

数据：

 data = { field1: [ 2312.345434234, 31415.926, 4316.1592, 23141.596, 614159.23, 3234234.34343443, 53141.926, 13145.926, 323423.434343443, 43161.592 ] }

序列化：

 Comparison: protobuf encoding: 7667558.9 i/s protobuf with model init: 572563.4 i/s - 13.39x slower Oj encoding: 323818.1 i/s - 23.68x slower Yajl encoding: 183763.3 i/s - 41.73x slower standard JSON encoding: 182332.3 i/s - 42.05x slower

反序列化：

 Comparison: Oj parsing: 953384.6 i/s protobuf parsing: 883899.0 i/s - 1.08x slower standard JSON parsing: 452799.0 i/s - 2.11x slower Yajl parsing: 356091.2 i/s - 2.68x slower

我们在这里得到了类似的结果。 protobuf 似乎对数组有一些问题。

有效载荷大小：

 JSON payload bytesize 131 Protobuf payload bytesize 82

复杂载荷

作为一个“复杂”的有效载荷，我用帖子和这些帖子的评论模拟了一些用户数据，使其更像现实生活中的应用程序。

 data = { user_id: 12345, username: 'johndoe', email: 'johndoe@example.com', date_joined: '2023-04-01T12:30:00Z', is_active: true, profile: { full_name: 'John Doe', age: 30, address: '123 Main St, Anytown, USA', phone_number: '+1-555-123-4567' }, posts: [ { post_id: 1, title: 'My first blog post', content: 'This is the content of my first blog post.', date_created: '2023-04-01T14:00:00Z', likes: 10, tags: ['blog', 'first_post', 'welcome'], comments: [ { comment_id: 101, author: 'Jane', content: 'Great first post!', date_created: '2023-04-01T15:00:00Z', likes: 3 }, ... ] }, ... ] }

结果：

 Comparison: protobuf encoding: 1038246.0 i/s Oj encoding: 296018.6 i/s - 3.51x slower Yajl encoding: 125909.6 i/s - 8.25x slower protobuf with model init: 119673.2 i/s - 8.68x slower standard JSON encoding: 115773.4 i/s - 8.97x slower Comparison: protobuf parsing: 291605.9 i/s Oj parsing: 76994.7 i/s - 3.79x slower standard JSON parsing: 64823.6 i/s - 4.50x slower Yajl parsing: 34936.4 i/s - 8.35x slower

和有效载荷大小：

 JSON payload bytesize 1700 Protobuf payload bytesize 876

我们在这里首先看到纯 protobuf 编码的预期行为，但是，如果我们看一下我们的“真实世界”示例，我们会发现它并不比标准 JSON 编码更快。

结论

如果您只是为了速度而从 JSON 切换到 Protobuf，那可能不值得。

使用 Protobuf 的原因应该是用于数据交换的出色跨语言模式定义——而不是性能提升。

本文的主图是由 HackerNoon 的AI Image Generator通过提示“编程语言”生成的。

L O A D I N G
. . . comments & more!

About Author

Oleksandr Starodubtsev@alexstaro

Software developer, interested in ruby, c, cloud computing, etc.

Read my stories

这篇文章刊登在...

Terminal

Lite

Join HackerNoon

Latest technology trends. Customized Experience. Curated Stories. Publish Your Ideas