\ *How to validate that a Protobuf message does not contain enum fields with zero value? Turn out that is not supported directly by Protobuf! We need to look into how* `protojson` *package is implemented.* \ More and more companies are adopting gRPC with [Protobuf](https://developers.google.com/protocol-buffers) for communication between internal services. It has the benefits of high performance, supporting multiple programming languages, and being backed by Google with a great ecosystem around. \ For communication with front-end and external services, Protobuf can be marshaled to JSON format. The browser only understands JSON format, and we can not expect other companies to consume Protobuf directly from us. *(Of course, you can, if you are [big enough](https://github.com/googleapis/googleapis)!)* ### Let’s talk about the enum and some problems that we encountered recently with it. *Sample code is written in Go.* \ From the Protobuf [style guide](https://developers.google.com/protocol-buffers/docs/style#enums), the zero value enum should have the suffix `UNSPECIFIED`. It’s because enum is implemented as a `uint32`, and the value `0` is considered as, well, unspecified. It’s similar to `nil` for a message or an empty string. When encoding Protobuf as JSON, a `nil` message, an `UNSPECIFIED` enum, or an empty string is ignored. \ We were following that convention, until someday, we did not. \ When sending external webhook messages, we decided to not use `0` as `UNSPECIFIED`. One reason is that we are using `EmitUnpopulated: true` to ensure that all fields are included in the JSON representation when sending webhook messages to external parties. **And we don’t want that** `UNSPECIFIED` value to appear in the webhook messages, if somehow we forget to set an enum field to 0. Unit tests can not catch all the mistakes; we engineers know that. \ This causes a lot of trouble, so we had to revert and make the value `0` as `UNSPECIFIED` again. One problem is that it forces the use of `EmitUnpopulated: true` everywhere! And there are places where we don’t want to emit all unpopulated fields. Like calling some third-party APIs. Some messages mix between `UNSPECIFIED` enums and non-`UNSPECIFIED` enums; there are no ways to send the correct format with that. Use `EmitUnpopulated: true`, the third-party APIs don’t understand `UNSPECIFIED`; use `EmitUnpopulated: false` and some required fields with non-`UNSPECIFIED` enums are omitted. Of course, they can all be refactored away, but it should be simpler to just force the use of `UNSPECIFIED` at the beginning. ### Well, why not just verify that all enum fields are not set to 0 in webhook messages? You may ask. Turn out there are no simple ways to do that in Protobuf 3! \ In Protobuf 2, there is `required` option to prevent a field to be unset. This option was removed in Protobuf 3, because it prevents refactoring for removing fields. If we forgot to update every service to remove that no-longer-used `required` field, especially in a company with multiple teams working together, the messages will be dropped unintentionally. It should be better not to require it upfront. (*[more](https://developers.google.com/protocol-buffers/docs/proto#specifying-rules)*) \ In Protobuf 3, there was `jsonpb.JSONPBMarshaler` interface. We can simply implement that interface for all enums to return error upon seeing a zero value. But again, it was removed! As a protocol, we should minimize the customization as much as possible. Otherwise, that customization will have to be implemented and maintained in all different languages across different teams! ### So, how to validate zero value in enum fields? We’ll have to reach the reflection package. The `protoreflect.Message` interface has `Range()` method for iterating over every populated field. We can use that method to verify that there are no enum fields with zero… Oh, wait. It only iterates over *populated fields*. So it won’t detect the zero value in enum! \ But the function `protojson.Marshal()` can still emit unpopulated fields with `EmitUnpopulated` option. How does it implement that? Dive into `encoding/protojson`, there is a code snippet for iterating over *unpopulated fields* (*[source](https://github.com/protocolbuffers/protobuf-go/blob/master/encoding/protojson/encode.go#L174-L197)*). Let’s steal it: ```go // unpopulatedFieldRanger wraps a protoreflect.Message and modifies its Range // method to additionally iterate over unpopulated fields. type unpopulatedFieldRanger struct{ pref.Message } func (m unpopulatedFieldRanger) Range(f func(pref.FieldDescriptor, pref.Value) bool) { fds := m.Descriptor().Fields() for i := 0; i < fds.Len(); i++ { fd := fds.Get(i) if m.Has(fd) || fd.ContainingOneof() != nil { continue // ignore populated fields and fields within a oneofs } v := m.Get(fd) isProto2Scalar := fd.Syntax() == pref.Proto2 && fd.Default().IsValid() isSingularMessage := fd.Cardinality() != pref.Repeated && fd.Message() != nil if isProto2Scalar || isSingularMessage { v = pref.Value{} // use invalid value to emit null } if !f(fd, v) { return } } m.Message.Range(f) } ``` \ What the above code does is iterating over additional fields, by looping over `protoreflect.Message.Descriptor().Fields()`. Fields within `oneof` fields are skipped. Unpopulated singular `message` fields are set as `invalid` (think of it as `null` in generated JSON) before being sent to the input function. \ Still a bit more code to write, like implementing a traveling method for iterating over all different Protobuf types: message, array (repeated), dynamic [Struct](https://developers.google.com/protocol-buffers/docs/reference/google.protobuf#google.protobuf.Struct), and of course, enum. But it’s solvable. And I can take a rest now. \ Thanks for reading! If you have a better way to do that, please let me know by connecting on [Twitter](https://twitter.com/olvrng) 👋 \ *Also published at [my blog](https://olvrng.github.io/w/proto.enum/).*