TL;DR: Using gRPC with Kubernetes, cluster-internally, is straight-forward. Exposing a gRPC service cluster-externally not so much. Maybe a good practice could be: use gRPC cluster-internally and offer public interfaces using HTTP and/or WebSockets?
Starting point was a simple gRPC server example called Yet another gRPC echo server (YAGES). This YAGES example shows end-to-end how to develop and deploy a gRPC service using Go and Kubernetes. You can of course use your favourite programming language (as long as it’s supported by gRPC) rather than Go to implement the service. Now, after deploying the gRPC service as an Kubernetes app, accessing the gRPC service from within the cluster is straight forward enough, for example, using a jump pod with grpcurl installed:
$ grpcurl --plaintext yages.grpc-demo:9000 yages.Echo.Ping{"text": "pong"}
OK, cool. How about if I want to access this from outside of the cluster? Well, let’s see …
The gRPC blog provides a nice introduction into the topic in the post gRPC Load Balancing discussing the options and the docs have some related info. For a nice intro see also Tom Wilkie’s presentation on the topic.
Equipped with the basics I set out to get a setup working with the focus on UX and ease of use. Ideally, it would work out of the box or at least with minimal configuration effort. I had a look at a couple of load balancer/reverse proxy/service mesh options: NGINX, three Envoy-based solutions (Ambassador, Contour, and Istio), Linkerd, and Træfik .
Here’s what I found …
Datawire’s Envoy-based API gateway Ambassador has first class gRPC support and we’re also using it in Kubeflow. The UX is good, only thing to be aware of is that it requires a cluster role and respective binding. I haven’t figured out yet how to use it on a namespace-level, that is, in an environment where I don’t have the permission to create cluster-wide resources.
Heptio’s Contour is a Kubernetes ingress controller using Envoy. It seems gRPC support is actively being worked on, I found one issue that needs to be addressed in order to make it work.
Linkerd is a CNCF inception project, originally developed by the service mesh pioneers Buoyant. It supports gRPC out of the box. There are very nice blog posts available on the topic— for example A Service Mesh For Kubernetes Part IX: gRPC for fun and profit from 04/2017 and Building scalable micro-services with Kubernetes, GRPC & Linkerd from 02/2018—but I have yet to try it out.
Træfik is a HTTP reverse proxy and load balancer with built-in support for gRPC. Also on my to do list to try out.
I think it’s fair to say that NGINX doesn’t need an introduction. Amongst many other things, you can use it as an Kubernetes Ingress controller. One of the things it now also supports is gRPC (since 1.13.10/mid-March 2018). I tried patching it on Minikube (which still uses v0.9.0 of the controller) but no luck so far. I also will give it a try in the context of GKE.
Finally Istio, another project from the Cloud Native ecosystem, is a service mesh that uses Envoy by default for the data plane. Seems the community is actively working on gRPC support. I’ll give it a try once it’s available.
For completeness sake I’ll also mention that it’s apparently possible to use HAProxy as well to handle gRPC but I didn’t have a closer look at it. Also, there’s sercand/kuberesolver, a client-side load balancer/Kubernetes name resolver which sounds like a good option for cluster-internal use cases.
Note that my setup is targeting Kubernetes 1.9 in Minikube (v0.25) and GKE and my only hard requirement is that it has to work with RBAC enabled.
Conclusion: I haven’t been able to use any of the above options successfully to expose my little YAGES example to the outside world. I got the farthest with Ambassador so far but not yet able to get to it with grpcurl
. The fact that gRPC requires HTTP/2 seems to be part of the challenge, RBAC another one.
I have no doubt that, given the ecosystem is moving so fast, the UX/DX around configuring and using gRPC towards cluster-external clients will improve a lot. For now it’s still early days, in my experience. Possible, but not easy in terms of UX/DX.
Maybe exposing gRPC to the outside world per se is not a great idea? That is, maybe the good practice we can derive is: use gRPC within the cluster and HTTP and/or WebSockets to communicate with cluster-external clients?
I’d love to learn about your experiences in this space and any thoughts or recommendations, here!
UPDATE 2018–03-27: since publishing this post I’ve received a great number of valuable feedback and learned about some more things. So thank you everyone who took time to read and follow-up here, much appreciated! Here are two things I’d like to add for now: