Photo by Thomas Jensen on Unsplash
Most of the business applications which are used today communicate with each other using HTTP 1.1 using REST. This has become a standard in the industry due to its sheer simplicity and almost no effort required to integrate the applications.
However, once your application goes to the scale of handling a million requests per second and more, the shortcomings of the above mechanism become apparent. Let’s address those first —
A single HTTP 1.1 connection can only be used to send one request and response at a time. This connection is blocked until the response is received which is highly inefficient.
HTTP 1.1 headers are not compressed which leads to an unnecessary increase in request data size. In HTTP/2, they are compressed using HPACK algorithm which has 99% compression ratio.
HTTP 1.1 is text-based which is highly inefficient for transmission of data. They require complex parsers and also don’t support high levels of compression.
HTTP 2 was made to get rid of all these limitations. It supports multiplexing which allows a client to send multiple parallel requests over a single connection. The headers are compressed using HPACK compression algorithm and transmit binary data.
This article provides a detailed comparison along with performance metrics for both protocols.
In this article, we’ll be exploring how you can use HTTP/2 in your service. Currently, the most popular way to do this is to use gRPC by Google.
In simple words, it is a web framework used to connect multiple services using HTTP/2. It is currently used by hundreds of companies to efficiently connect microservices.
gRPC services differ from REST in the way that they don’t expose endpoints but methods/procedures. The client simply calls a method as if it is locally implemented and in the background, an RPC call is sent to the server which contains the actual implementation of the method.
The interface of the service and its methods is defined in a .proto file. This is because gRPC uses Protocol buffers to transfer the data between various services as well as to generate client and server stubs. This results in an order of magnitude faster serialization and deserialization speeds. The detailed comparison between Protocol Buffers, JSON and Flatbuffers can be found in this article by me.
The RPC call is done over HTTP/2. This allows gRPC users to automatically leverage all the features of the above protocol.
Let’s create a simple Hello World service in gRPC using Java. The example provided here can be found on https://grpc.io/docs/quickstart/java/
First, we define the interface of the service in a .proto file. Let’s name this file greeter.proto
We name the service Greeter. This service contains a method SayHello. The method accepts a HelloRequest object and returns a HelloReply object.
Next, we define the format of HelloRequest and HelloReply ProtocolBuffers.
syntax = "proto3";
// The greeting service definition.
service Greeter {
// Sends a greeting
rpc SayHello (HelloRequest) returns (HelloReply) {}
}
// The request message containing the user's name.
message HelloRequest {
string name = 1;
}
// The response message containing the greetings
message HelloReply {
string message = 1;
}
Next, we’ll generate the service interface and client stub using protobuf compiler. Create your usual maven java project. Next copy the greeter.proto file in src/main/proto folder.
Once it is there, we can use maven proto compiler to generate Java classes from this file.
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<grpc.version>1.19.0</grpc.version><!-- CURRENT_GRPC_VERSION -->
<protobuf.version>3.6.1</protobuf.version>
<protoc.version>3.6.1</protoc.version>
<!-- required for jdk9 -->
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
</properties>
<build>
<sourceDirectory>
${basedir}/target/generated-sources/
</sourceDirectory>
<extensions>
<extension>
<groupId>kr.motd.maven</groupId>
<artifactId>os-maven-plugin</artifactId>
<version>1.6.2</version>
</extension>
</extensions>
<plugins>
<plugin>
<groupId>org.xolstice.maven.plugins</groupId>
<artifactId>protobuf-maven-plugin</artifactId>
<version>0.5.1</version>
<configuration>
<protocArtifact>com.google.protobuf:protoc:${protoc.version}:exe:${os.detected.classifier}</protocArtifact>
<pluginId>grpc-java</pluginId>
<pluginArtifact>io.grpc:protoc-gen-grpc-java:${grpc.version}:exe:${os.detected.classifier}</pluginArtifact>
</configuration>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>compile-custom</goal>
<goal>test-compile</goal>
<goal>test-compile-custom</goal>
</goals>
</execution>
</executions>
</plugin>sasas
</plugins>
</build>
The classes will be generated in target/generated-sources/protobuf directory.
We can now extend the service interface generated to implement the methods.
private class GreeterImpl extends GreeterGrpc.GreeterImplBase {
@Override
public void sayHello(HelloRequest req, StreamObserver<HelloReply> responseObserver) {
//Build Proto Messages Object
HelloReply reply = HelloReply.newBuilder().setMessage("Hello " + req.getName()).build();
//Send the response, you can send multiple objects which will be trasmitted in streaming fashion
responseObserver.onNext(reply);
//Indicate client that response has finished
responseObserver.onCompleted();
}
}
Clients can now simply call this sayHello method and the service will return the response via HTTP/2
public class Client {
private final ManagedChannel channel;
private final GreeterBlockingStub blockingStub;
private final GreeterStub asyncStub;
public Client(String host, String port){
channel = ManagedChannelBuilder.forAddress(host, port).usePlaintext().build(); //remove .usePlainText for secure connections
blockingStub = GreeterGrpc.newBlockingStub(channel); // use it to make blocking calls
asyncStub = GreeterGrpc.newStub(channel); // or use this to make async calls
}
public void blockingGreet(String name) {
logger.info("Will try to greet " + name + " ...");
HelloRequest request = HelloRequest.newBuilder().setName(name).build();
HelloReply response;
try {
response = blockingStub.sayHello(request);
} catch (StatusRuntimeException e) {
logger.log(Level.WARNING, "RPC failed: {0}", e.getStatus());
return;
}
logger.info("Greeting: " + response.getMessage());
}
public void asyncGreet(String name) {
StreamObserver<HelloReply> responseObserver = new StreamObserver<HelloReply>() {
@Override
public void onNext(HelloReply response) {
logger.info("Greeting: " + response.getMessage());
}
@Override
public void onError(Throwable t) {
Status status = Status.fromThrowable(t);
logger.log(Level.WARNING, "RPC Failed: {0}", status);
}
@Override
public void onCompleted() {
info("Finished Greeting");
}
};
logger.info("Will try to greet " + name + " ...");
HelloRequest request = HelloRequest.newBuilder().setName(name).build();
asyncStub.sayHello(request, responseObserver);
}
}
You can either use the server bundled with grpc or you can use external frameworks which already provide grpc bindings such as Vert.x Java
public GreeterServer(int port) throws IOException {
this(ServerBuilder.forPort(port), port);
}
public GreeterServer(ServerBuilder<?> serverBuilder, int port) {
this.port = port;
server = serverBuilder.addService(new GreeterImpl()).build();
}
public void start() throws IOException {
server.start();
logger.info("Server started, listening on " + port);
}
Now, you have successfully implemented a basic gRPC service.
The non-trivial part of running a http/2 service in production is load balancing. gRPC breaks the standard connection-level load balancing i.e. to create a new connection to another instance for a request which is provided by default in Kubernetes or HAProxy. This is because gRPC is built on HTTP/2, and HTTP/2 is designed to have a single long-lived TCP connection.
The solution is to do request level load balancing. This means to create long-lived connections and then distribute requests across those connections.
The easiest and most effective way to do this is to use linkerd2. It is a service-mesh which can run beside your Kubernetes/Mesos or any other cluster. Linkerd2 serves as a proxy for the incoming request. Since it is written in rust it adds a very minimal delay (<1ms) and load balance requests across host machines which it can detect through k8s API or DNS.
You don’t need to configure anything extra in Linkerd2, it handles HTTP 1 and 2 traffic by default.
If you want to learn more about gRPC, linkerD or http/2 you can refer the links below:
gRPC Load Balancing on Kubernetes without Tears
HTTP/2: the difference between HTTP/1.1, benefits and how to use it
Connect with me on LinkedIn or Twitter or drop a mail to [email protected]