What it's Like to Migrate a Backend Service From Spring Boot to Vert.X

Vert.x is a reactive framework for the JVM that uses a non-blocking event loop to respond to requests similar to Node.js. This is opposed to dedicating a thread to each request as is more typically the case in web application development, including in Spring Boot.

The objective of this approach is to maximize the usage of each thread while minimizing the overhead of creating one thread for each request, particularly in cases where request threads are mostly consumed by waiting (e.g., for the results of a database query). In these cases, the event loop approach increases the application’s run-time efficiency and response time.

Arguably, the trade-off between using Vert.x versus a Spring Boot-based web back-end is the required shift in developer perspective. A typical three-layer Spring Boot app (web endpoints, services, data access) is a known quantity: the logic is generally straightforward to follow and debug, and the Spring and Hibernate stacks are very popular with a number of resources and examples available. Given the asynchronous nature of Vert.x, it requires a shift in perspective that can complicate code unnecessarily if its efficiency and scalability benefits are not required. With that in mind, this article compares the performance and developer experience of Vert.x against Sprint Boot by taking a working Spring Boot project and adapting it into a functionally equivalent Vert.x back-end.

This case study is inspired by a project I worked on with my friends at AuthentiGATE. They provide an e-commerce platform for selling and scanning tickets at some of Canada’s largest events. In particular, their scanning system that processes tickets and badges from handheld scanners is an ideal candidate to evaluate Vert.x: it handles a large volume of requests that need to be responded to quickly with the majority of its response time spent waiting for SQL query results. The AuthentiGATE scanner API is a Spring Boot-based application, so we’ll use a toy version of that and adapt it into Vert.x to compare.

Scenario

We’ll simplify the domain for this exercise and only look at the scanning of tickets. When a scanner logs in, they are assigned to scan for a specific location. Each location will have its own set of rules where if any one of them is satisfied, the scan is successful. Otherwise, the scan will be rejected.

All configuration for the current location is held in memory for quick access (the Location Configuration in the figure above). This includes the configuration for how to parse the scanned barcode and all possible scanning rules (e.g., tickets of type X can be scanned for entry once per day to a maximum of three times over the duration of an event).

Once a list of relevant rules has been determined for a scanned barcode, each one will be evaluated until one succeeds. If no rule evaluates successfully, or if there are no rules that apply to the ticket, it will be rejected. Each rule evaluation will need to retrieve a set of previous scans for that ticket. Since there are some minor variations between rules (e.g., whether scans from all locations are considered or just the current one), each rule is responsible for querying the database for the previous scans it needs to make a decision.

The existing application is written in a standard three-layer Spring Boot project with the API being served via Jersey and an embedded Tomcat web server (the Spring Boot default), the business logic living in the service layer, and data obtained from the PostgreSQL database using direct queries via JDBC.

Adjusting the Architecture for Vert.x

In order for our Spring application to work effectively in Vert.x and take advantage of its benefits, we need to refactor all the blocking code. For reference, the blocking parts of the process in the diagram above have been highlighted yellow. Some of the most common blocking situations to be aware of are database queries, file system operations, and API calls. In this particular scenario, we’re dealing only with database queries.

In the Spring Boot approach, we evaluate each rule sequentially. This imperative logic is easy to follow and debug when we have one thread per request that we can block on; however, that logic must be refactored for a non-blocking asynchronous framework like Vert.x. Instead, we’ll want to dispatch all our queries to the JDBC pool and await their responses. The queries will be evaluated in parallel, and the thread handling the web request (the ScansVerticle) will be yielded to other requests. Once we receive a response from all the dispatched queries, the web thread will jump back in as it becomes available again.

Comparing Implementations

The flow in the Spring Boot application is pretty straightforward and can be captured by the following simplified Kotlin snippet from the web and service layers.

@Component
@Path("scanner")
class ScannerEndpoints {
  // See below
  @Autowired
  private lateinit var scansService: ScansService

  @POST
  @Path("scan")
  @Produces("application/json")
  @Consumes("text/plain")
  fun processScan(barocde: String) {
    val location = determineLocationFromRequest()
    return scansService.processScan(barcode, location)
  }
}

@Service
class ScanService {
  @Autowired
  private lateinit var scansRepository: ScansRepository

  fun processScan(ticketBarcode: String, location: Location): ScanRecord {
    // Step #1: Determine the scan outcome 
    val scanRecord = evaluateScan(ticketBarcode, location)
    // Step #2: Save the outcome
    scansRepository.saveScan(scanRecord)
    return scanRecord
  }
    
  private fun evaluateScan(ticketBarcode: String, location: Location): ScanRecord {
    val ticket = parseTicketBarcode(ticketBarcode)
    val rules = location.determineRulesApplicableToTicket(ticket)
    for (rule in rules) {
      val previousScans = scansRepository.findAllApplicableToRule(rule)
      // We have success
      if (rule.evaluate(ticket, previousScans)) {
          return ScanRecord.accepted(/* scan details here */)
      }
    }
    // Nothing succeeded
    return ScanRecord.rejected(/* scan details here */)
  } 
}

A lot of the details have been removed, but all the blocking operations are depicted. Each request runs on a single thread making the code pretty straightforward to step through and to understand for anyone looking at it for the first time.

The Vert.x code, as expected, looks quite a bit different. Let’s start with the scanning verticle. If you’re familiar with Node and Express, then the structure of the code may look familiar. At the top of the start() method, we assemble the HTTP router and its callbacks. Within the handler for the scanning endpoint, we start by evaluating the outcome of the scan (accept/reject) and then save the outcome. The start() method ends with the logic required to create the HTTP server on port 8080.

class ScanningVerticle(val pool: JDBCPool) : AbstractVerticle() {
  private val objectMapper = jacksonObjectMapper()

  override fun start(startPromise: Promise<Void>) {
    val router = Router.router(vertx)
    
    // Configure the HTTP routing
    router
      .post("/scan")
      .consumes("text/plain")
      .produces("application/json")
      .handler { ctx ->
        val response = ctx.response()
        ctx.request().body { requestBody ->
          // Omitted: check that the requestBody extraction was successful

          val location = determineLocationFromRequest()
          val logic = ConsumptionLogic(location, pool)

          val barcode = requestBody.result().toString()
          logic
            // Step #1: Determine the scan outcome
            .evaluateScan(barcode)
            .onFailure { response.setStatusCode(500).end("Unexpected failure") }
            .onComplete { result ->
              val scanRecord = result.result()
              // Step #2: Save the outcome
              logic.saveScan(scanRecord)
                .onFailure { response.setStatusCode(500).end("Unexpected failure") }
                .onComplete { response.end(objectMapper.writeValueAsString(scanRecord)) }
            }
        }
      }
    
    // Start the Vert.x HTTP server
    vertx
      .createHttpServer()
      .requestHandler(router)
      .listen(8080) { http ->
          if (http.succeeded()) {
              startPromise.complete()
              println("HTTP server started on port 8080")
          } else {
              startPromise.fail(http.cause());
          }
      }
  }
}

Next, we look at the consumption logic which we’ve condensed into the ConsumptionLogic class below. The evaluateScan() method calls the evaluateRule() method for each matched rule for the scanned ticket. Notice that evaluateRule() returns a Future (similar in concept to a promise in Javascript), and CompositeFuture.all() will wait for all these futures to resolve before determining the next steps.

data class RuleEvaluation(
    val rule: Rule,
    val success: Boolean
)

class ConsumptionLogic(val location: Location, val pool: JDBCPool) {
  fun evaluateScan(barcode: String): Future<ScanRecord> {
    val promise = Promise.promise<ScanRecord>()
    val ticket = parseTicketBarcode(ticketBarcode)
    val rules = location.determineRulesApplicableToTicket(ticket)

    CompositeFuture
      // Wait on all rules to evaluate
      .all(rules.map { rule -> evaluateRule(ticket, rule)})
      .onFailure { promise.fail(it) }
      .onComplete { result ->
        val ruleEvaluations = result.result().list<RuleEvaluation>()
        // Find the first rule with a successful outcome
        val firstMatch = ruleEvaluations.find { it.success }
        if (firstMatch != null) {
          promise.complete(ScanRecord.accepted(/* scan details here */))
        } else {
          // If a successful outcome doesn't exist in our results, then we reject the scan
          promise.complete(ScanRecord.rejected(/* scan details here */))
        }
      }
    return promise.future()
  }

  fun evaluateRule(ticket: Ticket, rule: Rule): Future<RuleEvaluation> {
    val promise = Promise.promise<RuleEvaluation>()
    val previousScansQuery = rule.buildPreviousScansQueryForTicket(ticket)
    pool
      .preparedQuery(previousScansQuery)
      .execute(/* Query parameters here */)
      .onFailure { promise.fail(it) }
      .onComplete { rows ->
        val previousScans = rows.result()
        promise.complete(RuleEvaluation(rule, rule.evaluate(ticket, previousScans)))
      }
    return promise.future()
  }

  fun saveScan(scanRecord: ScanRecord): Future<RowSet<Row>> {
    return pool
      .preparedQuery("INSERT INTO ...")
      .execute(/* Query parameters here */)
  }
}

Finally, we have the main() method that creates the JDBC pool and creates an instance of the ScanningVerticle.

fun main(args: Array<String>) {
  val vertx = Vertx.vertx()
  val pool = JDBCPool.pool(
      vertx,
      JDBCConnectOptions()
          .setJdbcUrl("jdbc:postgresql://localhost:5432/database")
          .setUser("database")
          .setPassword("database"),
      PoolOptions()
        .setMaxSize(16)
  )
  
  vertx.deployVerticle(ScanningVerticle(pool))
}

Benchmarking the Implementations

We’ll use JMeter to simulate the same load on the Spring Boot and Vert.x back-ends. The following test plan will generate random barcodes algorithmically that are known to be accepted when scanned for both scenarios. We’ll simulate the same number of concurrent threads (users) for both the Spring Boot and Vert.x tests, with a one-second ramp-up and loop five times. Both applications are running on the same PC in order to compare relative performance between the two. We will vary the number of concurrent threads and compare the results. Note that tickets in both cases will always only have a single rule to evaluate as this reflects the typical situation in reality as well.

We can see the response time and throughput are more-or-less the same up to about 50 threads. However, as we approach 100 threads, we’re nearing the terminal throughput for both back-ends: Spring Boot hits a maximum requests/second of around 295 versus Vert.x’s 450 (both at 1,000 threads). That’s an impressive increase in throughput of over 50% for Vert.x. Under load Vert.x is also consistently faster to respond with a tighter standard deviation as well. We see that once we hit the terminal throughput for both implementations at 200 threads, Vert.x is able to continue to respond to requests 46% faster, although that advantage shrinks as it is subjected to greater load.

If we run an additional load test where both Spring Boot and Vert.x back-ends are evaluated simultaneously then we can compare the relative response times between the two in the same graph. We’ll start by looking at the scenario where both back-ends are still below their terminal throughput at 50 threads.

This is consistent with the table above where both have near identical response times (average and standard deviation) and throughput. Next, we’ll run this simulation again at the point where both back-ends become “stressed” at 200 threads (Spring Boot is red; Vert.x is blue).

In this test, Spring Boot responded on average after 593 ± 482ms while Vert.x responded on average 382 ± 90.64ms. Spring Boot had a throughput of 239.1 requests/second vs. Vert.x’s 349.5 (46% greater). Spring Boot’s response time deviation stands out in this scenario (over five times greater than Vert.x) with Vert.x handling the load much more consistently. You can also see that Vert.x is able to process its backlog of scans well before Spring Boot.

Discussion and Conclusion

From a developer experience point-of-view, working with Vert.x was an enjoyable challenge. Especially when coupled with Kotlin, it’s a very modern and pleasant stack to develop on. It requires a change in mindset from Spring Boot development, but the documentation and resources are well put together, and the intuitiveness and consistency of the framework made it easy to work with.

However, I would be concerned about introducing it to a team that wasn’t fully on-board, or whose experience skews more toward the junior side. It does take a time investment to get up to speed, and given its niche position, it will be more difficult to find training resources that can accommodate all skill levels. There’s also the risk that improperly implemented Vert.x code can be slower than threaded logic (e.g., if you accidentally block a Verticle’s thread). On the other hand, developers coming from the Node ecosystem needing to work on the JVM may find the transition to Vert.x more comfortable than Spring Boot.

The most interesting result from the benchmarks is that the speed and throughput results are nearly identical until they begin to approach their terminal throughput. Even while running on a regular desktop PC, Spring Boot was able to handily process 50 requests at a time. If we assume that the average human scanner is only able to scan a barcode every three seconds, then that would mean that, in theory, a commodity PC could process scans from 150 active scanners and still respond in less than 40ms. Consequently, in this scenario, there’s a strong case for Spring Boot: the Spring Boot code is easier to follow with fewer parts to maintain, and since the load levels at which Spring Boot hits its terminal throughput are quite high, incorporating Vert.x is likely not advantageous in this scenario.

That being said, since it’s possible to develop Vert.x endpoints alongside Spring Boot within the same codebase, it would be possible to migrate specific endpoints to Vert.x as the need arises. For example, we could use Gradle to generate a separate fat JAR for each entry-point (Spring Boot and Vert.x) with each running on different containers. In fact, the Spring Boot and Vert.x applications benchmarked above were completed in the same project, just running the main() of two different classes.

For future work, I would be interested in applying Vert.x to a situation with a large volume of data points to process (e.g., internet-of-things or analytical event streams). Based on the how Vert.x gracefully handled heavy load, it would be interesting to evaluate Vert.x against other options in a similar manner. In particular, with the backing of GraalVM which Vert.x also supports.

This article was first published here