Recently, I worked on a Playwright Java project. One of its characteristics is its single-thread nature. This means we have to use a instance in a single thread at a time. For the performance’s sake, we need to overcome this limit. Playwright To get to the point, in the following demo, we’ll expose an HTTP endpoint that screenshots a web page given its URL. And we’ll build various solutions along with their benchmark when it’s useful to do so. These solutions will use the following interfaces. And they differ only by the implementation. Browser public interface Browser extends AutoCloseable { Screenshot screenshot(URL url); } public interface Screenshot { byte[] asByteArray(); String mimeType(); } Unsafe solution The first and unsafe (i.e., not thread-safe) solution could be the following one. public final class UnsafePlaywright implements Browser { private final Playwright playwright; private final Page page; public UnsafePlaywright() { this(Playwright::chromium); } public UnsafePlaywright(final Function<Playwright, BrowserType> browserTypeFn) { this( browserTypeFn, Playwright.create() ); } // this constructor pre-instantiate Playwright objects in order to avoid delay UnsafePlaywright(final Function<Playwright, BrowserType> browserTypeFn, final Playwright playwright) { this( playwright, browserTypeFn.apply(playwright).launch().newContext().newPage() ); } UnsafePlaywright( final Playwright playwright, final Page page ) { this.playwright = playwright; this.page = page; } @Override public Screenshot screenshot(final URL url) { this.page.navigate(url.toString()); return new Png(this.page.screenshot()); } @Override public void close() throws Exception { this.playwright.close(); } } This solution by itself is useless except in a single-thread context. So, the next step is to build a thread-safe solution. Lock based solution This solution implements the interface considering the single-thread nature of Playwright. It composes the and the general-purpose . Browser UnsafePlaywright LockBasedBrowser public final class LockBasedPlaywright implements Browser { private final Browser origin; public LockBasedPlaywright() { this(Playwright::chromium); } public LockBasedPlaywright(final Function<Playwright, BrowserType> browserTypeFn) { this( new LockBasedBrowser( new UnsafePlaywright(browserTypeFn) ) ); } LockBasedPlaywright(final Browser origin) { this.origin = origin; } @Override public Screenshot screenshot(final URL url) { return this.origin.screenshot(url); } @Override public void close() throws Exception { this.origin.close(); } } final class LockBasedBrowser implements Browser { private final Browser origin; private final Lock lock; LockBasedBrowser(final Browser origin) { this( origin, new ReentrantLock() ); } LockBasedBrowser(final Browser origin, final Lock lock) { this.origin = origin; this.lock = lock; } @Override public Screenshot screenshot(final URL url) { this.lock.lock(); try { return this.origin.screenshot(url); } finally { this.lock.unlock(); } } @Override public void close() throws Exception { this.origin.close(); } } We can use to implement another lock based implementation on top of an unsafe one. As an example, Selenium (a library similar to Playwright) is also not thread-safe by default. LockBasedBrowser Browser Furthermore, the pattern used to compose objects is the . alias pattern According to this implementation, all the screenshot requests will be serialized because of the lock. This will impact performance. Benchmark We’ll do three benchmarks with . siege $ siege -c 25 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 250 hits Availability: 100.00 % Elapsed time: 28.61 secs Data transferred: 67.19 MB Response time: 2.72 secs Transaction rate: 8.74 trans/sec Throughput: 2.35 MB/sec Concurrency: 23.80 Successful transactions: 250 Failed transactions: 0 Longest transaction: 2.94 Shortest transaction: 0.20 $ siege -c 50 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 500 hits Availability: 100.00 % Elapsed time: 57.03 secs Data transferred: 134.37 MB Response time: 5.42 secs Transaction rate: 8.77 trans/sec Throughput: 2.36 MB/sec Concurrency: 47.50 Successful transactions: 500 Failed transactions: 0 Longest transaction: 5.83 Shortest transaction: 0.18 $ siege -c 100 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 1000 hits Availability: 100.00 % Elapsed time: 115.19 secs Data transferred: 268.74 MB Response time: 10.95 secs Transaction rate: 8.68 trans/sec Throughput: 2.33 MB/sec Concurrency: 95.08 Successful transactions: 1000 Failed transactions: 0 Longest transaction: 11.67 Shortest transaction: 0.18 The siege -c parameter sets concurrent users. While the -r parameter sets how many requests each user will make. The throughput is pretty stable because that’s how much the single instance can handle. And, as a consequence, by increasing concurrent users, the response time increases. The behavior of this implementation is as we expected. On-demand solution The next solution is to try to overcome the single thread limit in a naive way. On each request, it will create a new browser instance. This implementation composes with the general-purpose . UnsafePlaywright OnDemandBrowser public final class OnDemandPlaywright implements Browser { private final Browser origin; public OnDemandPlaywright() { this(Playwright::chromium); } public OnDemandPlaywright(final Function<Playwright, BrowserType> browserTypeFn) { this( new OnDemandBrowser( () -> new UnsafePlaywright(browserTypeFn) ) ); } OnDemandPlaywright(final Browser origin) { this.origin = origin; } @Override public Screenshot screenshot(final URL url) { return this.origin.screenshot(url); } @Override public void close() throws Exception { this.origin.close(); } } final class OnDemandBrowser implements Browser { private final Supplier<Browser> browserSupplier; OnDemandBrowser(final Supplier<Browser> browserSupplier) { this.browserSupplier = browserSupplier; } @Override public Screenshot screenshot(final URL url) { return this.browserSupplier.get().screenshot(url); } @Override public void close() throws Exception { } } According to this implementation, all the screenshot requests will be parallelized. On the surface, this seems fine. In fact, this approach is the worst. This is because each request creates a new instance, so browser objects. This means the process will run out of memory in case of requests spike. Furthermore, we'll also delay each request because of the instantiation process. UnsafePlaywright UnsafePlaywright Benchmark The following benchmarks express the weaknesses of this approach. They also impact in case of a few requests. $ siege -c 25 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 250 hits Availability: 100.00 % Elapsed time: 68.59 secs Data transferred: 67.29 MB Response time: 6.56 secs Transaction rate: 3.64 trans/sec Throughput: 0.98 MB/sec Concurrency: 23.91 Successful transactions: 250 Failed transactions: 0 Longest transaction: 9.28 Shortest transaction: 3.94 Building on top the previous approaches The previous two approaches have both pros and cons. Pros Cons Lock based solution Pre-instantiated Playwright objects reused across requests Serialized requests handling Predictable resource consumption On-demand solution Parallelized requests handling Delayed requests due Playwright instantiation process Unpredictable resource consumption Well, it seems they compensate each other. And this suggests to us another approach that mitigates weakness and boosts strengths. Indeed, we can say that a set of pre-instantiated objects consume a predictable amount of memory. And at the same time, they can parallelize a fixed number of requests. Browser Pool solution The aforesaid ideas translate to a pool of objects. Browser In the following implementation, we’ll compose the general-purpose and . We used the latter because objects in the pool are shared among the requests. So, we need thread-safety. PoolBrowser LockBasedPlaywright Browser public final class PoolPlaywright implements Browser { private final Browser origin; public PoolPlaywright() { this(8); } public PoolPlaywright(final Integer size) { this( size, Playwright::chromium ); } public PoolPlaywright(final Integer size, final Function<Playwright, BrowserType> browserTypeFn) { this( () -> new LockBasedPlaywright(browserTypeFn), size ); } PoolPlaywright(final Supplier<Browser> browserSupplier, final Integer size) { this( new PoolBrowser( browserSupplier, size ) ); } PoolPlaywright(final Browser origin) { this.origin = origin; } @Override public Screenshot screenshot(final URL url) { return this.origin.screenshot(url); } @Override public void close() throws Exception { this.origin.close(); } } final class PoolBrowser implements Browser { private final List<Browser> pool; private final AtomicInteger last; PoolBrowser(final Supplier<Browser> browserSupplier, final Integer size) { this( x -> browserSupplier.get(), size ); } PoolBrowser(final IntFunction<Browser> browserFn, final Integer size) { this( IntStream.range(0, size) .mapToObj(browserFn) .toList(), new AtomicInteger(-1) ); } PoolBrowser(final List<Browser> pool, final AtomicInteger last) { this.pool = pool; this.last = last; } @Override public Screenshot screenshot(final URL url) { return this.browser().screenshot(url); } private Browser browser() { var max = this.pool.size(); return this.pool.get( this.last.accumulateAndGet(0, (left, right) -> left + 1 < max ? left + 1 : 0) ); } @Override public void close() throws Exception { for (var browser : this.pool) { browser.close(); } } } This solution seems to resolve all our issues, but it’s not true. Indeed, in case of requests spike, we cannot predict how much time a request will wait before processing. That’s because each request will be processed by a single instance. And each instance will serialize the requests it will process. So, in case of requests spike, there will be too much contention on the single instance. It’s like we have an unbounded queue of requests per single instance. So, to resolve this issue, we can integrate a bound. In this way, we can guarantee we can process, at most, a fixed number of requests. And this is useful because we can predict better how much time a screenshot will take. Overall, we are improving the software's predictability and quality. LockBasedBrowser Browser Semaphore solution We have many choices to implement the bounded behavior. One of them is to use a to limit access to a single instance. Semaphore LockBasedBrowser public final class SemaphorePlaywright implements Browser { private final Browser origin; public SemaphorePlaywright() { this(Playwright::chromium); } public SemaphorePlaywright(final Function<Playwright, BrowserType> browserTypeFn) { this( browserTypeFn, 32 ); } public SemaphorePlaywright(final Function<Playwright, BrowserType> browserTypeFn, final Integer maxRequests) { this( new SemaphoreBrowser( new LockBasedPlaywright(browserTypeFn), maxRequests ) ); } SemaphorePlaywright(final Browser origin) { this.origin = origin; } @Override public Screenshot screenshot(final URL url) { return this.origin.screenshot(url); } @Override public void close() throws Exception { this.origin.close(); } } final class SemaphoreBrowser implements Browser { private final Browser origin; private final Semaphore semaphore; SemaphoreBrowser(final Browser origin, final Integer maxRequests) { this( origin, new Semaphore(maxRequests) ); } SemaphoreBrowser(final Browser origin, final Semaphore semaphore) { this.origin = origin; this.semaphore = semaphore; } @Override public Screenshot screenshot(final URL url) { if (this.semaphore.tryAcquire()) { try { return this.origin.screenshot(url); } finally { this.semaphore.release(); } } throw new IllegalStateException("Unable to screenshot. Maximum number of requests reached"); } @Override public void close() throws Exception { this.origin.close(); } } Pool and semaphore solution Finally, we need to compose the last two implementations. Browser public final class PoolSemaphorePlaywright implements Browser { private final Browser origin; public PoolSemaphorePlaywright() { this( 8, 32 ); } public PoolSemaphorePlaywright(final Integer poolSize, final Integer maxRequestsPerBrowsr) { this( poolSize, maxRequestsPerBrowsr, Playwright::chromium ); } public PoolSemaphorePlaywright( final Integer poolSize, final Integer maxRequestsPerBrowser, final Function<Playwright, BrowserType> browserTypeFn ) { this( new PoolPlaywright( () -> new SemaphorePlaywright(browserTypeFn, maxRequestsPerBrowser), poolSize ) ); } PoolSemaphorePlaywright(final Browser origin) { this.origin = origin; } @Override public Screenshot screenshot(final URL url) { return this.origin.screenshot(url); } @Override public void close() throws Exception { this.origin.close(); } } In this implementation, each browser in the pool has a semaphore. We can also compose the other way around. This means a pool having a single semaphore. Benchmark The following three benchmarks are about the pool-based approach. The pool size is two. $ siege -c 25 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 250 hits Availability: 100.00 % Elapsed time: 14.73 secs Data transferred: 67.19 MB Response time: 1.40 secs Transaction rate: 16.97 trans/sec Throughput: 4.56 MB/sec Concurrency: 23.81 Successful transactions: 250 Failed transactions: 0 Longest transaction: 1.60 Shortest transaction: 0.19 $ siege -c 50 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 500 hits Availability: 100.00 % Elapsed time: 29.38 secs Data transferred: 134.37 MB Response time: 2.80 secs Transaction rate: 17.02 trans/sec Throughput: 4.57 MB/sec Concurrency: 47.58 Successful transactions: 500 Failed transactions: 0 Longest transaction: 3.01 Shortest transaction: 0.17 $ siege -c 100 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 1000 hits Availability: 100.00 % Elapsed time: 58.90 secs Data transferred: 268.74 MB Response time: 5.60 secs Transaction rate: 16.98 trans/sec Throughput: 4.56 MB/sec Concurrency: 95.08 Successful transactions: 1000 Failed transactions: 0 Longest transaction: 6.01 Shortest transaction: 0.18 Its behavior is pretty stable. But with too many requests, the response time will start to degrade. It’s important to note we have doubled the throughput compared to the lock-based solution. This implementation worked as expected. At this point, we can benchmark the pool and semaphore approach. The pool size is two, and the maximum number of requests bound is thirty-two. This means we can handle sixty-four requests in parallel. After reaching this limit, the service will return an unsuccessful response. $ siege -c 25 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 250 hits Availability: 100.00 % Elapsed time: 15.43 secs Data transferred: 67.19 MB Response time: 1.46 secs Transaction rate: 16.20 trans/sec Throughput: 4.35 MB/sec Concurrency: 23.71 Successful transactions: 250 Failed transactions: 0 Longest transaction: 2.09 Shortest transaction: 0.60 $ siege -c 50 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 500 hits Availability: 100.00 % Elapsed time: 29.33 secs Data transferred: 134.37 MB Response time: 2.79 secs Transaction rate: 17.05 trans/sec Throughput: 4.58 MB/sec Concurrency: 47.52 Successful transactions: 500 Failed transactions: 0 Longest transaction: 3.03 Shortest transaction: 0.20 $ siege -c 100 -r 10 -b -H 'Accept:image/png' 'http://localhost:8080/screenshot?url=http%3A%2F%2Flocalhost%3A8080' Transactions: 626 hits Availability: 62.60 % Elapsed time: 37.17 secs Data transferred: 168.92 MB Response time: 3.57 secs Transaction rate: 16.84 trans/sec Throughput: 4.54 MB/sec Concurrency: 60.14 Successful transactions: 626 Failed transactions: 374 Longest transaction: 3.97 Shortest transaction: 0.00 As expected, with this solution, we didn't improve performance but predictability. Indeed, the throughput is comparable to the previous solution. At the same time, the response time didn't degrade too much in case of spike. Furthermore, the concurrency value in the last benchmark reflects the maximum parallelizable requests. The price to pay to have greater predictability is a lower availability value (i.e., successful responses). Another option to implement the last solution is to use the . Where each loop enqueues the screenshot request, then each instance in the pool dequeues a request and fulfills it. event loop pattern Conclusion We can consider ourselves satisfied. We overcame the single-thread limit by reaching a good level of performance. We also implemented and compared many solutions. For simplicity’s sake, we omitted a few features required in a production-ready system. They are timeouts, asynchronous requests, exceptions handling, and a self-healing implementation. Indeed, sometimes, Playwright objects fail. And the objective of a self-healing implementation is to restore them. Browser To conclude, OOP has a bad reputation in terms of performance. It’s like a rule of thumb, but it’s not true. With proper objects, we can achieve great performance by not penalizing code elegance and readability. This is not at all obvious, and it’s a great gift of OOP.