This article delves into the intricacies of Microservices Architecture, with a special emphasis on its vital component - data transmission, in the context of a proxyless approach. We explore the nuances, benefits, and challenges of adopting this method, where direct service-to-service communication plays a pivotal role. The discussion extends to how DevOps practices and the right strategic choices can significantly bolster this architecture. Furthermore, we'll navigate through the realm of Cloud-Native DevOps, scrutinizing specific practices and tools that are essential in cloud-native environments, across platforms like AWS, Azure, or Google Cloud. This exploration aims to shed light on how a proxyless architecture can be effectively implemented and optimized in these diverse cloud platforms.
In today's cloud-based microservice architectures, efficient data exchange is crucial, particularly as we often operate in environments where traditional proxy-servers like Nginx are not the optimal solution. This necessity arises from various factors, leading to a shift towards a proxyless approach in specific scenarios:
Understanding the use cases for a proxyless approach, particularly in the context of data compression, becomes essential for optimizing microservices' efficiency and performance.
"To compress, or not to compress, that is the question: Whether 'tis nobler in the bytes to suffer. The pings and lags of outrageous file sizes, Or to take arms against a sea of data transfer, And by opposing, compress them."
Data transfer has become an integral part of everyday life, akin to the air we breathe - its value is only truly appreciated when lacking. As IT professionals, we are responsible for this 'digital air,' ensuring its purity and uninterrupted flow. We must focus on maintaining the quality and speed of data transmission, much like environmentalists fight for clean air in cities. Our task extends beyond merely sustaining a steady stream of data; we must also ensure its efficiency and security within the complex network ecosystems of microservices. As a bridge connecting the dots in a Microservices Architecture, the network is also a challenging terrain where data faces numerous obstacles that can impact performance, efficiency, and security.
Note: Understanding these challenges is pivotal in appreciating the solutions and strategies that can be employed to overcome them, including the crucial role that data compression plays in this equation.
Data compression is a technique to reduce data size to save storage space or transmission time. For microservices, this means that when one service sends data to another, instead of sending the raw data, it first compresses the data, sends the compressed data across the network, and then the receiving service decompresses it back to the original format. This will be the working scenario that we will focus on.
Data compression reduces data size, leading to transfer faster and better performance. However, it can cause data loss, format inconsistencies, and performance problems.
Note: This article will close some of these problems to ensure reliable and efficient data exchange between microservices.
When considering data exchange between microservices, choosing the appropriate data compression algorithm is crucial to ensure efficiency and performance. Let's dive into the most common types of data compression algorithms.
Lossless Compression Algorithms: |
Lossy Compression Algorithms: |
---|---|
These algorithms compress data to allow the original data to be perfectly reconstructed from the compressed data. |
These algorithms compress data by removing unnecessary or less important information, meaning the original data cannot be perfectly reconstructed from the compressed data. |
Examples include Huffman coding, Lempel-Ziv (LZ77 and LZ78), and Deflate, which are also used in GZIP. |
Examples commonly used in audio and video compression include JPEG for images and MP3 and AAC for audio. |
Another notable example is Snappy, which prioritizes speed and efficiency, making it a suitable choice for scenarios where quick data compression and decompression are more critical than achieving the highest compression level. |
|
The final choice depends on the specific needs of the project and the characteristics of the data being exchanged between microservices.
GZIP is the optimal choice as a data compression and decompression software application for our purposes. It uses the "Deflate" algorithm, combining the LZ77 algorithm and Huffman coding. Since the "Deflate" algorithm is a "Lossless data compression algorithm," the original data can be perfectly reconstructed from the compressed data. Everyone in the business wants to receive accurate data and avoid any mistakes.
Verdict: Gzip is versatile and practical, so it is widely used for compressing text files and is recognized as both a lossless compression algorithm and a text compression tool.
You can skip the Microservice Creating section and jump straight to the test cases or summary part.
Now that we've brushed up on the basic concepts, it's time to start creating.
We will zoom in on two regular links in the intricate tapestry of microservice architectures, where hundreds of interconnected services weave a complex network. These two, isolated from the vast ecosystem, distinct microservices will serve as our focal points, allowing us to dissect and deeply understand a specific study case.
As mentioned, we create two independent Spring Boot Kotlin microservices:
Hint: Data BaseH2 was chosen as an in-memory Java SQL database for our study case example. The unique feature of H2 is that it only exists during the application's runtime and does not require separate installation. It is a lightweight, fast, and convenient option for development and testing purposes. Note that all database data will be erased upon application restart as it is temporary.
For short project generation, use the start.spring.io utility. Let's start.
Hint: It’s important to keep a close eye on your dependencies and only install what is necessary for your needs.
I already have made two configurations for you: service-db / service-mapper
Download the projects and unzip them.
When you succeed with unzipping and indexation, you should add some code to your application.properties files.
By default, we need to configure the data source settings in service-db.
# DataSource Configuration
spring.datasource.url=jdbc:h2:mem:testdb
spring.datasource.driver-class-name=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=
# JPA/Hibernate Configuration
spring.jpa.database-platform=org.hibernate.dialect.H2Dialect
# H2 Database Console Configuration
spring.h2.console.enabled=true
# Server Configuration
server.port=8081
server.servlet.context-path=/db
application.properties in service-mapper:
# Server Configuration
server.port=8080
server.servlet.context-path=/mapper
# Microservice Configuration
db.microservice.url=http://localhost:8081/db/
user.end.point=user
Hint: These variables are used to connect to the service-db microservice:
Our project will follow a specific standard structure.
src
└── main
└── kotlin
└── com
└── example
└── myproject
├── config
│ └── AppConfig.kt
├── controller
│ └── MyController.kt
├── repository
│ └── MyRepository.kt
├── service
│ └── MyService.kt
├── dto
│ └── MyUserDto.kt
└── entity
└── MyEntity.kt
In our course, we will be implementing the following points:
service-db implementation |
service-mapper implementation |
---|---|
User Entity. |
User DTO. |
Service for Data Compression. |
Decompression Service - Logic for Data Decompression. |
Put/GET Controllers to save/get the Entity to/from the database. |
GET Controller - Get Raw / Compressed Data from service-db. |
Like sketching a blueprint before building a rocket, we start coding by shaping up DTO/Entity as my first step.
data class UserEntity in service-db and UserDto in service-mapper:
# service-db
@Entity
data class UserEntity(
@GeneratedValue(strategy = GenerationType.IDENTITY)
@Id
@jakarta.persistence.Id
val id: Long,
val name: String,
val creativeDateTime: LocalDateTime? = LocalDateTime.now(),
)
# service-mapper
data class UserDto(
val id: Long,
val name: String,
val creativeDateTime: LocalDateTime,
)
NOTE: For the service-db in our project, we will use the following decorators in the Entity to handle the unique identification of each entity. Using these annotations, the database will automatically manage the assignment of unique IDs when new entities are created, simplifying the entity management process.
Let's shift our attention toward service-db and create a Classic Duet comprising a Repository and a Controller.
package com.compress.servicedb.repository
@Repository
interface UserRepository : CrudRepository<UserDto, Long>
@RestController
@RequestMapping("/user")
class EntityController(
private val repository: UserRepository
) {
@GetMapping
fun getAll(): MutableIterable<UserDto> = repository.findAll()
@PostMapping
fun create(@RequestBody user: UserDto) = repository.save(user)
@PostMapping("/bulk")
fun createBulk(@RequestBody users: List<UserDto>): MutableIterable<UserDto> = repository.saveAll(users)
}
Please verify by Postman if everything is clear and proceed with the following steps accordingly.
POST: localhost:8081/db/user/bulk
POST: localhost:8081/db/user/
GET: localhost:8081/db/user
NOTE: We need to ensure our services are configured and communicating effectively. Our current method of retrieving data from the service-db is rudimentary but serves as our first test. Once we have enough data, we can measure retrieval volume and speed to establish a baseline for future comparisons.
Let's fly to service-mapper and create a Controller and a Service.
package com.compress.servicemapper.service
@Service
class UserService(
val restTemplate: RestTemplate,
@Value("\${db.microservice.url}") val dbMicroServiceUrl: String,
@Value("\${user.end.point}") val endPoint: String
) {
fun fetchData(): List<UserDto> {
val responseType = object : ParameterizedTypeReference<List<UserDto>>() {}
return restTemplate.exchange("$dbMicroServiceUrl/$endPoint", HttpMethod.GET, null, responseType).body ?: emptyList()
}
}
package com.compress.servicemapper.controller
@RestController
@RequestMapping("/user")
class DataMapperController(
val userTransformService: UserService
) {
@GetMapping
fun getAll(): ResponseEntity<List<UserDto>> {
val originalData = userTransformService.fetchData()
return ResponseEntity.ok(originalData)
}
}
In the service mapper, we configure a RestTemplate to facilitate communication between different parts of our application. This is done by creating a configuration class annotated with @Configuration containing the bean definition for our RestTemplate.
Here's the code:
@Configuration
class RestTemplateConfig {
@Bean
fun restTemplate(): RestTemplate {
return RestTemplateBuilder().build()
}
}
@Configuration is a Spring annotation that indicates that the class has @Bean definitions. Beans are objects the Spring IoC (Inversion of Control) container manages. In this case, our RestTemplate bean is defined in the RestTemplateConfig class.
The @Bean annotation in Spring identifies a method that produces a bean for the Spring container. In our example, the restTemplate method creates a new RestTemplate instance used for HTTP operations in Spring. This configuration lets you define RestTemplate once and inject it into necessary components, making your code cleaner and easier to maintain.
Without this configuration, you can still use RestTemplate, but you will need to create an instance of it at each place where you use it. This can lead to code duplication.
Postman is ready to start. Make sure that you created some users in service-db. As a result, you will get the same users in service-db.
localhost:8080/mapper/user
It's time to implement Compression in service-db. The heart of our Compression will be in CompressDataService. Take a look at the code:
@Service
class CompressDataService(
private val objectMapper: ObjectMapper
) {
fun compress(data: Iterable<UserEntity>): ByteArray {
val jsonString = objectMapper.writeValueAsString(data)
val byteArrayOutputStream = ByteArrayOutputStream()
GZIPOutputStream(byteArrayOutputStream).use { gzipOutputStream ->
val bytes = jsonString.toByteArray(StandardCharsets.UTF_8)
gzipOutputStream.write(bytes)
}
return byteArrayOutputStream.toByteArray()
}
}
The compress function takes a collection of UserEntity objects as input and returns a Byte Array. The primary goal of this function is to compress data to save space during transmission or storage. Here's what happens inside the function:
Thus, the compress function transforms a collection of objects into a compressed Byte Array, saving space during data transmission or storage.
The Controller is so banal that it does not need comments.
@RestController
@RequestMapping("/user")
class CompressedDataController(
private val repository: UserRepository,
private val compressDataService: CompressDataService,
) {
@GetMapping("/compressed")
fun fetchCompressedData(): ByteArray {
val data: Iterable<UserDto> = repository.findAll()
return compressDataService.compress(data)
}
}
Test yourself, please. You'll receive something special. If you're a young budding engineer, it's possible the first time you see a response like this. That's ByteArray, babe.
GET: localhost:8081/db/user/compressed
Check the sound: The data transfer volume is significantly lower at this stage. As the volume of similar-type data grows, the compression efficiency scales up, leading to increasingly better compression ratios.
It's time to decompress our data in the service mapper. To achieve this, we will create a new controller and the heart of this microservice: the decompression service. Let's open our hearts.
import java.io.ByteArrayInputStream
import java.util.zip.GZIPInputStream
@Service
class UserDecompressionService(
@Value("\${db.microservice.url}") val dbMicroServiceUrl: String,
@Value("\${user.end.point}") val endPoint: String,
val restTemplate: RestTemplate,
private var objectMapper: ObjectMapper
) {
fun fetchDataCompressed(): ResponseEntity<ByteArray> {
return restTemplate.getForEntity("$dbMicroServiceUrl$endPoint/compressed", ByteArray::class.java)
}
fun decompress(compressedData: ByteArray): ByteArray {
ByteArrayInputStream(compressedData).use { bis ->
GZIPInputStream(bis).use { gzip ->
return gzip.readBytes()
}
}
}
fun convertToUserDto(decompressedData: ByteArray): List<UserDto> {
return objectMapper.readValue(decompressedData, object : TypeReference<List<UserDto>>() {})
}
Together, these methods create a workflow within the service for retrieving compressed user data from a microservice, decompressing it, and converting it into a list of
Controller: add this function. Here, we see what is described above in service.
@GetMapping("/decompress")
fun transformData(): ResponseEntity<List<UserDto>> {
val compressedData = userDecompressionService.fetchDataCompressed()
return try {
val decompressedData = userDecompressionService.decompress(compressedData.body!!)
val users = userDecompressionService.convertToUserDto(decompressedData)
ResponseEntity.ok(users)
} catch (e: Exception) {
ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build()
}
Please test by Postman that everything is 200 OK. You have to receive decompressed data one-to-one, like from service-db.
localhost:8080/mapper/user/decompress
We have made significant progress in developing our two microservices, and we are now at an important milestone, similar to crossing the Rubicon. Our initial efforts have led us to build the blueprint of a globally renowned application. This pattern is the foundation of transformative platforms like Netflix's streaming service and Amazon's e-commerce ecosystem. These platforms leverage robust cloud systems to provide seamless, scalable, resilient services worldwide.
We have a wide variety of compression algorithms available for our test, and it would be beneficial to add another actor. As previously mentioned, Snappy Compression argues that it could be more efficient sometimes, and we plan to test this theory. How?
We add to Gradle dependencies in two microservices:
implementation ("org.xerial.snappy:snappy-java:1.1.10.5")
class CompressDataSnappyService(
private val objectMapper: ObjectMapper
) {
fun compressSnappy(data: Iterable<UserEntity>): ByteArray {
val jsonString = objectMapper.writeValueAsString(data)
return Snappy.compress(jsonString.toByteArray(StandardCharsets.UTF_8))
}
2. Controller:
@RestController
@RequestMapping("/user")
class UserCompressedSnappyController(
private val repository: UserRepository,
private val compressDataService: CompressDataSnappyService,
) {
@GetMapping("/compressed-snappy")
fun fetchCompressedData(): ByteArray {
val data: Iterable<UserEntity> = repository.findAll()
return compressDataService.compressSnappy(data)
}
}
fun decompressSnappy(compressedData: ByteArray): ByteArray {
return Snappy.uncompress(compressedData)
}
fun fetchDataCompressedSnappy(): ResponseEntity<ByteArray> {
return restTemplate.getForEntity("$dbMicroServiceUrl$endPoint/compressed-snappy", ByteArray::class.java)
}
@GetMapping("/decompressed-snappy")
fun transformDataSnappy(): ResponseEntity<List<UserDto>> {
val compressedData = userDecompressionService.fetchDataCompressedSnappy()
return try {
val decompressedData = userDecompressionService.decompressSnappy(compressedData.body!!)
val users = userDecompressionService.convertToUserDto(decompressedData)
ResponseEntity.ok(users)
} catch (e: Exception) {
ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build()
}
}
Original service-db / service-mapper you can download from GitHub.
Let me introduce my test case scenarios and my vision of the feasibility of using this approach globally.
Data compression is not merely programming magic; it's an art and science that allows us to achieve more with less. But how can we measure the true greatness of this art form? How can we ensure that our methods are functioning and functioning well?
Enter the concept of "Baseline Comparison" or "Reference Comparison." In simple terms, we create a starting point - a baseline - our uncompressed data in all its original, unedited glory. This raw data state will become our reference point for comparison and evaluation.
2. Data Transfer:
3. Results Validation:
4. Real Life testing:
Expected Results:
Postconditions:
Success Criteria:
In our demo, we'll leverage Apache JMeter, a robust testing tool, to simulate microservices interactions and measure the benefits of data compression. JMeter is ideal for such purposes as it can simulate loads and test service performance. It has documentation, tons of plugins, and excellent support. We aim to showcase how data compression optimizes traffic and communication efficiency between services.
In all cases, we will use a performance loader as 1000 Users each will create one request. Equals 1.000 requests to get 1/100/1.000/10.000 users.
This will give us a large field for statistical data and test our application for performance.
As we approach the testing track, our contenders are lining up, engines revving, each representing a unique approach to data transfer in the grand prix of microservice performance. Let's introduce our formula one teams:
Each API, a unique formula of technical prowess, stands ready to tackle the circuit, proving its mettle. As the flag waves, watch closely, for this race is not just about raw speed - it's about strategy, resource management, and endurance in the cloud arena, where every millisecond counts.
The result gives you the percentage reduction in size due to Compression.
Regarding the most indicative percentiles (% Line), attention is usually focused on the 99% percentiles as they reflect the response time for the slowest requests. These values help determine if the system's performance characteristics meet performance requirements and how resilient the system is under high loads.
Constants during the test.
One user costs:
100 users cost:
1.000 users cost:
10.000 user costs:
The first race will be under laboratory conditions - both microservices run on one CPU by different ports. It's the simulation of Zero Latency Conditions. I believe we will one day achieve Seamless Data Flow, but as long as humanity works on this, we will always have optimization work.
So, four laps with getting 1/100/1000/10000 users.
Pure compressed data is not considered; instead, we observe it to determine our location.
Note: All Sheets in CSV can be downloaded by this public link.
Subtotal: In our analysis, we observed that our system can crack like nuts 1, 100, and 1000 users, but it starts to experience delays when we increase the load to 10,000 users. That's why we must make use of pageable requests. JSON is the fastest response format, but the other teams are catching up quickly. Synthetic victory goes to JSON/Snappy. However, the most crucial factor is not necessarily speed but how much traffic we can save. The winner in this regard may be different from what we initially thought.
One more important conclusion is that for small requests, such as getting just one user, we experience a negative compression rate. This means that there are no advantages to using compression as the data becomes more extensive and requires more computing power to compress and decompress. Therefore, it's essential always to remember this case and conduct thorough testing to ensure optimal performance.
Local ping is usually the lowest because data is transmitted over short distances and is least affected by delays at routers and switches.
Latency increases when traffic crosses longer distances and goes through more routers and network nodes.
Intercontinental requests usually have the highest latency due to the data passing through submarine cables and intercontinental connections, significantly increasing the distance traveled and the number of intermediate nodes.
After setting the network latency, we can observe the outcome of our data compression taking shape. It's essential to remember that during transmission, several factors can affect the data: jitters, throttling, drops, packet loss, bandwidth limitation, congestion, latency spikes, routing and DNS issues, and so on.
In cases of longer lag time, the benefits of using highly compressed data become more apparent. GZIP emerges as the clear winner in this scenario. On the other hand, Snappy, which offers a perfectly balanced solution, performs well in low lag times but falls short compared to GZIP in mid to high lag times. Unfortunately, JSON fails to handle lag cases effectively and distorts reality.
Please keep in mind that these diagrams can be accessed through the link provided to facilitate better analysis.
Reminder: I hope you remember that you get a compression rate of 80-95% in addition to the time result in some cases.
To effectively identify and optimize traffic between microservices in proxyless applications, a strategy combining monitoring, traffic analysis, and data compression tools is essential. Here are the steps for developing such a strategy:
Identify the most frequently used APIs and those generating the most traffic. Tools for monitoring and analyzing traffic are necessary for this.
After identifying the heavily loaded APIs, you can select an appropriate compression strategy.
In a proxyless environment, the focus shifts to application-level compression.
Technologies and requirements change, so regularly reviewing and updating the compression strategy is crucial.
This approach will enable you to efficiently determine and optimize traffic between microservices in proxyless environments, enhancing performance and reducing data transmission costs.