I was recently sent a coding challenge where I was required to design a messaging system, diagram that design, and attach a writeup explaining my design choices.
I based the structure of this system design off of the principles of Enterprise Integration Patterns (enterpriseintegrationpatterns.com).
- Ensure no message are lost.
- Ensure corrupted or malformed messages cause an alert of some sort to be triggered.
- Permit priority-based messaging.
- Process messages by certain type.
- Collect and log information on each message such as processing time, the number of messages processed, and other related information.
- The first action the messaging system takes is to prepare the message for transit by extending and modifying its header (XML format. in this instance). It will add a timestamp to track the length of processing time, a GUID to ensure uniqueness as well as to track the total number of unique messages that have been processed. A hash is also generated, using the GUID as a key and the message body as the seed, to allow message integrity verification upon arrival at the target destination. A priority level can also be declared in the header.
- Our messaging system now deals with the problem of ensuring that no messages are lost. I resolved this issue by implementing the Guaranteed Delivery Messaging Channel Pattern. This implements a series of local data stores (they don’t need to be local per se, however, that is the simplest implementation for this solution) to ensure that the message is always being stored in a persistent fashion which prevent message loss in case of a system crash or other failure. Once the next store in the series receives the message properly it sends a confirmation receipt which acts as a deletion order for the message being held on the previous store. The first data store is local to the client, before sending the message onto the network (purpose for this addressed in Data Broker paragraph, i.e. decoupling systems). Note: Queues are not a good choice when rapid communication is required since the amount of time a message will sit in the queue is unknown. A Web Service would be a better solution for that use case.
- The next step can be performed in several ways. The simplest way to pass data from a client to a server would be in a Point-To-Point Channel Pattern. However, a Data Broker Channel Pattern would be a more robust solution. Using a Data Broker Channel decouples the destination from the sender while still ensuring proper delivery. This has several benefits as well as a few drawbacks. The benefits of using a Data Broker are increased security, the ability to route multiple applications through a single broker instead of having to write a separate pipeline for each application (this also forces developers to maintain a strict data standard across their varying applications). Perhaps the most important element that a Data Broker provides is the ability for the client to operate even if the server is not online, this enables maintenance on the server system to occur without disrupting the client systems or in the case of a catastrophic event such as a server crash. This ability relies on the Guaranteed Delivery Pattern implementation and its local data stores. The downside of the Data Broker approach is that it requires adherence to a standard which may not suit every application’s needs as well as increased complexity to the overall system.
- The Data Broker then filters each message by type, while giving filtering preference to higher priority messages. The data flow diverges at this point as each message type gets sent on separate paths to the application or server that is expecting messages matching that type. The now filtered messages can be sent via a Publish / Subscribe Channel Pattern working in concert with the Data Broker Pattern and message filter to ensure that all interested and appropriate applications (who are subscribed and match the message type) are informed of the newly available message. Note: The algorithm governing the priority queue for the filtering operation might be modified to allow lower priority messages through on occasion, even if there are high priority messages that just came in, in order to prevent the entire pipeline being dedicated to high priority messages which can possibly result in a severe backlog of lower priority messages and potentially a total blockade. I require a better understanding of the implementation of priority queues to determine exactly how this should, or is already, designed.
- The message has now been sent from the Data Broker to the target server where it will be stored in a local data store and a confirmation receipt/deletion order is sent to the Data Broker’s data store. The message is then checked for data integrity by comparing the hash contained in the header to a newly hash generated using the GUID and message body. If the hashes do not match then the message has been corrupted or is malformed as is sent to the Error Queue which organizes errors in order of priority to ensure that high priority messages are re-requested as quickly as possible. The Error Queue then re-requests the message from the client and logs the error in the Alert System. If enough errors occur with the same message/same message type an error may be submitted to the user/admin. If the hashes match, the message is passed to the target destination/application and a log entry is written comparing the timestamp to to the current systemtime to determine processing time, as well as logging other relevant details. If necessary, additional logging can be inserted in other stages of the process, such as after the Data Broker receives the message.
- Note regarding workload dissemination: If, for example, the message contains elements that must be sent to multiple processes for verification we can resolve that issue by using the Scatter-Gather Routing Pattern. This adds a Correlation ID to the header of the message and a corresponding ID to the sub-messages (child messages) that are dispersed to the necessary processes and routines. The parent message then waits at an Aggregator, in my system that Aggregator would be placed at the same location as the hash checking process, where it waits for the various sub-messages that match its Correlation ID to return. If all the responses from the various process are positive and the business rules have been satisfied the message can continue on its way. In this way we satisfy the requirements for workload dissemination.
Example XML Message Payload:
<?xml version=”1.0" encoding=”UTF-8"?>
<message>Buy 100 shares of TSLA</message>
Thanks so much for stopping by and reading my article! I’d love to know your thoughts on my implementation. If you have any questions, comments, suggestions, or tips on how I can improve this solution please feel free to leave a comment!