In modern data-driven enterprises, workflow scheduling systems are the "central nervous system" of data pipelines. From ETL tasks to machine learning training, report generation to real-time monitoring, nearly all critical business processes rely on a stable, efficient, and scalable scheduling engine. I believe Apache DolphinScheduler 3.1.9 is a stable and widely used version. Therefore, this series of articles will delve into its core source code, analyzing its architecture design, module division, and key implementation mechanisms to help developers understand how the Master and Worker "work" and lay a foundation for further secondary development or performance optimization. Previously, we analyzed the Apache DolphinScheduler 3.1.9 Master server startup process source code, which you can check if interested. This article is the second in the Apache DolphinScheduler 3.1.9 source code analysis series: Worker Server startup process source code interpretation and related process design. Flowcharts are provided at the end for reference. Apache DolphinScheduler 3.1.9 Master server startup process source code 2. Worker Server Startup Core Overview Code entry point: org.apache.dolphinscheduler.server.worker.WorkerServer#run Code entry point: org.apache.dolphinscheduler.server.worker.WorkerServer#run org.apache.dolphinscheduler.server.worker.WorkerServer#run public void run() { // 1. rpc start this.workerRpcServer.start(); // Ignore, as workerRpcServer initialization includes workerRpcClient initialization this.workerRpcClient.start(); // 2. Task plugin initialization this.taskPluginManager.loadPlugin(); this.workerRegistryClient.setRegistryStoppable(this); // 3. Worker registration this.workerRegistryClient.start(); // 4. Worker management thread, continuously fetch tasks from the waitSubmitQueue and submit them to the thread pool this.workerManagerThread.start(); // 5. Message retry thread, responsible for polling and sending services via RPC this.messageRetryRunner.start(); ... } public void run() { // 1. rpc start this.workerRpcServer.start(); // Ignore, as workerRpcServer initialization includes workerRpcClient initialization this.workerRpcClient.start(); // 2. Task plugin initialization this.taskPluginManager.loadPlugin(); this.workerRegistryClient.setRegistryStoppable(this); // 3. Worker registration this.workerRegistryClient.start(); // 4. Worker management thread, continuously fetch tasks from the waitSubmitQueue and submit them to the thread pool this.workerManagerThread.start(); // 5. Message retry thread, responsible for polling and sending services via RPC this.messageRetryRunner.start(); ... } 2.1 RPC Start: Description: Registers processors for relevant commands such as task request, task stop request, etc.Code entry point: org.apache.dolphinscheduler.server.worker.rpc.WorkerRpcServer#start Description: Registers processors for relevant commands such as task request, task stop request, etc. Code entry point: org.apache.dolphinscheduler.server.worker.rpc.WorkerRpcServer#start org.apache.dolphinscheduler.server.worker.rpc.WorkerRpcServer#start public void start() { LOGGER.info("Worker rpc server starting"); NettyServerConfig serverConfig = new NettyServerConfig(); serverConfig.setListenPort(workerConfig.getListenPort()); this.nettyRemotingServer = new NettyRemotingServer(serverConfig); // Receives and dispatches task requests, putting tasks into the waitSubmitQueue for later processing this.nettyRemotingServer.registerProcessor(CommandType.TASK_DISPATCH_REQUEST, taskDispatchProcessor); ... this.nettyRemotingServer.start(); LOGGER.info("Worker rpc server started"); } public void start() { LOGGER.info("Worker rpc server starting"); NettyServerConfig serverConfig = new NettyServerConfig(); serverConfig.setListenPort(workerConfig.getListenPort()); this.nettyRemotingServer = new NettyRemotingServer(serverConfig); // Receives and dispatches task requests, putting tasks into the waitSubmitQueue for later processing this.nettyRemotingServer.registerProcessor(CommandType.TASK_DISPATCH_REQUEST, taskDispatchProcessor); ... this.nettyRemotingServer.start(); LOGGER.info("Worker rpc server started"); } 2.2 Task Plugin Initialization: Description: Initializes task-related templates, such as task creation, parameter parsing, and resource information retrieval. Description: Initializes task-related templates, such as task creation, parameter parsing, and resource information retrieval. 2.3 Worker Registration: Description: Registers worker information with a registry center (Zookeeper in this example) and listens for connection state changes.Code entry point: org.apache.dolphinscheduler.server.worker.registry.WorkerRegistryClient#start Description: Registers worker information with a registry center (Zookeeper in this example) and listens for connection state changes. Code entry point: org.apache.dolphinscheduler.server.worker.registry.WorkerRegistryClient#start org.apache.dolphinscheduler.server.worker.registry.WorkerRegistryClient#start public void start() { try { // Register worker info with the registry center registry(); // Listen for connection state changes registryClient.addConnectionStateListener(new WorkerConnectionStateListener(workerConfig, registryClient, workerConnectStrategy)); } catch (Exception ex) { throw new RegistryException("Worker registry client start-up error", ex); } } public void start() { try { // Register worker info with the registry center registry(); // Listen for connection state changes registryClient.addConnectionStateListener(new WorkerConnectionStateListener(workerConfig, registryClient, workerConnectStrategy)); } catch (Exception ex) { throw new RegistryException("Worker registry client start-up error", ex); } } 2.4 Worker Management Thread: Description: Continuously retrieves tasks from the waitSubmitQueue and submits them to the thread pool for processing.Code entry point: org.apache.dolphinscheduler.server.worker.runner.WorkerManagerThread#run Description: Continuously retrieves tasks from the waitSubmitQueue and submits them to the thread pool for processing. waitSubmitQueue Code entry point: org.apache.dolphinscheduler.server.worker.runner.WorkerManagerThread#run org.apache.dolphinscheduler.server.worker.runner.WorkerManagerThread#run public void run() { Thread.currentThread().setName("Worker-Execute-Manager-Thread"); while (!ServerLifeCycleManager.isStopped()) { try { if (!ServerLifeCycleManager.isRunning()) { Thread.sleep(Constants.SLEEP_TIME_MILLIS); } // If thread pool resources are sufficient, process the task final WorkerDelayTaskExecuteRunnable workerDelayTaskExecuteRunnable = waitSubmitQueue.take(); workerExecService.submit(workerDelayTaskExecuteRunnable); ... } catch (Exception e) { logger.error("An unexpected interrupt happened", e); } } } public void run() { Thread.currentThread().setName("Worker-Execute-Manager-Thread"); while (!ServerLifeCycleManager.isStopped()) { try { if (!ServerLifeCycleManager.isRunning()) { Thread.sleep(Constants.SLEEP_TIME_MILLIS); } // If thread pool resources are sufficient, process the task final WorkerDelayTaskExecuteRunnable workerDelayTaskExecuteRunnable = waitSubmitQueue.take(); workerExecService.submit(workerDelayTaskExecuteRunnable); ... } catch (Exception e) { logger.error("An unexpected interrupt happened", e); } } } 2.5 Message Retry Thread: Description: If the worker doesn't receive an acknowledgment (ack) for a task request from the master, this thread retries the message at intervals, typically every 5 minutes. Description: If the worker doesn't receive an acknowledgment (ack) for a task request from the master, this thread retries the message at intervals, typically every 5 minutes. 3. Related Flowcharts Official documentation provides various flowcharts, such as fault-tolerance mechanisms and distributed lock implementation flowcharts. For more details, visit Architecture Design and Design Documentation. Architecture Design Design Documentation This article supplements the task dispatch and task stop flowcharts, and only describes the normal process of instance startup and shutdown. It does not include fault-tolerant recovery scenarios, nor does it cover related locking or concurrency scenarios. Conclusion This is an initial understanding of Apache DolphinScheduler 3.1.9 features and architecture based on personal learning and practice. There might be misunderstandings or omissions in the article, so feedback is welcome. If you're interested in the source code, you can dive deeper into the task scheduling strategy or develop secondary applications based on your business scenarios.