Previously, I explained where only the DB binaries were corrupted. Now, let's try to fix a node with only the GRID binaries corrupted. how to fix a node My configuration Oracle RAC with 2 nodes: ol8-19-rac1 and ol8-19-rac2 CDB: cdbrac1 PBD: pdb1 Grid Version 34318175;TOMCAT RELEASE UPDATE 19.0.0.0.0 (34318175) 34160635;OCW RELEASE UPDATE 19.16.0.0.0 (34160635) 34139601;ACFS RELEASE UPDATE 19.16.0.0.0 (34139601) 34133642;Database Release Update : 19.16.0.0.220719 (34133642) 33575402;DBWLM RELEASE UPDATE 19.0.0.0.0 (33575402) DB Version 34086870;OJVM RELEASE UPDATE: 19.16.0.0.220719 (34086870) 34160635;OCW RELEASE UPDATE 19.16.0.0.0 (34160635) 34133642;Database Release Update : 19.16.0.0.220719 (34133642) When you see <dbenv>, load the DB HOME variables. When you see <gridenv>, load the GRID HOME variables. When you see <bnode>, execute the command on the broken node. When you see <anode>, execute the command on any other working node. I am using an installation where both DB and GRID are installed under the user ORACLE, and I set the variables to access each environment. But the same procedure works even if the installation is done using two different users (usually oracle e grid). Always validate any procedure before you try it in a production environment. Scenario B - GRID Binaries Corrupted On ol8-19-rac2 After Patch Apply In this scenario, only the GRID binaries were affected, so let's try to preserve the DB binaries. The stack is already down since the binaries are corrupted. 01. Backup the OCR configuration ================================ As root, at any other node, create a OCR's backup. Better safe than sorry. <anode gridenv root> [root@ol8-19-rac2 scripts]# ocrconfig -manualbackup ol8-19-rac2 2022/11/17 13:40:44 +CRS:/ol8-19-cluster/OCRBACKUP/backup_20221117_134044.ocr.258.1121002845 896235792 02. Check whether both nodes are unpinned (they must be Unpinned to proceed) ============================================================================ <anode gridenv> [oracle@ol8-19-rac1 ~]$ olsnodes -s -t ol8-19-rac1 Active Unpinned ol8-19-rac2 Active Unpinned If the node is pinned, then run the crsctl unpin css. 03. Backup the $ORACLE_HOME/network/admin ========================================= <bnode gridenv> [oracle@ol8-19-rac2 ~]$ mkdir -p /tmp/oracle; tar cvf /tmp/oracle/grid_netadm.tar -C $ORACLE_HOME/network/admin . 04. Deinstall the GRID binaries =============================== <bnode gridenv> [oracle@ol8-19-rac2 ~]$ $ORACLE_HOME/deinstall/deinstall -local Confirm whether everything is OK, then say yes. After, execute the root step on another prompt. [root@ol8-19-rac2 ~]# /u01/app/19.0.0/grid/crs/install/rootcrs.sh -force -deconfig -paramfile "/tmp/deinstall2022-11-17_01-49-44PM/response/deinstall_OraGI19Home1.rsp" ... 2022/11/17 13:58:09 CLSRSC-336: Successfully deconfigured Oracle Clusterware stack on this node Now, press enter to continue and finish the deinstall. 05. Remove the broken node from the cluster =========================================== <anode gridenv root> [root@ol8-19-rac1 ~]# crsctl delete node -n ol8-19-rac2 CRS-4661: Node ol8-19-rac2 successfully deleted. 06. Remove the broken node's VIP ================================ <anode gridenv root> [root@ol8-19-rac1 ~]# srvctl stop vip -vip ol8-19-rac2 [root@ol8-19-rac1 ~]# srvctl remove vip -vip ol8-19-rac2 Please confirm that you intend to remove the VIPs ol8-19-rac2 (y/[n]) y 07. Check if the cluster is OK after the node removal ===================================================== <anode gridenv> [oracle@ol8-19-rac1 ~]$ cluvfy stage -post nodedel -n ol8-19-rac2 -verbose Performing following verification checks ... Node Removal ... CRS Integrity ...PASSED Clusterware Version Consistency ...PASSED Node Removal ...PASSED Post-check for node removal was successful. CVU operation performed: stage -post nodedel Date: Nov 17, 2022 2:04:22 PM Clusterware version: 19.0.0.0.0 CVU home: /u01/app/19.0.0/grid Grid home: /u01/app/19.0.0/grid User: oracle Operating system: Linux5.4.17-2136.312.3.4.el8uek.x86_64 08. Pre node addition validation ================================ If we were adding an actual new node, we were supposed to run some cluster verification checks to confirm that everything is OK, but since the node was already part of the cluster, I usually skip this validation, but you can execute it, if you want. [oracle@ol8-19-rac1 ~]$ cluvfy stage -pre nodeadd -n ol8-19-rac2 -verbose -fixup 09. Add the node back ===================== From any other node, add the ex-broken node back. This step takes a while, as it copies the files from one node to another. <anode gridenv> [oracle@ol8-19-rac1 ~]$ $ORACLE_HOME/addnode/addnode.sh -silent "CLUSTER_NEW_NODES={ol8-19-rac2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={ol8-19-rac2-vip}" "CLUSTER_NEW_NODE_ROLES={hub}" On the ex-broken node, execute the root script <bnode root> [root@ol8-19-rac2 tmp]# /u01/app/19.0.0/grid/root.sh 10. Check if the cluster is OK ============================== From any other node, run the cluster verification. <anode grid> [oracle@ol8-19-rac1 ~]$ cluvfy stage -post nodeadd -n ol8-19-rac2 -verbose ... Post-check for node addition was successful. CVU operation performed: stage -post nodeadd Date: Nov 17, 2022 2:32:12 PM Clusterware version: 19.0.0.0.0 CVU home: /u01/app/19.0.0/grid Grid home: /u01/app/19.0.0/grid User: oracle Operating system: Linux5.4.17-2136.312.3.4.el8uek.x86_64 Voilà, your cluster, and your database are supposed to be back online. [oracle@ol8-19-rac2 ~]$ crsctl check cluster -all ************************************************************** ol8-19-rac1: CRS-4537: Cluster Ready Services is online CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online ************************************************************** ol8-19-rac2: CRS-4537: Cluster Ready Services is online CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online ************************************************************** [oracle@ol8-19-rac2 ~]$ srvctl status database -db cdbrac Instance cdbrac1 is running on node ol8-19-rac1 Instance cdbrac2 is running on node ol8-19-rac2 In my next article, I'll show you how to recover from a scenario where we lost both GRID and DB binaries.