crs daemon was not able to start as my ocr diskgroup was not mounted. I logged in to asm instance.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
[oracle@racnode1 ~]$ . oraenv ORACLE_SID = [oracle] ? +ASM1 The Oracle base for ORACLE_HOME=/u01/app/11.2.0/grid is /u01/app/oracle [oracle@racnode1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.1.0 Production on Sun Jul 19 23:26:52 2015 Copyright (c) 1982, 2009, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter diskgroup CRS mount; alter diskgroup CRS mount * ERROR at line 1: ORA-15032: not all alterations performed ORA-15130: diskgroup "CRS" is being dismounted ORA-15066: offlining disk "CRS_0000" may result in a data loss |
But i was able to mount my other diskgroups
1 2 3 4 5 6 7 8 9 10 11 |
SQL> alter diskgroup DATA mount; Diskgroup altered. SQL> alter diskgroup FRA mount; Diskgroup altered. |
also checked using asmcmd lsdg –discovery command
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
SQL> quit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options [oracle@racnode1 ~]$ asmcmd lsdg --discovery State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name DISMOUNTED N 512 4096 0 0 0 0 0 0 N CRS/ MOUNTED EXTERN N 512 4096 1048576 15359 13678 0 13678 0 N DATA/ MOUNTED EXTERN N 512 4096 1048576 15359 15001 0 15001 0 N FRA/ |
came to know after checking alertlog CRS diskgroup had corrupted block.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
Diskgroup used for OCR is:CRS NOTE: cache registered group CRS number=1 incarn=0x4d38a200 NOTE: cache began mount (first) of group CRS number=1 incarn=0x4d38a200 NOTE: Assigning number (1,0) to disk (/dev/oracleasm/disks/CRS) NOTE: start heartbeating (grp 1) kfdp_query(CRS): 3 Mon Jul 20 13:30:17 2015 kfdp_queryBg(): 3 NOTE: cache opening disk 0 of grp 1: CRS_0000 path:/dev/oracleasm/disks/CRS NOTE: F1X0 found on disk 0 au 2 fcn 0.0 NOTE: cache mounting (first) external redundancy group 1/0x4D38A200 (CRS) Mon Jul 20 13:30:17 2015 * allocate domain 1, invalid = TRUE Mon Jul 20 13:30:17 2015 NOTE: attached to recovery domain 1 WARNNING: cache read a corrupted block group=CRS fn=1 blk=3 from disk 0 NOTE: a corrupted block from group CRS was dumped to /u01/app/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace/+ASM1_ora_6709.trc WARNNING: cache read(retry) a corrupted block group=CRS fn=1 blk=3 from disk 0 ERROR: cache failed to read group=CRS fn=1 blk=3 from disk(s): 0 CRS_0000 ORA-15196: invalid ASM block header [kfc.c:23924] [endian_kfbh] [1] [3] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:23908] [endian_kfbh] [1] [3] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:23924] [endian_kfbh] [1] [3] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:23908] [endian_kfbh] [1] [3] [0 != 1] System State dumped to trace file /u01/app/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace/+ASM1_ora_6709.trc NOTE: AMDU dump of disk group CRS created at /u01/app/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace NOTE: cache initiating offline of disk 0 group CRS NOTE: process 6709 initiating offline of disk 0.3915928308 (CRS_0000) with mask 0x7e in group 1 WARNING: Disk CRS_0000 in mode 0x7f is now being taken offline NOTE: initiating PST update: grp = 1, dsk = 0/0xe96852f4, mode = 0x15 |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
[root@racnode1 trace]# ocrcheck PROT-602: Failed to retrieve data from the cluster registry PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=8, opn=kgfolclcpi1, dep=208, loc=kgfokge AMDU-00208: File directory block not found. Cannot extract file CRS.255 AMDU-00209: Corrupt block found: Disk N0004 AU [0] block [3] type [3] AMDU-00201: Disk N ] [8] |
referring doc id 1062983.1 i was able to restore ocr to a new disk with same diskgroup name
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
stop cluster on all nodes crsctl stop crs -f #delete the corrupted disk oracleasm deletedisk CRS # formatted new disk fdisk /dev/sdf oracleasm createdisk CRS /dev/sdf1 oracleasm scandisks oracleasm listdisks start cluster in exclusive mode crsctl start crs -excl stop CRS daemon crsctl stop res ora.crsd -init login with grid user sqlplus / as sysasm create diskgroup CRS external redundancy disk '/dev/oracleasm/disk/CRS' attribute 'COMPATIBLE.ASM'='11.2'; after diskgroup is created quit from sql login as root user locate the ocr backup under $GRIDHOME/CDATA/raccluster ocrconfig -restore backup00.ocr #no message is displayed if restored successfully start the crs daemon crsctl start res ora.crsd -init recreate the voting file crsctl replace votedisk +CRS you should receive voting file successfully replaced create pfile for asm instance only if your spfile was located in crs disk group copy the non default parameter from the alert log located under $GRID_HOME/log/nodename/alertnodename.log vi /tmp/asm_pfile.ora *.large_pool_size = 12M *.instance_type = "asm" *.remote_login_passwordfile= "EXCLUSIVE" *.asm_diskstring = "/dev/oracleasm/disks" *.asm_power_limit = 1 *.diagnostic_dest = "/u01/app/oracle" save and close as grid user login to asm instance sys / as sysasm create spfile='+CRS' from pfile='/tmp/asm_pfile.ora'; exit shutdown crs login as root and shutdown crs crsctl stop crs -f scan asm disk on all nodes oracleasm scandisks startup crs on all nodes crsctl start crs and check if everything is working fine crsctl check cluster -all |
please feel free to ask me at sunil@sunilthetechfreak.com