SVM Disk Replacement

Before change

Iostat -En
metadetach d15 d17
metadetach d18 d20
cfgadm -al
cfgadm -c unconfigure c1::dsk/c1t3d0

After change
devfsadm
cfgadm -c configure c1::dsk/c1t3d0
cfgadm -al if disk is configured then proceed follow steps
prtvtoc /dev/rdsk/c1t2d0s2 |fmthard -s – /dev/rdsk/c1t3d0s2
metaattach d15 d17
metaattach d18 d20

Replacement of a Failed Disk in Solaris (SVM – In Solaris Volume Manager)

TO REPLACE FAILED DISK IN THE SYSTEM:

First of all take the backups of following (necessary):
# metastat –p >/var/tmp/metastat-p-b4repalcement
# metastat –t >/var/tmp/metastat-t-b4replacement
# metadb –i >/var/tmp/metadb-i-b4replacement
# echo | format >/var/tmp/format-b4replacement
# iostat –en >/var/tmp/iostat-en-b4repalcement
# ifconfig –a >/var/tmp/ifconfig-a-b4repalcement

1. Identify the failed disk by following commands:
# echo | format
OR
# iostat –en or iostat –En (for complete details regarding failed disk)

OR
# By identifying the logs (/var/adm/messages) & dmesg.

hans–test#echo | format
Searching for disks…done
AVAILABLE DISK SELECTIONS:

0. c1t0d0
/pci@0,0/pci1000,30@10/sd@0,0
1. c1t2d0 ———à faulty drive
/pci@0,0/pci1000,30@10/sd@2,0

Specify disk (enter its number): Specify disk (enter its number):

Where: c1t0d0 is the root disk & c1t2d0 is the mirror disk.

hans–test#iostat -en
—- errors —
s/w h/w trn tot device
0 0 0 0 fd0
0 0 0 0 md/d0
0 0 0 0 md/d1
0 0 0 0 md/d3
0 0 0 0 md/d10
0 0 0 0 md/d11
0 0 0 0 md/d13
0 0 0 0 md/d20
0 0 0 0 md/d21
0 0 0 0 md/d23
6 0 0 6 c1t0d0
10 0 0 10 c0t0d0
6 50 0 6 c1t2d0 —————à Mirror disk is showing 50 h/w errors
0 0 0 0 hans-:vold(pid568)
2. Run metadetach command to detach the failed disk’s submirrors (Break the mirror)

# metastat -p
# metadetach -f -à -f for forcefully
# metaclear
# metastat -p | grep –i —to check the submirrors has been cleared or not.

hans–test#metastat -p
d3 -m d13 d23 1
d13 1 1 c1t0d0s3
d23 1 1 c1t2d0s3
d1 -m d11 d21 1
d11 1 1 c1t0d0s1
d21 1 1 c1t2d0s1
d0 -m d10 d20 1
d10 1 1 c1t0d0s0
d20 1 1 c1t2d0s0

hans–test#metadetach d0 d20; metadetach d1 d21; metadetach d3 d23
d0: submirror d20 is detached
d1: submirror d21 is detached
d3: submirror d23 is detached

hans–test#metastat -p
d3 -m d13 1
d13 1 1 c1t0d0s3
d1 -m d11 d21 1
d11 1 1 c1t0d0s1
d0 -m d10 d20 1
d10 1 1 c1t0d0s0
d20 1 1 c1t2d0s0
d21 1 1 c1t2d0s1
d23 1 1 c1t2d0s3

hans–test#metastat –ac ——–à only for Solaris 10 (it wont work in previous versions of Solaris)
d3 m 517MB d13
d13 s 517MB c1t0d0s3
d1 m 1.0GB d11 d21
d11 s 1.0GB c1t0d0s1
d0 m 7.8GB d10 d20
d10 s 7.8GB c1t0d0s0
d20 s 7.8GB c1t2d0s0
d21 s 1.0GB c1t2d0s1
d23 s 517MB c1t2d0s3

hans–test#metaclear d20; metaclear d21; metaclear d23
d20: Concat/Stripe is cleared
d21: Concat/Stripe is cleared
d23: Concat/Stripe is cleared

3. Delete the statedata base replica’s of the failed disk:

# metadb -i
# metadb -d /dev/dsk/
# metadb -i | grep -i

hans–test#metadb -i

flags first blk block count
a m p luo 16 8192 /dev/dsk/c1t0d0s7
a p luo 8208 8192 /dev/dsk/c1t0d0s7
a p luo 16400 8192 /dev/dsk/c1t0d0s7
a W p luo 16 8192 /dev/dsk/c1t2d0s7
a W p luo 8208 8192 /dev/dsk/c1t2d0s7
a W p luo 16400 8192 /dev/dsk/c1t2d0s7

hans–test#metadb –d /dev/dsk/c1t2d0s7

hans–test#metadb -i
flags first blk block count
a m p luo 16 8192 /dev/dsk/c1t0d0s7
a p luo 8208 8192 /dev/dsk/c1t0d0s7
a p luo 16400 8192 /dev/dsk/c1t0d0s7

4. Remove the hard drive from the device tree, type the following command:

######################################################################

In case of SCSI / SAS Disks

SCSI/SAS disks will appear as below in format output

0. c0t0d0 <DEFAULT cyl 17832 alt 2 hd 255 sec 63>
/pci@0,0/pci1022,7450@2/pci1000,3060@3/sd@0,0
Command Sequence to replace the disks

– cfgadm –al
– cfgadm –c unconfigure Ap_ID ( e.g. cfgadm -c unconfigure c0::dsk/c0t0d0)
– cfgadm -x remove_device c0::dsk/c0t0d0 (for data disk only)

In case of FCAL Sun 280R, V880, V490, V880, V890

FCAL disks will appear as below in format output

0. c1t0d0 <SUN36G cyl 24620 alt 2 hd 27 sec 107>
/pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf365024,0

Command Sequence to replace the disks

– luxadm –e port
– luxadm probe (to display paths)
– luxadm remove_device –F /dev/rdsk/c#t#d#s2
– devfsadm –v –Cc disk (where: C= cleans dir; c= specify disk)
– luxadm insert_device (optional)
######################################################################

Below procedure is related to SCSI/SAS disks

# cfgadm –al
# cfgadm –c unconfigure c1::dsk/c1t2d0

hans–test#cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t2d0 disk connected configured unknown

hans–test#cfgadm -c unconfigure c1::dsk/c1t2d0
Ap_Id Type Receptacle Occupant Condition
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t2d0 disk connected unconfigured unknown

5. Verify the device has been removed from the device tree, type following command:
# cfgadm –al
hans–test#cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t2d0 disk connected configured unknown

6. Remove failed disk form the server and insert new disk.

7. Configure the new hard drive, type following command:
# cfgadm –c configure c1:dsk/c1t2d0
hans–test# cfgadm –c configure c1:dsk/c1t2d0
Ap_Id Type Receptacle Occupant Condition
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t2d0 disk connected configured unknown

8. Verify the device has been added to the device tree, type following command:

# cfgadm -al

hans–test#cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t2d0 disk connected configured unknown

9. Check the disk status in the server by applying:

# echo | format OR # iostat –en

If disk is not visible in the server apply devfsadm –C command to reconfigure the attached devices, for reconfigure all disk apply:

# devfsadm –C –c disks
# echo | format OR # iostat -en

hans–test#echo | format
Searching for disks…done

AVAILABLE DISK SELECTIONS:
0. c1t0d0
/pci@0,0/pci1000,30@10/sd@0,0
2. c1t2d0
/pci@0,0/pci1000,30@10/sd@2,0

Specify disk (enter its number): Specify disk (enter its number):
hans–test#

10. Check the vtoc table for root disk and replaced disk,if not same then use fmthard (It would not be the same for the replaced disk).

# prtvtoc /dev/dsk/
# prtvtoc /dev/dsk/

Copy the VTOC to the replaced disk:

# prtvtoc /dev/dsk//| fmthard -s – /dev/rdsk/
hans–test#prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s – /dev/rdsk/c1t2d0s2
fmthard: New volume table of contents now in place.

hans–test#prtvtoc /dev/rdsk/c1t0d0s2
* /dev/rdsk/c1t0d0s2 partition map
*
* Dimensions:
* 512 bytes/sector
* 63 sectors/track
* 255 tracks/cylinder
* 16065 sectors/cylinder
* 1304 cylinders
* 1302 accessible cylinders
*

* Flags:
* 1: unmountable
* 10: read-only
*
* Unallocated space:
* First Sector Last
* Sector Count Sector
* 0 16065 16064
* 19631430 1285200 20916629
*
* First Sector Last

* Partition Tag Flags Sector Count Sector Mount Directory
0 2 00 16065 16386300 16402364
1 3 01 16402365 2104515 18506879
2 5 00 0 20916630 20916629
3 8 00 18506880 1060290 19567169
7 0 00 19567170 48195 19615364
8 1 01 0 16065 16064

hans–test#prtvtoc /dev/rdsk/c1t2d0s2
* /dev/rdsk/c1t2d0s2 partition map
*
* Dimensions:
* 512 bytes/sector
* 63 sectors/track
* 255 tracks/cylinder
* 16065 sectors/cylinder
* 1304 cylinders
* 1302 accessible cylinders
*
* Flags:
* 1: unmountable
* 10: read-only
*
* Unallocated space:
* First Sector Last
* Sector Count Sector
* 0 16065 16064
* 19631430 1285200 20916629
*
* First Sector Last
* Partition Tag Flags Sector Count Sector Mount Directory
0 2 00 16065 16386300 16402364
1 3 01 16402365 2104515 18506879
2 5 00 0 20916630 20916629
3 8 00 18506880 1060290 19567169
7 0 00 19567170 48195 19615364
8 1 01 0 16065 16064

hans–test#

11. Create the statedata base devices on replaced disk.

# metadb -a -f -c 3 /dev/dsk/as s7 we have preserved for statedata base devices/replica’s.

hans–test#metadb -i

flags first blk block count
a m p luo 16 8192 /dev/dsk/c1t0d0s7
a p luo 8208 8192 /dev/dsk/c1t0d0s7
a p luo 16400 8192 /dev/dsk/c1t0d0s7
a p luo 16 8192 /dev/dsk/c1t2d0s7
a p luo 8208 8192 /dev/dsk/c1t2d0s7
a p luo 16400 8192 /dev/dsk/c1t2d0s7
r – replica does not have device relocation information
o – replica active prior to last mddb configuration change
u – replica is up to date
l – locator for this replica was read successfully
c – replica’s location was in /etc/lvm/mddb.cf
p – replica’s location was patched in kernel
m – replica is master, this is replica selected as input
W – replica has device write errors
a – replica is active, commits are occurring to this replica
M – replica had problem with master blocks
D – replica had problem with data blocks
F – replica had format problems
S – replica is too small to hold current data base
R – replica had device read errors

12. Reattach the mirrors and wait untill all mirrors will syncned.

# metainit 1 1
# metattach
# metastat –ac OR metastat –t —-à to check syncing status

hans–test#metainit d23 1 1 c1t2d0s3
d23: Concat/Stripe is setup
hans–test#metainit d20 1 1 c1t2d0s0
d20: Concat/Stripe is setup
hans–test#metainit d21 1 1 c1t2d0s1
d21: Concat/Stripe is setup

hans–test#metattach d3 d23; metattach d0 d20; metattach d1 d21
d3: submirror d23 is attached
d0: submirror d20 is attached
d1: submirror d21 is attached

hans–test#metastat -ac
d3 m 517MB d13 d23
d13 s 517MB c1t0d0s3
d23 s 517MB c1t2d0s3
d1 m 1.0GB d11 d21
d11 s 1.0GB c1t0d0s1
d21 s 1.0GB c1t2d0s1
d0 m 7.8GB d10 d20
d10 s 7.8GB c1t0d0s0
d20 s 7.8GB c1t2d0s0

13. Safe to run metadevadm command to update the new devID.

# metadevadm -u
metadevadm – To update metadevice information.
hans–test#metadevadm -u /dev/dsk/c1t2d0