Stuff I Liked or Use

Take it easy, but take it.

Splitting a Sun 6140 array volume in two with Sun Cluster 3.x and samfs

without comments

The goal here is to take this:

dbnode51# df -h /global/u02
Filesystem             size   used  avail capacity  Mounted on
archives               360G   7.3G   353G     3%    /global/u02

and split it into u02 and u03, each 125G in size.

So it appears I am forced to backup the filesystem, remove the volume on the 6140, make two smaller ones, and recreate everything.

I checked with Sun to see if there’s an easier way, but reportedly there’s no way to shrink an existing volume, or to break off the mirror halves, create my new setup on the old half, and then copy data/resync.  Much like the old Raid Manager stuff.  Not sure why they think all this extra management stuff on the 6140 is worth it, but I guess it’s good for sales.

Anyway, here we go.

First, backup qfs volumes with qfsdump

Usage: qfsdump [-dHqTvD] [-b size] [-B size] [-I include_dir] [-X excluded_dir] -f dump_file [file...]

and do it again with tar, since I am not as familiar with qfsdump as I am with tar.  You know, just in case.

Take resource groups offline for anything that uses the filesystems in question.  See Sun Cluster 3.x reference page to see how to do that.  Then unmount your old filesystem.

(For me, the next issue was that I couldn’t manage the array via CAM nor sscs. Neither controller will talk to me over the network.  Nor will it respond to any pings/telnet/etc.  This has happened on and off and Sun hasn’t figured it out yet.  Reportedly there is a secret Sun reset procedure (http://forums.sun.com/thread.jspa?threadID=5301953) to reset the controllers without restarting the whole array, but it requires a ’secret’ password.  We’ll just restart the array for now and return to our story.  You can also set up in-band management – over the Fibre instead – expect a post on that once I get it working.  Thin docs on that too.)

Here’s the info on my old volume from CAM.

Name: archives
 World Wide Name: 60:0A:0B:80:00:2A:0D:FE:00:00:0B:78:48:00:85:7E
 Type: Standard
 Capacity: 408.000 GB
 Virtual Disk: 3
 Pool: Testpool
 RAID Level: 1
[snip]

So, click through CAM and delete the old, create two new ones assigned to my old vidisk (3). map to our hostgroup, and watch it get assigned lun 5 for the new smaller archives, and lun6 for the new archives3 (to be mounted on /global/u03)

Next, time for scdidadm.  From before:

# scdidadm -l
4        dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B6E48008436d0 /dev/did/rdsk/d4
5        dbnode51:/dev/rdsk/c0t0d0    /dev/did/rdsk/d5
6        dbnode51:/dev/rdsk/c5t2000001862EF4E9Cd0 /dev/did/rdsk/d6
7        dbnode51:/dev/rdsk/c5t2000001862F7F654d0 /dev/did/rdsk/d7
8        dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B784800857Ed0 /dev/did/rdsk/d8
9        dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B7548008520d0 /dev/did/rdsk/d9
10       dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B704800847Cd0 /dev/did/rdsk/d10
11       dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B6C4800838Ad0 /dev/did/rdsk/d11
12       dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B73480084BEd0 /dev/did/rdsk/d12
8186     dbnode51:/dev/rmt/2          /dev/did/rmt/6
8187     dbnode51:/dev/rmt/1          /dev/did/rmt/5
8188     dbnode51:/dev/rmt/0          /dev/did/rmt/4

Now run, on all nodes.

devfsadm && scdidadm -r && scdidadm -C
scgdevs

Depending on your exact circumstances, you may/may not need -C.  These steps, simply put, rescan devices (devfsadm), rescan and make new DID devices (scdidadm -r), clean out old DID devices (scdidadm -C), update cluster global devices (scgdevs).

If you’re not removing, don’t use -C.  If you stumbled across this looking for disk replacement commands, don’t use scdidadm -C to clear, use -R d## instead.

After running the above, you get

4        dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B6E48008436d0 /dev/did/rdsk/d4
5        dbnode51:/dev/rdsk/c0t0d0    /dev/did/rdsk/d5
7        dbnode51:/dev/rdsk/c5t2000001862F7F654d0 /dev/did/rdsk/d7
9        dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B7548008520d0 /dev/did/rdsk/d9
10       dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B704800847Cd0 /dev/did/rdsk/d10
11       dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B6C4800838Ad0 /dev/did/rdsk/d11
12       dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B73480084BEd0 /dev/did/rdsk/d12
13       dbnode51:/dev/rdsk/c5t600A0B80002A0E2A00000BE548BE67CFd0 /dev/did/rdsk/d13
14       dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000CAE48BE686Ed0 /dev/did/rdsk/d14
8186     dbnode51:/dev/rmt/2          /dev/did/rmt/6
8187     dbnode51:/dev/rmt/1          /dev/did/rmt/5
8188     dbnode51:/dev/rmt/0          /dev/did/rmt/4

DID id 6 is gone (old archives) and 13/14 are added (new archives and archives3.  Compare WWN versus CAM to see which is which versus your naming convention.)

Now that the devices are there, let’s get the filesystems ready.

First, update the master config file for samfs.

From before”

# cat /etc/opt/SUNWsamfs/mcf
archives    20    ma    archives    on     shared
/dev/did/dsk/d8s0    21    mm    archives    on
/dev/did/dsk/d8s6    22    mr    archives    on
database    40    ma    database    on    shared
/dev/did/dsk/d4s0    41    mm    database    on
/dev/did/dsk/d4s6    42    mr    database    on

It wants two devices, one for it’s database and one for the actual data.  I would imagine you can split these elsewhere, but this is how SunPS had set ours up.

So, let’s check out existing filesystem ‘database’ (one of the samfs volumes we didn’t wipe), also known as dbnode51:/dev/rdsk/c5t600A0B80002A0DFE00000B6E48008436d0 /dev/did/rdsk/d4 and see what the partition table looks like on there.

Part      Tag    Flag     First Sector         Size         Last Sector
0        usr    wm                34       40.00GB          83886113
1 unassigned    wm                 0           0               0
2 unassigned    wm                 0           0               0
3 unassigned    wm                 0           0               0
4 unassigned    wm                 0           0               0
5 unassigned    wm                 0           0               0
6        usr    wm          83886114      359.99GB          838844381
8   reserved    wm         838844382        8.00MB          838860765

So let’s slice our new d13 and d14 similarly.  Go into format, find the new devices based on the WWN from CAM.  (Or just format the ones that need labeling, those’ll be your new ones. ; )

Part      Tag    Flag     Cylinders         Size            Blocks
0       root    wm       0 -  3839       15.00GB    (3840/0/0)   31457280
1       swap    wu       0                0         (0/0/0)             0
2     backup    wu       0 - 37117      144.99GB    (37118/0/0) 304070656
3 unassigned    wm       0                0         (0/0/0)             0
4 unassigned    wm       0                0         (0/0/0)             0
5 unassigned    wm       0                0         (0/0/0)             0
6        usr    wm    3840 - 37115      129.98GB    (33276/0/0) 272596992
7 unassigned    wm   37116 - 37117        8.00MB    (2/0/0)         16384

You will notice that the old table doesn’t have slice 2, and mine does. Doesn’t appear to matter.

Now update /etc/opt/SUWsamfs/mcf as such, on all servers.  Be sure to use your own new did numbers, not mine.  Don’t re-use the sequence numbers (or whatever they’re called – the 40/41/42, etc.)  And watch your mm/mr.

archives        20      ma      archives        on      shared
/dev/did/dsk/d13s0      21      mm      archives        on
/dev/did/dsk/d13s6      22      mr      archives        on
database        40      ma      database        on      shared
/dev/did/dsk/d4s0       41      mm      database        on
/dev/did/dsk/d4s6       42      mr      database        on
archives3       30      ma      archives3       on      shared
/dev/did/dsk/d14s0      31      mm      archives3       on
/dev/did/dsk/d14s6      32      mr      archives3       on

then, on all servers, run

# samd config
# cp /etc/opt/SUNWsamfs/hosts.archives /etc/opt/SUNWsamfs/hosts.archives3

Just do these next two on one server that is mounting your filesystem.  The -S is for ’shared’.

# sammkfs -S archives
Building 'archives' will destroy the contents of devices:
                /dev/did/dsk/d13s0
                /dev/did/dsk/d13s6
Do you wish to continue? [y/N]y
total data kilobytes       = 136298496
total data kilobytes free  = 136298432
total meta kilobytes       = 15728640
total meta kilobytes free  = 15727856
# sammkfs -S archives3

and then add archives3 to vfstab on all servers.

archives3       -       /global/u03     samfs   -       no      shared

Next, make mountpoints and mount the new filesystems on all.  Watch your underlying permissions (classic error I make. ; )

# mkdir /global/u03; chmod 755 /global/u03; mount archives3

Then qfsrestore the old files to one of the new filesystems, then rearrange as desired.

Now, just create your new resources for the cluster for the new filesystem, and re-enable your resources.

See Sun Cluster 3.x reference page to see how to do that.

Written by Brad

September 3rd, 2008 at 5:51 pm

Posted in Computing, Solaris

Leave a Reply