Replacing an online Host Bus Adapter (HBA) on an M5000 server

This section contains the procedure to replace an online Host Bus Adapter (HBA) when DMP is managing multi-pathing in a Cluster File System (CFS) cluster. The HBA World Wide Port Name (WWPN) changes when the HBA is replaced.

Following are the prerequisites to replace an online Host Bus Adapter (HBA):

Following is the procedure to hotswap an online Host Bus Adapter on an M5000 server:

To replace an online Host Bus Adapter (HBA) on an M5000 server

  1. Identify the HBAs on the M5000 server using the following command:
    /usr/platform/sun4u/sbin/prtdiag -v | grep emlx ( emulex HBA) 
    /usr/platform/sun4u/sbin/prtdiag -v | grep qlc ( qlogic HBA )
    
    00  PCIe  0       2, fc20, 10df     119,  0,  0  okay     4,    
    4  SUNW,emlxs-pci10df,fc20        LPe 11002-S
        /pci@0,600000/pci@0/pci@9/SUNW,emlxs@0
    
    00  PCIe  0       2, fc20, 10df     119,  0,  1  okay     4,    
    4  SUNW,emlxs-pci10df,fc20        LPe 11002-S
        /pci@0,600000/pci@0/pci@9/SUNW,emlxs@0,1
    
    00  PCIe  3       2, fc20, 10df       2,  0,  0  okay     4,    
    4  SUNW,emlxs-pci10df,fc20        LPe 11002-S
        /pci@3,700000/SUNW,emlxs@0
    
    00  PCIe  3       2, fc20, 10df       2,  0,  1  okay     4,    
    4  SUNW,emlxs-pci10df,fc20        LPe 11002-S
        /pci@3,700000/SUNW,emlxs@0,1
    
    
  2. Identify the HBA and it's WWPN(s), which you want to replace using the cfgadm command.

    To identify the HBA:

    # cfgadm -al | grep -i fibre 
    iou#0-pci#1 fibre/hp connected configured ok
    
    iou#0-pci#4 fibre/hp connected configured ok
    

    To list all HBAs:

    # luxadm -e port ( will list all HBA's )
    /devices/pci@0,600000/pci@0/pci@9/SUNW,emlxs@0/fp@0,0:devctl       
    NOT CONNECTED
    /devices/pci@0,600000/pci@0/pci@9/SUNW,emlxs@0,1/fp@0,0:devctl     
    CONNECTED
    /devices/pci@3,700000/SUNW,emlxs@0/fp@0,0:devctl                   
    NOT CONNECTED
    /devices/pci@3,700000/SUNW,emlxs@0,1/fp@0,0:devctl            
    CONNECTED
     
    

    Select the HBA to dump the portap and get the WWPN:

    # luxadm -e dump_map /devices/pci@0,600000/pci@0/pci@9/SUNW,emlxs@0,1/
    fp@0,0:devctl
    0    304700  0         203600a0b847900c 200600a0b847900c 0x0  
    (Disk device)
    1    30a800  0         20220002ac00065f 2ff70002ac00065f 0x0  
    (Disk device)
    2    30a900  0         21220002ac00065f 2ff70002ac00065f 0x0  
    (Disk device)
    3    560500  0         10000000c97c3c2f 20000000c97c3c2f 0x1f 
    (Unknown Type)
    4    560700  0         10000000c97c9557 20000000c97c9557 0x1f 
    (Unknown Type)
    5    560b00  0         10000000c97c34b5 20000000c97c34b5 0x1f 
    (Unknown Type)
    6    560900  0         10000000c973149f 20000000c973149f 0x1f 
    (Unknown Type,Host Bus Adapter)
    

    Alternately, you can run the fcinfo hba-port Solaris command to get the WWPN(s) for the HBA ports.

  3. Ensure you have a compatible spare HBA for hot-swap.
  4. Stop the I/O operations on the HBA port(s) and disable the DMP subpath(s) for the HBA that you want to replace.
    # vxdmpadm disable ctrl=<>
  5. Dynamically unconfigure the HBA in the PCIe slot using the cfgadm command.
    # cfgadm -c unconfigure iou#0-pci#1

    Look for console messages to check if the cfgadm command is unsuccessful.

    If the cfgadm command is unsuccessful, proceed to troubleshooting using the server hardware documentation. Check the Solaris 10 patch level recommended for dynamic reconfiguration operations and contact Oracle support for further assistance.

    console messages
    
    Oct 24 16:21:44 m5000sb0 pcihp: NOTICE: pcihp (pxb_plx2): 
    card is removed from the slot iou 0-pci 1
    
    
  6. Verify that the HBA card that is being replaced in step 5 is not in the configuration using the following command:
    # cfgadm -al | grep -i fibre
    iou 0-pci 4 fibre/hp connected configured ok
    
  7. Mark the fiber cable(s).
  8. Remove the fiber cable(s) and the HBA that you must replace.

    Note:

    You can refer to the HBA replacement procedures in SPARC Enterprise M4000/M5000/M8000/M9000 Servers Dynamic Reconfiguration (DR) User's Guide for more information.

  9. Replace it with a new compatible HBA of similar type in the same slot.

    The reinserted card shows up as follows:

    console messages
    
    iou 0-pci 1 unknown disconnected unconfigured unknown
    
    
  10. Run the following command to bring the replaced HBA back into the configuration.
    # cfgadm -c configure iou 0-pci 1 
    console messages
    
    Oct 24 16:21:57 m5000sb0 pcihp: NOTICE: pcihp (pxb_plx2): 
    card is inserted in the slot iou#0-pci#1 (pci dev 0)
    
    
  11. Verify that the reinserted HBA is in the configuration using the cfgadm command:
    # cfgadm -al | grep -i fibre
    iou#0-pci 1 fibre/hp connected configured ok <====
    
    iou#0-pci 4 fibre/hp connected configured ok
    
    
  12. Modify fabric zoning to include the replaced HBA WWPN(s).
  13. Enable LUN security on storage for the new WWPN(s).
  14. Perform an operating system device scan to re-discover the LUNs using the cfgadm command:
    # cfgadm -c configure c3
  15. Clean up the device tree for old LUNs.
    # devfsadm -Cv

    Note:

    Sometimes HBA replacement may create new devices. Perform cleanup operations for the LUN only when new devices are created.

  16. If VxVM / Dynamic Multi-pathing (DMP) does not show a ghost path for the removed HBA path, enable the path using the vxdmpadm command: This performs the device scan for that particular HBA subpath(s).
    # vxdmpadm disable ctrl=<ctrl#>
  17. Verify if I/O operations are scheduled on that path.

    If I/O operations are running correctly on all paths, then the dynamic HBA replacement operation is complete.