Tuesday, May 21, 2013

Resolving hard disk drive problems

Note: The RAS storage node menus can be useful for identifying the correct disk for maintenance. For additional information, refer to the IBM SONAS RAS menus
 
  • Follow the suggested actions for a Symptom in the order in which they are listed in the Action column until the problem is solved.
  • All components are field replaceable units (FRU) and all steps must be performed only by a trained service technician

A hard disk drive has failed and the associated amber hard disk drive status LED is lit.
Replace the failed hard disk drive. See Removing a hot-swap hard disk drive and Installing a hot-swap hard disk drive.

An installed hard disk drive is not recognized.


  1. Observe the associated amber hard disk drive status LED. If the LED is lit, it indicates a drive fault.
  2. If the LED is lit, remove the drive from the bay, wait 45 seconds, then reinsert the drive, ensuring that the drive assembly connects to the hard disk drive backplane.
  3. Observe the associated green hard disk drive activity LED and the amber status LED:
    • If the green activity LED is flashing and the amber status LED is not lit, the drive is recognized by the controller and is working correctly. Run the DSA hard disk drive test to determine whether the drive is detected.
    • If the green activity LED is flashing and the amber status LED is flashing slowly, the drive is recognized by the controller and is rebuilding.
    • If neither LED is lit or flashing, check the hard disk drive backplane (go to step 4).
    • If the green activity LED is flashing and the amber status LED is lit, replace the drive. If the activity of the LEDs remains the same, go to step 4. If the activity of the LEDs changes, return to step 1.
  4. Ensure that the hard disk drive backplane is correctly seated. When it is correctly seated, the drive assemblies correctly connect to the backplane without bowing or causing movement of the backplane.
  5. Move the hard disk drives to different bays to determine if the drive or the backplane is not functioning.
  6. Re-seat the backplane power cable and repeat steps 1 through 3.
  7. Re-seat the backplane signal cable and repeat steps 1 through 3.
  8. Suspect the backplane signal cable or the backplane:
    • If the server has eight hot-swap bays:
      1. Replace the affected backplane signal cable.
      2. Replace the affected backplane.
    • If the server has 12 hot-swap bays:
      1. Replace the backplane signal cable.
      2. Replace the backplane.
      3. Replace the SAS expander card.
  9. Run the DSA tests for the SAS controller and hard disk drives:
    • If the controller passes the test but the drives are not recognized, replace the backplane signal cable and run the tests again.
    • Replace the backplane.
    • If the controller fails the test, disconnect the backplane signal cable from the controller and run the tests again.
    • If the controller fails the test, replace the controller.


    Multiple hard disk drives fail.

     Ensure that the hard disk drive, SAS RAID controller, and server device drivers and firmware are of the latest version.
    Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, check whether the latest code version is supported before you update the code. Refer to Upgrade provider information.
    Multiple hard disk drives are offline.

    Review the storage subsystem logs for indications of problems within the storage subsystem, such as backplane or cable problems.
    A replacement hard disk drive does not rebuild.

    1. Ensure that the hard disk drive is recognized by the controller (the green hard disk drive activity LED is flashing).
    2. Review the SAS RAID controller documentation to determine the correct configuration parameters and settings.

    A green hard disk drive activity LED does not accurately represent the actual state of the associated drive.

    1. If the green hard disk drive activity LED does not flash when the drive is in use, run the DSA disk drive test. Refer to the "Diagnostics" or “Running the diagnostic programs” section in Troubleshooting the System x3650 server.
    2. Use one of the following procedures:
      • If the drive passes the test, replace the backplane.
      • If the drive fails the test, replace the drive. 

    An amber hard disk drive status LED does not accurately represent the actual state of the associated drive. 

    If the amber hard disk drive LED and the RAID controller software do not indicate the same status for the drive, complete the following steps:
    1. Turn off the server.
    2. Re-seat the SAS controller.
    3. Re-seat the backplane signal cable, backplane power cable, and SAS expander card (if the server has 12 drive bays).
    4. Re-seat the hard disk drive.
    5. Turn on the server and observe the activity of the hard disk drive LEDs.

No comments:

Post a Comment