Hard Disk Health

Discussion in 'General Hardware' started by bobsalot, Jul 12, 2008.

  1. bobsalot

    bobsalot Boooooooooooooooooooom

    Messages:
    1,584
    Location:
    Broadstairs, UK
    I've seen a drop in my disk health on Hark Disk Sentinel over the past couple of days. Any explanation?
     

    Attached Files:

    • disk.jpg
      disk.jpg
      File size:
      125.9 KB
      Views:
      1,351
  2. loaderbull

    loaderbull OSNN Junkie

    Messages:
    52
    Location:
    Midlands, UK
    Hi Bobsalot,

    I notice it also says about 5 bad sectors on your Disk. Have you run Disk check or something similar to try and resolve this? Maybe the drive is getting towards it's end of life, or just faulty, i don't know how old it is etc though.

    Let us know.
     
  3. X-Istence

    X-Istence * Political User

    Messages:
    6,498
    Location:
    USA
    Post the actual SMART data from the device rather than a screenshot of some program.

    Code:
    smartctl version 5.38 [i386-portbld-freebsd7.0] Copyright (C) 2002-8 Bruce Allen
    Home page is [url]http://smartmontools.sourceforge.net/[/url]
    
    === START OF INFORMATION SECTION ===
    Model Family:     Western Digital Caviar family
    Device Model:     WDC WD400BB-53AUA1
    Serial Number:    WD-WMA6R2250318
    Firmware Version: 18.20D18
    User Capacity:    40,020,664,320 bytes
    Device is:        In smartctl database [for details use: -P show]
    ATA Version is:   5
    ATA Standard is:  Exact ATA specification draft version not indicated
    Local Time is:    Sat Jul 12 10:33:40 2008 MST
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x82)	Offline data collection activity
    					was completed without error.
    					Auto Offline Data Collection: Enabled.
    Self-test execution status:      (   0)	The previous self-test routine completed
    					without error or no self-test has ever 
    					been run.
    Total time to complete Offline 
    data collection: 		 (2040) seconds.
    Offline data collection
    capabilities: 			 (0x1b) SMART execute Offline immediate.
    					Auto Offline data collection on/off support.
    					Suspend Offline collection upon new
    					command.
    					Offline surface scan supported.
    					Self-test supported.
    					No Conveyance Self-test supported.
    					No Selective Self-test supported.
    SMART capabilities:            (0x0003)	Saves SMART data before entering
    					power-saving mode.
    					Supports SMART auto save timer.
    Error logging capability:        (0x01)	Error logging supported.
    					No General Purpose Logging support.
    Short self-test routine 
    recommended polling time: 	 (   2) minutes.
    Extended self-test routine
    recommended polling time: 	 (  32) minutes.
    
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x000b   200   199   051    Pre-fail  Always       -       0
      3 Spin_Up_Time            0x0007   120   102   021    Pre-fail  Always       -       2833
      4 Start_Stop_Count        0x0032   097   097   040    Old_age   Always       -       3797
      5 Reallocated_Sector_Ct   0x0032   200   200   112    Old_age   Always       -       0
      7 Seek_Error_Rate         0x000b   100   253   051    Pre-fail  Always       -       0
      9 Power_On_Hours          0x0032   040   040   000    Old_age   Always       -       44525
     10 Spin_Retry_Count        0x0013   098   095   051    Pre-fail  Always       -       5
     11 Calibration_Retry_Count 0x0013   100   100   051    Pre-fail  Always       -       0
     12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2936
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0012   200   200   000    Old_age   Always       -       0
    199 UDMA_CRC_Error_Count    0x000a   200   253   000    Old_age   Always       -       119
    200 Multi_Zone_Error_Rate   0x0009   200   200   051    Pre-fail  Offline      -       0
    
    SMART Error Log Version: 1
    ATA Error Count: 2
    	CR = Command Register [HEX]
    	FR = Features Register [HEX]
    	SC = Sector Count Register [HEX]
    	SN = Sector Number Register [HEX]
    	CL = Cylinder Low Register [HEX]
    	CH = Cylinder High Register [HEX]
    	DH = Device/Head Register [HEX]
    	DC = Device Command Register [HEX]
    	ER = Error register [HEX]
    	ST = Status register [HEX]
    Powered_Up_Time is measured from power on, and printed as
    DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
    SS=sec, and sss=millisec. It "wraps" after 49.710 days.
    
    Error 2 occurred at disk power-on lifetime: 221 hours (9 days + 5 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 08 3f 00 5e e0  Error: UNC 8 sectors at LBA = 0x005e003f = 6160447
    
      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      c8 00 08 3f 00 5e e0 00      00:00:27.850  READ DMA
      c8 00 08 3f 00 5e e0 00      00:00:23.950  READ DMA
      c8 00 08 47 00 5e e0 00      00:00:23.900  READ DMA
      c8 00 08 47 00 5e e0 00      00:00:20.400  READ DMA
      c8 00 08 37 00 5e e0 00      00:00:20.400  READ DMA
    
    Error 1 occurred at disk power-on lifetime: 221 hours (9 days + 5 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 08 3f 00 5e e0  Error: UNC 8 sectors at LBA = 0x005e003f = 6160447
    
      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      c8 00 08 3f 00 5e e0 00      00:00:23.950  READ DMA
      c8 00 08 47 00 5e e0 00      00:00:23.900  READ DMA
      c8 00 08 47 00 5e e0 00      00:00:20.400  READ DMA
      c8 00 08 37 00 5e e0 00      00:00:20.400  READ DMA
      c8 00 08 2f 00 5e e0 00      00:00:20.400  READ DMA
    
    SMART Self-test log structure revision number 1
    No self-tests have been logged.  [To run self-tests, use: smartctl -t]
    
    
    Device does not support Selective Self Tests/Logging
    Is a sample output from the tool that I personally use (smartctl). You should be able to get something along these lines as well. Then you can look at the chart and see what things are wrong.

    Code:
    5 Reallocated_Sector_Ct   0x0032   200   200   112    Old_age   Always       -       0
    
    Is of primary concern, mainly because it means parts of the drive are failing and it is silently relocating bad sectors. Eventually it will run out of spare sectors, at which point you will start to see data loss.

    Code:
      9 Power_On_Hours          0x0032   040   040   000    Old_age   Always       -       44525
    
    Power on hours is the next thing to look at, how long has the drive been powered up and been in use, my example:

    44 525 hours = 5.07939208 years

    Code:
    12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2936
    
    How many times has the drive been turned off and back on. Each time a drive is started up, it wears down the device. When the drive is powered off the head literally sits on top of the surface of the drive, when it powers on it starts spinning the platters first, which creates an air bubble underneath the heads of the HD, and then moves it across the surface. Each time it is a shock to the heads when the drive is started, which is wear and tear.


    Next up you want to look at the drive errors that have been recorded by SMART, what they were and how long ago they happened. This mainly records transactions that failed when trying to talk to the host OS, which is really important as well, since it could mean the actual hard drive chipset is failing.

    Code:
    smartctl version 5.38 [i386-portbld-freebsd7.0] Copyright (C) 2002-8 Bruce Allen
    Home page is http://smartmontools.sourceforge.net/
    
    === START OF INFORMATION SECTION ===
    Model Family:     Maxtor DiamondMax 10 family (ATA/133 and SATA/150)
    Device Model:     Maxtor 6L100P0
    Serial Number:    L25V48LG
    Firmware Version: BAJ41G20
    User Capacity:    100,256,292,864 bytes
    Device is:        In smartctl database [for details use: -P show]
    ATA Version is:   7
    ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 0
    Local Time is:    Sat Jul 12 10:38:10 2008 MST
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x80)	Offline data collection activity
    					was never started.
    					Auto Offline Data Collection: Enabled.
    Self-test execution status:      (   0)	The previous self-test routine completed
    					without error or no self-test has ever 
    					been run.
    Total time to complete Offline 
    data collection: 		 ( 841) seconds.
    Offline data collection
    capabilities: 			 (0x5b) SMART execute Offline immediate.
    					Auto Offline data collection on/off support.
    					Suspend Offline collection upon new
    					command.
    					Offline surface scan supported.
    					Self-test supported.
    					No Conveyance Self-test supported.
    					Selective Self-test supported.
    SMART capabilities:            (0x0003)	Saves SMART data before entering
    					power-saving mode.
    					Supports SMART auto save timer.
    Error logging capability:        (0x01)	Error logging supported.
    					General Purpose Logging supported.
    Short self-test routine 
    recommended polling time: 	 (   2) minutes.
    Extended self-test routine
    recommended polling time: 	 (  41) minutes.
    SCT capabilities: 	       (0x0021)	SCT Status supported.
    					SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      3 Spin_Up_Time            0x0027   226   225   063    Pre-fail  Always       -       10028
      4 Start_Stop_Count        0x0032   253   253   000    Old_age   Always       -       90
      5 Reallocated_Sector_Ct   0x0033   253   253   063    Pre-fail  Always       -       0
      6 Read_Channel_Margin     0x0001   253   253   100    Pre-fail  Offline      -       0
      7 Seek_Error_Rate         0x000a   253   228   000    Old_age   Always       -       0
      8 Seek_Time_Performance   0x0027   248   243   187    Pre-fail  Always       -       58062
      9 Power_On_Minutes        0x0032   218   218   000    Old_age   Always       -       398h+46m
     10 Spin_Retry_Count        0x002b   253   252   157    Pre-fail  Always       -       0
     11 Calibration_Retry_Count 0x002b   253   252   223    Pre-fail  Always       -       0
     12 Power_Cycle_Count       0x0032   253   253   000    Old_age   Always       -       197
    192 Power-Off_Retract_Count 0x0032   253   253   000    Old_age   Always       -       0
    193 Load_Cycle_Count        0x0032   253   253   000    Old_age   Always       -       0
    194 Temperature_Celsius     0x0032   038   253   000    Old_age   Always       -       40
    195 Hardware_ECC_Recovered  0x000a   253   252   000    Old_age   Always       -       8844
    196 Reallocated_Event_Count 0x0008   253   253   000    Old_age   Offline      -       0
    197 Current_Pending_Sector  0x0008   253   253   000    Old_age   Offline      -       0
    198 Offline_Uncorrectable   0x0008   253   253   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0008   199   199   000    Old_age   Offline      -       0
    200 Multi_Zone_Error_Rate   0x000a   253   252   000    Old_age   Always       -       0
    201 Soft_Read_Error_Rate    0x000a   253   252   000    Old_age   Always       -       0
    202 TA_Increase_Count       0x000a   253   252   000    Old_age   Always       -       0
    203 Run_Out_Cancel          0x000b   253   252   180    Pre-fail  Always       -       0
    204 Shock_Count_Write_Opern 0x000a   253   252   000    Old_age   Always       -       0
    205 Shock_Rate_Write_Opern  0x000a   253   252   000    Old_age   Always       -       0
    207 Spin_High_Current       0x002a   253   252   000    Old_age   Always       -       0
    208 Spin_Buzz               0x002a   253   252   000    Old_age   Always       -       0
    209 Offline_Seek_Performnce 0x0024   240   240   000    Old_age   Offline      -       166
    210 Unknown_Attribute       0x0032   253   252   000    Old_age   Always       -       0
    211 Unknown_Attribute       0x0032   253   252   000    Old_age   Always       -       0
    212 Unknown_Attribute       0x0032   253   252   000    Old_age   Always       -       0
    
    SMART Error Log Version: 1
    No Errors Logged
    
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Extended offline    Completed without error       00%      1732         -
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.

    This drive contains even more SMART status codes than the previous example I posted, the ones to look for are numbered:

    3, 4, 5, 7, 9, 12, 194, 195 (this one may be high, if you have a bad SATA cable because of transfer errors from the HD to the host, and is not much to worry about, just keep an eye on it if it seems to be growing and check your cables), 196, and 198.
     
  4. bobsalot

    bobsalot Boooooooooooooooooooom

    Messages:
    1,584
    Location:
    Broadstairs, UK
    10/07/2008 09:21:09,#196 Reallocation Event Count 4 -> 5
    10/07/2008 09:21:09,#5 Reallocated Sectors Count 4 -> 5
    04/07/2008 08:37:03,#196 Reallocation Event Count 3 -> 4
    04/07/2008 08:37:03,#5 Reallocated Sectors Count 3 -> 4
     
  5. bobsalot

    bobsalot Boooooooooooooooooooom

    Messages:
    1,584
    Location:
    Broadstairs, UK
    1,Raw Read Error Rate,50,100,100,OK,000000000000,0,Enabled
    2,Throughput Performance,50,100,100,OK,000000000000,0,Enabled
    3,Spin Up Time,1,100,100,OK,0000000006C3,0,Enabled
    4,Start/Stop Count,0,100,100,OK (Always passing),00000000042F,0,Enabled
    5,Reallocated Sectors Count,50,100,100,OK,000000000005,0,Enabled
    7,Seek Error Rate,50,100,100,OK,000000000000,0,Enabled
    8,Seek Time Performance,50,100,100,OK,000000000000,0,Enabled
    9,Power-On Time Count,0,98,98,OK (Always passing),0000000003D0,0,Enabled
    10,Spin Retry Count,30,121,100,OK,000000000000,0,Enabled
    12,Drive Power Cycle Count,0,100,100,OK (Always passing),00000000042C,0,Enabled
    192,Power off Retract Cycle,0,100,100,OK (Always passing),000000000049,0,Enabled
    193,Load/Unload Cycle Count,0,100,100,OK (Always passing),000000001998,0,Enabled
    194,HDD Temperature,0,100,100,OK (Always passing),0039000D002A,0,Enabled
    196,Reallocation Event Count,0,100,100,OK (Always passing),000000000005,0,Enabled
    197,Current Pending Sector Count,0,100,100,OK (Always passing),000000000000,0,Enabled
    198,Off-Line Uncorrectable Sector Count,0,100,100,OK (Always passing),000000000000,0,Enabled
    199,Ultra ATA CRC Error Count,0,200,200,OK (Always passing),000000000000,0,Enabled
    220,Disk Shift,0,100,100,OK (Always passing),00000000202F,0,Enabled
    222,Loaded Hours,0,98,98,OK (Always passing),00000000038C,0,Enabled
    223,Load/Unload Retry Count,0,100,100,OK (Always passing),000000000000,0,Enabled
    224,Load Friction,0,100,100,OK (Always passing),000000000000,0,Enabled
    226,Load-in Time,0,100,100,OK (Always passing),00000000013E,0,Enabled
    240,Head Flying Hours,1,100,100,OK,000000000000,0,Enabled
     
  6. LeeJend

    LeeJend Moderator

    Messages:
    5,291
    Location:
    Fort Worth, TX
    Data looks ok. have you checked on the fragemntations level? Also purge any pagefile, restore points and hybernate files by turning them off before you defrag. That will give you a clan compacted HD.

    Then after defrag turn on the pagefile and restore points again.

    And if you are running wubi ubuntu it makes a big mess in the middle of your HD. Best to install it on a seperate partition.
     
  7. X-Istence

    X-Istence * Political User

    Messages:
    6,498
    Location:
    USA
    What does fragmentation level have to do with the health of the hard drive that is reported by SMART? Please explain that to me.

    bobsalot: It looks fine, just keep a watch on the re-allocated sector count, if it continues to climb I would suggest purchasing a new hard drive and replacing the one that you currently have.
     
  8. bobsalot

    bobsalot Boooooooooooooooooooom

    Messages:
    1,584
    Location:
    Broadstairs, UK
    Can i replace the harddrive on a laptop?
     
  9. X-Istence

    X-Istence * Political User

    Messages:
    6,498
    Location:
    USA
    Depends on the manufacturer. Contact their support if it is still under warrantee, if it is not, find out how much it will cost to have them do it, and find out how hard it would be to do yourself. On most laptops it is remove a screw and remove the hard drive, then pop a new one in.
     
  10. LeeJend

    LeeJend Moderator

    Messages:
    5,291
    Location:
    Fort Worth, TX
    Since we have no idea what HD Sentinel is using to base it's "health" evaluation on it could just be flagging a performance drop which can be caused by excessive fragmentation.

    It is worth checking before replacing a disk that SMARTs has not designated as failing. Since SMARTs is the criteria manufacturers use to issue RMAs for defective hardware and it is not flagging a failing drive I suspect the diagnostic tool being used before I suspect the actual hardware.
     
  11. X-Istence

    X-Istence * Political User

    Messages:
    6,498
    Location:
    USA
    It is clearly using the 5 bad sectors as it's health evaluation. Hence the reason it says so. HD Sentinel's website [1] clearly states it monitors the status of a drives health using the SMART data.

    That being said, just because SMART says it is not failing does not mean squat. I have drives that have suffered head crashes still show that their SMART status is not failing. The drive does not work anymore, but SMART says it should.

    Defragging the hard drive will NOT change the health that HD Sentinel displays.

    [1] http://www.hdsentinel.com/
     
  12. Tuffgong4

    Tuffgong4 The Donger Need Food!!!! Political User

    Messages:
    2,465
    Location:
    Chicago
    reviving an older thread...what is the best hard drive diagnostic program? could be bootable or through windows??

    getting a 1TB drive and want to keep 1 of the 2 drives I have in my sig but I want to see which one is "healthier" or in better condition to use as my system drive...