Its started again...

Discussion in 'General Hardware' started by CrashGate3, Apr 13, 2006.

  1. CrashGate3

    CrashGate3 OSNN Junior Addict

    Messages:
    24
    For *xxxx'x* sake..

    (See here for background and specs http://forum.osnn.net/showthread.php?t=82381 )

    The problem with the USB not working seems to have been fixed by moving everything to a powered hub.

    However, the crashing has returned, again.

    If I try and run too much 3D stuff, the PC freezes with no errors or anything - everything just stops with watever happened to be on the screen at the time. Theres no corruption of the image, no graphical artefacts.

    A complete reinstall of Windows fixes things completely, every time, for about 1 to 2 weeks, and then it starts again (it was a week last Saturday I last reinstalled - the crashes returned last night).

    The last time I reinstalled, I got rid of my original HD and replaced it with a different one.

    This is driving me INSANE. Please help.. *sob*
     
  2. LeeJend

    LeeJend Moderator

    Messages:
    5,291
    Location:
    Fort Worth, TX
    My first reaction is it sounds like your system drive has some flaky sectors on it. As the area in use moves out from the hub after the re-install you finally write important system files on the bad area and start getting lock ups.

    BUT windows and the CPU exception handling should catch any bad execution code or pointers to invalid RAM and give you a BSOD. Plus your old system worked fine with that HD. I think you are still putting the system on the same HD as before right?

    So back to the other stuff. As I recall
    -you got rid of the Neo power supply and got a real one. (Neo are no good for high performance systems.)
    -You ran memtest with no failures.
    -You crash during 3dmark06 which could mean video or CPU/MB problems.
    -You didn't want to switch boot drives becasue of programs you don't have the media to reinstall.

    Do the following in the order that is most convenient for you.

    1) Run sisoft sandra burn-in on just the CPU(s) for an hour or two and record max temp and if it crashes to rule that out.

    2) Run atitool.exe for a couple hours and monitor temps while waiting for a crash. (It's a free overclock tool that lets you run a 3d application in a window while monitoring graphics card temp's). I use it for stability tests (never overclock a GPU life is too short, and they are too fragile).

    If it crashes call ATI tech suppot and tell them system locks up in 3d app's. They will make you go through multiple MB & card driver reinstalls and updates. Eventually if you get to a senior tech they might admit to issues with the ABIT and 7900GTX or they may just tell you to RMA the video card.

    3) Move your OS to another HD and do a burn in on the entire drive you had your OS on.

    PS Don't get too wrapped around the "it works fine after a reinstall for a week or so". Random crashes are... random. The system can seem to be fixed and the bug just rears it's ugly head again eventually. I spent 4 months chasing crashes from a bad sound card. Only reason I ever found it was I decided I wanted a better sound card.

    4) If you want to rule out the reinstall do another one but this time install some big programs on the HD early in the process, before doing windows updates. The 3dMarkxx series will do nicely. That will force your OS further out onto the HD surface, possibly into an area with bad sectors. If no failures occur when running the various benchmarks it is probably not the HD.

    5) Really undesireable but - If you still have the old MB/Vid/CPU put that system back together and see if it has any issues. If it is clean the HD is ok.

    6) Power quality is always an issue. You could have dirty power coming in from the wall. Get everything else off the branch your PC is powered from. Pop the CB that feeds the PC. Go around the house and unplug everything that has lost power. Push the CB back in and go run benchmarks. If it still fails it's not power quality.

    If you can't isolate to anything else the last suspect is the MB. There is no way to isolate to that except swapping MBs. Preferably with a board with the same chipset from a different manufacturer. As a last resort buy the cheapest POS (ECS, my brand) you can find. If the system runs fine with that RMA the ABIT. When you get the ABIT back sell the ECS (or if you're really pissed sell the ABIT) to get some money back. This may sound expensive but taking the PC into a shop will cost more with no guarantee of success and they may charge you to replace half the stuff in the computer even though it was working fine.
     
  3. CrashGate3

    CrashGate3 OSNN Junior Addict

    Messages:
    24
    The whole OS is moved to a different drive - I've taken the original OS drive out, and did a proper format on the replacement one before I reinstalled Windows - this should have marked any bad sectors as bad and not used them, am I right?

    One of the first things I installed after the reformat was 3DMark 06, to see if things worked ok with that - they did, it ran through the whole thing with no problems at all. I'd also been running Quake 4 for hours on full settings with no problems. The problem is exactly the same as I was getting with the old drive.

    I just noticed theres a new set of Catalyst drivers out - I'll try them just to see if it helps.

    EDIT: Nope.

    It seemed to get a bit further on 3DMark06 - it froze during the Arctic Research Station bit, whereas before it would freeze where the monster jumps out of the lake.

    One other thing - I downloaded ATITool, but I can't find anything showing the temperature of the card. Am I missing something?

    Its the fact that its not *random* crashes that's bugging me - it works perfectly every time for between 1 and 2 weeks and then passes a point at which it freezes every single time I try to use it for games.
     
    Last edited: Apr 14, 2006
  4. CrashGate3

    CrashGate3 OSNN Junior Addict

    Messages:
    24
    I've run ATiTool and SiSoft Sandra using the Burn-in for the CPU and for the RAM for about an hour each and neither has come up with any errors.
     
  5. lancer

    lancer There is no answer! Political User Folding Team

    Messages:
    3,093
    Location:
    FL, USA
    I wonder if it is the v/card drivers you are using.
     
  6. CrashGate3

    CrashGate3 OSNN Junior Addict

    Messages:
    24
    I suppose it couldn't hurt to try different ones - I have the Catalyst 6.4 ones at the moment.

    What's the best alternative?


    I found out that the resolurtion has no effect on the crash either - 3DMark06 crashes it in about the same place at 640x400 or 1600x1200. I did hear the fan in the card speed up just before it crashed. The annying thing is that I can't find out how to check the temperature on my card. There wasn't anything on the CD that came with it and I can't find anything on Powercolor's website. Could it be that it hasn't got a temperature sensor? I would have thought that a high-end card like the X1900XT would have one.
     
    Last edited: Apr 14, 2006
  7. CrashGate3

    CrashGate3 OSNN Junior Addict

    Messages:
    24
    Woop! Woop!

    I just installed the Omaga drivers and things seem to be going well. It just got all the way through 3DMark06. We shall see what happens in the next few days...
     
  8. LeeJend

    LeeJend Moderator

    Messages:
    5,291
    Location:
    Fort Worth, TX
    Yes. The set clock screen has a little green and black temp graph in the middle left of the screen. It shows GPU temp.

    Click show 3d view to start the 3d window. You will see the GPU temp rise rapidly. Note if you cover up the 3d window the temp drops again because the 3d rendering stops.

    Click settings and there is a pull down menu on the bar at the top. Thats where all the goodies are hidden.

    Select the temperature monitoring page for text temp. details.

    You have a Powercolor card. Those usually come with a really nifty tool set, like what you get with ATI tool. Check your install CD. You might have not installed it.

    As for the Omega's fixing the problem, not likely. They are just the ATI drivers with a different set of default optimizations selected. Remember what I said about random failures. It means just that.

    If the PC wokrs ok all weekend and then starts acting up on Monday I'd start looking at external problems - wall power, radio transmitters, close by businesses with heavy machinery, etc.

    Neat, my GPU just went from 42 C to 68C and the fan throttled up to where Ican hear it.
     
    Last edited: Apr 15, 2006
  9. CrashGate3

    CrashGate3 OSNN Junior Addict

    Messages:
    24
    I don't have a temperature monitoring option on my ATiTools. I just have this:

    [​IMG][​IMG]


    The different drivers do seem to make the crashes less frequent - I can run it longer before it crashes and now it gets through 3DMark06 every time, wheras before it crashed in about the same place every time. It still eventually crashes though, again every time.
     
  10. LeeJend

    LeeJend Moderator

    Messages:
    5,291
    Location:
    Fort Worth, TX
    That is weird same version and identical appearance to mine but the temp and fan options are missing. I wonder if it's because you have a powercolor branded board.

    Check the power color website and see if they have a bios update for the card. You can flash video card bioses. Don't flash it unless they say the new bios fixes crash issues.

    I looked at the powercolor website and there was nothing about bios updates and theFAQs were 5 years old. I'll see what ATI has to say.

    Nothing useful on the ATI site either.

    Since you seem to be getting improvement with a change in drivers you should submit a trouble ticket to powercolor and see what they have to say. ATI admitted a lot of problems to me on the phone that are not listed on their website. There may be a known bug with work arounds or at least if there is a known bug they may be working on a fix and tell you.

    It took 4 months from when I bought my X800XL until they had drivers and software that are stable with my system. It required changes to my MB, Sound and Video crivers until everything was acceptable. The X1900 XT is a much newer card. There could be issues.
     
    Last edited: Apr 15, 2006
  11. CrashGate3

    CrashGate3 OSNN Junior Addict

    Messages:
    24
    I tried to register mt problem on the PowerColor website and it seems they only offer tech support if you like in the USA or Canada... :mad:

    I definitely won't be buying a PoweColor product ever again...

    EDIT:

    Aha! I just updated my ATITool to the 0.25 beta and its enabled the temperature and fan controls. I'm going to turn up the fan speed and see if that makes a difference.
     
    Last edited: Apr 15, 2006
  12. RickyC

    RickyC OSNN Addict

    Messages:
    199
    Location:
    Earth
    Have you checked to see if the GFX card fan is spinning?
     
  13. CrashGate3

    CrashGate3 OSNN Junior Addict

    Messages:
    24
    Heh, yes its definietly spinning. It would have died a loong time ago if it wans't. Plus it sounds like a helicopter when it spins up.

    I've had a play with the fan controls in ATiTool and speeding them up a bit seems to be working.

    The temperatures seem a bit odd though - it idles at around 48 degrees and goes up to about 56 under load, and the crashes seem to occur if the temperature goes over this. Isn't 56 degrees a bit cold for a card to crash? I though they could go up to 70 or so with no problems? I can speed up the fan to keep it below this, but it makes a hell of a noise.
     
  14. Admiral Michael

    Admiral Michael Michaelsoft Systems CEO Folding Team

    Probably the fan is dying and upping the speed is just providing the same airflow as if it were new.

    I think I worded that right. :p