Jump to content

2080Ti and Random Crashes


Recommended Posts

Long story short I upgraded my GPU to a shinny new 2080Ti apart from the somewhat smaller form factor compared to my 1080Ti I didn't give it much thought all this whilst going through some really odd image issues in VR and that's another story from the 2.5.4.x changes. One thing I noticed from the time I installed the 2080Ti is that the back plane got very hot whilst running DCS 60+ C.

 

Running old school settings netted me slightly more eye candy but still same ole same ole FPS, alright so I did some tweaking and experimenting with settings and found my rig could mostly do 90FPS in Rift NTTR and Caucasus even PG and Normandy but not as consistent with some interesting observations more latter.

 

So what I observe happening is for example on the Caucasus map CPU would sit at about 17% GPU about 50% 45FPS then the CPU would ramp up to 25% and GPU to 80% and voila 90FPS.

 

This meant GPU back plane temps rose a few more degrees as my previous 1080Ti GPU ported some of the heat out the back the new 2080Ti does not (all the 3 fan designs look much the muchness here) and increases the temperature inside the case. Anyone else think not porting at least some of the heat out the back is a bad idea?

 

The crashing was consistent and brutal, BSOD and a restart after a few of these and with a fairly hot day yesterday the system crashed but this time when I fired up taskmanager I had 8GB of RAM missing I'm not sure how many starforce points I racked up because of "Total System Memory" changes. :cry:

 

Platform is X99 so 4 memory slots either side of the CPU all close to the GPU backplane and getting hotter than before. Front RAM set about 50C rear RAM set about 55C.

 

The temporary fix, drop RAM speed from 2400 to 2200 so far stable but long term I need to revise the system cooling especially the GPU and RAM. So if you upgraded recently and I can't say if the same applies to other 20 series GPUs it might be worth checking temps etc if your getting seemingly random crashes.

 

As a side note dropping the RAM speed noticeably affected performance,... such is life. :)

 

HTH. :thumbup:

Control is an illusion which usually shatters at the least expected moment.

Gazelle Mini-gun version is endorphins with rotors. See above.

 

Currently rolling with a Asus Z390 Prime, 9600K, 32GB RAM, SSD, 2080Ti and Windows 10Pro, Rift CV1. bu0836x and Scratch Built Pedals, Collective and Cyclic.

Link to comment
Share on other sites

Do you have custom fan curves set for your GPU?

 

I have mine set to be at 100% by 45C using EVGA Precision x1, you can also set custom fan curves in msi afterburner. My card sounds like a quiet vacuum cleaner when I'm gaming but it stays pretty cool. Backplates can get a little warm.. but 60C shouldn't hurt your GPU. It might throttle down slightly at that point thouugh. 55C shouldn't kill your ram either, but that doesn't mean there isn't a faulty memory module in need of an RMA or replacement. Did you try re-seating the ram to be sure nothing to jiggled loose inside?


Edited by Headwarp
Spoiler

Win 11 Pro, z790 i9 13900k, RTX 4090 , 64GB DDR 6400GB, OS and DCS are on separate pci-e 4.0 drives 

Sim hardware - VKB MCG Ultimate with 200mm extension, Virpil T-50CM3 Dual throttles.   Blackhog B-explorer (A), TM Cougar MFD's (two), MFG Crosswinds with dampener.   Obutto R3volution gaming pit.  

 

Link to comment
Share on other sites

Long story short I upgraded my GPU to a shinny new 2080Ti apart from the somewhat smaller form factor compared to my 1080Ti I didn't give it much thought all this whilst going through some really odd image issues in VR and that's another story from the 2.5.4.x changes. One thing I noticed from the time I installed the 2080Ti is that the back plane got very hot whilst running DCS 60+ C.

 

Running old school settings netted me slightly more eye candy but still same ole same ole FPS, alright so I did some tweaking and experimenting with settings and found my rig could mostly do 90FPS in Rift NTTR and Caucasus even PG and Normandy but not as consistent with some interesting observations more latter.

 

So what I observe happening is for example on the Caucasus map CPU would sit at about 17% GPU about 50% 45FPS then the CPU would ramp up to 25% and GPU to 80% and voila 90FPS.

 

This meant GPU back plane temps rose a few more degrees as my previous 1080Ti GPU ported some of the heat out the back the new 2080Ti does not (all the 3 fan designs look much the muchness here) and increases the temperature inside the case. Anyone else think not porting at least some of the heat out the back is a bad idea?

 

The crashing was consistent and brutal, BSOD and a restart after a few of these and with a fairly hot day yesterday the system crashed but this time when I fired up taskmanager I had 8GB of RAM missing I'm not sure how many starforce points I racked up because of "Total System Memory" changes. :cry:

 

Platform is X99 so 4 memory slots either side of the CPU all close to the GPU backplane and getting hotter than before. Front RAM set about 50C rear RAM set about 55C.

 

The temporary fix, drop RAM speed from 2400 to 2200 so far stable but long term I need to revise the system cooling especially the GPU and RAM. So if you upgraded recently and I can't say if the same applies to other 20 series GPUs it might be worth checking temps etc if your getting seemingly random crashes.

 

As a side note dropping the RAM speed noticeably affected performance,... such is life. :)

 

HTH. :thumbup:

 

 

Sorry to be that guy. I really hope I am wrong and you have just jiggled a RAM module loose.

 

However...

 

 

Prepare for black screen and RMA. When you go BSoD and it suddenly fails to recover, its time to pull the card and try reebooting with your on board graphics.

The GPU and VRAM are rated at 85c. Unfortunately from what I have heard, it is the vram overheating and failing. Nvidia are now sourcing their memory from Samsung afaik.

 

No one knows what the real problem is as Nvidia are blowing smoke over the whole deal inc rate of failure. They reckoned that some "test cards" got sent out by mistake, but does not seem plausible as third party cards also affected.

 

As for failure rate, best info I found out on the forums came from non partisan retailers themselves, who reckoned the overall return rate on the 2080 series were running at 3.5% with just under 50% of that purely because the customer changed their mind and sent it back. So the problem is not a common one, it's just that the cards are failing usually within two months or so, making it appear artificially widespread.

 

In case you missed the threads on my particular failure.... you will see the similarities.

 

VR section.

 

https://forums.eagle.ru/showthread.php?t=229469

 

Hardware section.

 

https://forums.eagle.ru/showthread.php?t=230077

 

 

Edit. New card just arrived. Currently doing a full system image backup before installing it just in case. I learned my lesson!


Edited by Tinkickef

System spec: i9 9900K, Gigabyte Aorus Z390 Ultra motherboard, 32Gb Corsair Vengeance DDR4 3200 RAM, Corsair M.2 NVMe 1Tb Boot SSD. Seagate 1Tb Hybrid mass storage SSD. ASUS RTX2080TI Dual OC, Thermaltake Flo Riing 360mm water pumper, EVGA 850G3 PSU. HP Reverb, TM Warthog, Crosswind pedals, Buttkicker Gamer 2.

Link to comment
Share on other sites

  • 2 weeks later...

Well fortunately the GPU is still working but I fear the mother board is failing I'm getting random boot failures along the lines of think Holly from Red Dwarlf .

 

PC from windows boot manager, I cant' boot today John there's no boot drive.

 

Me But I just checked from command line C:\ is still there. Do a repair again,..

 

PC user account "John" password?

 

Me yes here is my password types it in.

 

PC sorry John no boot drive can't boot today.

 

Now it seems everything is correct MBR did all the fixes, It now boots between 4 and 10 restarts this is the 3rd SSD that has gone weird.

 

The irony for me is the "no boot drive found" yet all the command line tools say it's fine and yet it seems to work for a while after re-installing Windows then becomes intermittent. :huh:

Control is an illusion which usually shatters at the least expected moment.

Gazelle Mini-gun version is endorphins with rotors. See above.

 

Currently rolling with a Asus Z390 Prime, 9600K, 32GB RAM, SSD, 2080Ti and Windows 10Pro, Rift CV1. bu0836x and Scratch Built Pedals, Collective and Cyclic.

Link to comment
Share on other sites

Well fortunately the GPU is still working but I fear the mother board is failing I'm getting random boot failures along the lines of think Holly from Red Dwarlf .

 

PC from windows boot manager, I cant' boot today John there's no boot drive.

 

Me But I just checked from command line C:\ is still there. Do a repair again,..

 

PC user account "John" password?

 

Me yes here is my password types it in.

 

PC sorry John no boot drive can't boot today.

 

Now it seems everything is correct MBR did all the fixes, It now boots between 4 and 10 restarts this is the 3rd SSD that has gone weird.

 

The irony for me is the "no boot drive found" yet all the command line tools say it's fine and yet it seems to work for a while after re-installing Windows then becomes intermittent. :huh:

 

Sounds like the grief my 2500k would give me at times before I upgraded. I attributed it to the lack of driver updates for my EVGA mobo in that rig since 2013. With quite a bit of searching and investigating the individual components of the mobo i found some success with drivers directly from intel for the sata controller, as well as network drivers that were somewhat problematic. It's.. "Fun" installing drivers meant for 8.1 in windows 10.

 

I can't say I miss those moments with my old 2nd gen rig. Unsure if that's the case with you. But my ssd's from that old rig are still working like a charm in this build.

Spoiler

Win 11 Pro, z790 i9 13900k, RTX 4090 , 64GB DDR 6400GB, OS and DCS are on separate pci-e 4.0 drives 

Sim hardware - VKB MCG Ultimate with 200mm extension, Virpil T-50CM3 Dual throttles.   Blackhog B-explorer (A), TM Cougar MFD's (two), MFG Crosswinds with dampener.   Obutto R3volution gaming pit.  

 

Link to comment
Share on other sites

I remember the first Sandybridge boards that had a flawed SATA controller that would fail after some time only too well.

 

Same symptoms. But I knew what was wrong...

Windows 10 64bit, Intel i9-9900@5Ghz, 32 Gig RAM, MSI RTX 3080 TI, 2 TB SSD, 43" 2160p@1440p monitor.

Link to comment
Share on other sites

Okay looking at it I started getting BSOD's a few months back, WHEA and watch dog timer errors which I dismissed as maybe software errors but that and the fact the mother board never did run XMP with 2 different sets of RAM, wouldn't do it's automatic over-clock right from the get go. although RAM and CPU preform very well with a mild manual OC when it boots.

 

The bios/uefi was updated to the latest version only a week ago I was thinking it might be a OS/bios issue but even dropping all OC settings out it's still has this boot issue.

 

Thinking about it I have had this weird boot problem since day one although at the time I put it down to using an intel 750 series PCIe NVME SSD it's just gotten much worse recently and that's with a SATA SSD boot drives now. I've just recently pulled the NVME drive out and prior to that NVME drive has preformed very well as a data drive.

 

It does point to a dodgy mother board good news is it's still under warranty!

 

I'll wait and see what the supplier says about this.

Control is an illusion which usually shatters at the least expected moment.

Gazelle Mini-gun version is endorphins with rotors. See above.

 

Currently rolling with a Asus Z390 Prime, 9600K, 32GB RAM, SSD, 2080Ti and Windows 10Pro, Rift CV1. bu0836x and Scratch Built Pedals, Collective and Cyclic.

Link to comment
Share on other sites

WHEA and WatchDog are very much indicating your RAM is beyond what the combo will take.

 

I have had dozens over dozens of thos ewhen I tried to get mine running XMP on this board.

 

Mind you, have those a couple times and you have a real good chance of a corrupted OS.

 

Windows and any OS in general do not like RAM errors at all, sooner or later the problems get worse, even at 2133MHz then.

 

 

Try to downclock your RAM and see what happens. I had to settle with 3000MHz, 3200MHz gives me your errors and any fatser is bound to crash in less than 30min with BSOD or wont even beep or boot.

Gigabyte Aorus X570S Master - Ryzen 5900X - Gskill 64GB 3200/CL14@3600/CL14 - Asus 1080ti EK-waterblock - 4x Samsung 980Pro 1TB - 1x Samsung 870 Evo 1TB - 1x SanDisc 120GB SSD - Heatkiller IV - MoRa3-360LT@9x120mm Noctua F12 - Corsair AXi-1200 - TiR5-Pro - Warthog Hotas - Saitek Combat Pedals - Asus PG278Q 27" QHD Gsync 144Hz - Corsair K70 RGB Pro - Win11 Pro/Linux - Phanteks Evolv-X 

Link to comment
Share on other sites

Yes BM this is another possibility that I have not discounted as every now and again I have seen a RAM module drop out of bios and/or OS when booted.

 

TBH the CPU and MB and GPU's have been great performers but the RAM side isn't so good. I have used both separately of course , :) Trident Z 3200 which runs at 2200 used to run at 2400, rated at 3200 this MB has never ran at the rated XMP level and Corsair Vengeance 2666, would run at 2200 but never run at it's XMP of 2666.

 

So I have been down-clocking my RAM attempting to find a stable state and today put the CPU back up to 4.7GHZ, so far so good with RAM at 2200.

 

Crashes could be memory or memory controller, 2 sets of RAM similar results. Cause?

 

Boot issue could effect so more testing.

Control is an illusion which usually shatters at the least expected moment.

Gazelle Mini-gun version is endorphins with rotors. See above.

 

Currently rolling with a Asus Z390 Prime, 9600K, 32GB RAM, SSD, 2080Ti and Windows 10Pro, Rift CV1. bu0836x and Scratch Built Pedals, Collective and Cyclic.

Link to comment
Share on other sites

Yikes,

 

once you had those, chances are that your OS already has errors that will keep on crashing it even if your RAM is stable now. You will see what I mean if it is true with yours, no doubt.

 

FragBum, you have a 4-channel monster, there is not much need to increase bandwidth. Yours run at 2133 as fast in bandwidth as ours at 4266, so lean back and relax ;) Not much need to oc or force XMP.

 

I assume you have 4 modules and 8 slots, so why dont you change them all over, despite the manual says No-No, give it a try, sometimes this cures bad termination and other creepy symptoms that cause RAM to fail.

 

You can start with 1 module in one of those slots that only get occupied if you have more than 4 modules, like A1, B1, C1, D1. Usually you would use A2 B2 D2 and C2.

 

Give it a try.

 

One more tip if you do RAM testing.

 

!!!!!!!!!!!!!!!!DO NOT BOOT INTO W I N D O W S !!!!!!!!!!!!!!!!

 

make yourself a bootable USB stick with Linux Mint or Ubuntu and test there. Its a lot if they boot w/o crashing, once in desktop, just run stressapptest -s 3600 in console and lean back.

 

If you boot into Win, you are asking for trouble.

Gigabyte Aorus X570S Master - Ryzen 5900X - Gskill 64GB 3200/CL14@3600/CL14 - Asus 1080ti EK-waterblock - 4x Samsung 980Pro 1TB - 1x Samsung 870 Evo 1TB - 1x SanDisc 120GB SSD - Heatkiller IV - MoRa3-360LT@9x120mm Noctua F12 - Corsair AXi-1200 - TiR5-Pro - Warthog Hotas - Saitek Combat Pedals - Asus PG278Q 27" QHD Gsync 144Hz - Corsair K70 RGB Pro - Win11 Pro/Linux - Phanteks Evolv-X 

Link to comment
Share on other sites

Well I dropped the OC back to 4.6Ghz (I'll have a tweak of the cpu latter!) but seems to boot reliably even after booting reliably this morning repeatably @ 4.7Ghz until lunch time then,..

 

 

PC can't find the boot drive John,...

 

Me look it's right here C:\ dir look all your files are there there,...

 

Me checks partitions boot files all there not one has been changed since installed last week Grrr.

 

My thinking is it's a thermal issue as the 2080ti sits close to the memory and over one of the X99 chipset controller heatsink, it wouldn't take much to push it over the edge.

 

I do have an RMA for the motherboard but I'm not sure I can say it's faulty just not working like it did with my OC of about 18 months and until several weeks ago which has worked great, 99 percentile and all :)

 

Several weeks ago also puts us summer with ambient temps of 30's +

Control is an illusion which usually shatters at the least expected moment.

Gazelle Mini-gun version is endorphins with rotors. See above.

 

Currently rolling with a Asus Z390 Prime, 9600K, 32GB RAM, SSD, 2080Ti and Windows 10Pro, Rift CV1. bu0836x and Scratch Built Pedals, Collective and Cyclic.

Link to comment
Share on other sites

  • 2 weeks later...

I'll pose a question here for those with computer knowledge.

 

I removed heat sink and RAM cleaned the edge connectors on the RAM re-assembled, seems to work okay @ 2133 but to get the RAM to run at 2.2Ghz I have to set RAM voltage to 1.26V where the PC seems stable and fairly consistent memory throughput.

 

RAM will run at 2.4Ghz with full XMP voltage but it's not stable.

 

Previously I had the RAM running at 2.4GHz at stock voltage (Auto actually but 1.2V)

 

Considering that I have "changed" thermal conditions in the case could this be normal or perhaps indicate a RAM Module or System Controller chip dying?

 

So far the boot problem seems "fixed" and oddly by disabling USB boot in bios??

 

CPU runs happily at 4.7GHz (but I might be able to get that to 4.8~4.9 if I try)

Control is an illusion which usually shatters at the least expected moment.

Gazelle Mini-gun version is endorphins with rotors. See above.

 

Currently rolling with a Asus Z390 Prime, 9600K, 32GB RAM, SSD, 2080Ti and Windows 10Pro, Rift CV1. bu0836x and Scratch Built Pedals, Collective and Cyclic.

Link to comment
Share on other sites

I'll pose a question here for those with computer knowledge.

 

I removed heat sink and RAM cleaned the edge connectors on the RAM re-assembled, seems to work okay @ 2133 but to get the RAM to run at 2.2Ghz I have to set RAM voltage to 1.26V where the PC seems stable and fairly consistent memory throughput.

 

RAM will run at 2.4Ghz with full XMP voltage but it's not stable.

 

Previously I had the RAM running at 2.4GHz at stock voltage (Auto actually but 1.2V)

 

Considering that I have "changed" thermal conditions in the case could this be normal or perhaps indicate a RAM Module or System Controller chip dying?

 

So far the boot problem seems "fixed" and oddly by disabling USB boot in bios??

 

CPU runs happily at 4.7GHz (but I might be able to get that to 4.8~4.9 if I try)

 

I can't answer your question, but an experience comes to mind on my last rig in upgrading from 8gb to 16GB of ram, and my 4.5ghz OC on an i5 2500k became unstable. WIthout much thought I bumped the cpu vcore somewhere between .05 and .15 and system became stable.

 

Completely different scenario, but I'd try it with ram set to xmp and see what happens. Shouldn't hurt. Ruling out of course any seating issues with the reseat. It's a long shot, given this isn't a change of hardware but idk. Just came to mind when i read it for some reason.

 

Sorry if it does nothing, as to damaged components, would be beyond my knowledge to know how to determine, beyond testing each stick individually, which could take a long time with 8 sticks >.< The only time I know for sure I had component damage was with an i7 920's memory controller being fried in a lightning storm/power surge/outtage and that sucker just failed to boot period. Would turn on and right back off.


Edited by Headwarp
Spoiler

Win 11 Pro, z790 i9 13900k, RTX 4090 , 64GB DDR 6400GB, OS and DCS are on separate pci-e 4.0 drives 

Sim hardware - VKB MCG Ultimate with 200mm extension, Virpil T-50CM3 Dual throttles.   Blackhog B-explorer (A), TM Cougar MFD's (two), MFG Crosswinds with dampener.   Obutto R3volution gaming pit.  

 

Link to comment
Share on other sites

No worries, I'll play with CPU voltages after I get the RAM sorted CPU is still stock voltages.

 

I took the RAM out and cleaned the edge connectors and wiped them down with Deoxit and reseated them. I enabled XMP 1.35V from memory and I could only get a stable system at 2200Mhz but I tried lowering the volts and got it down t 1.25~1.26V.

 

Under XMP however the RAM does not run at the rated 3200Mhz. :cry:

Control is an illusion which usually shatters at the least expected moment.

Gazelle Mini-gun version is endorphins with rotors. See above.

 

Currently rolling with a Asus Z390 Prime, 9600K, 32GB RAM, SSD, 2080Ti and Windows 10Pro, Rift CV1. bu0836x and Scratch Built Pedals, Collective and Cyclic.

Link to comment
Share on other sites

Try to set VCC-IO and VCC-SA voltage a notch higher. Those correspond to your IMC and RAM circuits connecting them.

 

I would set the CPU to default speed and try to get XMP stable with DRAM-Voltage, VCC-IO and SA at first..maybe a little UP on the CPU as well, but very minor.

If you cant get it set this way, chances are low that it will work if you oc the CPU as well.

Gigabyte Aorus X570S Master - Ryzen 5900X - Gskill 64GB 3200/CL14@3600/CL14 - Asus 1080ti EK-waterblock - 4x Samsung 980Pro 1TB - 1x Samsung 870 Evo 1TB - 1x SanDisc 120GB SSD - Heatkiller IV - MoRa3-360LT@9x120mm Noctua F12 - Corsair AXi-1200 - TiR5-Pro - Warthog Hotas - Saitek Combat Pedals - Asus PG278Q 27" QHD Gsync 144Hz - Corsair K70 RGB Pro - Win11 Pro/Linux - Phanteks Evolv-X 

Link to comment
Share on other sites

Yes been a long week!

 

Yes XMP defaults CPU clock but it never gets through a boot there after., eventually it comes back saying overclocking failed press F1. Sigh,..

 

Long story short the motherboard is on it's way back as a warranty claim. :cry:

 

The good news is a 9600K build I did for my wife is testing AOK whilst the other is sorted out, there is a plan to this second rig. :thumbup:

 

And given how much testing in DCS I've done and how well this 9600K Rig is doing it seems the other hardware had issues. Ah well shyt happens.

Control is an illusion which usually shatters at the least expected moment.

Gazelle Mini-gun version is endorphins with rotors. See above.

 

Currently rolling with a Asus Z390 Prime, 9600K, 32GB RAM, SSD, 2080Ti and Windows 10Pro, Rift CV1. bu0836x and Scratch Built Pedals, Collective and Cyclic.

Link to comment
Share on other sites

:huh::cry:

Gigabyte Aorus X570S Master - Ryzen 5900X - Gskill 64GB 3200/CL14@3600/CL14 - Asus 1080ti EK-waterblock - 4x Samsung 980Pro 1TB - 1x Samsung 870 Evo 1TB - 1x SanDisc 120GB SSD - Heatkiller IV - MoRa3-360LT@9x120mm Noctua F12 - Corsair AXi-1200 - TiR5-Pro - Warthog Hotas - Saitek Combat Pedals - Asus PG278Q 27" QHD Gsync 144Hz - Corsair K70 RGB Pro - Win11 Pro/Linux - Phanteks Evolv-X 

Link to comment
Share on other sites

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...