GTX 670 (X2 in SLI) same cards, same slots, but run differently when switched, need help!!

Brian Bowman

Honorable
Mar 13, 2013
6
0
10,510
Okay guys, I have a situation that makes no sense and I would like to get it figured out, HELP PLEASE. I will explain this as best as possible. I have a Z77 motherboard by gigabyte, and I think its important to mention that I only have one true 16x PCI express slot, the other one is an 8x, so when running two cards, one is 16x and one is 8x according to GPUz. I have two matching MSI GTX 670 video cards, and I have installed aftermarket cooling on them via Arctic which made a big difference as far as noise and temps. I have a 3rd gen i7 cpu, so its running in PCI express 3.0 I have installed and used Precision X extensively to try to get the most overclock out of these cards and this is where we start running into my issue. GPUz says that both of these cards have a matched gpu clock, no matter where I set it at, they are the same. From the stock speeds to whatever I can overclock them to, GPUz says they are the same. Now, Precision X tells a completely different story, according to it, one of the cards is more than 100 MHz faster, and I felt completely ok with ignoring that and just going with what GPUz said, but, as I continue to explain further, I believe there is a big difference between the two cards. Now, most people (from what I've read) agree that you should be able to get a GTX 670 around 1250 MHz with overclocking, and that's where the hotter card says it is, 1241 or so. The other one I believe is around 1120 MHz. Now when the cards are swapped, and everything else staying the same, and the hotter card goes into the 16x slot, it has a 122% power draw in Precision X, temps start skyrocketing, and basically gives every indication that it's working as hard as it can. Now you swap these same identical cards and put the one that says its around 1120 in the 16x slot, the power draw isn't as high, neither of the card's temps really start to go up, it just seems very weird to me. My questions are the following: Is it possible to have two identical cards with different clocks? GPUz says they are the same but Precision X tells a different story and even further when they are physically moved around in my system they behave differently. My main question is, am I not getting what I should be getting out of the other card? Any insight to this?
 
No cards are truly identical. Some just happen to OC better than others. It sounds like you got one dud that doesn't like to OC. Also, depending their temps, boost will kick in to varying degrees, meaning that if one of your cards is running much hotter than the other, boost will behave differently.

GPU-Z will show the default clocks for the card in the opening window, but not the running clocks, and it usually shows the OC'ed clock as well.

When you OC them, you have to either use the sync check box or switch to both GPU's to make sure they are OC'ed the same, but keep in mind, this is before boost. The average 1250 clock speed is after boost kicks in, not before. Almost no one can get to 1250 clocks without boost kicking in and causing a crash. Voltage tweaks are needed to go past that point, when EVGA allows, but be careful, as that is how you can damage a card.

Since you put on aftermarket coolers, it is also possible you didn't get the TMI on very well, or have a loose screw causing it not to perform well.

One last thing, if v-sync is on, or you are being held back by the CPU, boost may not kick in for one or both cards. They attempt to save power when not needed.
 

Brian Bowman

Honorable
Mar 13, 2013
6
0
10,510
Thank you bystander for the help. I have not had the time to come back to this for a while, but I spent many hours yesterday putting more time into this, and have finally come up with some more information. I took one card out of the equation completely, and just ran with one card at a time and did every different test I could think of trying to eliminate as many issues as possible. I used different PCIe power connectors from the power supply, overclocked, underclocked, changed settings, benchmarked over and over again until I got results. The one card, which had the higher core clock according to PrecisonX, did NOT get hot at all by itself. I was able to push it HARD, with PrecisionX, and successfully ran 3dMark11 with 150mhz overclock on the core clock, and 650mhz on the memory, all according to PrecisionX. Whenever I would check what GPUz was saying it was always something a little different so I really didn't factor any of that information into these tests. I personally believe I was overclocking to the point that I ran into power draw issues, and the card simply wanted more power that just wasn't available, because the card never did overheat and it would actually start artifacting on the screen during benchmarks, rather than crash or black screen or freeze and lock the computer up. I think with my aftermarket cooling I could even push the card harder with voltage modifications. However, I am not afraid to admit that is something that is beyond my scope of understanding, and I'm not really interested in doing that anyway, PrecisionX allows for extra voltage and I'm fine with what it provides. For all that trouble, I would rather upgrade to two GTX Titans, but since I don't have 2 grand to blow, I would like to tweak what I have and feel good knowing they are actually working like they should without weird issues. Now, with that being said, I adjusted the cards clocks to the highest I could get and still run a good 3dMark11 benchmark without any artifacting. In theory, I should be able to shutdown, swap cards, and get the same results from the other identical card, which is what I tried. When I did that, the other card behaved completely differently, and gave me a black screen. I'm now convinced the card is bad, and it's on its way to be RMA'ed right now. I believe that the reason the one card would run hotter in the SLI setup is because it was trying to do more work, and even after I was able to push it much much harder by itself it still did not overheat. The weaker card must have been putting a huge burden on the other one, or at least that's what I'm theorizing.
 
There is a good chance the problem is the PSU. What PSU do you have? It's funny you just brought this back to life, because last week I went through a similar issue. I was noticing that my cards were hotter than normal at idle and I started to get crashes. Most the time it was my SSD that became unavailable, but then it was my video cards having issues. When I disabled SLI, they worked, so I thought one card went bad. I later tested each card by themselves and they worked without a problem.

At this point I decided to look at the SSD issue, and after doing a software update, I noticed the reasons my SSD got kicked off line was a lack of power.

So, I bought a new power supply, and everything is normal. I've not had a crash yet, my SSD hasn't gone off line and my cards are idling at 32C as they should.
 

Brian Bowman

Honorable
Mar 13, 2013
6
0
10,510
I didn't go super huge on power for my PSU selection, I tried to go for quality. I got an 80 plus gold certified power supply from cooler master at 800watts, which is SLI and Crossfire ready. I was thinking it was the power supply, or that it could be the power supply, but I feel I definitely eliminated that. The fact that the two cards behaved 100% completely different in the same slot with same settings and the same speeds set in precisioXn and same power supply during the same benchmark tells the whole story, I think anyway. Plus, there's still the fact that they popped up differently in PrecisonX as far as core clock speeds. Bad card, has to be. I will keep you posted. Sounds like you figured out an interesting issue with your computer, that's nice you figured out the psu before you bought an SSD, which is what I probably would have done lol. And P.S., my cards idle at 24C with my aftermarket cooling!! Boo ya !
 
Well, the PSU was also giving issues to my SLI setup. My cards ran hotter than normal, and crashed when pushed at all. I was using an Antec 850 bronze 80+ PSU, and it crapped out on me. So anyways, I'm good to go now. Sounds like you are ready for an RMA, but make sure to note the performance issues with it. You want to make sure they test what you have, so they spot the problem, and not send it back to you.
 

Brian Bowman

Honorable
Mar 13, 2013
6
0
10,510
Another thing that occured to me was that you can actually look up the average GPU score of any given card on futuremark.com . I would be really curious to know how they come up with that number but I'm assuming its an average score based on average benchmarks. The rating for the 670 is 9500 according to futuremark. Now, I just have a barebones stock card, I wanted something inexpensive because I was just going to rip them apart and put coolers on them and everything, and it scores inbetween 9150 and 9250. So, I think that's fine. I can get well over 10k with PrecisonX, so the card that I kept has to be good. I feel like I'm talking myself into my decision at this point lol but I think I'm just freaking out a little since I have to wait for god knows how long for the new card to come in the mail, I don't have it to compare anymore. P.S. How old was your Antec PSU? I thought those were one of the better PSU brands. And no six core in your 1366 board? That's the main reason to get one. Did you ever plan to get one or just haven't got around to it yet? I know the 990x is still going for top dollar even used, ridiculous. Should have come down when the new 2011 cpu's rolled out.
 

Brian Bowman

Honorable
Mar 13, 2013
6
0
10,510
Okay. So I finally get the card back from MSI. They tested it, and said it tested fine, which didn't surprise me all that much since it does work at "normal" specs. My only issue was with the boost clocks not matching and different power draws and all the other weird issues. Again, the boost clocks do not even come close to matching when they enabled, more than 100mhz difference, and on top of that, neither card goes to the default boost clock speed of 1045, they are both over. I'm not even going to bother sending my other card in, I already know they are going to tell me it's fine. So, with all that being said, I did some more testing with the card I got back both by itself and in SLI with different configurations of both cards being on top and bottom. Any benchmark that's run on the cards with no overclocking seems to give about the same results by themselves, within 2% or so, so I'll chalk that up to margin of error. But, the one card with the higher boost clock does WAY WAY better by itself with overclocking, it can run at 1300mhz and make it through benchmarks, it gets a graphics score of 10600 in 3Dmark, which is 1100 more than what 3dmark says is the average for that card, so, wonderful in other words. The other one doesn't even come close, and that is pretty much what's killing me here, why isn't it? It is definitely limiting me in my setup, in SLI I have to keep the overclock low enough to where the sub-par card will operate, even though the other one will take a much higher overclock. I initially thought that the placement of the cards had something to do with their behavior but I think I've ruled that out. Unfortunately, I don't really think I came up with any concrete answers to what happened or why these cards behave so differently, but, I do have a theory. The boost clock spec must just be a minimum spec, used for testing, and for my cards the boost clock is 1045. I believe that the clocks will go up to whatever they think they can handle based on temps and voltage and other specs gathered from self-monitoring. Intel's max turbo for their cpu's does something similar and that's what got me thinking about it. If the intel CPU sees that temps and other metrics are good, it will go all the way up to the max turbo clock speed, it doesn't mean it will always go to that clock, it just means that it's capable (without manually setting it with overclocking). I think that the boost clock speed is just a random number based on what the card thinks its capable of, and as long as its 1045 or better they ship the card when they test it at the factory. I think that if you get a card with a high boost clock, its a good indicator that its also going to be a good overclocker. Has anyone else arrived at this conclusion? And even worse, I don't think there is a way out of this situation because there is no guarantee of what kind of card you will get if you try to replace it to match it more evenly. Oh well. Poo.