I've read a lot of speculation on how a D0 differs from the C0 and C1 (yes Virginia there is a C0 and C1 stepping).
Overclocking reports seem to indicate there are thermal and or voltage improvements in the D0 stepping. On this score I have no new information other than to say Intel has not changed the electrical or thermal specs.
What Intel has announced, is a fix for a number of errata with the D0 stepping. The Core I7 family lists 102 errata. The D0 stepping fixes 16 of these and introduces one new one of it's own.
I condensed the Fixes into a word doc that you can download here
It seems that the real reason for a new stepping was fixing some of the more critical bugs in the C0 stepping. whether Intel also incorporated Geometric and/or Process changes is not known. The few benchmarks posted on the web seem to support your theory but I wonder if the hope that the D0 will overclock better and run cooler isn't leading some to a self fulfilling prophecy.
Until more results come in it is just speculation which way it will go. Personally I hope you are right which is why I'm holding out for a 3520 or I7-920 DO.
To set history straight the reason for Q6600 G0 clocking generally better than B3 (besides process maturity) is all down to the GTL Ref. G0 has much tighter tolerance when operating at higher FSB and needed less GTL Ref tweaking than B3.
There are still many B3 clocking as good as G0 once significant GTL Ref changes has been applied.
Now, the news I heard with D0 is a change entirely on packaging while the design of the die is untouched. How that relates to the errata fixes I don't know.
I'm guessing the better consistency in overclocking a 920 D0 is due to better tracing in the packaging and general process maturity on the die.
I don't know what changes, if any, Intel made that could effect dissipation or clocking potential and Intel isn't saying. Although, I'm sure, from a marketing standpoint they don't mind if rumors of better performance get traction.
I do know that the bug fixes I listed are official and that the D0 stepping includes mask changes and is functionally different than the C0.
One misconception that seems to persist is that this "process maturity" gain coincides with stepping releases. These, non mask related, process changes are applied continuously to each production line. A C0 part made in the last month of production may be just as good at the current D0 parts and a D0 part made a week from now may include process improvements that do not exist in the current D0. Intel, at least externally, does not say what or when a process change is made. Further more, the goals of a process change are not always to give us a faster part. They might be made to improve yields and actually sacrifice some overclocking potential to achieve that result.
As for changes between the masking of C0 to D0, then I guess it exists in order to fix the erratas as you said. And would explain many news website got it wrong.
Anyway, there's still a huge variation in clocking potential in D0 compared to C0 from the data I've collected, just smaller and in generally a little higher (with same vcore/uncore). There were also reports of C0 made late before D0 that clocked very high (4.5Ghz range) as well which says something about process maturity over time and irrelevant to the change in masking.
Also as your first post said it's very similar to the B3 -> G0 situation. What people need to understand is all potential clocking improvement are based on statistics overall. i.e. not every D0 is not magically better than C0. The deciding factor is still down to the chip itself and that's down to luck/chance(again statistics) with D0 providing higher.
It would be interesting if we could isolate the benchmarks for the most recently manufactured C0 parts and compare them to the DO's.
Many of the more professional over clockings were done on early I7 parts rather than the most current ones.
Be nice to test our theories about process improvements.
One last thought. We all know how important voltage is to overclocking. Within the die the power distribution traces are laid down according to simulations and rules. When engineering for ultimate clock speed tweaking these traces can make meaningful improvements.
Put it this way. We all know that at the top end a chip may boot but not finish a benchmark. This happens because booting does not use that part of the chip that has gone past its OC potential until the benchmark starts to run.
The goal when you design a chip is to balance each subsection to run at a common clock speed. Even so there will always be a weak link that fails first. If Intel engineers can identify the week links and through various means help them OC to the same speed as the rest of the chip they can produce faster chips.
I have no specific knowledge whether Intel has done this in the case of the D0 stepping but it is within the realm of possibilities.
Engineers are very competitive and I think AMD recently claimed the OC crown. Maybe .. just maybe Intel has been burning the D0 midnight oil.