Very technical K10 information for those interested
Quote:So this leads me to believe that that there were K10s working well before Feb 5th as that's when the code was submitted to GNU for inclusion in the compiler. I would suppose that the compiler would have to be tested on the CPU in question, so K10s must have been made and working for some time before then.
I'm interested to know why you think this?
This looks more like code optimisation to me so that the best instruction is used when cleaning stack etc.
The new instructions are, well, just the new instructions. They are documented anyway i thought?
Quote:Nice find MU_engineer.
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (2), /* cost of a lea instruction */
does this means that each addition needs 1 cycle to execute and each load effective address needs 2? :?
Well the problem with this is that it depends what you are adding. register to register is 1. reg, mem will be more.
This might just be their weighted balance though.
I think the idea of this table is weight which instructions to use.
and,or,add,sub,shr,shl. The compiler can choose which one to use to complete the operation in less cycles. There are a lot more instructions than that though, but thats what I think this is for.
A quick example here.
var1 = 30
subtract 8 from var 1
now in assembler.
mov ecx, 30;
sub ecx, 8;
Quote:how do they get to use SSE4A when those instructions are SSE3?
i see some new vectors but the others you listed are from SSE3 also that seems like a real short list of instructions for the processor is there more and those are the changes or is that it?
Good question. Looking at the file and some of the other files like ammintrin.h, it appears that those are the only new instructions for SSE4A. AMD was supposed to be introducing the more SSE4 instructions at some later date, so these might very well be the only ones introduced in the Barcelona/Agena. Also, the user "hjagasia" replied on the mailing list and the e-mail address is an amd.com one, which is not surprising.
I suppose it would be possible to write the compiler code as long as you know what the instructions will do and what penalties additions, subtractions, etc. have without ever compiling anything with it on the target chip. But since this is an optimization routine, I'd still have to think that it would have had to run a few times to confirm that everything works as it is supposed to. If the chips were not around for testing and no specific optimizations were given for it, then one would simply compile code using generic optimizations. The GCC developers didn't have Core 2 chips until after they shipped (Intel has its own compiler, icc) so that is what C2D users are doing.