Well, let's see. first of all, I want to stress that performance as such is *not* the biggest advantage of 64 bit computing. Its the flat memory addressing, more than 2 GB addres space for your apps, no more fragmented virtual memory leaving you with just a couple hundreds of megabytes of continues memory.
That being said, a 64 bit cpu can be faster in some cases, especially the K8. First off all, there is like you mentioned, apps that require math on integers bigger than 32 bit. A 64 bit cpu can crunch these in theory up to 4x faster (in the case of 64 bit multiply which can done at once, instead of in 4 steps like on a 32 bit cpu). However, these apps are rare: mostly encryption, mathematical/scientific apps or simulations. not something you'd run on a daily basis (unless perhaps those Keygen cracks
.
Then there is the memory addressing issue; a 64 bit app under a 64 bit OS can access all of its memory directly, whereas on x86, a process is limited to 2 or 3 GB period. Hard to put a performance estimate on this, but if an app requires more than what is available, it should launch serveral process and the OS is forced to use PAE and bankswitching crap that is slow as hell and a PITA to develo for. I can't give hard numbers on this, but I would guess memory access could be as much as 5x slower using PAE and passing data from one process to the other. Not to mention, certain apps just can't be programmed around this at all. This will be most apparent in server apps (ERP, Databases, etc,..) and is the reason every other server ISA except x86 has moved to 64 bit ages ago. On the desktop this will be less apparant, since no desktop apps that I am aware off support PAE or /3GB in the first place. The difference will not be performance, but rather: it will work instread of not work. (think photoshop using huge images, and other memory hogs)
Then there are some specific advantages to AMD64. First of all, AMD extended the register set from 8 to 16 (of those, 5 I think are true general purpose registers, so the real increase is from 5 to 15). depending on the code, I would guess this might give you a ~5-10% performance boost.
AMD also extended SSE2 with 8 additional 128 bit FP registers. SSE2 performance therefore should be noticeably better under AMD64 in 64 bit long mode than in legacy 32 bit mode. I don't dare give an estimate though.. but this could have a noticeable impact on SSE2 dependant code (games, 3D rendering, encoding,..).
Lastly, I expect a speedup because of K8 specific compilation optimizations. Currently, most software is optimized for a wide range of cpu's, going from P2/3 to P4, Athlon, A64, etc.. Code compiled for AMD64 should not enable switches to remain compatible with (or optimized for) older cpu's, since K8 is the only AMD64 cpu anyway. I would bet currently most new applications are compiled with the P4 as performance target, not the K7 (which is logical given the marketshare). For 64 bit code, this won't be true anymore, performance optimization target will be the K8, and this alone might lead to more significant speedups than everything mentioned above, even though it has nothing to do with "64 bit" as such. Its just the benefit of the market leader (in this case, the prettty small 64 bit x86 market).
But AMD64 also has its drawbacks, not everything will speed up. Two things: when you use a 64 bit OS, and if you run an app that doesnt need 64-bit data you can use the old 32-bit instructions just fine and only use 64-bit ones as necessary. However, a pointer is always going to be 64-bits in AMD64 long mode, so some applications might take a slight performance hit because of the extra strain on the memory bus caused by loading 64-bits for every pointer instead of 32. I would expect the extra registers to offset this possible disadvantage, but quick & dirty recompiles could actually give you a minor performance hit, instead of an advantage.
A second disadvantage is the fact that ICC (intel compiler) doesnt support AMD64 (yet ?). ICC is probably the compiler that generates the fastest x86 code. It is not widely used except for "benchmark software" and of course SPEC, but some cpu intensive apps (think 3D render cores, divx codecs, maybe even GPU drivers,..etc) might well be compiled with ICC because of the performance. Porting to AMD64 will require using other compilers (Microsoft, GCC, Portland, ..) which might not give you as fast binaries. Especially SPEC (where ICC shines) scores will be affected, but some often used benchmark programs might take a hit as well. Your average game OTOH, is not likely to have been compiled with ICC.
Now, you where wondering, what will this give me for "my apps" ? the answer is: I don't know. I am not expecting leaps and bounds, even though I just saw some results under XP 64 that where just incredible (like more than double the performance using DivX). Overall, I am prudent, and do not expect more than ~10%. In same cases, I even expect a small loss of performance, while a few other specific programs might give spectacular increases. For gaming, I doubt we will see spectacular increases in FPS, but I firmly expect future games to enable you to select higher details, bigger maps, etc on a 64 bit CPU/OS. Compare it to DX8 verus DX9; not really faster as such, but more eye candy and more flexibility for developpers.