I agree with wusy that more cache should be faster in games. I believe the explanation has something to due with the ability to fit large or more textures into the cache allowing faster processing making size and bandwidth more critical than latency. Everyday applications, especially Office, benefits from lower latency rather than bandwidth on the other hand because the files are generally smaller and don't require such a large cache.
It also isn't true that larger caches are always higher latency. On the Intel side, the Celeron D and 5xx models all share a common latency of 23 cycles despite the fact that the Celeron D has 256k of L2 to the 5xx's 1MB. The 6xx (2MB) and 8xx (1x1MB) likewise share a common latency slightly higher common latency of 27 cycles. As well, despite what mpjesse has said about the increased latency of the 2x2MB cache of the 9xx over the 8xx it isn't true. The L2 cache of the 9xx has been doubled while maintaining the same 27 cycle latency. The 8xx always had the added latency of the 6xx series despite it having less cache per core. The new 65nm Celeron D's coming out will have double the cache to 512k and I presume the latency will be increased to 27 cycles to correspond to the 6x1 series they are based on.
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2658&p=4
The most telling example of when larger caches can be obtained without added latency is the Pentium M. Dothan doubled the L2 cache of Banias to 2MB while still maintaining the low 10 cycle latency. As well, the reason why AMD can quote an extra 200MHz in the PR rating for doubling the cache is because their increase is likewise obtained without increased latency. I believe both 512k and 1MB L2 cache A64s have a latency of 17 cycles.
Generally though, the extra cache doesn't make a significant difference in performance. Only applications that can readily fill up such a large cache show a performance difference. This is especially true for AMD's architecture which, due to the use of an exclusive cache, doesn't rely as heavily on the L2 cache and so don't see as large an increase in performance when it is increased. A large part of it is also that the point of a L2 cache is to increase hit rates to reduce the likelihood of going to RAM, but once the hit rate is say 99.99% (just making up the number to illustrate) an extra 0.009% won't make as much of a difference.
I believe that endyen usually said that the extra cache is useful as a heat sink so if you are overclocking, the extra cache is a good idea. In any case, MHz provides a more reliable performance boost.
Oh, and about 64-bit. I'm going out on a limb here since I'm not completely sure, but I believe that extra cache will be more helpful in that case since the instructions are longer (twice or less?) and so would take up more cache space.