Yes, the Athlon's L1 cache is slower. In fact, for FP data, the P4's L2 cache is about the same speed as the L1 data cache on the Athlon.
Basically, the P4's "L1" caches shouldn't even be considered L1 anymore, more like L0 (yes, those are the registers technically, but I can't very well do L0.5 can I?). The instruction trace cache is part of the core itself, designed into the micro-architecture. The data cache is only there to store integer data. It gets all of its FP data from the L2.
"We are Microsoft, resistance is futile." - Bill Gates, 2015.