The L1 data cache is 8 KB with a 2 cycle access latency for a 64-byte cacheline with accesses every cycle.
The L2 cache is 512KB for the Northwood models with a 7 cycle transfer latency for a 64-byte cacheline and can be accessed every 2 cycles.
There is no traditional L1 instruction cache. Instructions which are decoded are stored in the execution trace cache and all further calls to those instructions are taken from the trace cache in the form of already decoded micro-ops. The trace cache stores 12k micro-ops. Each micro-op is similar to 1 or more x86 instructions depending on the complexity, which themselves vary in size from 1 bit to 86 bytes.
There's no real way to compare the two caching structures (based on size and speed), they just work differently.
"We are Microsoft, resistance is futile." - Bill Gates, 2015.