Sign in with
Sign up | Sign in

A Shared Front-End And Dual Integer Cores

AMD Bulldozer Review: FX-8150 Gets Tested
By

Sharing The Front-End

As I already mentioned, Bulldozer’s instruction fetch and decode stages are shared between both of its cores. AMD uses interleaved multi-threading to track the thread ID of each instruction in flight, decide which thread most needs work completed, and perform an operation on behalf of that thread. It’s able to switch on a per-cycle basis to keep progress moving on both threads. 

AMD actually decouples the branch target predictor from the instruction fetch stage, allowing it run ahead, independent of any stalls that occur in the fetch pipeline. More important, AMD says, is that decoupling those components enables a feature called prediction-directed instruction prefetch, characterized by a high level of accuracy and energy efficiency.

Branch prediction is guided by 512-entry L1 and 5000-entry L2 branch target buffers (BTBs). That pipeline is responsible for predicting ahead to populate a queue of future fetch addresses, and keep it as full as possible. There are actually two queues—one for each thread—ensuring there’s always work to be done. The instruction fetch pipeline then pulls addresses from the prediction queue.

Those addresses enter the fetch pipeline’s 64 KB two-way instruction cache, which is shared between both threads (the threads compete dynamically for access to it). Next, Bulldozer’s fetch queue feeds x86 instructions to a decode pipeline composed of four x86 decoders that, in turn, dispatch up to four operations per cycle to the schedulers.

When a miss occurs (that is, it’s not available in the instruction cache), a request is sent to the L2 cache and forwarded to system memory if necessary. That’s a big latency hit. So, while the request is in flight, fetch addresses further in the prediction queue are looked up to see if they’ll hit or not. If they’ll miss as well, a subsequent request is sent to L2 as the first instruction is coming back, overlapping instruction miss requests.

Dual Integer Cores

From the front-end, decoded operations make their way to one of two independent integer cores, where they execute fully out-of-order. The two cores each come equipped with two execution units and two address generation units.

Each core also features its own 16 KB way-predicted L1 data cache. Moreover, both cores include 32-entry L1 data translation lookaside buffers (TLBs) backed by a 1024-entry, eight-way L2 TLB that lives in the logic shared by both cores. Thirdly, each of the two integer cores employs out-of-order load/store units capable of two 128-bit loads/cycle or one 128-bit store/cycle.

Ask a Category Expert

Create a new thread in the Reviews comments forum about this subject

Example: Notebook, Android, SSD hard drive

Display all 530 comments.
This thread is closed for comments
Top Comments
  • 54 Hide
    Homeboy2 , October 12, 2011 4:38 AM
    killerclickAs I said before, it won't come close to beating Intel in performance or price. Now let's hear the fanboys whine.


    Everyone should cry, even the Intel fanboys, this is bad news for everyone, now Intel has absolutely no incentive to lower prices or accelerate Ivy Bridge.
  • 51 Hide
    jdwii , October 12, 2011 4:14 AM
    Been so long and i'm kinda sad.
  • 47 Hide
    gmcizzle , October 12, 2011 4:25 AM
    What I learned: the 2.5 year old i7-920 is still a beast.
Other Comments
  • 51 Hide
    jdwii , October 12, 2011 4:14 AM
    Been so long and i'm kinda sad.
  • 43 Hide
    compton , October 12, 2011 4:16 AM
    Not many surprises but I've been waiting for a long, long time for this. I hope this is just the first step to a more competitive AMD.
  • 29 Hide
    ghnader hsmithot , October 12, 2011 4:16 AM
    At least its almost as good as Nehalem.
  • 40 Hide
    gamerk316 , October 12, 2011 4:17 AM
    Dissapointing. Predicted it ages ago though. PII X6 is a better value.
  • 26 Hide
    Anonymous , October 12, 2011 4:18 AM
    As I expected - failure.
  • 25 Hide
    AbdullahG , October 12, 2011 4:18 AM
    I see the guys from the BD Rumors are here. As many others are, I'm disappointed.
  • 33 Hide
    iam2thecrowe , October 12, 2011 4:20 AM
    for the gaming community this is a FLOP.
  • 25 Hide
    phump , October 12, 2011 4:22 AM
    FX-4100 looks like a good alternative to the 955BE. Same price, higher clock, and lower power profile.
  • 40 Hide
    phatbuddha79 , October 12, 2011 4:25 AM
    Why bring back the FX brand for something like this?
  • 47 Hide
    gmcizzle , October 12, 2011 4:25 AM
    What I learned: the 2.5 year old i7-920 is still a beast.
  • 25 Hide
    Ragnar-Kon , October 12, 2011 4:36 AM
    Looks like solid chips, but I'll admit that the price point isn't low enough to compete in the gaming world with Intel.

    I am rather curious how the FX-4100 will stack up against the current Phenom II X4 chips.

    And even though the FX is a slight disappointment, I am rather impressed by the Windows 8 benchmarks. Having said that, by the time Windows 8 is ready for release I'm sure Intel will have an even better solution.
  • 25 Hide
    Tamz_msc , October 12, 2011 4:37 AM
    So Bulldozer is AMD's version of NetBurst?
  • 54 Hide
    Homeboy2 , October 12, 2011 4:38 AM
    killerclickAs I said before, it won't come close to beating Intel in performance or price. Now let's hear the fanboys whine.


    Everyone should cry, even the Intel fanboys, this is bad news for everyone, now Intel has absolutely no incentive to lower prices or accelerate Ivy Bridge.
  • 12 Hide
    the associate , October 12, 2011 4:41 AM
    killerclickAs I said before, it won't come close to beating Intel in performance or price. Now let's hear the fanboys whine.


    Waaaahhhhhhhhhhhhh!!!!!!!!

    Bah, well, been with AMD since my first pc like 8 years ago...Guess I'll be going intel for the first time ever especially since I can get an overkill cpu for just 300 bucks. Hell that's how much I payed for my phenom II 955...
Display more comments