http://www.osnn.net/comments.php?shownews=12595
What does this mean for the AMDs that use Hyper Transport? Or is the AMD HT totally different?
What does this mean for the AMDs that use Hyper Transport? Or is the AMD HT totally different?
An obvious question to ask is how does an SMT processor offer up its multithreading capabilities to software. In the case of the EV8, it is with an abstraction called a thread processing unit or TPU. A TPU is essentially a single-threaded virtual processor that is presented to the lowest level of the operating system hardware abstraction layer (HAL). The EV8’s four way SMT capabilities are represented with four separate TPUs as shown in Figure 5.
Essentially the EV8 appears to software as consisting of four separate processors that share a single set of translation lookaside buffers (TLBs) and caches. The advantages of SMT over a real four-way chip level multiprocessor (CMP) are there is only one physical processor occupying die area and cache coherency occurs without extra logic or overhead.
The EV8 uses a more powerful mechanism than either coarse or fine grained multithreading to exploit TLP. Called Simultaneous Multithreading (SMT), it allows the instructions from two or more threads to be issued to execution units each cycle. This process is illustrated conceptually in Figure 1D. The advantage of SMT is that it permits TLP to be exploited all the way down to the most fundamental level of hardware operation - instruction issue slots in a given clock period. This allows instructions from alternate threads to take advantage of individual instruction execution opportunities presented by the normal ILP inefficiencies of single thread program execution. SMT can be thought of as equivalent to the airline practice of using standby passengers to fill seats that would have otherwise flown empty.
Consider a single thread executing on a superscalar processor. Conventional superscalar processors such as the Alpha EV6 fall well short of utilizing all the available instruction issue slots. This is caused by execution inefficiencies including data dependency stalls, cycle by cycle shortfall between thread ILP and the processor resources given limited re-ordering capability, and memory accesses that miss in cache. The big advantage of SMT over other approaches is its inherent flexibility in providing good performance over a wide spectrum of workloads. Programs that have a lot of extractable ILP can get nearly all the benefit of the wide issue capability of the processor. And programs with poor ILP can share with other threads instruction issue slots and execution resources that otherwise would have gone unused.
awesome-o said:I think they must be different as Intel and AMD are totally different companies with different goals and aspirations. I imagine just because intel dropped the tech doesn't necessarily mean AMD will. In the meantime though, I just want a the AMD X2 +4800... Battlefield 2 needs dual core...