Athlon
From Wikipedia, the free encyclopedia
Athlon Central processing unit |
|
![]() An AMD AthlonXP 1700 (Thoroughbred) |
|
Produced: | From mid 1999 to 2005 |
Manufacturer: | AMD |
CPU Speeds: | 500 MHz to 2.33 GHz |
FSB Speeds: | 100 MHz to 200 MHz |
Process: (MOSFET channel length) |
0.25 µm to 0.13 µm |
Instruction Set: | x86 |
Sockets: | |
Cores:
|
Athlon is the brand name applied to a series of different x86 processors designed and manufactured by AMD. The original Athlon, or Athlon Classic, was the first seventh-generation x86 processor and, in a first, retained the initial performance lead it had over Intel's competing processors for a significant period of time. AMD has continued the Athlon name with the Athlon 64, an eighth-generation processor featuring x86-64 (later renamed AMD64) technology.
Contents |
[edit] Athlon Classic
The Athlon made its debut on June 23, 1999. The name "Athlon" was chosen by AMD as short for "decathlon". Athlon was the ancient Greek word for "Champion/trophy of the games". The original Athlon core revision, code-named "K7" (in homage to its predecessor, the K6), was available in speeds of 500 to 700 MHz at its introduction and was later sold at speeds up to 1000 MHz (K75). The processor was compatible with the industry-standard x86 instruction set and plugged into a motherboard slot (Slot A) mechanically similar to (but not pin-compatible with) the Pentium II's Slot 1.
Internally, the Athlon was a fully seventh generation x86 processor, the first of its kind. The CPU was designed by a combination of AMD engineers and newly-hired ex-DEC engineers, and the result was a merging of technologies from AMD's earlier CPUs and the DEC Alpha 21264. Like the AMD K5 and K6, the Athlon is a RISC microprocessor which decodes x86 instructions into its own internal instructions at runtime. The CPU is again an out-of-order design, like previous post-486 AMD CPUs. The Athlon utilizes the DEC Alpha EV6 bus architecture with double data rate technology. Although it was clocked at 100 MHz initially, the DDR aspect to the bus allowed it to provide significantly higher bandwidth than the Intel GTL+ bus used by the Pentium III and its derivatives.
AMD designed the CPU with more robust x86 instruction decoding capabilities, to enhance its ability to keep more data in-flight at once. Athlon's CISC to RISC decoder triplet could potentially decode 6 x86 operations per clock, although this was somewhat unlikely in real-world use.[1] The critical branch predictor unit was enhanced compared to what was onboard the K6 because Athlon's longer pipeline necessitated highly accurate prediction to prevent performance-costly pipeline stalls. The deeper pipelining with more stages allowed higher clock speeds to be attained.[2] Whereas the AMD K6-III+ topped out at 570 MHz due to its short pipeline, even when built on the 180 nm process, the Athlon was designed to go much higher.
AMD ended its long-time issue with floating point performance by designing an impressive super-pipelined, out-of-order, triple-issue floating point unit.[1] Each of its 3 units were tailored to be able to calculate an optimal type of instructions with some redundancy to provide for more popular code usage. By having separate units it was possible to operate on more than one floating point instruction at once.[1] This FPU was a huge step forward for AMD. While the K6 FPU had looked positively anemic compared to the Intel P6 FPU, the new Athlon put even the Pentium III to shame.[3] Athlon gained a revised version of 3DNow! too, called "Enhanced 3DNow!", with added DSP instructions and an implementation of the extended-MMX subset of Intel SSE.[4]
Caching onboard Athlon consisted of the typical two levels of cache. First off came the largest level 1 cache in x86 history, a split 2-way associative cache of 128 KiB, half for data and half for instructions (Harvard architecture).[1] This cache was double the size of K6's already large cache, and quadruple the size of Pentium II and III's L1 cache. Like Intel's Pentium II and "Katmai" Pentium III, there was also a 512 KiB secondary cache, mounted outside the CPU itself and running at a lower speed than the core. The cache was placed on its own 64-bit bus, called a "backside bus", similar to AMD's own K6-III and Intel's Pentium Pro and later CPUs.[5] A backside bus allows concurrent cache and main RAM accesses, dramatically improving efficiency and bandwidth. This alone was a major improvement over the L2 cache architecture of the AMD K6-2 on down where L2 and RAM shared the front side bus. Initially this L2 cache was set for half of the CPU clock speed, on up to 700 MHz Athlon CPUs, but later Slot-A processors ran the cache at 2/5 (up to 850 MHz) or 1/3 (up to 1 GHz) of the core speed.[6] A 1.0 GHz Slot-A Athlon with external cache would require the chips to run at 500 MHz considering a 1/2 multiplier. The SRAM available at the time was simply incapable of reaching this speed, due both to cache chip technology limitations and the electrical/cache latency complications of running an external cache at such a high speed. Later Athlon processors would, like Intel's Pentium III Coppermine, move to an on-die L2 cache to allow higher cache clock speeds. Athlon cores before "Thunderbird" used an inclusive caching scheme that duplicated L1 cache data in the L2 cache.[7] This was the same as Intel's processors but unlike later AMD processors which utilized exclusive designs.
The Slot-A Athlons were the first multiplier-locked CPUs from AMD. This was partly done to fight off CPU remarking being done by questionable resellers around the globe. AMD's older CPUs could simply be set to run at whatever speed the user chose on the motherboard, making it trivial to relabel a CPU and sell it as a faster grade than it was originally. These relabeled CPUs were not always stable, being overclocked and not tested properly, and this was damaging to AMD's reputation. Although the Athlon was multiplier locked, crafty enthusiasts eventually discovered that a connector on the PCB of the cartridge could control the multiplier. Eventually a device called the "Goldfingers device" was created that could unlock the CPU, named after the gold connector pads it attached to. It was basically a module that attached to the CPU board connector and had a set of switches that opened or shorted the circuits the board connector controlled.[8]
Upon release, the Athlon was the fastest x86 CPU in the world, and various versions of the CPU held this distinction continuously from August 1999 until January 2002. Athlon outclassed Intel's Pentium III processors in nearly every way and compared quite favorably years later to the best the Pentium 4's Netburst-architecture could offer.
In commercial terms, the Athlon Classic was an enormous success — not just because of its own merits, but also because the normally dependable Intel endured a series of major production, design, and quality control issues at this time. In particular, Intel's transition to a 0.18 μm production process, starting in late 1999 and running through to mid-2000, was chaotic, and there was a severe shortage of Pentium III parts. In contrast, AMD enjoyed a remarkably smooth process transition, had ample supplies available, and Athlon sales became quite strong. Many long-time Intel-only PC dealers found the combination of the Athlon's excellent performance and reasonable pricing tempting, and the prospect of being able to get stock in commercial volumes impossible to resist. The demand that resulted in fact caused AMD to stop producing the K6-III.
[edit] Athlon Thunderbird (T-Bird)
The second generation Athlon, the Thunderbird, debuted on June 5, 2000. This version of the Athlon shipped in a more traditional pin-grid array (PGA) format that plugged into a socket ("Socket A") on the motherboard. It was sold at speeds ranging from 700 to 1400 MHz. The major difference, however, was cache design. Just as Intel had done when they replaced the old Katmai Pentium III with the much faster Coppermine P-III, AMD replaced the 512 KiB external reduced-speed cache of the Athlon Classic with 256 KiB of on-chip, full-speed cache. As a general rule, more cache improves performance, but faster cache improves it further still.
AMD changed cache design significantly with Thunderbird. With the older Athlon CPUs, the CPU caching was of an inclusive design where data from the L1 is duplicated in the L2 cache. Thunderbird moved to an exclusive design where the L1 cache's contents are not duplicated in the L2. This increases total cache size of the processor and effectively makes caching behave as if there is a very large L1 cache with a slower region (the L2) and a very fast region (the L1).[9] Because of Athlon's very large L1 cache and the exclusive design which turns the L2 cache into basically a "victim cache", the need for high L2 performance and size was lessened. AMD kept the 64-bit L2 cache data bus from the older Athlons, as a result, and allowed it to have a relatively high latency. A simpler L2 cache reduced the possibility of the L2 cache causing clock scaling and yield issues. Still, instead of the 2-way associative scheme used in older Athlons, Thunderbird did move to a more efficient 16-way associative layout.[7]
The Thunderbird was AMD's most successful product since the Am386DX-40 ten years earlier. Mainboard designs had improved considerably by this time, and the initial trickle of Athlon mainboard makers had swollen to include every major manufacturer. Their new fab in Dresden came on-line, allowing further production increases, and the process technology was improved by a switch to copper interconnects. In October 2000 the Athlon "C" was introduced, raising the mainboard front side bus speed to 266 MT/s (133 MHz double-pumped) and providing roughly 10% extra performance per clock over the "B" model Thunderbird.
[edit] Athlon XP/MP
In performance terms, the Thunderbird had easily eclipsed the rival Pentium III, and the early Pentium 4 were a long way off the pace, but gradually clawed their way closer. The 1.7 GHz P4 (April 2001) served notice that the Thunderbird could not count on retaining performance leadership forever, and thermal and electricity-consumption issues with the Thunderbird design meant that it was not practical to take it past 1400 MHz (and even at that speed it was excessively hot).
[edit] Palomino
AMD released the third major Athlon version on October 9, 2001, code-named "Palomino". This version, the first to include the full SSE instruction set from the Intel Pentium III as well as AMD's 3DNow! Professional, was introduced at speeds between 1333 and 1533 MHz, with ratings from 1500+ to 1800+. The major changes were optimizations to the core design to increase efficiency by roughly 10% over a Thunderbird at the same clock speed, through enhancement of the TLB architecture and the addition of a hardware data prefetch mechanism to better take advantage of available memory bandwidth.[10] The new Athlon core was also more frugal with its electrical demands, consuming approximately 20% less power than its predecessor, and as such, reducing heat output comparatively as well.[11] The core was also modified to improve clock speed scalability. At launch, Athlon XP allowed AMD to clearly take the x86 performance lead with the 1800+, and enhance the lead with the release of the 1600 MHz 1900+ less than a month later.[12]
The "Palomino" was first released as a mobile version, called the Mobile Athlon 4 (codenamed "Corvette"), after the fact that it was AMD's fourth core to be called Athlon (after the original K7, the 0.18 μm K75, and the Thunderbird), but many people noted that the name was most likely a jab at the then-brand-new Intel Pentium 4. The desktop Athlon XP followed a few months later, in October.[13] The "XP" name is interpreted to mean eXtreme Performance and also as an allusion to the recent release of Windows XP.[14]
"Palomino" core processors are most distinguishable from "Thoroughbred" and "Barton" model Athlon XPs in that the CPU die is square, whereas earlier (and later) cores are rectangular. The normally top-mounted capacitors were relocated to the underside as well, contrasting "Thunderbird". The "Palomino" Athlon XP CPUs also had their stepping information engraved on the core, similar to the "Thunderbird", as opposed to the label to the side of the core as in the "Thoroughbred" picture.
The Athlon XP was marketed using a PR rating system, which compared its performance to an Athlon Thunderbird. Because the Athlon XP had much higher IPC (instructions per clock) than the Pentium 4 (and about 10% higher than a Thunderbird), it was more efficient; it delivered the same level of performance at a significantly lower clock-speed. Also, unlike the earlier Athlons, this chip was available in a form that officially supported dual processing, known as Athlon MP.[15]
[edit] Thoroughbred (T-Bred)
The fourth-generation Athlon, the Thoroughbred, was released 10 June 2002 at 1.8 GHz, or 2200+ on the PR rating system. There were actually two versions of this core, commonly called A and B. The A version was the one introduced at 1800 MHz, which had some heat issues, so it was only sold in versions from 1333 to 1800 MHz, replacing the Palomino. The B version, which had an additional metal layer, was released at higher clock speeds, up to the 2800+ model, which ran at 2250 MHz. Later, it replaced the entire Athlon XP line until the launch of the Barton core. Two new models, the 2400+ and 2600+, were announced on 21 August 2002. The 2400+ ran at 2000 MHz, and the 2600+ ran at 2083 or 2133 MHz, depending on the front side bus speed (2083 MHz for 333 MT/s FSB, 2133 MHz for 266 MT/s FSB). 2700+ and 2800+ Thoroughbred parts were also announced, but were only available in very small quantities.
The "Thoroughbred" core was on a 0.13 micrometre (130 nm) process, updated from the 0.18 micrometre (180 nm) process of its "Palomino" predecessor. Other than the new process, the Thoroughbred design was not different from the "Palomino" in any way. AMD did have initial troubles with the "Thoroughbred A" revision having substantial heat issues, which were solved in the "B" revision. The rev. A may have been on the 130 nm process, but it offered no real improvements over the old Palomino. Overclockers still liked to use the Palomino; even with it being made on the 180 nm process, it still was able to hit higher clock speeds. The Thoroughbred "B" fixed this problem by adding an extra metal layer to the manufacturing process, allowing enhanced speeds that would allow them to become competitive again. At first, 2600+ was released. Later, AMD raised the FSB from 266 MT/s (133 MHz double-pumped) to 333 MT/s (166 MHz double-pumped). This allowed the company to raise the performance rating numbers of the CPUs without actually upping the clock speed much. However, AMD failed to manufacture and ship acceptable amounts of the highest-end 2700+ and 2800+ Thoroughbreds, and as a result they were hard to obtain.
[edit] Barton and Thorton
Fifth-generation Athlon Barton-core processors released in early 2003 featured PR ratings of 2500+, 2600+, 2800+, 3000+, and 3200+. While not faster than Thoroughbred-core processors in clock speed, they earned their higher PR-rating-per-clock by featuring an additional 256 KiB of full-speed on-chip level 2 cache, for a total of 512 KiB, and a faster FSB. The Thorton core was a variant of the Barton with half of the L2 cache disabled and thus functionally identical to the Thoroughbred B core. The disabled L2 cache on some Thortons was partially defective, but on others it could be re-enabled through bridge modifications.[16]
As with most Athlons, the Barton core was popular with overclockers. For example, the 2500+ was rated to run an a 333 MT/s (166 MHz double-pumped) bus. By increasing this to 400 MT/s (200 MHz double-pumped), it became equivalent to the much more expensive 3200+. Some suspect this was the reason for the relatively short retail lifespan of the lower-rated Bartons, which were the first to be replaced by the cut-down Semprons.
Some AMD proponents claim that these new parts regained performance leadership for the Athlon, but this remained in doubt. Much controversy surrounds the benchmarks which are used to measure performance leadership. In particular, industry insiders point out that some tests have been deliberately skewed in Intel's favour—notably the BAPCo tests, which were written by Intel's own engineers. Other insiders accused AMD's model numbers of no longer being internally consistent, and also accused them of basing their processor ratings on applications which were no longer widely used.
Most observers considered the Athlon no longer the fastest x86 processor in the world, believing that Intel's Pentium 4 overtook the Athlon XP early in 2002 and held its lead until February 2003, with the 3.06 GHz P4 benchmarking slightly faster than the Athlon 2700+. At the time, the question was moot: AMD had yet to deliver the 2700+ and 2800+ in commercial quantities; they did not begin to ship in volume until well into the first quarter of 2003. However, as the initially troublesome transition to the 0.13 micrometre process neared completion, AMD began producing large numbers of 0.13 micrometre parts in the 1700+ to 2400+ speed grades (usually a sign that faster grades are not far away). In mid February 2003, they announced the Athlon XP 3000+ to ship in volume in early March of 2003. Pending an Intel reply, the 3000+ had according to AMD reclaimed the "fastest x86 in the world" title for the Athlon once again. However, reviewers' opinions on this were split, and most still believed the top Intel part to be faster. A month later, Intel introduced a new series of Pentium 4s which had a faster 800 MT/s, or 200 MHz quad-pumped bus (previously it was 533 MT/s, or 133 MHz quad-pumped. The new bus was indicated by the "C" appendage at the end of the model number) and support for Hyper-Threading. In response, AMD released the Athlon XP 3000+ and 3200+ featuring a 400 MT/s bus. Unfortunately, the bus speed increase did not offer a large performance gain. The 3200+ failed to outperform convincingly the new 3.0 GHz Pentium 4 "C", much less the subsequent 3.2 GHz version. Many reviewers concluded that the C-series Pentium 4 was a bridge too far for the Athlon XP, and that Intel had gained a decisive performance lead which the Athlon XP could not overcome. However, AMD did not try to do so; their focus was now on the soon-to-be released K8, the Athlon 64.
[edit] Mobile Athlon XP
Mobile Athlon XPs (Athlon XP-M) are identical to normal Athlon XPs, apart from running at lower voltages, often lower bus speeds, and not being multiplier-locked. The lower Vcore rating caused the CPU to have lower power consumption (ideal for battery-powered laptops) and lower heat production. Athlon XP-M CPUs also have a higher-rated heat tolerance, a requirement of the tight conditions within a notebook PC.
The Athlon XP-M replaced the older Mobile Athlon 4. The Mobile Athlon 4 used the older Palomino core, while the Athlon XP-M used the newer Thoroughbred and Barton cores. Some specialized low-power Athlon XP-Ms utilize the microPGA socket 563 rather than the standard Socket A.
The CPUs, like their mobile K6+ predecessors, were also capable of dynamic clock adjustment for power optimization. When the system is idle, the CPU clocks itself down through a lower bus multiplier and also reduces its voltage. Then, when a program demands more computational resources, the CPU very quickly (there is some latency) returns to intermediate or maximum speed to meet the demand. This technology was marketed as "PowerNow!". It was similar to Intel's SpeedStep power saving technique. The feature was controlled by the CPU, motherboard BIOS, and operating system. AMD later renamed the technology to Cool'n'Quiet, on their K8-based CPUs (Athlon 64, etc), and re-imagined it for use on desktop PCs as well.
Athlon XP-Ms were popular with desktop overclockers, as well as underclockers. The lower voltage requirement and higher heat rating resulted in CPUs that were basically "cherry picked" from the manufacturing line. Being the best of the cores off the line, the CPUs typically were more reliably overclocked than their desktop-headed counterparts. Also, the fact that they weren't locked to a single multiplier was a significant simplification for the overclocking process. Some Barton core Athlon XP-Ms have been successfully overclocked to as high as 3.1 GHz.
As stated, the chips were also liked for their underclocking ability. Underclocking is a process of determining the lowest Vcore at which a CPU can remain stable at for a given clock speed. The Athlon XP-M CPUs were capable of running lower voltages per clock rate compared to their desktop siblings. As such, the chips were used in home theater PC systems due to their high performance and low heat output at low Vcore settings.
[edit] Models
[edit] Athlon
[edit] Athlon Classic
- -> K7 "Argon" (250 nm)
- -> K75 "Pluto/Orion" (180 nm)
- L1-Cache: 64 + 64 KiB (Data + Instructions)
- L2-Cache: 512 KiB, external chips on CPU module with 50, 40 or 33% of CPU-speed
- MMX, 3DNow!
- Slot A (EV6)
- Front side bus: 200 MT/s (100 MHz double-pumped)
- VCore: 1.6 V (K7), 1.6 - 1.8 V (K75)
- First release: June 23, 1999 (K7), November 29 1999 (K75)
- Clockrate: 500 - 700 MHz (K7), 550 - 1000 MHz (K75)
[edit] Thunderbird (180 nm)
- L1-Cache: 64 + 64 KiB (Data + Instructions)
- L2-Cache: 256 KiB, fullspeed
- MMX, 3DNow!
- Slot A & Socket A (EV6)
- Front side bus: 200 MT/s (Slot-A, B-models), 266 MT/s (C-models) (100, 133 MHz double-pumped)
- VCore: 1.7 V - 1.75 V
- First release: June 5, 2000
- Clockrate:
[edit] Athlon XP
[edit] Palomino (180 nm)
- L1-Cache: 64 + 64 KiB (Data + Instructions)
- L2-Cache: 256 KiB, fullspeed
- MMX, 3DNow!, SSE
- Socket A (EV6)
- Front side bus: 266 MT/s (133 MHz double-pumped)
- VCore: 1.75 V
- First release: October 9, 2001
- Clockrate: 1333 - 1733 MHz (1500+ to 2100+)
[edit] Thoroughbred A/B (130 nm)
- L1-Cache: 64 + 64 KiB (Data + Instructions)
- L2-Cache: 256 KiB, fullspeed
- MMX, 3DNow!, SSE
- Socket A (EV6)
- Front side bus: 266/333 MT/s (133/166 MHz double-pumped)
- VCore: 1.5 V - 1.65 V
- First release: June 10, 2002 (A), August 21, 2002 (B)
- Clockrate:
- T-Bred "A": 1400 - 1800 MHz (1600+ to 2200+)
- T-Bred "B": 1400 - 2250 MHz (1600+ to 2800+)
- 266 MT/s FSB: 1400 - 2133 MHz (1600+ to 2600+)
- 333 MT/s FSB: 2083 - 2250 MHz (2600+ to 2800+)
[edit] Thorton (130 nm)
- L1-Cache: 64 + 64 KiB (Data + Instructions)
- L2-Cache: 256 KiB, fullspeed
- MMX, 3DNow!, SSE
- Socket A (EV6)
- Front side bus: 266/333/400 MT/s (133/166/200 MHz double-pumped)
- VCore: 1.6 V - 1.65 V
- First release: September 2003
- Clockrate: 1667 - 2200 MHz (2000+ to 3100+)
[edit] Barton (130 nm)
- L1-Cache: 64 + 64 KiB (Data + Instructions)
- L2-Cache: 512 KiB, fullspeed
- MMX, 3DNow!, SSE
- Socket A (EV6)
- Front side bus: 333/400 MT/s (166/200 MHz double-pumped)
- VCore: 1.65 V
- First release: February 10, 2003
- Clockrate: 1833 - 2200 MHz (2500+ to 3200+)
- 333 MT/s FSB: 1833 - 2166 MHz (2500+ to 3000+)
- 400 MT/s FSB: 2100, 2200 MHz (3000+, 3200+)
[edit] 7th generation x86 competitors
[edit] Supercomputers
The fastest supercomputers based on AthlonMP:
- Rutgers University, Department of Physics & Astronomy. Machine: NOW Cluster - AMD Athlon. CPU: 512 AthlonMP (1.65 GHz). Rmax: 794 Gigaflops.
[edit] See also
- List of AMD Athlon microprocessors
- List of AMD Athlon XP microprocessors
- List of AMD Athlon 64 microprocessors
[edit] External links
- cpu-collection.de AMD Athlon processor images and descriptions
- amdboard.com AMD Athlon/Duron/Sempron CPU identification and OPN breakdown
- AMD's Technical Specifications for 7th generation CPUs (.pdf)
- Easy identification with Interactive AMD product ID
- AMD Athlon technical specifications
[edit] References
- ^ a b c d Hsieh, Paul. 7th Generation CPU Comparisons.
- ^ De Gelas, Johan. The Secrets of High Performance CPUs, Part 1, Ace's Hardware, September 29, 1999.
- ^ Pabst, Thomas. Performance-Showdown between Athlon and Pentium III, Tom's Hardware, August 23, 1999.
- ^ Womack, Tom. Extensions to the x86 architecture.
- ^ De Gelas, Johan. Clash of Silicon, The Athlon 650, Ace's Hardware, September 29, 1999.
- ^ Lal Shimpi, Anand. AMD Athlon 1GHz, 950MHz, 900MHz, Anandtech, March 6, 2000, p.2.
- ^ a b K7 microarchitecture information, Sandpile.org, accessed September 26, 2006.
- ^ Noonan, Jim and Rolfe, James. Athlon Gold-Finger Devices, Overclockers.com.au, accessed August 24, 2006.
- ^ Stokes, John. Inside AMD's Hammer: the 64-bit architecture behind the Opteron and Athlon 64, Ars Technica, February 1, 2005:p.9.
- ^ Lal Shimpi, Anand. AMD Athlon 4 - The Palomino is Here, Anandtech, May 14, 2001, p:4-5.
- ^ Wasson, Scott. AMD's Athlon XP 1800+ processor: 1533 > 1800, The Tech Report, October 9, 2001.
- ^ Wasson, Scott. AMD's Athlon XP 1900+ processor: Pouring it on, The Tech Report, November 5, 2001.
- ^ De Gelas, Johan.Upgrading to eXtreme Performance, Ace's Hardware, October 16, 2001.
- ^ Advanced Micro Devices, Inc. Introducing the AMD Athlon XP Processor.
- ^ Lal Shimpi, Anand. AMD 760MP & Athlon MP - Dual Processor Heaven, Anandtech, June 5, 2001.
- ^ Shilov, Anton. How to Enable Additional 256KB of L2 Cache on AMD Athlon XP Processors Model 10. Thorton Modified!, X-bit labs, September 30, 2003.
This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.