Talk:Register renaming
From Wikipedia, the free encyclopedia
Initially I had thought that I could describe just the mechanics of renaming without pipeline descriptions. Didn't happen; those descriptions got large. At some point I will move those two pipeline descriptions into their own pages, like the Classic RISC Pipeline page. Before I do that I'd like to find better names for the renaming styles.
Better diagrams would be nice as well. I haven't figured out how to make them both compact and readable.
Finally, the introduction is a tough read. It's easy to write something short but it usually ends up either cryptic or wrong. The current version is cryptic.
You removed the discussion of registers-for-speed and noted it (apparently) as a "factual error". I have to disagree. The growth in register count from ~3 in the 1970s to the hundreds in modern CPU's is specifically to take advantage of higher speed access, allowing one-cycle retiring. Perhaps I am misreading the check-in note, but if not I think it's reasonable to re-add the portion on performance.
Maury 23:55, 24 Feb 2004 (UTC)
Let me expand on my check-in note:
> Registers are small pieces of memory that can be accessed with almost zero > cost
I didn't understand precisely what "cost" meant.
> Registers are an important part of the CPU design,
I didn't think this added anything; you were working up to...
> but take up considerable room on the CPU design
No. Specifically, things like architectural register files in fast x86 designs are absolutely tiny. Issue/decode logic dominates.
> and are therefore "expensive".
No. Large numbers of registers in the ISA are "expensive" because you have to encode them, which means the instructions get large and awkward and chew up lots of instruction cache.
> One of the major goals in CPU design is to get the best use of the > registers for the lowest cost.
I've been part of two CPU designs, and at no time did anyone mention this as a goal. Never heard about it from any other CPU design teams either.
That said, adding architected registers to ISAs has been a point of several modern ISA designs. For high speed out-of-order designs, this is to give the compiler more names to expose more parallelism to the hardware. For embedded designs, it's gone both ways -- many ISAs went with 32 registers because that was the classic RISC style, and it fits nicely in a 32 bit instruction. Others (e.g. ARM Thumb) went with smaller numbers of registers (8 IIRC), trading exposed parallelism for packing density. I have not done a stack architecture, but I hear they are dense and give a fair number of registers too, at the cost of decoder complexity.
> functional units, a single one of which would represent a complete CPU > on it's own
Also false. Functional units refer to ALUs, FP multipliers or adders, or cache access pipes. The term is actually a bit vague, since it sometimes refers to a bunch of possible computations that share a writeback port to a central register file.
You had something in there about 20-way superscalar machines. You may have intended to refer to vector operations, but I didn't get that from context.
"Fast" these days comes down to getting values from one functional unit to another with nothing more than wire and mux delays. This can be done with values that go through memory, but the hardware to figure out the aliasing and get this right is much larger and more power hungry than the hardware to figure out forwarding from register names.
I'm not sure how to write this accurately and concisely, but saying that putting in lots of registers is good because they're fast is just not right.
I do appreciate your interest, and I readily acknowledge that my writing is not the best and can be improved.
[edit] Sources
Wen-Mei Hwo, Yale Patt, Michael Shebanow, about 1985, describe R10k style renaming and checkpoint repair.
Jim Smith and Andrew Pleszkun, about 1986, describe ROB, history file, future file.
Add these paper references to this page.
Iain McClatchie 20:38, 2 May 2005 (UTC)