Binary prefix
From Wikipedia, the free encyclopedia
The SI defines a system of prefixes (nano-, micro-, milli-, kilo-, mega-, giga- etc.) and symbols (n, µ, m, k, M, G, etc.) that denote powers of ten. When discussing things that naturally come in powers of two (mostly amounts of computer memory) one also needs names and symbols for those, but such names were first introduced rather late (1999, see below), and it was and still is common to use the decimal prefixes to denote nearby powers of two. This led to much confusion about the meaning of the SI prefixes combined with "byte" (kilobyte, megabyte, gigabyte, etc.).
Contents |
[edit] History
It is not clear when the SI prefix kilo- and its symbol k which denotes the number 1,000, became to mean 1,024 when sizing computer main storage. 210=1,024 bytes is accurately designated as 1.0 kilobytes to two significant digits. Absent other information, it is not clear what an author meant by k (or K) until 32k which is 32,000 according to the SI usage or 32,768 according to the binary usage of k, the latter, of course rounds to 33k. Based upon current evidence (as of Mar 2007), it appears this may have occurred during the 1970's or later. Binary meaning was also extended to higher SI prefixes, such as mega, M, and giga, G, leading to some consumer confusion.
In January 1999, the International Electrotechnical Commission introduced the prefixes kibi-, mebi-, gibi-, etc., and the symbols Ki, Mi, Gi, etc. to specify binary multiples of a quantity and eliminate this ambiguity.[1]
They have since been officially adopted by many other organizations and researchers, and incorporated in some computer software, but are not yet ubiquitous. (See #Adoption, below.) Many continue to use the SI prefixes in a binary sense, despite the lack of support from official bodies. As a result, there is no unambiguous notation for decimal multiples of bytes.
[edit] Binary prefixes using SI symbols
Name | Symbol | Value | Base 16 | Base 10 |
---|---|---|---|---|
kilo | k/K | 210 = 1,024 | = 162.5 | > 103 |
mega | M | 220 = 1,048,576 | = 165 | > 106 |
giga | G | 230 = 1,073,741,824 | = 167.5 | > 109 |
tera | T | 240 = 1,099,511,627,776 | = 1610 | > 1012 |
peta | P | 250 = 1,125,899,906,842,624 | = 1612.5 | > 1015 |
exa | E | 260 = 1,152,921,504,606,846,976 | = 1615 | > 1018 |
zetta | Z | 270 = 1,180,591,620,717,411,303,424 | = 1617.5 | > 1021 |
yotta | Y | 280 = 1,208,925,819,614,629,174,706,176 | = 1620 | > 1024 |
The one-letter symbols are identical to SI prefixes, except for "K", which is used interchangeably with "k" (in SI, the upper-case or capital "K" stands for kelvin, and only the lower-case "k" represents 1,000).
These prefixes are in common use in contexts where accuracy is not important, such as file and memory sizes, but conflict with SI definitions. The names and values of the SI prefixes were defined in the 1960 SI standard, with powers-of-1000 values. As of 2005, standard dictionaries do not recognize the binary meanings for these prefixes.
BIPM (which maintains SI) expressly prohibits the binary prefix usage, and recommends the use of the IEC prefixes as an alternative (computing units are not included in SI).[2]
Some have suggested that "k" be used for 1,000, and "K" for 1,024, but this cannot be extended to the higher order prefixes and has never been widely recognised.
Although the prefixes denoting fractions of a bit or byte might theoretically find application in areas such as cryptography, data compression, and data transfer rates, they are not used in practice.
Informally, the prefixes are often used on their own. Thus one might hear about "a 40K file" (40 binary kilobytes) or "a 2M Internet connection" (2 decimal megabits per second). What units are being used, and whether the multipliers are decimal or binary, depends on context and cannot be determined by the units alone.
[edit] IEC standard prefixes
In 1999, the International Electrotechnical Commission (IEC) published Amendment 2 to "IEC 60027-2: Letter symbols to be used in electrical technology — Part 2: Telecommunications and electronics". This standard, which was approved in 1998, introduced the prefixes kibi-, mebi-, gibi-, tebi-, pebi-, exbi-, to be used in specifying binary multiples of a quantity. The names come from the first two letters of the original SI prefixes followed by bi which is short for "binary". It also clarifies that, from the point of view of the IEC, the SI prefixes only have their base-10 meaning and never have a base-2 meaning.
This amendment was included in the next edition of the standard: "IEC 60027-2 (2000-11) Ed. 2.0"
The second edition defined them only up to exbi-, but in 2005, the third edition of the standard added prefixes zebi- and yobi-, thus matching all standard SI prefixes with their binary counterparts.
Name | Symbol | Base 2 | Base 16 | Base 10 | ||
---|---|---|---|---|---|---|
kibi | Ki | 210 | 162.5 | 0x400 | = 1,024 | > 103 |
mebi | Mi | 220 | 165 | 0x10 0000 | = 1,048,576 | > 106 |
gibi | Gi | 230 | 167.5 | 0x4000 0000 | = 1,073,741,824 | > 109 |
tebi | Ti | 240 | 1610 | 0x100 0000 0000 | = 1,099,511,627,776 | > 1012 |
pebi | Pi | 250 | 1612.5 | 0x4 0000 0000 0000 | = 1,125,899,906,842,624 | > 1015 |
exbi | Ei | 260 | 1615 | 0x1000 0000 0000 0000 | = 1,152,921,504,606,846,976 | > 1018 |
zebi | Zi | 270 | 1617.5 | 0x40 0000 0000 0000 0000 | = 1,180,591,620,717,411,303,424 | > 1021 |
yobi | Yi | 280 | 1620 | 0x1 0000 0000 0000 0000 0000 | = 1,208,925,819,614,629,174,706,176 | > 1024 |
Example: 300 GB ≅ 279.5 GiB (= 0x117.6592E GiB = 0x45D96.4B8 MiB = 0x1176592E KiB = 0x45D964B800 bytes).
[edit] Approximate ratios between binary and decimal prefixes
As the order of magnitude increases, the percentage difference between the binary and decimal values of a prefix increases, from 2.4% (with the kilo prefix) to over 20% (with the yotta prefix).
Name | Bin ÷ Dec | Dec ÷ Bin | Example | Percentage difference |
---|---|---|---|---|
kilobyte : kibibyte | 1.024 | 0.976 | 100 kB ≅ 97.6 KiB | +2.4% or −2.3% |
megabyte : mebibyte | 1.049 | 0.954 | 100 MB ≅ 95.4 MiB | +4.9% or −4.6% |
gigabyte : gibibyte | 1.074 | 0.931 | 100 GB ≅ 93.1 GiB | +7.4% or −6.9% |
terabyte : tebibyte | 1.100 | 0.909 | 100 TB ≅ 90.9 TiB | +10% or −9.1% |
petabyte : pebibyte | 1.126 | 0.888 | 100 PB ≅ 88.8 PiB | +12.6% or −11.2% |
exabyte : exbibyte | 1.153 | 0.867 | 100 EB ≅ 86.7 EiB | +15.3% or −13.3% |
zettabyte : zebibyte | 1.181 | 0.847 | 100 ZB ≅ 84.7 ZiB | +18.1% or −15.3% |
yottabyte : yobibyte | 1.209 | 0.827 | 100 YB ≅ 82.7 YiB | +20.9% or −17.3% |
[edit] Adoption
As of 2007 this naming convention has not gained widespread use, but its use is growing. It is strongly supported by many standardization bodies and technical organizations, such as IEEE, CIPM, NIST, and SAE.[3][2][4][5] In particular, on March 19, 2005 the IEEE standard IEEE 1541-2002 (Prefixes for Binary Multiples) has been elevated to a full-use standard by the IEEE Standards Association after a two-year trial period.[6]
The binary prefixes which are defined in IEC 60027-2 (Letter symbols to be used in electrical technology – Part 2: Telecommunications and electronics) have been adopted by the European Committee for Electrotechnical Standardization (CENELEC) as the harmonization document HD 60027-2:2003-03.[7] This document will be adopted as a European standard.[8]
[edit] Consumer confusion
The practice of using binary-based prefixes for computer memory arose as early as 1964;[9] however, there was little or no consumer confusion because of the sophisticated nature of the consumers and the practice of the computer manufacturers to specify (as opposed to advertise) their products with decimal digits of sufficient places, e.g., the 1968 IBM stated System 360 "Model 91s can accommodate up to 6,291,496 bytes of main storage."[10] The confusion appears to relate to the advent of graphical user interfaces where there was not enough space to provide sufficient digits to fully state the capacity.[citation needed] In such GUIs, storage capacity was reported in a mixed system of decimal digits and binary prefixes using the SI notation. Apparently, some computer programmers were unaware that disk drive manufacturers used the SI notation when specifying and/or advertising capacity of their hard disk drives. This mixed presentation appears as early as Microsoft Windows 98 which, for example, would report the space available on a 41,959,424 byte hard disk drive as 40 MB. In Microsoft Windows XP, a 30 gigabyte drive has its capacity reported both 30,064,771,072 bytes and as 28 GB. The confusion has sometimes led to litigation.
[edit] Usage notes
The phrase "decimal unit" will be used to denote "SI designation understood in its standard, decimal, power-of-1000 sense" and "binary unit" will mean "SI designation understood in its binary, power-of-1024 sense." B will be used as the symbol for byte as per computer-industry standard (IEEE 1541 and IEC 60027; B is also the symbol for bel, a common non-SI unit used for ratio measurement).
Certain units are always understood as decimal even in computing contexts. For example, hertz (Hz), which is used to measure clock rates of electronic components, and bit/s, used to measure bit rate. So a 1 GHz processor performs 1,000,000,000 clock ticks per second, a 128 kbit/s MP3 stream consumes 128,000 bits (16 kB, 15.625 KiB) per second, and a 1 Mbit/s Internet connection can transfer 1,000,000 bits (125 kB, approx 122 KiB) per second, assuming an 8-bit byte, and no overhead.[11]
[edit] Pronunciation
It is suggested that in English, the first syllable of the name of the binary-multiple prefix should be pronounced in the same way as the first syllable of the name of the corresponding SI prefix, and that the second syllable should be pronounced as "bee." [4]
[edit] Computer memory
Measurements of most types of electronic memory such as RAM and ROM and Flash (large scale disk-like flash is sometimes an exception) are given in binary units, as they are made in power-of-two sizes. This is the most natural configuration for memory, as all combinations of their address lines map to a valid address, allowing easy aggregation into a larger contiguous block of memory.
[edit] Hard disk drives
HDD manufacturers state capacity in decimal units. This usage has a long tradition, even predating the SI system of decimal prefixes adopted in 1960, as follows:
- The first disk drive the IBM 350 (1950s) had 5,000,000 6 bit characters organized in 100 character sectors (i.e., blocks). This predates the SI system.
- In the 1960s virtually all disk drives used IBM's variable block length format (called, Count Key Data or "CKD[12]"). Any block size could be specified up to the maximum track length. Blocks ("records" in IBM's terminology) of 88, 96, 880 and 960 were often used because they related to the fixed block size of punch cards. The drive capacity was usually stated in full track record blocking, for example, the 100 Megabyte 3336 disk pack only achieved that capacity with a full track block size of 13,030 bytes.
- CKD continued into the 1990s and perhaps into this day. In the 1970s and 1980s most drives were specified with unformatted tracks (the unformatted capacity) with the particular block size and formatted capacity a function of the controller design. For example, the ST412 of IBM PC/XT fame had an unformatted capacity of 12.75 MB (not MiB) and with the Xebec controller and 512 byte blocks it formatted to and was advertised as a 10.0 MB (not MiB) HDD. Other controllers supported other block sizes resulting in other formatted capacities.
- The advent of intelligent interfaces (SCSI and IDE) in the early 1990s took the block size decision into the drive and virtually all chose 512 bytes, for no reason other than that was what IBM had chosen when they picked the Xebec controller for the PC/XT. Capacity continued to be specified by the HDD manufacturers with SI prefix definitions.
Regardless of the HDD manufacturer's continuous practice of specifying with conventional SI prefixes, some systems GUI's took the HDD capacity, reported by the operating system as a binary number without prefixes, and reported the HDD capacity in a mixed decimal number/binary prefix[13] leading to some confusion. As of January 2007, most, if not all, HDD manufacturers continue to use decimal prefixes to identify capacity.[14]
[edit] Flash drives
USB Flash Drive and Flash-based memory cards like CompactFlash and Secure Digital are typically classified in "powers of two" multiples of decimal megabytes, for example: 32MB, 64MB, 128MB, 256MB, 512MB. Although the devices usually have at least the expected byte capacity as defined by the "decimal" calculation (256 million bytes for a 256MB classified device), each manufacturer allocates different portions of the device's ultimate capacity for such things as wear levelling.
[edit] Floppy drives
The confused usage of decimal prefixes may have started in floppy drives where the drive and media manufacturers stated their unformatted capacity while various systems houses published differing formatted capacities as a consequence of their varying controller designs. It appears that some system manufacturers and OS vendors began reporting in what we now know as Ki bytes. A very confusing hybrid system developed with the double sided high density 3½" floppy disk, in which a "megabyte" means a thousand 1024-byte "kilobytes". Thus, as of 2005, manufacturers universally use the designation "1.44 MB diskette" for a product which holds neither 1.44×220 bytes nor 1.44×106 bytes, but rather 1.44×1000×1024 bytes (approximately 1.406 MiB, or 1.475 MB). This is probably because some marketing person decided that this was best advertised as a double capacity version of the prior generation 720 KB product (of course, it was 720 KiB).
[edit] CD and DVD
CD capacities are always given in binary units. A "700 MB" (or "80 minute") CD has a nominal capacity of about 700 MiB.[15] But DVD capacities are given in decimal units. A "4.7 GB" DVD has a nominal capacity of about 4.38 GiB.[16]
[edit] Buses
Bus bandwidth is given in decimal units. This is not because hard drive capacities use the decimal versions, nor because bit rates do, but because clock speeds do. For example, "PC3200" memory runs on a double pumped 200 MHz bus, transferring 8 bytes per cycle, and hence has a bandwidth of 200,000,000×2×8 = 3,200,000,000 byte/s.
[edit] Legal disputes
The implicit use of decimal units to describe the capacity of storage devices has become a source of confusion as these devices are increasingly marketed to non-technical consumers. When a user buys a device advertised using decimal units, and installs it in a system that shows the available space in binary units, a misinformed user may be disturbed by the apparent discrepancy. As a result, there have been several lawsuits against companies who sell hard drives, flash memory devices, and computer systems that list drive capacities.
Several significant lawsuits have been filed:
- On June 23, 2003, a Business Tort action entitled Matthew Leffert vs. Amazon.com, INC., was filed in the Superior Court for the City and County of San Francisco, Case No. CGC-03-421769. In this case, the plaintiff complained of false advertising in relation to how MP3 player storage is marketed.
- In September of 2003, Lanchau Dan, Adam Selkowitz, Tim Swan and John Zahabian filed a lawsuit against Dell, Inc., Apple Computer Inc., Gateway, Inc., Hewlett-Packard Co., IBM Corp., Sharp Corporation, Sony Corp. and Toshiba Corp claiming their advertising deceptively exaggerates the real capacity of their hard drives.
- On February 20, 2004, Willem Vroegh filed a lawsuit against Lexar Media, Dane–Elec Memory, Fuji Photo Film USA, Eastman Kodak Company, Kingston Technology Company, Inc., Memorex Products, Inc.; PNY Technologies Inc., SanDisk Corporation, Verbatim Corporation, and Viking InterWorks alleging that their descriptions of the capacity of their flash memory cards were false and misleading.
- On July 7, 2005, an action entitled Orin Safier v. Western Digital Corporation, et al., was filed in the Superior Court for the City and County of San Francisco, Case No. CGC-05-442812. The case was subsequently moved to the Northern District of California, Case No. 05-03353 BZ.
[edit] See also
[edit] Specific units of IEC 60027-2 A.2
These units have individual articles:
|
|
|
[edit] References
- ^ Amendment 2 to IEC International Standard IEC 60027-2: Letter symbols to be used in electrical technology — Part 2: Telecommunications and electronics [1]
- ^ a b (2006) "§3.1 SI prefixes", The International System of Units (SI) (PDF), 8th edition (in French/English), Paris: STEDI Media, p. 127. ISBN 92-822-2213-6. Retrieved on 2007-02-25. “[Side note:] These SI prefixes refer strictly to powers of 10. They should not be used to indicate powers of 2 (for example, one kilobit represents 1000 bits and not 1024 bits). The IEC has adopted prefixes for binary powers in the international standard IEC 60027-2: 2005, third edition, Letter symbols to be used in electrical technology — Part 2: Telecommunications and electronics. The names and symbols for the prefixes corresponding to 210, 220, 230, 240, 250, and 260 are, respectively: kibi, Ki; mebi, Mi; gibi, Gi; tebi, Ti; pebi, Pi; and exbi, Ei. Thus, for example, one kibibyte would be written: 1 KiB = 210 B = 1024 B, where B denotes a byte. Although these prefixes are not part of the SI, they should be used in the field of information technology to avoid the incorrect usage of the SI prefixes.”
- ^ (2003-02-12) IEEE Trial-Use Standard for Prefixes for Binary Multiples (PDF). ISBN 0-7381-3386-8. Retrieved on 2007-02-25. “This standard is prepared with two goals in mind: (1) to preserve the SI prefixes as unambiguous decimal multipliers and (2) to provide alternative prefixes for those cases where binary multipliers are needed. The first goal affects the general public, the wide audience of technical and nontechnical persons who use computers without much concern for their construction or inner working. These persons will normally interpret kilo, mega, etc., in their proper decimal sense. The second goal speaks to specialists—the prefixes for binary multiples make it possible for persons who work in the information sciences to communicate with precision.”
- ^ a b Prefixes for Binary Multiples — The NIST Reference on Constants, Units, and Uncertainty
- ^ Rules for SAE Use of SI (Metric) Units — Section C.1.12 — SI prefixes
- ^ IEEE-SA STANDARDS BOARD STANDARDS REVIEW COMMITTEE (RevCom) MEETING AGENDA (2005-03-19). Retrieved on 2007-02-25. “1541-2002 (SCC14) IEEE Trial-Use Standard for Prefixes for Binary Multiples [No negative comments received during trial-use period, which is now complete; Sponsor requests elevation of status to full-use.] Recommendation: Elevate status of standard from trial-use to full-use. Editorial staff will be notified to implement the necessary changes. The standard will be due for a maintenance action in 2007.”
- ^ HD 60027-2:2003 Information about the harmonization document (obtainable on order)
- ^ prEN 60027-2:2006 Information about the EN standardization process
- ^ "Architecture of the IBM System/360," ©1964 gives memory capacity ranges of the various models in "Capacity 8 bit bytes 1 KB = 1024"
- ^ System/360 Model 91
- ^ Binary vs. Decimal Measurements
- ^ http://www.answers.com/topic/ckd
- ^ It is not clear when such mis-reporting began, perhaps as early as the mid-1970s with the advent of floppy disk drive capacity reporting; however it became a significant issue with the advent of GUI based personal computers.
- ^ On January 6, 2007, a check of the websites of Fujitsu, HGST, Samsung, Seagate, Toshiba and Western Digital showed these companies (representing virtually all of the HDD industry by unit volume) specify capacity with the SI prefix definitions.
- ^ Data capacity of CDs
- ^ Understanding Recordable and Rewritable DVD
[edit] External links
- "When is a kilobyte a kibibyte? And an MB an MiB?" — An introduction to binary prefixes from the IEC
- Tool to convert to/from the binary and standard units (up to yobibytes)
- What is a Megabyte ...? Markus Kuhn's 1996–1999 paper on bits, bytes, prefixes and symbols
- Prefixes for binary multiples — NIST
- Get Ready for the mebi, gibi and tebi — NIST
- Another description of binary prefixes
- White-paper on the controversy over drive capacities
- There is no such thing as a 1.44 MB standard format floppy disc
- A summary of the organizations, software, and so on that have implemented the new binary prefixes
- A plea for sanity
- KiloBytes vs. kilobits vs. Kibibytes (Binary prefixes)
- Here Come Zebi- and Yobi- IEC press release announcing new prefixes
- Decimal-to-Binary Prefixes and Binary-to-Decimal Prefixes Converter