The various terms used for digital measurements can be confusing. This page attempts to at least explain what the various terms actually mean. CKnow can’t control people’s use of those terms however and ignorance of the true meanings has sometimes crept into common use. With that caveat in mind let’s start at the beginning.
Bit
The most basic piece of information that a computer understands is a bit. The term bit is a shorted form of binary digit and the bit can have only one of two values: 0 or 1. Think of a bit like a light switch; it’s either on or it’s off.
The short form of a bit is a lower-case b. To be certain your meaning is understood, you should probably use the full word bit as, in general use, sometimes people will improperly use the lower case b for byte.
As interesting as a bit may be, humans really can’t think in terms of bits. Only computers “think” in terms of bits. So, humans need some organization to the bits in order to interact with the computer. That’s where bytes and other bit collections come in…
Byte
A byte is a common unit for groupings of bits. In general use, a byte is taken to mean a contiguous sequence of eight bits. There are other, less common meanings for a byte but for this discussion the eight bit sequence is a good one and will be used throughout.
The short form of a byte is an upper-case B.
The word byte is pronounced like the word “bite” and is thought to have come from a byte being the smallest amount of data a computer could bite at one time. Couple that with early computer scientists feeling the need to construct their own terminology and the possible confusion between bit and bite and you end up with the spelling: byte.
To take the analogy of eating a step further, some early computers worked in sequences of four bits instead of eight for some operations. This, as you might have guessed, led to the term nibble for these four-bit sequences. (You might also see nybble but since there is no other term to confuse nibble with, it is the spelling generally used.)
In some circumstances instead of byte you might see the word octet. This is a term often used in standards and also in communications and networking settings where the number of bits that represents a given single piece of information might differ from eight. If you look at MIME types for anything, you will probably note its use in the type application/octet-stream which basically tells the communication system to transfer the data stream without any modification during transit.
In terms of a conversion for human use, consider what eight bits can represent. Starting with zero (counting by a computer generally starts at zero instead of one) you get a sequence like…
- 00000000 = decimal 0
- 00000001 = decimal 1
- 00000010 = decimal 2
- 00000011 = decimal 3
- …
- 11111111 = decimal 255
Each byte therefore represents up to 256 different things. You can map the integers 0 through 255 to one byte, use 256 bytes to represent 256 different ASCII characters, use 256 bytes to represent 0 through FF hexadecimal numbers, or any other mapping that you define. This ability to do this mapping alows humans to better interact with computers without having to think in binary.
So far, so good. However things get a bit more complicated when talking about groupings of bytes in larger quantities. The problem is that terminology from decimal numbering systems has made its way into binary numbering systems as equivalent but, in reality, they are not equivalent. As one example, the term “kilo” is generally used to denote 1,000 of something in the decimal numbering system. In binary, however, “kilo” means 210 (two to the tenth power) which turns out to be 1,024 in decimal. This confusion of terms continues up the numbering scale for all the prefixes.
In 1998, the International Electrotechnical Commission (IEC) attempted to resolve this problem by introducing new prefixes for binary multiples. The names were taken from the decimal prefixes with the addition of “bi” at the end to stand for “binary” and clearly indicate that a multiple referred to the 1,024 factor and not the 1,000 factor (the “bi” is pronounced as the word “bee”). The prefixes introduced include: kibi-, mebi-, gibi-, and others. More of a similar nature were added in 2005 as the need arose. CKnow will show both below when discussing multiples.
But, even that didn’t completely work. Certain units are generally understood to be decimal despite being used when talking about computing.
- Hertz (Hz) measures clock rate so a 2GHz CPU clock performs 2,000,000,000 clock ticks per second.
- Bits per second measures a data rate so a 128 kbit/s MP3 stream will send 128,000 bits every second which works out to 16,000 bytes per second when you divide the 128,000 by 8 (assuming an 8-bit byte and no overhead).
- Hard disk drive makers typically state capacity in decimal units. The numbers (for retail purposes) are larger and so the disk drive sounds bigger. The operating system, however, reports the binary size of the disk which is why when you format a 100GB hard drive you only get a bit over 93GB of free space reported out. There are good engineering reasons why this is convention is used but it doesn’t hurt marketing either 🙂 (although, there have been some legal battles over this as well).
- Floppy disk sizes (and some other newer devices) used a hybrid of the two systems to report out disk size. Disks are accessed using sectors which are measured in binary sizes (from 512 to 2048 bytes depending on the device). Thus, for the most basic measure the “kilo” means 1,024. However, sizes above that are multiples of 10 and not 2 so a megabyte for a disk of this type really means a thousand 1024-byte kilobytes. This means that a 1.44MB floppy diskette holds 1.44x1000x1024 bytes!
- To further confuse things, CDs are measured using binary units but DVDs are measured using decimal units! A 4.7GB DVD really only has a binary capacity of about 4.38 billion binary bytes.
- Bus bandwidth is typically a decimal measure. Again, since the bus is clock-based, decimal units are used instead of binary units even though binary bytes are being pumped over the bus.
With all this in mind, here are the various prefixes in use today and what they refer to in actual numbers. Each multiple is shown first as its decimal value and next as its binary value.
- Kilo (K) = 1,000 = 103 decimal
Kibi (Ki) = 1,024 = 210 binary - Mega (M) = 1,000,000 = 106 decimal
Mebi (Mi) = 1,048,576 = 220 binary - Giga (G) = 1,000,000,000 = 109 decimal
Gibi (Gi) = 1,073,741,824 = 230 binary - Tera (T) = 1,000,000,000,000 = 1012 decimal
Tebi (Ti) = 1,099,511,627,776 = 240 binary - Peta (P) = 1,000,000,000,000,000 = 1015 decimal
Pebi (Pi) = 1,125,899,906,842,624 = 250 binary - Exa (E) = 1,000,000,000,000,000,000 = 1018 decimal
Exbi (Ei) = 1,152,921,504,606,846,976 = 260 binary - Zetta (Z) = 1,000,000,000,000,000,000,000 = 1021 decimal
Zebi (Zi) = 1,180,591,620,717,411,303,424 = 270 binary - Yotta (Y) = 1,000,000,000,000,000,000,000,000 = 1024 decimal
Yobi (Yi) = 1,208,925,819,614,629,174,706,176 = 280 binary
Now we’ll list each in turn to discuss any particular meanings relevant to that specific prefix as it applies to various uses.
Kilo- and Kibi-
- Kilobit. A unit of information abbreviated kbit or kb. Use of this term can be confusing as it was introduced early and therefore takes on either the decimal or binary meaning in various circumstances moreso than most of the other prefixes. Fortunately, at this level the difference is small. A decimal kilobit equals 125 8-bit bytes while a binary kilobit equals 128 8-bit bytes. When discussing telecommunications the decimal use is almost always used. When precision is necessary then kibibit should be used.
- Kibibit. A unit of information or storage abbreviated Kibit or Kib (note that when kibi is used it is always capitalized unlike kilo). One kibibit is always equal to 1,024 bits or 128 8-bit bytes.
- Kilobyte. A unit of information or storage abbreviated KB, kB, Kbyte, kbyte, or, informally, K or k. Kilo was chosen early because while computers use binary, the tenth power of 2 (210) is 1,024 and is close to the 1,000 that kilo meant. This informal adoption has confused things ever since unfortunately as seen from the discussion above. Be careful when encountering kilobyte and know the specific reference.
- Kibibyte. A unit of information or storage abbreviated KiB (never with a lower case “k”). One KiB will always mean exactly 1,024 (210) bytes. When precision is needed, use KiB instead of KB if you are discussing binary measures.
Mega- and Mebi-
- Megabit. A unit of information or storage abbreviated Mbit or Mb. Again, this term takes on a dual meaning depending on the context. However, in the case of the megabit the term is almost always used in a communication setting and therefore most typically takes on the decimal meaning where 1 megabit = 106 = 1,000,000 bits (one million).
- Mebibit. A unit of information or storage abbreviated Mibit or Mib. As with all the other binary bit definitions, 1 mebibit = 220 bits = 1,048,576 bits = 1,024 kibibits.
- Megabyte. A unit of information or storage abbreviated MB (never Mb which would mean bits and not bytes — see just above) or meg. Again, there are confusing interpretations for this term as they have developed over time and for different contexts (also see the more general discussion above).
- 1,000,000 bytes (106) when used in a networking context, clocks, or performance measures.
- 1,048,576 bytes (220) when used discussing memory and file size.
- 1,024,000 bytes (1,024×1,000) when used for floppy disk sizes.
- Mebibyte. A unit of information or storage abbreviated MiB. This is the specific measure of the binary representation of 1,048,576 bytes = 1,024 kibibytes = 220 bytes. When precision is demanded, this is the term to use. Note: Mebibyte is often misspelled as “Mibibyte.”
Giga- and Gibi-
- Gigabit. A unit of information or storage abbreviated Gbit or Gb. One gigabit most typically equals 109 or 1,000,000,000 bits (one billion or one milliard or thousand million in long scale measure*).
- Gibibit. A unit of information or storage abbreviated Gibit or Gib. This is the absolute binary measure equaling 1,073,741,824 (230) bits. Use it when precision is needed.
- Gigabyte. A unit of information or storage abbreviated GB (never Gb) or gig when writing informally or speaking. Again, there are confusing interpretations for this term as they have developed over time and for different contexts (also see the more general discussion above).
- 1,000,000,000 bytes (109) when used in a networking context, clocks, or performance measures. Disk drive manufacturers also typically use this form as it results in an apparent larger size for the disk which the operating system then formats and reports out in binary, or lower numbers.
- 1,073,741,824 (230 ) bytes. This definition is used for memory, file and formatted disk size, and other contexts where binary notation fits better.
- Gibibyte. A unit of information or storage abbreviated GiB. This is the specific measure of the binary representation of 1,073,741,824 (230) bytes. When precision is demanded, this is the term to use.
Tera- and Tebi-
- Terabit. A unit of information or storage abbreviated Tbit or Tb. One terabit most typically equals 1012; or 1,000,000,000,000 bits (one trillion or one billion in long scale measure*).
- Tebibit. A unit of information or storage abbreviated Tibit or Tib. This is the absolute binary measure equaling 1,099,511,627,776 (240) bits. Use it when precision is needed.
- Terabyte. A unit of information or storage abbreviated TB. Again, there are/will be confusing interpretations for this term for different contexts (also see the more general discussion above). Note: Tera derives from the Greek word “teras” which means “monster.”
- 1,000,000,000,000 bytes (1012) when used in a networking context, clocks, or performance measures.
- 1,099,511,627,776 (240) bytes. This definition is used for memory, file and formatted disk size, and other contexts where binary notation fits better.
- Tebibyte. A unit of information or storage abbreviated TiB. This is the specific measure of the binary representation of 1,099,511,627,776 (240) bytes. When precision is demanded, this is the term to use.
Peta- and Pebi-
- Petabit. A unit of information or storage abbreviated Pbit or Pb. One petabit most typically equals 1015; or 1,000,000,000,000,000 bits (one quadrillion or one billard in long scale measure*).
- Pebibit. A unit of information or storage abbreviated Pibit or Pib. This is the absolute binary measure equaling 1,125,899,906,842,624 (250) bits. Use it when precision is needed.
- Petabyte. A unit of information or storage abbreviated PB. Again, there are/will be confusing interpretations for this term for different contexts (also see the more general discussion above).
- 1,000,000,000,000,000 bytes (1015) when used in a networking context, clocks, or performance measures.
- 1,125,899,906,842,624 (250) bytes. This definition is used for memory, file and formatted disk size, and other contexts where binary notation fits better.
- Pebibyte. A unit of information or storage abbreviated PiB. This is the specific measure of the binary representation of 1,125,899,906,842,624 (250) bytes. When precision is demanded, this is the term to use.
Exa- and Exbi-
- Exabit. A unit of information or storage abbreviated Ebit or Eb. One exabit most typically equals 1018; or 1,000,000,000,000,000,000 bits (one quintrillion or one trillion in long scale measure*).
- Exbibit. A unit of information or storage abbreviated Eibit or Eib. This is the absolute binary measure equaling 1,152,921,504,606,846,976 (260) bits. Use it when precision is needed.
- Exabyte. A unit of information or storage abbreviated EB. Again, there are/will be confusing interpretations for this term for different contexts (also see the more general discussion above).
- 1,000,000,000,000,000,000 bytes (1018) when used in a networking context, clocks, or performance measures.
- 1,152,921,504,606,846,976 (260) bytes. This definition is used for memory, file and formatted disk size, and other contexts where binary notation fits better.
- Exbibyte. A unit of information or storage abbreviated EiB. This is the specific measure of the binary representation of 1,152,921,504,606,846,976 (260) bytes. When precision is demanded, this is the term to use.
Zetta- and Zebi-
- Zettabit. A unit of information or storage abbreviated Zbit or Zb. One zettabit most typically equals 1021 or 1,000,000,000,000,000,000,000 bits (one sextillion or one trilliard in long scale measure*). Note: Zettabit is not yet used; thought it’s only a matter of time before it is.
- Zebibit. A unit of information or storage abbreviated Zibit or Zib. This is the absolute binary measure equaling 1,180,591,620,717,411,303,424 (270) bits. Use it when precision is needed. Note: Zebibit is not yet used; thought it’s only a matter of time before it is.
- Zettabyte. A unit of information or storage abbreviated ZB. Again, there are/will be confusing interpretations for this term for different contexts (also see the more general discussion above).
- 1,000,000,000,000,000,000,000 bytes (1021) when used in a networking context, clocks, or performance measures.
- 1,180,591,620,717,411,303,424 (270) bytes. This definition is used for memory, file and formatted disk size, and other contexts where binary notation fits better.
- Zebibyte. A unit of information or storage abbreviated ZiB. This is the specific measure of the binary representation of 1,180,591,620,717,411,303,424 (270) bytes. When precision is demanded, this is the term to use.
Yotta- and Yobi-
- Yottabit. A unit of information or storage abbreviated Ybit or Yb. One yottabit most typically equals 1024 or 1,000,000,000,000,000,000,000,000 bits (one septillion or one quadrillion in long scale measure*). Note: Yottabit is not yet used; thought it’s only a matter of time before it is.
- Yobibit. A unit of information or storage abbreviated Yibit or Yib. This is the absolute binary measure equaling 1,208,925,819,614,629,174,706,176 (280) bits. Use it when precision is needed. Note: Yobibit is not yet used; thought it’s only a matter of time before it is.
- Yottabyte. A unit of information or storage abbreviated YB. Again, there are/will be confusing interpretations for this term for different contexts (also see the more general discussion above).
- 1,000,000,000,000,000,000,000,000 bytes (1024) when used in a networking context, clocks, or performance measures.
- 1,208,925,819,614,629,174,706,176 (280) bytes. This definition is used for memory, file and formatted disk size, and other contexts where binary notation fits better.
- Yobibyte. A unit of information or storage abbreviated YiB. This is the specific measure of the binary representation of 1,208,925,819,614,629,174,706,176 (280) bytes. When precision is demanded, this is the term to use.
As a matter of trivia, if you think teleportation is coming soon consider what it would take to describe an average human. Considering the average makeup of a 75 kg (165 pound) person they find the body contains about 11,800 moles of atoms. If you assign 100 bytes to store the location, type, and state of every atom in that average body, you would need about 600,000 yottabytes. Just try sending that over your local area network! 🙂
*Note: The description of a particular number of digits (e.g., thousand for 1,000) comes in two forms; long scale and short scale. In the long scale a billion means a million millions and in the short scale a billion means a thousand millions.
Comments from 5/20/2009 page:
STELLA JAMES
Said this on 2011-05-11 At 07:30 am
state the relationship between a bit, a character, byte, kilo byte, mega byte, giga byte and terra byte
[A character is either one or two bytes depending on the encoding. The rest is explained in the article. –DaBoss]
#7
Tom Gardner
Said this on 2012-02-07 At 05:00 pm
You have a typo at “Megibit;” I believe u mean Mebibit.
More importantly Megabyte was historically and consistently used by the hard disk drive industry in their decimal connotation long before its binary meaning became generally used. Megabyte hard disk drives are still available today, albeit used or refurbished.
Similarly and consistently the disk drive industry applied decimal meanings to Gigabyte and Terabyte long before the alternate binary meaning became common for memory and file size reporting. There is no evidence to support your assertion in the Gigabyte paragraph that this was done because the reported sizes are larger.
Finally, modern OSes such as Apple report file and disk size in decimal units.
[Fixed the typo, thank you. As to the rest most of what you say agrees with what I say although we say it differently. As to the decimal vs binary for disks, the assertion is based on simple marketing. I doubt sincerely that the marketing department of disk makers did not notice that even, larger numbers sounded better than what would be shown with a binary version and that became at least part of the reason for disk makers using that terminology early on so I will stand by my assertion but note that it is just that, an assertion and not a fact. –DaBoss]