Bits and Bytes: the Open and Shut Case of Data Terminology
The term BYTE in some form is probably the tech term you will hear most often, and without even realizing it, because it won’t always be used in it’s solo form or in its full form.
Let me tell you first why this can get confusing: Because the very same words that describe how BIG something is also describes how FAST something is and how MUCH of it there is. Normally, we have a 2,000 square-foot house, we go 70 miles per hour, and we boil a quart of water. What we actually describe when we get into BIG/FAST/MUCH is all actually HOW MUCH. We want to know HOW MUCH data will fit on a hard drive, HOW MUCH data can be transferred at a particular pace, and HOW MUCH data can be handled by RAM.
Let’s start with the smallest, most basic unit: BIT.
A bit is actually a shortened term that comes from binary digit. It is a visual representation of the position of an electrical circuit, or gate. The gate has two positions, open and closed; you could also call it on or off. If the gate is open, it’s on, if the gate is closed, it’s off. Think of it using a water spigot: If you OPEN the water spigot, you turn the flow of water ON. Got that? The visual representation of a bit is either a 1 or a 0. 0 indicates the closed circuit, or a state of OFF. 1 indicates that it is open, and on. Have you ever looked into a water spigot at the hardware store? Pick one up and watch what happens inside when you turn the handle. You will see the gate change position. When it is full open, you will only see the side view of it, and it will look like an upright 1. When you start to close it, you will see the view change from side view to full-on view, and the image will change slowly from a 1 to a 0 inside the spigot, only the 0 will be filled in with whatever material that gate is made of. So one bit of data is an electrical signal that indicates whether a gate should be open or shut. Don’t make the mistake of thinking that one letter on a keyboard is one bit. One document with one letter from your keyboard is several thousand bits.
Eight bits make up a BYTE, and from there it gets a lot easier, because each name builds up from a multiple of its predecessor. In other words, a KILOBYTE is 1024 bytes. A MEGABYTE is 1024 kilobytes. A GIGABYTE is 1024 megabytes. A TERABYTE is 1024 gigabytes. So far, that’s probably all you have had to deal with; but it does go further:
A PETABYTE is 1024 terabytes. An EXABYTE is 1024 petabytes. A ZETTABYTE is 1024 exabytes. A YOTTABYTE is 1024 zettabytes. A BRONTOBYTE is 1024 yottabytes. And a GEOPBYTE is 1024 brontobytes. By the time we got to exabytes, we were talking a good-sized company’s storage needs. The zettabyte size is getting close to the NSA’s compound in Utah.
Let’s muddy the water a little more. Hard drive manufacturers don’t use the 1024 multiplier. They use the standard multiple of 1000. Otherwise, the sequence is the same. But this will explain why a hard drive that is labeled 100GB may not appear to have 100GB of capacity. Your operating system is not following the 1000-multiplier convention that the hard drive manufacturer is. But the difference in the numbers won’t make that much of a difference. If you get to the point where 35MB of storage is free on your hard drive, it’s time to either do some serious cleaning up or buy a bigger hard drive. The reality is that today’s hardware is so big that the difference between 1000 and 1024 doesn’t make a “bit” of difference to the average user.
Now let’s look at the applications of BIG, FAST, AND MUCH.
You want a nice, BIG hard drive. The early days of hard drives were measured in MEGABYTES. (The REAL early days of computers didn’t have internal hard drives at all; the programs were run either from tapes or disks.) My first computer that I actually owned had an 8 GB hard drive. I’ve seen growth to 40GB as a reasonable size, then 80GB, then 120, then 250, then…..oh my goodness, now we’re all afraid that a terabyte hard drive isn’t going to be big enough! Well, at today’s prices, a terabyte hard drive can be had inexpensively enough to have one inside the computer and a couple to connect as external storage.
You want a nice FAST processor and bus, and you want a nice FAST transfer rate on your storage devices. That is measured in XXXXXX per second. But they aren’t the same measurement at all. Your processor is measured in HERTZ, which indicates an amount of bits that can be manipulated per second. Our modern processors are measuring everything in GHz, or Gigahertz, which is about a billion bits per second. That’ll make your head spin, won’t it?
If you look at the label on a hard drive, you’ll see the capacity of the drive, but you’ll also see a designation like this somewhere on it: SATA 6.0 Gb/s. The SATA is the connection type, but look at that number: 6.0 —that designation is not gigabytes, but gigaBITS. When we mean gigaBYTES, we use an upper case B, when we mean gigaBITS, we use a lower case b. Same with kilos and metas. But the designation indicates that the hard drive can transfer 6 gigabits per second. For the most part, when you’re buying a computer you’ll take whatever is there, you can shop the hard drive SIZE but you won’t really need to worry much about the transfer rate. Now, if you replace a hard drive and you’re buying just the drive, pay attention to that number, because more is better. If you can get a terabyte hard drive with a transfer rate of 6.0 Gb/s, that’s better than a terabyte hard drive with a transfer rate of 4.0 Gb/s—but not much. The reality is that if you aren’t a heavy gamer, or if you aren’t running serious statistical or geological or graphical programs, you won’t notice a difference.
We also measure network traffic in Mbps and Gbps. Obviously you want a bigger number when you can get it. Most internet speeds are measured in Megabits per second. If you’re luck enough to get an internet connection of a Gigabit or higher, I don’t want to hear about it. You’re going to get what your internet service provider is willing and/or able to provide, and there won’t be anything you can do to change that. You may be able to select different packages within the ISP’s offering, but you can’t increase their highest speed offering. But knowing the number can give you a frame of reference when selecting a provider.
Finally we want to know how MUCH data can be immediately accessed by a program from RAM. The more RAM you have in your computer, the faster your programs will run while you’re creating stuff. Check out my post on RAM for a fuller explanation on why.
I covered a lot of time, space, and volume in this post, and even though I read over it several times and made several edits, I’m sure there’s something I left unclear. Leave a comment below if you still have any questions on this subject.