|
techieb0y.corbettdigital.net
peter@corbettdigital.net | |
|
Last updated:
2008 Nov 15 |
Understanding Computer Specs v1.1
v1.1 - 2007 Sep 17 v1.0 - 2005 Aug 27 peter@corbettdigital.net
CPU Speed
Computers are avid fans of public transit; they are filled with busses of various types: expansion busses are the most obvious; drive connections, memory, and CPUs in some multi-CPU designs are also on a shared bus. Busses have 2 key parameters: width, and speed. The simplest bus has 3 wires: A clock signal, a data line, and a reference ground. Every time the clock line pulses, read the data line. This is a 1-bit bus, or a serial bus setup. Most computers use parallel busses, in some multiple of 8 bit increments (exactly how big has gotten bigger over time; original ISA used 8 data bits side-by-side, reading one byte of data every clock cycle; PCI uses up to 64 data bits per clock cycle.)
The clock rate is other key bus spec, usually some number of Mhz. ISA ran at 8Mhz; PCI goes up to 133Mhz.
These numbers taken together specify the theoretical maximum rate information can flow over the bus: An ISA slot (8 bits at 8Mhz) can transfer 64000 bits (64 kbit) of information every second. For comparison, the basic di.fm stream is 96kbit/sec of compressed MP3 audio -- lack of CPU power aside, the original IBM PC could not transfer (much less decode or play) a DI stream.
Desktop PCs mostly use 32-bit 33Mhz PCI, which can transfer 1056000 bits per second. That's 1.056Gbit, or 132 Mbyte, per second, or 11,000 DI streams. This is more then plenty for most ordinary computing, but it has its limitations. For instance, pretend you're downloading a file over a (very fast) network connection to your hard drive. With a 100Mbit/sec (Fast Ethernet) network link, you transfer every recieved packet from the network into system RAM, and then transfer it from system RAM to the drive controller. On a modern system, the drive controller runs at 100Mbit/sec or 133Mbit/sec, so it can keep up (although the physical limits of the drive are often slower). So, between network and disk traffic, you have (1056-200 = ) 856 Mbit/sec left over on the bus. Now, let's say you're watching a movie: compressed streams of MPEG2 are typically 1.35MBit/sec for a DVD. The sound after decoding from the stream is uncompressed PCM sent to the sound card, 44.1kHz samples at 16 bits, in stereo. That's ( 441000 * 16 * 2) = 1411200 bits/sec, or 1.378 Mbit/sec, going from system memory to the sound card. If the movie uses surround sound, there's that same about for the rear channels as well. The movie has video; DVDs run at 720x480 at 30fps, and end up as 32-bit color regions for the graphics card -- that's 331776000 bits/sec, or 316.4MBit/sec from the CPU to memory. Then there's the process of moving that back to the display: a typical screen size is 1280x1024, 32-bit color, updated 72 times per second: 3019898880 bits/sec, or 2.88 Gbit/sec, of video data (Obviously, the PCI can't handle that, so systems have for some time now used an independent graphics bus, AGP, for the video data.) So, you're using 1.35+1.378+1.378+316.4+100+100 = 520.506 -- only 535.494Mbit left over. High-definition (HD) at 1080p is 1920x1080, 30fps, 24-bit color: 1492992000 bit/sec or 1.423Gbit/sec -- again, more then PCI can handle. Another facto is bus management overhead -- deciding which part of the system can use the bus at any given moment, resolving conflicts, and similar system behind-the-scenes tasks.
Current trends in system design are to use more and more faster serial busses -- PCI-Express. Parallel busses, while fast and efficient, run into laws-of-physics issues at higher speeds: two wires next to each other act as antennas and capaciters, interfearing with the signals that nearby wires cary. Care must be taken that each wire in the bus is the same length between any 2 points, or portions of the signal may arrive early/late compared to the rest of the message. PCI-Express solves these problems by providing 'lanes' of 250Mbit/sec capacity. Many cards need only a fraction of this, so are designed for a single-lane (x1) slot. Other cards have greater needs, and are designed for slots which provide 4, 8, or 16 lanes. Disk controlers typically are x8; Graphics cards are almost always x16. An x16 slot provides 4Gbit/sec, sufficient to carry information for about 3 full-bandwidth HD feeds.
Memory
The next major number used to rate the performance of a PC is the amount of memory it has, and the speed of the memory bus. The early IBM PC had 640kbyte of memory (Bill Gates is often mis-quoted as saying nobody would ever need more than that); My Windows PC currently has 640Mbytes -- 1000 times as much. Most new machines now ship with 2 to 4 times that much (1 or 2 GB) 32bit Desktop PCs can address up to 2Gb of RAM before extra tricks are required; the increashly common 64-bit chips can address much, much more.
Large amounts of RAM aren't much good if the CPU can't use it in a timely manner, so memory bus speeds are also a factor. Early computers ran the memory bus (and expansion busses) from the same clock as the CPU itself, and life was good. As computers got faster, this became impractical; over a few inches of wire, the next clock signal would be put onto the wire as the previous one made it to the other end. So, memory (and all other busses) are generally run at a fraction of the CPU bus clock, and the CPU spends time in between memory clocks doing other (hopefully useful) things. This is where pipelined CPU designs are important, as they allow part of a next job to work while they wait for results of a memory access.
Cache
In order to combat the delay of accessing main memory, CPU's include one (or more) layers of cache memory. These are called Level 1 and Level 2 cache (and occasionally Level 3 as well). Level 1 cache is small -- perhaps 32kbyte -- and is part of the CPU package, and runs at CPU speed. Things recently accessed from main RAM are copied here, and the CPU looks in cache before going back to main memory. Level 2 cache is larger (512 or so on x86 CPUs), but slower. The physical location of L2 cache varies between systems; older designes had it as an add-on option socket, the Pentium 2 included it on the same circuit board as the CPU (part of the Slot 1 package design), and modern designs are moving it closer yet to the CPU itself. The effect of cache is really quite drastic. As a cost saving measure, Intel used (at first) no L2 cache on the Celeron CPU, and it shows -- it's no fun to use a Celeron box at all, it feels slow and laggy just showing an empty desktop with nothing running.
The last major selling point of a computer is the hard drive capacity and speed. Various factors are involved in the speed: the discs rotation (5400 rpm for a home PC, 7200 rpm for a gaming PC, 10000rpm or 15000 for a server), which in turn affects the seek time (an ever-decreasing number of miliseconds.) Then, the speed of the connection from the drive to the rest of the computer: ATA/133 at 133Mbyte/sec, Ultra160 SCSI, 2 Gigabit fiber channel (which is a serial bus; it equates to 250Mbyte/sec), 3 Gigabit SATA (around 300MByte/sec)
There are other numbers whose effects can be compared all day long, much the same as an engine's number of cylenders, strke depth, displacement, bore, fuel octane, number of valves, and inumerable other aspects are routinely debated amongst car enthusiasts. These are just the most common ones. Also important to note is that these are all specifications of particular parts of a computer, not a single measure of the system as a whole. For that, people use benchmarks.
Benchmarks
A benchmark is some single-number metric to say 'my computer is faster then yours'. Advertizers and sales people love them. Unless you fully understand how they are derived, they are meaningless. Continuing the car analogy, the Zero-to-60 time is a benchmark, as is the horsepower or the torque.
Gamers often use 3DMark scores (which involve timings of 3D geometry computations) to rate performance, but this is mainly a measure of their graphics subsystem. Also, some graphics card venders (ATI and nVidia are particularly bad about this) will tune and tweak their card designs to match the way the benchmarks are measured to edge out a better score, but not nesecarily better overall real-world performance.
Oh, and you'll learn more then you ever wanted to know about all of this stuff (and many other things) in CS3421 (Computer Organization) and CS4411 (Intro to Operating Systems).
|