RAM and ROM
The previous section talked about the address and data buses, as well as the RD and WR lines. These buses and lines connect either to RAM or ROM -- generally both. In our sample microprocessor, we have an address bus 8 bits wide and a data bus 8 bits wide. That means that the microprocessor can address (28) 256 bytes of memory, and it can read or write 8 bits of the memory at a time. Let's assume that this simple microprocessor has 128 bytes of ROM starting at address 0 and 128 bytes of RAM starting at address 128.
ROM stands for read-only memory. A ROM chip is programmed with a permanent collection of pre-set bytes. The address bus tells the ROM chip which byte to get and place on the data bus. When the RD line changes state, the ROM chip presents the selected byte onto the data bus.
RAM stands for random-access memory. RAM contains bytes of information, and the microprocessor can read or write to those bytes depending on whether the RD or WR line is signaled. One problem with today's RAM chips is that they forget everything once the power goes off. That is why the computer needs ROM.
By the way, nearly all computers contain some amount of ROM (it is possible to create a simple computer that contains no RAM -- many microcontrollers do this by placing a handful of RAM bytes on the processor chip itself -- but generally impossible to create one that contains no ROM). On a PC, the ROM is called the BIOS (Basic Input/Output System). When the microprocessor starts, it begins executing instructions it finds in the BIOS. The BIOS instructions do things like test the hardware in the machine, and then it goes to the hard disk to fetch the boot sector (see How Hard Disks Work for details). This boot sector is another small program, and the BIOS stores it in RAM after reading it off the disk. The microprocessor then begins executing the boot sector's instructions from RAM. The boot sector program will tell the microprocessor to fetch something else from the hard disk into RAM, which the microprocessor then executes, and so on. This is how the microprocessor loads and executes the entire operating system.
Understanding Microprocessor Instructions
Even the incredibly simple microprocessor shown in the previous example will have a fairly large set of instructions that it can perform. The collection of instructions is implemented as bit patterns, each one of which has a different meaning when loaded into the instruction register. Humans are not particularly good at remembering bit patterns, so a set of short words are defined to represent the different bit patterns. This collection of words is called the assembly language of the processor. An assembler can translate the words into their bit patterns very easily, and then the output of the assembler is placed in memory for the microprocessor to execute.
Here's the set of assembly language instructions that the designer might create for the simple microprocessor in our example:
If you have read How C Programming Works, then you know that this simple piece of C code will calculate the factorial of 5 (where the factorial of 5 = 5! = 5 * 4 * 3 * 2 * 1 = 120):
- LOADA mem - Load register A from memory address
- LOADB mem - Load register B from memory address
- CONB con - Load a constant value into register B
- SAVEB mem - Save register B to memory address
- SAVEC mem - Save register C to memory address
- ADD - Add A and B and store the result in C
- SUB - Subtract A and B and store the result in C
- MUL - Multiply A and B and store the result in C
- DIV - Divide A and B and store the result in C
- COM - Compare A and B and store the result in test
- JUMP addr - Jump to an address
- JEQ addr - Jump, if equal, to address
- JNEQ addr - Jump, if not equal, to address
- JG addr - Jump, if greater than, to address
- JGE addr - Jump, if greater than or equal, to address
- JL addr - Jump, if less than, to address
- JLE addr - Jump, if less than or equal, to address
- STOP - Stop execution
At the end of the program's execution, the variable f contains the factorial of 5.
A C compiler translates this C code into assembly language. Assuming that RAM starts at address 128 in this processor, and ROM (which contains the assembly language program) starts at address 0, then for our simple microprocessor
the assembly language might look like this:
So now the question is, "How do all of these instructions look in ROM?" Each of these assembly language instructions must be represented by a binary number. For the sake of simplicity, let's assume each assembly language instruction is given a unique number, like this:
The numbers are known as opcodes. In ROM, our little program would look like this:
- LOADA - 1
- LOADB - 2
- CONB - 3
- SAVEB - 4
- SAVEC mem - 5
- ADD - 6
- SUB - 7
- MUL - 8
- DIV - 9
- COM - 10
- JUMP addr - 11
- JEQ addr - 12
- JNEQ addr - 13
- JG addr - 14
- JGE addr - 15
- JL addr - 16
- JLE addr - 17
- STOP - 18
You can see that seven lines of C code became 17 lines of assembly language, and that became 31 bytes in ROM.
The instruction decoder needs to turn each of the opcodes into a set of signals that drive the different components inside the microprocessor. Let's take the ADD instruction as an example and look at what it needs to do:
Every instruction can be broken down as a set of sequenced operations like these that manipulate the components of the microprocessor in the proper order. Some instructions, like this ADD instruction, might take two or three clock cycles. Others might take five or six clock cycles.
- During the first clock cycle, we need to actually load the instruction. Therefore the instruction decoder needs to:
- activate the tri-state buffer for the program counter
- activate the RD line
- activate the data-in tri-state buffer
- latch the instruction into the instruction register
- During the second clock cycle, the ADD instruction is decoded. It needs to do very little:
- set the operation of the ALU to addition
- latch the output of the ALU into the C register
- During the third clock cycle, the program counter is incremented (in theory this could be overlapped into the second clock cycle).
The number of transistors available has a huge effect on the performance of a processor. As seen earlier, a typical instruction in a processor like an 8088 took 15 clock cycles to execute. Because of the design of the multiplier, it took approximately 80 cycles just to do one 16-bit multiplication on the 8088. With more transistors, much more powerful multipliers capable of single-cycle speeds become possible.
More transistors also allow for a technology called pipelining. In a pipelined architecture, instruction execution overlaps. So even though it might take five clock cycles to execute each instruction, there can be five instructions in various stages of execution simultaneously. That way it looks like one instruction completes every clock cycle.
Many modern processors have multiple instruction decoders, each with its own pipeline. This allows for multiple instruction streams, which means that more than one instruction can complete during each clock cycle. This technique can be quite complex to implement, so it takes lots of transistors.
The trend in processor design has been toward full 32-bit ALUs with fast floating point processors built in and pipelined execution with multiple instruction streams. There has also been a tendency toward special instructions (like the MMX instructions) that make certain operations particularly efficient. There has also been the addition of hardware virtual memory support and L1 caching on the processor chip. All of these trends push up the transistor count, leading to the multi-million transistor powerhouses available today. These processors can execute about one billion instructions per second!
For more information on microprocessors and related topics, check out the links on the next page!
Lots More Information!
Related stuff.dewsoftoverseas.com Articles
More Great Links!