8086-Performance

Performance

Although partly shadowed by other design choices in this particular chip, the multiplexed bus limited performance slightly; transfers of 16-bit or 8-bit quantities were done in a four-clock memory access cycle.[10] As instructions varied from 1 to 6 bytes, fetch and execution were made concurrent (as it remains in today's x86 processors): The bus interface unit fed the instruction stream to the execution unit through a 6 byte prefetch queue (a form of loosely coupled pipelining), speeding up operations on registers and immediates, while memory operations unfortunately became slower (4 years later, this performance problem was fixed with the 80186 and 80286). However, the full (instead of partial) 16-bit architecture with a full width ALU meant that 16-bit arithmetic instructions could now be performed with a single ALU cycle (instead of two, via carry), speeding up such instructions considerably. Combined with orthogonalizations of operations versus operand-types and addressing modes, as well as other enhancements, this made the performance gain over the 8080 or 8085 fairly significant, despite cases where the older chips may be faster.

No comments:

Post a Comment