by Darryl Hinshaw
Sometimes Slower Is Really Faster!
Fast system bus speeds may cause a reduction in both performance and stability when upgrading a PowerPC 601, 603 or 604 based Mac to a G3 card based on the PowerPC 750 processor. In order to understand this confusing issue a basic understanding of the dual bus G3 architecture is necessary.
The PowerPC 601, 603 and 604 processors have one bus. The I/O, Cache memory and system memory (SIMMs and DIMMs) all share this bus. This is to say these devices share the same physical connections for communication with the processor. By far the fastest device on the bus is the L2 cache. L2 cache is relatively small high speed static memory. The cache controller predicts what data is likely to be needed from the slower DRAM based system memory and preloads this data to the faster cache memory.
A "cache hit" occurs when data requested by the processor is found in the cache. In the case of a cache hit the data will be accessed at full bus speed, usually between, 25ns (40 MHz bus) and 17ns (60 MHz bus). For a single bus system increasing the bus speed gives better performance.
A "cache miss" occurs if the requested data is not found in the cache memory. In the case of a cache miss the data must be fetched from the slower system memory. In a single bus system the fast cache memory along with the slower memory and other slower I/O devices are connected to a common bus. This mismatch in device speed and bus speed is handled by adding "wait states" when accesses are made to slower devices. Once a bus cycle has started, wait states are added by holding the state of the bus for one or more additional bus clock cycles.
The likelihood of any particular access being a cache hit or miss will vary depending on the cache load algorithm and on the application itself. A typical system will usually average a 70% to 80% cache hit rate making fast cache accesses the majority of bus activity.
The MAXpowr G3 cards from Newer Technology, based on the Power PC 750 processor, have two buses. One for the traditional I/O and system memory and a second bus dedicated only for the L2 cache. Because the second bus is dedicated to high speed memory it allows for much faster L2 cache accesses than a single bus system could have. This high speed memory is soldered directly to the MAXpowr G3 card so access times of 3ns to 8ns are possible. This eliminates the need for a fast system bus clock because the L2 cache is no longer on the system bus.
System DRAM memory have access times of either 60ns or 70ns so wait states are required to access this memory. A faster system bus clock may actually reduce performance due to the required additional wait states. For best performance with a dual bus system such as MAXpowr G3 it is more important to match, or sync,the system bus frequency with the speed of the system memory. Wait states can only be added in increments equal to the period of the system bus clock frequency. There is also some fixed overhead time due to system components of about 5ns which must be accounted for.
The idea is to pick a system bus clock where a given number of wait states minus 5ns will access system memory closest to the access time rating of the installed memory modules. This must be done without violating the memory timing margins. For 60ns memory the best bus clock is around 45 MHz with two wait states. Increasing the bus much beyond 46 MHz will require an additional wait state reducing overall performance.
Because wait states can be confusing for many users to understand, the MAXpowr G3 comes with a control panel which allows the user to specify the slowest speed of the memory installed. The hardware and software automatically configure the card and mother board system memory for best performance. Setting the system bus to something other than the mid forties MHz factory default may compromise the ability of the control panel to optimize for best performance.
Operating the bus at higher clock rates not only reduces memory performance but may causes other problems. The Power PC 750 G3 processor has narrow timing margins as compared to the 601, 603 and 604 processors which these systems were designed for. There is much less timing margin to push the system bus faster, and as mentioned above, there is no advantage.
As a side note, future machines will use a new kind of DRAM called Synchronous DRAM (SDRAM). These modules are optimized for moving blocks of data at full bus speed taking full advantage of the faster clock. They do however require several wait states for random data access. The MAXpowr G3 upgrade cards are designed for the existing Macintosh and Macintosh clones which use either Fast Page Mode (FPM) or Extended Data Out (EDO) which require waits states as explained above.
Some upgrade card manufactures may not have the expertise or the technology to optimize the bus speed and system memory timing for best performance. Worse yet, they may inadvertently be pushing the system bus without making the necessary corrections to the system memory timing. This will over clock the memory which could result in unreliable operation or data corruption.
Not all bench mark tools will indicate a performance increase from faster memory access due to the high cache hit rate with these applications. However, real world applications will benefit from optimized system memory timing.
[Editor's Note: NewerTech obviously thinks their processor upgrades
have an advantage. Are you persuaded by this article? If you have any
comments or perspectives on what MR Hinshaw has written take it to our
For Free Macintosh Hardware Click Here