There is only one optimal bus speed for each
given CPU speed for G3 and G4 processors. This is a particularly
important consideration when upgrading computers which use
FPM (Fast Page Mode) or EDO (Extended Data Out) memory. Some
G3 and G4 upgrade cards offer user adjustable bus speed selections.
This is a carry-over from older system designs and is not
appropriate for designs based on the G3 or G4 PowerPC processor.
There exists a common misconception that, the
faster the bus speed the better the performance. It may seem
intuitive to most that clocking your computers system
bus faster would improve performance. However, for G3 and
G4 based computer systems this may actually reduce performance.
In G3 and G4 systems the L2 cache is located
on a dedicated bus. This leaves the system memory (DRAM),
the most performance critical device on the system bus. Any
access to DRAM involves a large number of wait states, so
the trick is to use the time efficiently. To do this one needs
to pick bus frequencies whose clock period (1/frequency)
divides nicely into the time required to access the memory.
If you pick an unfortunate value, say one that requires 4.1
clock periods, you must set the memory timing to 5 clock periods
to avoid violating timing requirements, thus wasting 0.9 clock
periods. Therefore, there are certain "sweet spots" in bus
timing that give the best performance - a faster or slower
frequency is worse.
This was not true with the traditional "look-aside"
L2 caches used in machines based on the PowerPC 601, 603 or
604 processor. In these machines the L2 cache was on the main
system bus and ran at a fixed ratio to the bus, faster was
always better, DRAM timing was a very secondary effect.
Detail Explanation:
Fast system bus speeds may cause a reduction
in both performance and stability when upgrading a PowerPC
601, 603 or 604 based Mac to a G3 or G4 PowerPC processor.
In order to understand this confusing issue a basic understanding
of the dual bus G3 and G4 architecture is necessary.
Single Bus Processors, (601, 603, 604)
Increasing system bus speed improves performance.
The PowerPC 601, 603 and 604 processors have
one bus. The I/O, Cache memory and system memory (SIMMs and
DIMMs) all share this bus. This is to say these devices share
the same physical connections for communication with the processor.
By far the fastest device on the bus is the L2 cache.
L2 cache is relatively small high speed static
memory. The cache controller predicts what data is likely
to be needed from the slower DRAM based system memory and
preloads this data to the faster cache memory.
A "cache hit" occurs when data requested by
the processor is found in the cache. In the case of a cache
hit the data will be accessed at full bus speed, usually between,
25ns (40 MHz bus) and 17ns (60 MHz bus). For a single bus
system increasing the bus speed gives better performance.
A "cache miss" occurs if the requested data
is not found in the cache memory. In the case of a cache miss
the data must be fetched from the slower system memory. In
a single bus system the fast cache memory along with the slower
memory, and other slower I/O devices, are connected to a common
bus. This mismatch in device speed and bus speed is handled
by adding "wait states" when accesses are made to the slower
devices. Once a bus cycle has started, wait states are added
by holding the state of the bus for one or more additional
bus clock cycles.
The likelihood of any particular access being
a cache hit or miss will vary depending on the cache load
algorithm and on the application itself. A typical system
will usually average a 70% to 80% cache hit rate making fast
cache accesses the majority of bus activity.
Dual Bus Processors, (G3, G4)
Bus speed should be optimized for memory access time.
Upgrade cards based on the G3 or G4 processor
have two buses. One for the traditional I/O and system memory
and a second bus dedicated only for the L2 cache. Because
the second bus is dedicated to high speed memory it allows
for much faster L2 cache accesses than a single bus system
could have. This high speed memory is soldered directly to
the G3 or G4 upgrade card so access times of 3ns to 8ns are
possible. This eliminates the need for a fast system bus clock
because the L2 cache is no longer on the system bus.
System DRAM memory is accessed at either 60ns
or 70ns so wait states are required to access this memory.
A faster system bus clock may actually reduce performance
due to the required additional wait states. For best performance
with a dual bus system such as a G3 or G4 upgrade card, it
is more important to match, or sync, the system bus frequency
with the speed of the system memory. Wait states can only
be added in increments equal to the period of the system bus
clock frequency. There is also some fixed overhead time due
to the memory controller and other system components of about
5ns which must be accounted for.
How To Select The Optimal Bus Speed
The idea is to pick a system bus clock where
a given number of wait states minus about 5ns will access
system memory closest to the access time rating of the installed
memory modules. This must be done without violating the memory
timing margins. For 60ns memory the best bus clock is around
45 MHz with two wait states. Increasing the bus much beyond
47 MHz will require an additional wait state, reducing overall
performance.
MAXpowr SmartSet
Because wait states can be confusing for many
users to understand, the MAXpowr G3 and G4 card come with
SmartSet technology. SmartSet is part of the control panel
which allows the user to specify the slowest speed of the
memory installed in the system. The SmartSet hardware and
software automatically configure the card and mother board
system memory controller for the best performance. It turns
out that for any given CPU speed there is only one preferred,
or optimized, bus speed to achieve the best possible performance.
SDRAM Systems
The newest generation of computers from Apple
Computer use SDRAM. The need to optimize for system memory
access remains important. However, for these systems the bus
clock speed is determined by the motherboard, not the CPU
upgrade card. Apple has designed the SDRAM memory system and
bus speed for the best performance. These designs are optimized
for moving blocks of data at full bus speed taking advantage
of the faster clock. They do however require several wait
states for random data access.
Additional Bus Speed Notes
Some upgrade card manufactures may not have
the expertise or the technology to optimize the bus speed
and system memory timing for best performance. Worse yet,
they may inadvertently be pushing the system bus without making
the necessary corrections to the system memory timing. This
would over-clock the memory which could result in unreliable
operation or data corruption.
Not all benchmark tools will indicate a performance
increase from optimized memory access due to the high L2 cache
hit rate with these applications. However, real world applications
will benefit from optimized system memory timing.
Conclusion
In a single bus system the advantage of a faster
bus outweighs the reduced system memory access speed due to
added wait states. However, when these systems are upgraded
to a G3 or G4 processor, the system bus clock speed should
be selected for optimal system memory access. Faster will
not always provide the best performance.
The MAXpowr SmartSet technology, included with
all Newer Technology MAXpowr G3 and G4 upgrade cards, automatically
optimizes memory timing without cumbersome switches or knobs.
MAXpowr upgrades offer the best possible performance, stability
and ease of use - exactly what every Mac user expects and
deserves.
Copyright 1996-2007 by Cider Press Publishing LLC all rights reserved.
MacSpeedZone is not authorized, sponsored, or otherwise approved by Apple
Computer. Apple, the Apple logo, Macintosh, iPod, iBook, iMac, eMac, and
PowerBook are registered trademarks of Apple Computer, Inc. Additional
company
and product names may be trademarks or registered trademarks and are hereby
acknowledged.