AltiVec and Dual Processor Performance Scores For XLR8 G4 ZIF Processor Upgrades

According to Apple, these applications, to some extent, take advantage of the Velocity Engine (AltiVec Instructions) of the G4 processor. So you should see some performance improvement running these applications if you have a G4 machine or upgrade. How much improvement depends on what parts of the application have been written to utilize the Velocity Engine. The AltiVec instruction set was developed primarily to speed up multimedia and graphics performance. Please see the note below from Glenn Fisher of Apple Performance Product Marketing for more detailed information on the benefits and limitations of the G4's Velocity Engine.

The benchmark scores below are, for the most part, estimates. Though based on real numbers, we did not test each individual machine, but relied instead on what we know about the performance of G4 upgrade cards, and the relative performance of each Macintosh model. The dual G4 processor scores are based solely on the claim by MAXON that their Cinebench product will run 80% faster on a dual processor machine over a single processor one - we hope to put this claim to a test soon.

In deriving these numbers we tried to be conservative in our estimates, so that, if you are surprised, you will be pleasantly so. That said in evaluating the results keep in mind that they are "in the ballpark". Also only certain applications take advantage of the AltiVec instructions of the G4 and it is only certain functions of these applications that are sped up (for instance certain Photoshop filters get a performance boost from the G4 and other filters show only the same performance that you would get from a similarly clocked G3).

At the moment dual processor aware applications are even fewer. This should change once OS X is released and more OS X savvy applications come on market.

For the results below we based the scores on the knowledge that a G4 running at the same speed as a G3 is 43% faster at rendering images with Photoshop AltiVec enhanced filters, 128% faster when encoding in QuickTime (Sorenson) and 38% faster using SoundJam to encode MP3.

We do not, as of yet, have any performance scores for what the G4 in dual processor configuration will do in terms of performance improvement, but hope to add those scores soon.

A note from the MotherShip:

MSZ : Is there a limit to what the AltiVec instructions can accelerate? In other words could someone rewrite an application to make all of it AltiVec savvy or can only certain portions of an application be accelerated? If only certain parts can be accelerated what is this limited to.

GF: Yes and yes. AltiVec can only provide acceleration based on it's ability to handle complex instructions and up to 4 words of data at a time; theoretically, if you were able to parallelize all your data, and it was all half-word size, you would get 8x [performance]. AltiVec also has a few instructions that essentially do two instructions in one cycle, so you can theoretically get 2 instructions per cycle speed up over non-AltiVec code. However, this assumes your code can be rewritten to take advantage of AltiVec. Many things can't. AltiVec was designed to speed up multimedia-type operations, such as transparency, en- and de-coding, graphics operations, 3D, etc. What deveopers generally find is they can accelerate the part of their code that does these things anywhere from 1.5 to 16x. But there's a lot of 'overhead' code that can't be accelerated, so the final acceleration ends up being substantially less. For example, reading or writing files to disk cannot be accelerated--it's dependent on the file system and disk performance.

Glenn Fisher
Performance Product Marketing Apple Computer

MSZ writes: We are doing some benchmarking using Cinema 4D running on machines upgraded with G4s. Our understanding is that Cinema is AltiVec savvy, however we are not seeing any improvement over a similarly clocked G3. We are using 3 of the sample files and batch rendering them. What parts of Cinema are accelerated for the G4? Can you offer us any advice on how to gauge the performance of the G4 when using your product?

Maxon Computer: Yes, XL6 is optimized for the use of the velocity engine of the G4 processor. However, if you try to compare XL6 on a G3 and a G4 at same clockspeed, you might get different results, ranging from no acceleration to up to 50% faster rendering.

Why? The velocity engine is mainly designed to speed up certain mathematical functions, which are mostly designed for speeding up playing back video, encoding and decoding MPEG (DVD video, MP3), and also for game acceleration. All these tasks require only a limited accurancy of 64 Bit. A raytracer like CINEMA 4D needs a much higher precision and therefore calculates mostly with 256 Bit accurancy. The velocity engine is unfortunately no help here. But there are certain areas in the software, which donĀt need this high precision, where 64 Bit calculation is enough (e.g. the noise in lights and some other things).

It depends very much on what you have inside your test scene. The size of the rendered image, the size and format of the textures, models, if your objects are parametric or polygonized, and lots more have an influence on the amount of acceleration.

These behaviors and limitations of the Velocity Engine are confirmed by Apple's developer support

Paul Babb, Maxon Computer, Inc

