The age of massively parallel and concurrent programming has begun [1]. On this page i will show some demonstrations with it.
I chose a ray tracer as an example application. A number of spheres is drawn in an empty space with some point lights. The camera looks from the outside to the center of the scene and turns around the scene in a circle. An environment map / texture is also used. A ray tracer is straightforward to parallelize: The idea is to split the screen into disjunct areas and to calculate each area in parallel and join the results afterwards into the final picture.
Not all problems are that easy. See for example the 13 patterns ("dwarfs") of parallel algorithm design by David Paterson et.al..
The following screenshots shows this ray tracer running on a Mac.
On a multi-core processor each core is able to process a part of the rendered image. For better scheduling it is necessary to use a finer grain: threads.
The following films show the usage of the pthreads library. The film on the left is on an Intel Core 2 Duo with 2.2 MHz and the film on the right with an Intel Core i7 with 2.67 MHz.
The cell broadband engine is a multi-core processor. One of the cores, the so called PPE, is a general processor that can handle I/O, memory, etc. There are 6 so called SPEs that are spezialized to number crunching. All the cores are 128-bit SIMD .
So basically there are two ways to parallelize here.
At the point of writing i only implemented the first point. The following film shows the ray tracer in action.
The ray tracer simply splits the screen into n parts and uses an SPE for each part.
This is not ready yet.
The source is available on request.