Sunday, August 15, 2010

Explaining parallelism to my mother with the Mandelbrot demo

We have put online an Ateji PX version of the Mandelbrot set demo.

You specify the number of processor cores to be used for the computation using a slider (lower right), ranging from 1 to the number of available cores on your computer. My mom's PC has two cores:

Look 'ma, when the slider is on 1, you see only one guy painting. When the slider is on 2, you see two guys painting at the same time:

Now 'ma is finally able to proudly explain what her son is busy at. We're actually using the same demo with CTO's and high-ranking managers in large corporations.

For us developers, here's the code. Since dots are independent of each other, we use a simple for loop:

for (int x:nx, int y:ny){
   compute( x , y );

You can see the code used in the demo under the "Source Code" tab. Parallelizing the for loop is simply a matter of inserting parallel bars right after the for keyword:

for ||(int x:nx, int y:ny){
   compute( x , y );

The demo also shows some advanced features of loop splitting. By default, the work is split in blocks in order to get one block per processors. However, you can see on the Mandelbrot demo that some blocks take more time to compute than others (blacks dots take more time). The result is that although the work has been split in parallel across a number of workers, we end up waiting for the one worker who has the more black dots. This is not the most efficient way of leveraging parallel hardware.

In such cases, the solution consists in splitting work in smaller blocks. You can play with block splitting in the "Advanced Settings" area (lower left). The effect on the source code is to insert the corresponding #BlockSize annotation:

for ||(#BlockSize(30), int x:nx, int y:ny){
   compute( x , y );

You can learn more about Ateji PX loop splitting in the language manual, downloadable from Ateji's web site.