Monday, August 30, 2010

Disruptive Technology at SC'10

Ateji PX has been selected for presentation at the Disrupted Technologies exhibit part of the SuperComputing 2010 conference.


"Each year, the SC Conference seeks out new technologies with the potential to disrupt the HPC landscape as we know it. Generally speaking, “disruptive technology” refers to drastic innovations in current practices such that they have the potential to completely transform the high-performance computing field as it currently exists — ultimately overtaking the incumbent technologies or software tools in the marketplace. For SC10, Disruptive Technologies examines new computing architectures and interfaces that will significantly impact the high-performance computing field throughout the next five to 15 years, but have not yet emerged in current systems. The Disruptive Technologies exhibits, located in the SC10 exhibit hall, will showcase technologies ranging from storage, programming, cooling and productivity software through presentations, demonstrations and an exhibit showcase.

Selected technologies for SC10 will be on display during regular exhibit hall hours. Please stop by the booth for more information on the presentations and demonstrations schedule."

See you in New Orleans, November 13-19.

Sunday, August 15, 2010

Explaining parallelism to my mother with the Mandelbrot demo

We have put online an Ateji PX version of the Mandelbrot set demo.

You specify the number of processor cores to be used for the computation using a slider (lower right), ranging from 1 to the number of available cores on your computer. My mom's PC has two cores:


Look 'ma, when the slider is on 1, you see only one guy painting. When the slider is on 2, you see two guys painting at the same time:


Now 'ma is finally able to proudly explain what her son is busy at. We're actually using the same demo with CTO's and high-ranking managers in large corporations.

For us developers, here's the code. Since dots are independent of each other, we use a simple for loop:


for (int x:nx, int y:ny){
   compute( x , y );
}


You can see the code used in the demo under the "Source Code" tab. Parallelizing the for loop is simply a matter of inserting parallel bars right after the for keyword:


for ||(int x:nx, int y:ny){
   compute( x , y );
}


The demo also shows some advanced features of loop splitting. By default, the work is split in blocks in order to get one block per processors. However, you can see on the Mandelbrot demo that some blocks take more time to compute than others (blacks dots take more time). The result is that although the work has been split in parallel across a number of workers, we end up waiting for the one worker who has the more black dots. This is not the most efficient way of leveraging parallel hardware.

In such cases, the solution consists in splitting work in smaller blocks. You can play with block splitting in the "Advanced Settings" area (lower left). The effect on the source code is to insert the corresponding #BlockSize annotation:


for ||(#BlockSize(30), int x:nx, int y:ny){
   compute( x , y );
}


You can learn more about Ateji PX loop splitting in the language manual, downloadable from Ateji's web site.