Friday, July 30, 2010

Integrating π (pi) in parallel

A simple way of computing the constant π (pi) consists in measuring the surface under a curve. In more algebraic terms, this amounts to integrating y = 4/(1+x*x) between 0 and 1, and in programming terms this means incrementing x in small steps and summing the corresponding y (the smaller the steps, the more accurate the result).

Tim Mattson in his blog entry "Writing Parallel Programs: a multi-language tutorial introduction" explores available tools for coding this algorithm in parallel, namely OpenMP, MPI and Java threads.

Here we will stick to the Java universe, and compare Java sequential and multi-threaded code with their Ateji PX equivalent. Impatient readers may readily jump to the Ateji PX version at the end of the article.

The sequential Java code, inspired from Tim's sequential C version, is as follows:


  static final int numSteps = 100000; 
  static final double step = 1.0/numSteps; ; 

  public static void main(String[] args) {
    double sum = 0.0; 
    for(int i=0; i<= numSteps; i++) { 
      double x = (i+0.5)*step; 
      sum = sum + 4.0/(1.0+x*x); 
    } 
    double pi = step * sum;
    System.out.println(pi);
    System.out.println(Math.PI); 
  }



Try to play with the value of numSteps and see the effect on precision.

Tim parallelizes this code using threads as follows (slightly edited to make it look more Java-ish) :


static int nProcs = Runtime.getRuntime().availableProcessors(); 

static class PIThread extends Thread 
{ 
  final int partNumber; 
  double sum = 0.0; 

  public PIThread(int partNumber) { 
    this.partNumber = partNumber; 
  } 

  public void run() { 
    for (int i = partNumber; i < numSteps; i += nProcs) { 
      double x = (i + 0.5) * step; 
      sum += 4.0 / (1.0 + x * x); 
    } 
  } 
} 

public static void main(String[] args) { 
  PIThread[] part_sums = new PIThread[nProcs]; 
  for(int i = 0; i < nProcs; i++) { 
    (part_sums[i] = new PIThread(i)).start(); 
  } 
  double sum = 0.0; 
  for(int i = 0; i < nProcs; i++) { 
    try { 
      part_sums[i].join(); 
    } catch (InterruptedException e) {
    } 
    sum += part_sums[i].sum; 
  } 
  double pi = step * sum; 
  System.out.println(pi); 
} 



Pretty verbose, isn't it ? The core of the algorithm becomes hidden behind a lot of irrelevant details.

Being verbose also means that it becomes just too easy to overlook potential problems. In this code, the handling of InterruptedException is wrong and may lead to very nasty bugs when put in the context of a larger application. Not to blame Tim: honestly, who understands the precise meaning and usage rules of InterruptedException ?

In contrast, let us code the integration of π using Ateji PX, an extension of Java. First of all, the mathematical expression used in the integration is a typical example of a comprehension, for which Ateji PX provides an intuitive syntax. Here is the sequential code:


public static void main(String[] args) {
  double sum = `+ for{ 4.0/(1.0+x*x) | int i : numSteps, double x = (i+0.5)*step }
  double pi = step * sum;
  System.out.println(pi);
}



The second line, computing sum, is very close to the standard big-sigma notation in mathematics. Having this notation available as an extension of Java makes the expression of many mathematical formulas concise and intuitive, almost like what you've learned in high school.

It also makes the code closer to the programmer's intent. In the first sequential version, using a for loop, it takes some thinking before realizing that the code is actually computing a sum. This has a strong impact on code readability and maintenance.

But what's really interesting is how this code can be parallelized. Simply add a parallel bar ("||") right after the for keyword, and Ateji PX will perform the computation in parallel using all available cores.


public static void main(String[] args) {
  double sum = `+ for||{ 4.0/(1.0+x*x) | int i : numSteps, double x = (i+0.5)*step }
  double pi = step * sum;
  System.out.println(pi);
}



In the OpenMP community, this is called a parallel reduction. Compare this code to the OpenMP version and the multi-threaded version.

Comprehension expressions in Ateji PX are not limited to summation. They can express aggregate operations such as product, logical or, count and average, but also bulk data manipulation such as SQL-like queries and list or set comprehensions (the set of all ... such that ...), and even operate on user-defined operations.