performance - Why does Haskell perform so poorly when executing C-like codes? (in this instance at least) -


i trying figure out performance issues have been having haskell. part of that, have written small comparison program compare c , haskell. specifically, translated c program haskell few changes could. speed-measured part of haskell program written in imperative style.

the program makes 2 list of random numbers in range, , calculates integral of graph formed connecting points, 1 list being x-values , 1 list being y-values. essentially, trapezoidal rule.

here 2 codes:

main.c

#include <stdio.h> #include <stdlib.h> #include <time.h>  #define n 5000000 #define maxy 1e5f/n #define maxxgap 1  int main(){     int i;     float *y, *x;     float xaccum, area;     clock_t begin, end;     double time_spent;      y = (float*)malloc(sizeof(float)*n);     x = (float*)malloc(sizeof(float)*n);      srand(50546345); // change seed different numbers      //populate y , x fields random points     for(i = 0; < n; i++){         y[i] = ((float)rand())/((float)rand_max)*maxy;     }     xaccum = 0;     for(i = 0; < n; i++){         x[i] = xaccum;         xaccum += ((float)rand())/((float)rand_max)*maxxgap;     }     begin = clock();     //perform trapezoidal integration using x y coordinates     area = 0;     for(i = 0; < n-1; i++){         area += (y[i+1]+y[i])/2*(x[i+1]-x[i]);     }     end = clock();     time_spent = (double)(end - begin) / clocks_per_sec * 1000;     printf("%i points\n%f area\n%f ms\n", n, area, time_spent); } 

main.hs

{-# language bangpatterns #-} module main  import data.array.unboxed import data.array.io import data.list import system.random import system.cputime import text.printf import control.exception  main :: io () main =           (x,y) <- initarrays           area <- time $ integrate x y           print area  n :: int n = 5000000  maxy :: float maxy = 100000.0/(fromintegral n)  maxxgap :: float maxxgap = 1  --initialize arrays random floats --this part not measured in running time (very slow) initarrays :: io (iouarray int float, iouarray int float) initarrays =                 y <- newlistarray (0,n-1) (randomlist maxy n (mkstdgen 23432))                 x <- newlistarray (0,n-1) (scanl1 (+) $ randomlist maxxgap n (mkstdgen 5462))                 return (x,y)  randomlist :: float -> int -> stdgen -> [float] randomlist max n gen = map (abs . ((*) max)) (take n . unfoldr (just . random) $ gen)  integrate :: iouarray int float -> iouarray int float -> io float integrate x y = iterative x y 0 0  iterative :: iouarray int float -> iouarray int float -> int -> float -> io float iterative x y !i !accum = if == n-1                               return accum                               else x1 <- readarray x                                       x2 <- readarray x (i+1)                                       y1 <- readarray y                                       y2 <- readarray y (i+1)                                       iterative x y (i+1) (accum + (y2+y1)/2*(x2-x1))  time :: io t -> io t time =             start <- getcputime             v <-             end <- getcputime             let diff = (fromintegral (end-start)) / (10^9)             printf "computation time %0.5f ms\n" (diff :: double)             return v 

the c integration runs in 7 ms , haskell integration in 60 ms on system. of course haskell version slower, wondering why slower. there lot of inefficiency in haskell code.

why haskell code slower? how 1 fix it?

thanks answers.

out of curiosity, ran llvm:

ghc test.hs -o2 -xbangpatterns -fllvm -optlo-o3

and took down 60ms 24ms. still not ideal.

so, 1 of first things i'll when want know why benchmark slow, dump prepared core. is, core after optimisations.

ghc test.hs -o2 -ddump-prep -dsuppress-all -xbangpatterns > test.hscore

after looking through core, found $wa, iterative loop defined. turns out it's making surprisingly many index bound checks. see, use data.vector.unboxed, has "unsaferead" , "unsafeindex" functions, remove bounds checks. these useful here. personally, think vector package superior.

if @ $wa, you'll notice it's unboxing arguments @ start:

case w_s3o9 of _ { stuarray l_s3of u_s3oi ds1_s3ol ds2_s3oh -> case l_s3of of wild2_s3os { i# m_s3oo -> case u_s3oi of wild3_s3ot { i# n1_s3ov -> case ds1_s3ol of wild4_s3oc { i# y1_s3oe -> 

this looks bad, turns out in recursive call it's using specialised version, integrate_$s$wa, unboxed integers etc. good.

in summary, think should improvement using vector unsafe indexing.

edit: here modified version data.vector. runs in 7ms. good vector code, think slowness compared c should due incomplete alias analysis. https://gist.github.com/amosr/6026995


Comments

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -