c++ - Cache Poisoning Issue for deep nested loop -
i writing code mathematical method (incomplete cholesky) , have hit curious roadblock. please see following simplified code.
for(k=0;k<nosunknowns;k++) { //pieces of code for(i=k+1;i<nosunknowns;i++) { // more code } for(j=k+1;j<nosunknowns;j++) { for(i=j;i<nosunknowns;i++) { //some more code if(xok && yok && zok) { if(xdf == 1 && ydf == 0 && zdf == 0) { for(row=0;row<3;row++) { for(col=0;col<3;col++) { // 3x3 static arrays line statobj->a1_[row][col] -= localfuncarr[row][col]; } } } } }//inner loop ends here }//inner loop j ends here }//outer loop k ends here
for context,
statobj object containing number of 3x3 static double arrays. initializing statobj call new function. populating arrays inside using mathematical functions. 1 such array a1_. value of variable nosunknowns around 3000. array localfuncarr generated matrix multiplication , double array.
now problem:
when use line shown in code, code runs extremely sluggishly. 245secs whole function.
when comment out said line, code performs extremely fast. takes 6 secs.
now when replace said line following line :
localfuncarr[row][col] += 3.0
, again code runs same speed of case(2) above.
clearly call statobj->a1_
making code run slow.
my question(s):
is cache poisoning reason why happening ?
if so, changed in terms of array initialization/object initialization/loop unrolling or matter form of code optimization can speed ?
any insights experienced folks highly appreciated.
edit: changed description more verbose , redress of points mentioned in comments.
if conditions true, line of code executed 3000x3000x3000x3x3 times. that's 245 billion times. depending on hardware architecture 245 seconds might reasonable timing (that's 1 iteration every 2 cycles - assuming 2ghz processor). in case there isn't in code suggests cache poisoning.
Comments
Post a Comment