Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've now implemented the binary search version; I was right, the factor 10 was about correct, and even with 20 mio buckets the algorithm is still faster than the FPGA and more than 3 times faster than his faster software algorithm even though he's got a newer CPU (I'm running it on a Core 2 duo (single-threaded) at 2.5 Ghz), and again, he's really using something between about 1000..100000 buckets, which would make my program on my machine between 17 to 32 times faster than his:

  buckets   gcc -O1 gcc -O3
  1000       75 ms   72 ms
  10000     105 ms  102 ms
  100000    135 ms  136 ms
  1000000   243 ms  240 ms
  10000000  580 ms
  20000000  693 ms
(Results in my previous post were with gcc -O1.)

All results so far were with random numbers with an uneven distribution: vec[i]= (i / len) * random_between(0.,1.). Here are the results for the binary search with vec[i]= random_between(0.,1.):

  buckets  gcc -O1
  1000       82 ms
  10000     115 ms
  100000    148 ms
  1000000   360 ms
  10000000  735 ms
  20000000  860 ms
(He's using a Gaussian distribution, not even.)

Here's the code:

  int nvals= ..;
  double* vals= ..;
  int nbuckets= ..;
  double* buckets= ..;
  unsigned int* res= ..;

  int i;
  for (i=0; i<nvals; i++) {
      double val= vals[i];
      int bi;
      {
	  int lo=0, hi=nbuckets;
	  while (lo < hi) {
	      int mid= (lo+hi) / 2;
	      if (val < buckets[mid]) {
		  hi= mid;
	      } else {
		  lo= mid+1;
	      }
	  }
	  bi= lo;
      }
      res[bi]++;
  }
(Embedded within a Scheme driver program: http://christianjaeger.ch/scratch/binning.scm)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: