Tuesday, August 18, 2009

why the focus on exact correctness in computing?

I think a lot of people are missing the point on correct vs. probably correct computing.
In an interview with Google, I was asked to compile a 'top-1-million' most frequently found strings from terabytes of data.
I really wanted to poll the data at random until I had a sufficient accuracy to say that I knew what the top 1-million strings were. There was an obsession on counting each and every string, beyond mathematical necessity.
Really we should be looking for an accuracy on the order of the probability of computation error.

A probabilistic method could be both faster *and* more accurate. Try telling that to a brick wall called Google.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home