Calculating Accurate Means Using 2048
The goal here is to find an algorithm that can calculate the mean of very large lists of numbers with high accuracy and good performance.
For small lists of numbers, using double
, or even float
is more than good enough in most cases. If, however, you start averaging lists of numbers that are millions of numbers long, then you can start to get significant error buildup. Using the most straightforward approach for calculating the mean, summing up all the numbers and dividing by the count, the sum starts to grow to be very large. To understand why this is a problem, you need to know a little about floating-point numbers.
Floating-point numbers have two parts to them, the exponent and the mantissa (they also have the sign bit, but we can ignore that for now). The exponent is what determines the size of the number, but not the exact value. The mantissa is what we are interested in. It is what stores the significant figures of the number, and is 23 bits for float
and 52 bits for double
. So if the number we are trying to store has more than 23 bits of significant figures, we will get some round-off error. This is complicated by the fact that it is all in base two, so a number that in base 10 has a single significant figure, 0.1, would infamously have an infinite number of significant figures in base two.
So when we are trying to average a long list of numbers, especially ones with a lot of significant figures, we can start to lose the smallest significant figures. Which normally is not a big deal, because rarely do you need to know the exact number of milliseconds it takes to sprint from Los Angeles to New York. If you are trying to calculate the average time of thousands of 100-meter sprints, which cumulatively are the same distance, you may want milliseconds.
There are 4 main approaches that I am testing:
- The straightforward sum then divide approach
- The iterative approach (see this StackOverflow answer: https://stackoverflow.com/a/1934266)
- The recursive pair, sum, divide approach
- The 2048-based pair, sum, divide approach
Approach #1
class AverageAccumulator{
double sum;
int count;
void init(int len) {
this->sum = 0;
this->count = 0;
}
void addTo(double num) {
this->sum += num;
this->count++;
}
double getAverage() {
return this->sum / this->count;
}
void destroy() {
}
};
This is the simplest and most straightforward way to calculate the mean of a list of numbers. It should be very fast to execute and uses only 16 bytes. Accuracy though is not very good.
sum | |
count | 0 |
About these visualizations:
These visualizations show how memory usage looks with each approach. They are all synchronized, so changes to one affect all the others. The grayed out numbers are ones that are used to calculate the numbers actually in memory. You can hover over a number in the graph to highlight it in "memory".Approach #2
class AverageAccumulator {
double avg;
int count;
void init(int len) {
this->count = 0;
this->avg = 0;
}
void addTo(double num) {
this->count++;
this->avg += (num - this->avg) / this->count;
}
double getAverage() {
return avg;
}
void destroy() {
}
};
This approach is nearly perfect, as it keeps an average instead of a sum. This keeps the value from ever growing very large and losing precision. The main weakness of this approach is the division by the count. If you have a number with a lot of precision and the count is large, dividing by the count will lose some significant figures. Say for example, you were to try to sum a series of 1 million zeroes followed by 1 million values of 0.1. By the time you started averaging the 0.1 values, the count would be fairly high and each of the 1 million 0.1 values would lose some significant figures.
average | |
count | 0 |
Approach #3
class AverageAccumulator {
double* nums;
int capacity;
int count;
void init(int len) {
this->capacity = len;
this->nums = new double[this->capacity];
this->count = 0;
}
void addTo(double num) {
this->nums[this->count] = num;
this->count++;
}
double getAverage() {
int len = this->count;
double* buff = new double[len];
memcpy(buff, this->nums, len * sizeof(double));
double extrasAvg = 0;
int extrasCount = 0;
int steps = 0;
while (len > 1) {
for (int i = 0; i < len / 2; i++) {
buff[i] = buff[i * 2] + buff[i * 2 + 1];
buff[i] = buff[i] / 2;
}
if (len % 2 == 1) {
double numsBeingAveraged = pow(2, steps) + extrasCount;
extrasAvg = extrasAvg * (extrasCount / numsBeingAveraged);
extrasAvg += buff[len - 1] * (pow(2, steps) / numsBeingAveraged);
extrasCount += pow(2, steps);
}
len = len / 2;
steps++;
}
double numsBeingAveraged = pow(2, steps) + extrasCount;
double weightedPairsAverage = buff[0] * (pow(2, steps) / numsBeingAveraged);
double weightedExtrasAverage = extrasAvg * (extrasCount / numsBeingAveraged);
double average = weightedExtrasAverage + weightedPairsAverage;
delete[] buff;
return average;
}
void destroy() {
delete[] this->nums;
}
};
This is a significantly more complex algorithm. The precision though will be much improved, because the partial sums in nums
never grow very large. At worst, a sum will be of the two largest numbers in the input list. This is much better than Approach #1 where the sum will, by definition, grow to the size of len
* mean(nums)
.
While it should still be somewhat fast to execute, memory usage is the main drawback. It requires a copy of the entire list of numbers, and that the entire list of numbers be available when calculating.
nums | ||||||||||||||||
count | 0 |
Approach #4
Oddly enough, the last approach was inspired by the game 2048.
If you have not played it before, the original by Gabriele Cirulli is here: https://play2048.co/ If you have never seen it before, it might be hard to follow the following explanation. Be warned however that the game is strangely addicting. :)
If you think about it, 2048 is a similar problem. The goal is to try to sum 1024 tiles together on a board with only 16 spaces. Also, we can only combine tiles of the same number. Similarly, when averaging we can only sum two means that are of the same number of inputs (e.g. we can only sum means of 4 inputs and 4 inputs, not 4 inputs and 2 input). The key part of the game, and the algorithm, is that you only ever need one tile of each size on the board at any one time. If there were two of any, you could combine them and get rid of it. This drastically reduces the amount of memory needed to average n numbers from O(n) to O(log(n)). So for our example of 100,000,000 numbers, instead of 0.8 GB, we only need 216 bytes!
A nice bonus of this method is that because the combining of numbers pairwise is just like binary addition, the occupied
value can just be incremented.
nums | |||||
occupied | 0 | 0 | 0 | 0 | 0 |
class AverageAccumulator {
double* nums;
//Zero bits indicate empty spots in the nums array, one bits indicate full spots.
int occupied;
int capacity;
void init(int len) {
this->capacity = log2(len) + 1;//Rounds up to the nearest integer
this->nums = new double[this->capacity];
this->occupied = 0;
}
void addTo(double num) {
int level = 0;
double avg = num;
//We keep averaging pairs of numbers until we find an empty slot with no number
//to pair with.
while ((this->occupied >> level) & 1) {
avg = (this->nums[level] + avg) / 2.0;
this->nums[level] = 0;
level++;
}
this->nums[level] = avg;
this->occupied++;
}
double getAverage() {
//For numbers that are not a convenient power of 2, we have to use
//weighted averages for averaging the unpaired numbers.
double lastAverage = 0.0;
int numsAveraged = 0;
//Going slot by slot
for (int i = 0; i < this->capacity; i++) {
if ((this->occupied >> i) & 1) { //Check to see if the slot is occupied
//If it is, then we need to get the total count of the numbers to be averaged.
//We keep a running count of numbers averaged so far, and add 2 to the i
//for the count of the numbers averaged in the current slot
double numsToBeAveraged = pow(2, i) + numsAveraged;
//Then we can just do a weighted average of the running average and the average of the current slot.
lastAverage = this->nums[i] * (pow(2, i) / numsToBeAveraged) + lastAverage * (numsAveraged / numsToBeAveraged);
//And update numsAveraged
numsAveraged = numsToBeAveraged;
}
}
return lastAverage;
}
void destroy() {
delete[] this->nums;
}
};
So Which Approach Is Best?
To find out which of the approaches is the best, or at least what each approach was best suited for, I did a series of tests.
Test #1
The first test is to average a list of 100,000,000 copies of 12345.12345. I picked that number simply because it has a lot of digits of precision, and it is trivial to check for any round-off error. The results are in the table below:
Approach #1 | Approach #2 | Approach #3 | Approach #4 | |
---|---|---|---|---|
Error | 0.0000027762471290 | 0.0000000000000000 | 0.0000000000000000 | 0.0000000000000000 |
Runtime | 259.37 ms | 635.85 ms * | 881.76 ms | 265.32 ms |
Memory | 24 bytes | 24 bytes | 1.600 GB | 240 bytes |
The error and memory usage results are about what I expected, but the runtime measurements were pretty surprising. I expected more of a performance hit on Approach #4, and I did not expect such a large performance hit for the memory usage in Approach #3.
*The reason for the relative slowness of Approach #2 has something to do with the Visual Studio compiler, as when I tested it with g++ on Linux, Approach #2 was much faster, about the same as Approach #1.
Test #2
The second test is to average a list of 50,000,000 copies of 0 followed by 50,000,000 copies of 12345.12345 * 2. This is to test the weak point of Approach #2, and see how it compares. The runtime and memory usage is nearly identical, so those results are left off. The results are in the table below:
Approach #1 | Approach #2 | Approach #3 | Approach #4 | |
---|---|---|---|---|
Error | 0.0000046634486352 | 0.0000000021536834 | 0.0000000000000000 | 0.0000000000000000 |
As expected, there is a small amount of error in the result from Approach #2.
Conclusion
While it is hard to say that one algorithm is the “best”, I think we can say that Approach #3 is definitely the worst of the 4. As for which of the other ones to use, that is more difficult to say. If you are going for the maximum accuracy, especially while using float
s, I think the added complexity of Approach #4 is a worthy trade-off. If you just want good accuracy on a large list of numbers and know that your data is relatively uniform, then Approach #2 makes a lot of sense. If you are just trying to get the average cost of 100 items to the nearest cent, please go with Approach #1 :)
Thanks for reading, I hope you found it interesting!