math

Ranter

anArchLinuxer

622

Comments

1

bigworld12

2313

7y

I am too dumb to understand this
edit: so you have i-1 numbers and you have the average of them, and you want to add a new number to the average using the old average?
3

bigworld12

2313

7y

I think the proof is something like this, sorry for my crappy writing, I literally wrote that on bed
0

7Raiden

864

7y

So when you tag it "fast average" you mean it's faster to calculate? As in, it's more intuitive? I totally agree!
6

iZeroCool

1298

7y

It’s the exact same formula but just simplified differently lmao
1

kleinerKobold

89

7y

It is one multiplication less, so you have less problems with float?
0

mundo03

4845

7y

@iZeroCool agree, it is just simplified

The question here is, why is the not simplified one the standard now? Is it really?
0

Root

77447

7y

The first one doesn't make sense to me: too much overlap. The second is how I'd calculate it.
1

michezio

1741

7y

If i is very big the second one can be rounded as o + (n-o)/i = o

This especially happens using low precision types, so I would discourage using it if you think the number of elements is going to be really big.

The standard one doesn't face this problem
0

anArchLinuxer

622

7y

@bigworld12 Yes, but I went about it differently.

@kleinerKobold You're right. The removal of the multiplication also removes risk of overflow for large numbers or large amounts of numbers.

@michezio I had noticed that too!

(o(i - 1) + n) / i
= (oi - o + n) / i
= o + (n - o) / i
0

anArchLinuxer

622

7y

If the calculation of n is heavy, utilizing Kahan summation algorithm should alleviate the pain of low precision. You might also be able to rewrite whatever loop you're adding to the average with for pairwise summation.
1

bigworld12

2313

7y

@michezio if I is very big, new numbers won't affect the average that much, so it would make sense that the new average rounds off to old average
0

anArchLinuxer

622

7y

@bigworld12 I agree, but it still won't be the same. I had run some tests checking the difference with lots of random numbers; I'll find (or rewrite) it and post results.
1

michezio

1741

7y

I've done some testing, and at the end of the day both methods suffer from some kind of precision error. So it depends only on the precision of type used, and if efficiency is an issue, the simplified version is actually better.

Anyway a better way to deal with running averages would be to store an unaveraged sum and doing the division only when you actually need the average value. So you have less precision issues and more efficiency, but this is not always applicable...
0

anArchLinuxer

622

7y

@michezio Agreed, but the general use case (sum) is not one we're talking about (mostly). The need for running average is generally when the average is needed in each cycle.

By the easy, what were the actual results and errors that you found, compared to sum then average?
2

michezio

1741

7y

@anArchLinuxer a quick test over 340 millions randomly generated numbers (floats) between 0 and 1 using java doubles to calculate the averages. As you can see errors are really small.
1

anArchLinuxer

622

7y

I tried it in C (actually D, but only used C stuff, so essentially same thing) with double (64-bit) and random numbers from rand(). Instead of 340 million, tested it with 1 billion. Compiled on GCC with
-Ofast.

There really isn't much of a difference!