Sure, so the idea that's been bouncing around in my head is that if you are unhappy with the results of the linear interpolation, we can try a nonlinear interpolation based off a function form that we know can be fitted accurately to all of the powerrank stats data. But if the linear interpolation gives you satisfactory results, then we don't have to worry about any of that.
If I understand what you're going for correctly, I want to comment that for the last few tops (the ones above the highest known top), linear interpolation can't really be used with much success without manually putting in "weighting" fudge factors, but these weighting fudge factors would have to be adjusted manually very often (if you are planning on doing what I think you are, but I may be wrong), and in order to calculate meaningful fudge factors, you'd have to know something about the hidden top stats, anyway.
I think the best simple way to calculate the powerlevels of those last few would be just to make yourself a line whose slope you know is reasonable simiiar, such as this: let p0 be the powerlevel of the person at the highest known rank. Let that rank be n. Then a good first approximation for those last few dudes might be something like:
p(rank) = p0*(1+.05*(n-rank)). Currently, this would give results like this:
CODE
Rank Name Vita Mana Powerrank Estimated powerrank currently Estimated powerrank if Ceez stopped showing stats
Rank Name Vita Mana Powerrank Estimated powerrank currently Estimated powerrank if Ceez stopped showing stats
1 winroute 0 0 0 9041148.45 9242423.75
2 aubrey 0 0 0 8749498.5 8978354.5
3 holangi 0 0 0 8457848.55 8714285.25
4 santa 0 0 0 8166198.6 8450216
5 SubChess 0 0 0 7874548.65 8186146.75
6 chinchin 0 0 0 7582898.7 7922077.5
7 trunk 0 0 0 7291248.75 7658008.25
8 suwan 0 0 0 6999598.8 7393939
9 Cristiana 0 0 0 6707948.85 7129869.75
10 xtroubsx 0 0 0 6416298.9 6865800.5
11 Sindella 0 0 0 6124648.95 6601731.25
12 Ceez 2219665 1806667 5832999 5832999 6337662
13 kEum 0 0 0 0 6073592.75
14 CheBakY 0 0 0 0 5809523.5
15 Blason 0 0 0 0 5545454.25
16 Conro 3266817 1007284 5281385 0 5281385
I estimated what it would be if Ceez stopped showing to make sure that it still gave sane estimates with a different dataset, which it seems to do.
Maybe every year or so the shape of the top part of the curve will change sufficiently to merit a different slope, but this shouldn't occur too often as it would by creating a line with the two lowest data points.
In summary:
Regular linear interpolation: Using two points, construct a line and use that to estimate points near the two you know (works best for points between the two known points). This will probably work wonderfully for everyone below the highest known top, and if it doesn't we can discuss some nonlinear interpolation methods.
"Ghetto" linear interpolation: Constructing a line with the last two known tops and using it to estimate everyone above them is begging for large errors (Consider if the two highest known tops are just about to pass each other, suddenly winroute's estimate is 5m. One month later when one has gained many stats and one has been on vacation, the estimate may bounce up to 10m).
Instead the "Ghetto" method I describe above assumes a reasonable slope and uses that and the highest known point to make a line. This allows it to grow accurately with the power chart, assuming that the average slope of the tops doesn't change much (which it doesn't really, and if it does, it is very slowly, on the order of several months to a year). This gives us just one fudge factor that we only have to manually adjust every few months at the very most.