My Math Forum (http://mymathforum.com/math-forums.php)
-   Calculus (http://mymathforum.com/calculus/)
-   -   What is the best fit for this data (http://mymathforum.com/calculus/48424-what-best-fit-data.html)

 nekdolan December 7th, 2014 10:50 AM

What is the best fit for this data

Hi

I have one data set and I am looking for a function that could fit my data well. Preferably I am looking for something I can reproduce in excel and solve with solver. The function can have multiple constants, but not too many. It can also have IF-s or whatever. I am looking for the best possible fit and I am also looking for any limits (n->inf) on said function.
I tried what I could but I could not get a good fit. :cry:

My data can be found here:
https://dl.dropboxusercontent.com/u/.../data_long.csv

There's also a shorter and more accurate but less complete version, here:
https://dl.dropboxusercontent.com/u/...data_short.csv
In the later data set every row that is not included would have the value of the row that has the largest row number of all rows that are less than the one in question.

So for example:
GEN;FIT
56;0,064144345
127;0,06656979
Would mean that GEN 57 through 126 would have the value of 0,064144345

Any help would be much appreciated ;)

 nekdolan December 8th, 2014 04:09 AM

I made a graph with the data:
https://dl.dropboxusercontent.com/u/54506834/data.png

 Benit13 December 8th, 2014 05:02 AM

Can you try plotting it on a log scale? It seems like there's lots of exponential decays sort of stuck together.

 nekdolan December 8th, 2014 12:06 PM

Sure:
https://dl.dropboxusercontent.com/u/54506834/data2.png

I used this data for the graph. I duplicated every value so that it would be more faithful to the original dataset:
https://dl.dropboxusercontent.com/u/..._data_full.csv

 topsquark December 8th, 2014 12:45 PM

1 Attachment(s)
It looks a bit like this.

-Dan

 CaptainBlack December 8th, 2014 08:27 PM

Quote:
 Originally Posted by nekdolan (Post 214886) Hi I have one data set and I am looking for a function that could fit my data well. Preferably I am looking for something I can reproduce in excel and solve with solver.
You should not be asking for someone to find a best/good fit as you are. This is not arbitrary data and therefore you should at least have some idea of the form that the fitting function should have. If you don't have any idea tell us where this data comes from so we can form our own opinion on what form the fitting function should take.

Failing that a low order Padé approximation should produce a very good fit, but I would not recommend this procedure. You need some theoretical or heuristic basis for the fitting function.

CB

 nekdolan December 8th, 2014 11:45 PM

This is the result of an algorithm running for 10 hours. The algorithm starts with a square polygon and randomly adds and takes away points. The new polygon is tested than to see if it is more round (more like a circle) than the previous one was. If it is better, than the new polygon is kept, if not than it will be thrown out and the old one will be used again. The horizontal lines exist because sometimes no new points are found for a while. Also these lines will keep growing longer and longer since the more the polygon is round the less likely a new point can be integrated into it.

Here is the algorithm written in (node)js:
https://dl.dropboxusercontent.com/u/...node_script.js

Hope this helps

 CaptainBlack December 9th, 2014 08:00 AM

Quote:
 Originally Posted by nekdolan (Post 215056) This is the result of an algorithm running for 10 hours. The algorithm starts with a square polygon and randomly adds and takes away points. The new polygon is tested than to see if it is more round (more like a circle) than the previous one was. If it is better, than the new polygon is kept, if not than it will be thrown out and the old one will be used again. The horizontal lines exist because sometimes no new points are found for a while. Also these lines will keep growing longer and longer since the more the polygon is round the less likely a new point can be integrated into it. Here is the algorithm written in (node)js: https://dl.dropboxusercontent.com/u/...node_script.js Hope this helps
I would go with a rational function model starting with the simple:

$$Fit(n)=\frac{1+\alpha n}{\beta + \gamma n}$$

or maybe:

$$Fit(n)=\frac{1+\alpha \log(n)}{\beta + \gamma \log(n)}$$

and only go for something of higher order if the fit is inadequate.

CB

 nekdolan December 10th, 2014 01:36 AM

I'v gotten a 74% fit with the later function over 80% if I change the 1 into a constant.

https://dl.dropboxusercontent.com/u/54506834/fit.png

So where would I go from here? I tried a few things but I had little to no luck.
It seems there is a periodic element to the data that is missing from the function but I'm not sure how I could add one to the it.

 All times are GMT -8. The time now is 10:36 PM.