 My Math Forum How to calculate simple linear model for noisy data?
 User Name Remember Me? Password

 Advanced Statistics Advanced Probability and Statistics Math Forum

 July 16th, 2010, 04:07 AM #1 Newbie   Joined: Jul 2010 Posts: 5 Thanks: 0 How to calculate simple linear model for noisy data? I have data about the number of a website's hits per day. This data is very noisy. I've seen someone calculate a stepwise linear model in order to simplify the graph and highlight significant changes - like this: . What do you think is the right approach / method / tool to generate the linear model (red line)? PS: I posted in this forum because I assume statistics is involved in the process. July 16th, 2010, 05:13 AM #2 Global Moderator   Joined: Nov 2006 From: UTC -5 Posts: 16,046 Thanks: 938 Math Focus: Number theory, computational mathematics, combinatorics, FOM, symbolic logic, TCS, algorithms Re: How to calculate simple linear model for noisy data? I think a better method would be to plot a best-fit line through the data. Excel can do that, as can (for example) R. July 18th, 2010, 01:28 AM #3 Newbie   Joined: Jul 2010 Posts: 5 Thanks: 0 Re: How to calculate simple linear model for noisy data? The stepwise approach can be accommodated, but you need to make some assumptions to get it done. One set of assumptions: The steps occur at set breaks in time. With this assumption you only need to do a constant regression on each of the subsets of data (which is just using the mean to estimate the data on each of the subintervals. However, this is not satisfying. Since your data is a time series, there is a model which is I believe more appropriate for piecewise constant regression. Use an average of the last N data points. This will smooth the data. Jumps in the average will become more apparent. Use these jumps to determine where the breaks between piecewise approximations should occur, and then use the average on each interval to perform the regression there. Note this second formulation is NOT a linear regression as the regression does not change linearly in the data (due to movements in the jump points). July 18th, 2010, 04:20 AM #4 Newbie   Joined: Jul 2010 Posts: 5 Thanks: 0 Re: How to calculate simple linear model for noisy data? Thank you for your great post. I will definitely do some reading on piecewise constant regression. I agree with you that in order to use the very simplified linear model it must be assumed that "the mean of the data basically stays the same and only changes from time to time". The core problem here of course is to find where these breaks happen (it will later be used in 'news'-like fashion, notifying the user about these breaks). I guess I could use threshold and "jumps in the average will become more apparent" - or is there a better way? By the way, on the slide where I saw this the author also wrote down the confidence level for each break. How might that fit into the model? July 22nd, 2010, 05:54 AM #5 Newbie   Joined: May 2009 Posts: 25 Thanks: 0 Re: How to calculate simple linear model for noisy data? Do you expect any seasonality in your data....for example is Saturdays volume likely to be greater than Monday each week. If so it is wrong to attempt to put a trendline line through your raw data - instead you should attempt to seasonally adjust the data first. A simple way of doing that is create a 7-day rolling average (7-day if you think your data is weekly seasonal) then try to fit that to a linear model. Why do you expect the underlying trend in your data to be linear by the way - maybe the data reflects the amount of promotion you are getting - if that promotion is done in bursts then I might expect a saw-tooth shape to your underlying data. Tags calculate, data, linear, model, noisy, simple Thread Tools Show Printable Version Email this Page Display Modes Linear Mode Switch to Hybrid Mode Switch to Threaded Mode Similar Threads Thread Thread Starter Forum Replies Last Post pnf123 Applied Math 1 March 29th, 2014 04:53 AM mbcpineda Linear Algebra 1 November 23rd, 2012 07:43 AM boomer029 Algebra 3 March 19th, 2012 05:58 PM donald coolme Advanced Statistics 0 April 14th, 2011 03:32 AM coth123 Calculus 2 February 16th, 2007 11:42 AM

 Contact - Home - Forums - Cryptocurrency Forum - Top      