My Math Forum  

Go Back   My Math Forum > College Math Forum > Advanced Statistics

Advanced Statistics Advanced Probability and Statistics Math Forum


Reply
 
LinkBack Thread Tools Display Modes
January 5th, 2017, 07:47 AM   #1
Newbie
 
Joined: Jan 2015
From: hong kong

Posts: 3
Thanks: 0

Predictive modelling and understanding how important your last data point is

Hi all.

Using this as a simple example –

There is a fairly basically computer game called Mr Jump.
The aim of the game is to complete the level by finishing the course.
Your game performance is measured in percentage terms, if you complete the level you get to 100%. If you “die” a third of the way through you have completed 33%.
The level will be completed by jumping at the correct time negotiating an obstacle course that is always the same. Obstacles that can get in the way are things like walls to jump over / big jumps / short jumps / series of successive jumps etc etc

Each time you play the game you learn something new, so the more you play the more likely you are to complete the course. However, there are times when you play and due to a mis-click or a loss of concentration you can easily “die” early.


In the first 25 attempts, your % performance is as follows
11
16
28
13
29
30
20
21
19
62
35
40
7
45
28
40
50
14
63
38
42
15
42
55
51

As you can see from the number you are slowly improving.


I have got the following questions, and would like to know the appropriate maths that need to be applied to find out the answer

1. Predicting. How long will it take (statistically to beat the game)?

2. Weighting. How much do you learn each time you play?? How much more important is the 25th performance compared with the 24th? And how much more is the 24th compared with the 23rd etc

Any info would be greatly appreciated.
Thanks
Marco
StooCats is offline  
 
January 5th, 2017, 06:37 PM   #2
Senior Member
 
Joined: Oct 2013
From: New York, USA

Posts: 522
Thanks: 74

I'm not guaranteeing that a least squares regression line produces the best answer, but the regression equation is % performance = 0.2347998797*attempt number + 5.354915917. To the nearest whole number, it would take 403 attempts to beat the game. Note that a regression equation would predict impossible values over 100 if there were enough attempts. For example, the equation would predict 240% on the 1,000th attempt.
EvanJ is offline  
January 8th, 2017, 08:46 AM   #3
Newbie
 
Joined: Jan 2015
From: hong kong

Posts: 3
Thanks: 0

Hi

Can I please ask.

What time of regression did you use there?
Using a simple linear regression I am getting 100% on the 72nd attempt using
y = 1.1531x + 17.57
StooCats is offline  
January 31st, 2017, 08:51 AM   #4
Senior Member
 
Joined: Dec 2012
From: Hong Kong

Posts: 807
Thanks: 290

Math Focus: Stochastic processes, statistical inference, data mining, computational linguistics
Quote:
Originally Posted by StooCats View Post
Hi

Can I please ask.

What time of regression did you use there?
Using a simple linear regression I am getting 100% on the 72nd attempt using
y = 1.1531x + 17.57
I got the same result. I don't know if this is still helpful (you might have beat the game already?) but a 95% prediction interval would be (55.62466861, 87.34931404).
123qwerty is online now  
January 31st, 2017, 08:54 AM   #5
Senior Member
 
Joined: Dec 2012
From: Hong Kong

Posts: 807
Thanks: 290

Math Focus: Stochastic processes, statistical inference, data mining, computational linguistics
Quote:
Originally Posted by StooCats View Post
2. Weighting. How much do you learn each time you play?? How much more important is the 25th performance compared with the 24th? And how much more is the 24th compared with the 23rd etc
This doesn't directly answer your question, but you might want to look into Studentised residuals, DFFITS, DFBETAS and Cook's distance, which measure how influential a data point is on your model.

Last edited by 123qwerty; January 31st, 2017 at 08:58 AM.
123qwerty is online now  
Reply

  My Math Forum > College Math Forum > Advanced Statistics

Tags
data, important, modelling, point, predictive, understanding



Thread Tools
Display Modes


Similar Threads
Thread Thread Starter Forum Replies Last Post
Tricky Bayesian question using posterior predictive distributions, help!!!? laurenblair325 Advanced Statistics 0 February 8th, 2015 07:32 AM
Difficulty understanding point-set topology theorem LordofthePenguins Topology 8 July 9th, 2013 12:44 PM
Modelling real life data n3rdwannab3 Algebra 1 February 25th, 2013 11:56 AM
Modelling Data Help! jimmy_neutron987 Algebra 5 February 18th, 2013 11:52 AM
How to find the percentiles for certain data point? Knight Advanced Statistics 1 February 8th, 2009 10:36 AM





Copyright © 2017 My Math Forum. All rights reserved.