July 13th, 2016, 03:46 AM  #1 
Newbie Joined: Jul 2016 From: England Posts: 2 Thanks: 0  Data Mining
I have a data mining complex task with Dog Racing. The information is as below: Every race has 6 dogs running and all of them finish the race always. The race time is always 45 seconds and between 2 races there is in total 241 seconds. [It means the time has no effect in the data since is always the same.] The odds for each dog are fixed during 1 race, they don't change. I have the information for all winners and second places for a period of 3 months (History data). The other dogs rank is not shown. Regarding the odds, I have noticed that in total there are: The odds for winners are inside the range 2.00  10.60 The odds for the combination winner + second place inside a range 10.50  134.00 In total there are 995 possible combination of odds for all dogs(there are 36 odds(6x6 because it is the winner and the second place) for each of these 995 possibilities). How can i predict the winner and the second place based on this historical data? 
July 13th, 2016, 06:35 AM  #2 
Senior Member Joined: Apr 2014 From: UK Posts: 895 Thanks: 328 
Do you have the climate details at the track for each race and the hardness of the ground? Dog weights? I would suggest that the odds are irrelevant to the question, otherwise you'd just pick the 1st and 2nd place based on the lowest and 2nd lowest odds. All the historical odds would tell you is when the bookies got the places right. I'd be looking at which dog beat which and under what conditions. I'd also look at the individual finishing times and 'predict' their times based on the conditions at the time of the race. Do the dogs race more than once? I don't know where your 995 combinations comes from, but I can say the 1/36 odds is incorrect, there is 1/6 chance for the first place, 1/5 for the 2nd place, that's odds of 1 in 30. 
July 14th, 2016, 11:32 AM  #3 
Newbie Joined: Jul 2016 From: England Posts: 2 Thanks: 0 
Hi Weiddave, Thanks for your reply. I was wrong on that 36, you are right it is 30. In total there are 995*30=29850 different possible races. Regarding the 995 raws i made a search based on the odds and on a total of 65000 races there are just 995 different raws. (Even if you search in around 5000 thousand consecutive entries you see that there are 995 different raws.) So the odds are repeated but the winners and second place is changing. There is no dogs weight, no finish time, no ground, no race name, no name just dog 1,2,3,4,5,6 number. (The dimensions are: odds, dog nr, race_nr, race_time, first place and second place.) It looks to me like there are fixed odds (995 raws) in total and there is just a change of winners and seconds places based on a specific algorithm to be sure that it is guaranteed the win of the booking company. It looks like the data are faked and not real, just produced to stimulate a dog race for a booking company. Have a look on the odds data (attached image) to create a better idea. D11 > Dog 1 odd to win D12 > Dog 1 to win and Dog 2 second place odd D22 > Dog 2 winner odd D24 > Dog 2 to win and Dog 4 second place odd ...... 
August 30th, 2016, 11:43 PM  #4 
Newbie Joined: Aug 2016 From: London Posts: 2 Thanks: 0 
Thanks you so much for clarification.

September 23rd, 2016, 04:22 AM  #5 
Newbie Joined: Aug 2016 From: sydney Posts: 9 Thanks: 0 
Hi, If you want looking for serious help regarding your topic . You can take online Assignment Writing Service.


Tags 
data, mining 
Thread Tools  
Display Modes  

Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
What can you say about the data?  shunya  Probability and Statistics  2  January 1st, 2016 02:14 PM 
What is the best fit for this data  nekdolan  Calculus  8  December 10th, 2014 02:36 AM 
Rolling 3 months YOY data to Monthly YOY data  lumpa  Real Analysis  0  October 19th, 2012 09:07 AM 
Is formalization of data mining possible?  Deb_D  Applied Math  19  February 23rd, 2011 09:49 AM 
How to regress actual data towards projected data.  BigLRIP  Advanced Statistics  1  May 18th, 2009 11:01 AM 