
Probability and Statistics Basic Probability and Statistics Math Forum 
 LinkBack  Thread Tools  Display Modes 
February 8th, 2018, 12:22 PM  #1 
Newbie Joined: Oct 2013 Posts: 10 Thanks: 0  How derive formula for database disk cache?
We have database which is partially in memory. If DB were all in memory then we assume costs=0  we calculate only disk>memory costs. Computer must not whole memory allocate for database. Let is 1 GB in memory (or 12 GB), whole database have for example 2 GB or 200 GB or 5 TB. One record has 100160 bytes. Random time reading from disk=14 ms, average speed 120 MB/s. If we read record which is not in memory, we must read whole block. (Blocks have size for example 32 KB, 512 KB, 10 MB or even 100 MB – this size is also parameter of formula). Fortunately, we have possibility of many records in one query: 50 or 1000 – there records can be grouped and reading one block give us many records. How is best strategy? How is formula of number accessed to disk and number bytes reading from disk, depend on: size in memory, size whole, size one record,size block and number records in one query? 
February 8th, 2018, 01:08 PM  #2 
Senior Member Joined: Sep 2015 From: USA Posts: 1,758 Thanks: 900 
This analysis sounds like it is the better part of a senior project. Perhaps you can give us some more details about how you happened to come by this problem. Some of us are happy to help students, to a point, but disinclined to assist professionals. 
February 8th, 2018, 01:33 PM  #3 
Newbie Joined: Oct 2013 Posts: 10 Thanks: 0 
M: memory size D: database size r: record size B: block size m : number records in one query Assume nontrivial D>M Let come next record. Probability that is not in memory is (DM)/D then we read one block B bytes. Removing from memory B/r records. If 10% were cached is 90% chance that we must read one block to read record?? Records can be random, but sometimes we want read succeeding records. For example: we read index, index can be large like table and can be not fit whole in memory. This query cached succeeded records, but if we want add random records with unique keys, block must be small as possible? 

Tags 
cache, database, derive, disk, disk cache, formula 
Thread Tools  
Display Modes  

Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Learn How To Derive The Trapezium Rule Formula From Absolute Scratch...  perfect_world  Calculus  0  May 16th, 2014 11:00 AM 
Derive Tangent add/sub formula. Double angle  yogazen2013  Trigonometry  1  September 6th, 2013 07:07 AM 
Looking for a database of constants  Yooklid  Real Analysis  9  May 16th, 2010 03:38 PM 
Help me derive the formula  riou11  Algebra  1  May 27th, 2009 05:54 AM 
How to derive reduction formula?  conomeara  Calculus  1  April 3rd, 2007 05:41 PM 