My Math Forum  

Go Back   My Math Forum > High School Math Forum > Probability and Statistics

Probability and Statistics Basic Probability and Statistics Math Forum

LinkBack Thread Tools Display Modes
February 8th, 2018, 12:22 PM   #1
Joined: Oct 2013

Posts: 20
Thanks: 0

How derive formula for database disk cache?

We have database which is partially in memory. If DB were all in memory then we assume costs=0 - we calculate only disk->memory costs.
Computer must not whole memory allocate for database. Let is 1 GB in memory (or 12 GB), whole database have for example 2 GB or 200 GB or 5 TB.
One record has 100-160 bytes.
Random time reading from disk=14 ms, average speed 120 MB/s.
If we read record which is not in memory, we must read whole block. (Blocks have size for example 32 KB, 512 KB, 10 MB or even 100 MB – this size is also parameter of formula).
Fortunately, we have possibility of many records in one query: 50 or 1000 – there records can be grouped and reading one block give us many records.
How is best strategy?
How is formula of number accessed to disk and number bytes reading from disk, depend on: size in memory, size whole, size one record,size block and number records in one query?
Borneq is offline  
February 8th, 2018, 01:08 PM   #2
Senior Member
romsek's Avatar
Joined: Sep 2015
From: USA

Posts: 2,200
Thanks: 1153

This analysis sounds like it is the better part of a senior project.

Perhaps you can give us some more details about how you happened to come by this problem. Some of us are happy to help students, to a point, but disinclined to assist professionals.
romsek is online now  
February 8th, 2018, 01:33 PM   #3
Joined: Oct 2013

Posts: 20
Thanks: 0

M: memory size
D: database size
r: record size
B: block size
m : number records in one query
Assume nontrivial D>M
Let come next record. Probability that is not in memory is (D-M)/D
then we read one block B bytes. Removing from memory B/r records.
If 10% were cached is 90% chance that we must read one block to read record??
Records can be random, but sometimes we want read succeeding records.
For example: we read index, index can be large like table and can be not fit whole in memory.
This query cached succeeded records, but if we want add random records with unique keys, block must be small as possible?
Borneq is offline  

  My Math Forum > High School Math Forum > Probability and Statistics

cache, database, derive, disk, disk cache, formula

Thread Tools
Display Modes

Similar Threads
Thread Thread Starter Forum Replies Last Post
Learn How To Derive The Trapezium Rule Formula From Absolute Scratch... perfect_world Calculus 0 May 16th, 2014 11:00 AM
Derive Tangent add/sub formula. Double angle yogazen2013 Trigonometry 1 September 6th, 2013 07:07 AM
Looking for a database of constants Yooklid Real Analysis 9 May 16th, 2010 03:38 PM
Help me derive the formula riou11 Algebra 1 May 27th, 2009 05:54 AM
How to derive reduction formula? conomeara Calculus 1 April 3rd, 2007 05:41 PM

Copyright © 2018 My Math Forum. All rights reserved.