My Math Forum  

Go Back   My Math Forum > College Math Forum > Advanced Statistics

Advanced Statistics Advanced Probability and Statistics Math Forum

LinkBack Thread Tools Display Modes
April 5th, 2019, 01:21 AM   #1
Joined: Apr 2019
From: London

Posts: 1
Thanks: 0

Comparing Distributions of Datasets

I've been asked to find a method to best compare the distribution of a number datasets that have small sample sizes. Bonus points for a solution/result that is in a scale of 0-1, i.e. a distribution approaching 1 is bordering on perfectly unequal and a distribution approaching 0 is bordering on perfectly equal.

Some examples within this dataset include:
  • Sample A: [10,1]
  • Sample B: [10,1,1]
  • Sample C: [4,4,3,2,2]
In other words, the method used should show A to have a distribution close close to 1 (almost perfectly unevenly distributed), B to be close to 1 but further away from 1 than A's distribution, and C to be closer to 0 (quite an equal distribution).

I first thought of the Gini coefficient, which is precisely about distribution and gives values between 0-1. However it seems the Gini has a 'small-sample bias' that limits its use here, where each of the datapoints have between 1 and c.10 values.

I then considered the coefficient of variance, however given results can go higher than 1 this also isn't well suited to this problem.

Any pointers would be greatly appreciated!
Furn is offline  
April 5th, 2019, 01:58 AM   #2
Senior Member
Joined: Oct 2009

Posts: 772
Thanks: 279

What about entropy?
Micrm@ss is online now  

  My Math Forum > College Math Forum > Advanced Statistics

comparing, datasets, distributions

Thread Tools
Display Modes

Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparing How Even Two Distributions Are EvanJ Probability and Statistics 1 July 15th, 2016 01:19 PM
Applying proportional relationships between datasets Rincewind Applied Math 3 March 4th, 2014 11:30 AM
HELP! Comparing Two Means sosoDi Advanced Statistics 0 November 7th, 2013 10:54 AM
Comparing three datasets and two variables at the same time? Spoilerman Algebra 0 July 5th, 2013 07:13 AM
fit one nonlinear function to four datasets yjwang05 Applied Math 0 February 12th, 2009 02:11 AM

Copyright © 2019 My Math Forum. All rights reserved.