My Math Forum Prove a large number of samplings converge to a set of small number of samples

 August 29th, 2017, 12:26 AM #1 Newbie   Joined: Aug 2017 From: Singapore Posts: 2 Thanks: 0 Prove a large number of samplings converge to a set of small number of samples In a $k$-dimensional space, given a set of $n$ points: $X=\{x_1, x_2,..., x_n\}$ which is sampled from an known distribution $P$. Given a set of $m$ points ($m << n$): $Y=\{y_1,y_2,...,y_m\}$ which is also drawn from $P$. Suppose $\{z_1, z_2,...,z_m\} \subset X$ are the closest points (in term of Euclidean distance) of $\{y_1,y_2,...,y_m\}$ respectively. A gap $g$ is defined as follows: $\displaystyle g = max\{|z_1 - y_1|^2, |z_2 - y_2|^2,..., |z_m - y_m|^2\}$ where $|a-b|^2$ denotes the Euclidean distance between a, b. Prove that if $m$ is fixed, $g \rightarrow 0$ as $n \rightarrow \infty$. I don't know how to start this problem. Thank you. Last edited by thanhvinh0906; August 29th, 2017 at 01:00 AM.
 August 29th, 2017, 01:11 PM #2 Global Moderator   Joined: May 2007 Posts: 6,805 Thanks: 716 The applicable theorem is the law of large numbers, but I haven't tried to work it out. I'll leave that to you. Thanks from thanhvinh0906
August 30th, 2017, 01:34 AM   #3
Newbie

Joined: Aug 2017
From: Singapore

Posts: 2
Thanks: 0

Quote:
 Originally Posted by mathman The applicable theorem is the law of large numbers, but I haven't tried to work it out. I'll leave that to you.

Could you please check whether my below proof is correct? I do it in a uniform distribution.

Suppose $g = max\{|z_1 - y_1|^2, |z_2 - y_2|^2,..., |z_m - y_m|^2\} = \| z_{max} - y_{max} \|$.

For any $\varepsilon > 0$, put a hypersphere radius $\varepsilon$ around $y_{max}$ in that $y_{max}$ is the centroid (*).

Suppose there is no sample of $S$ exists in the hypersphere as $n \rightarrow \infty$. (**)

We project $y_{max}$ to dimension $k$, denoted as $y_{max,k}$. Suppose the boundary of dimension $k$ is $[a, b]$, i.e. $a \leq y_{max,k} \leq b$. According to the law of large numbers, $E[x_k]$ (the average of $X$ projected to dimension $k$) $\rightarrow (a + b)/2$ as $n \rightarrow \infty$. Now if there is no sample in the hypersphere:

\begin{split}
E[x_k] \rightarrow \int_a^b \mathrm{\frac{x}{b-a}}\,\mathrm{d}x = \int_a^{y_{max,k} - \varepsilon} \mathrm{\frac{x}{b-a}}\,\mathrm{d}x + \int_{y_{max,k} + \varepsilon}^b \mathrm{\frac{x}{b-a}}\,\mathrm{d}x \\
= \frac{a+b}{2} - \frac{2\varepsilon y_{max,k}}{b - a} \neq \frac{a + b}{2}
\end{split}

Equation (1) conflicts the law of large numbers, this indicates that (**) is false; so there exists at least one sample in the hypersphere. This statement accompanied by (*) indicates that $g = \| z_{max} - y_{max} \| \rightarrow 0$ as $n \rightarrow \infty$.

Please share if you have another proof. Thank you.

Last edited by skipjack; August 30th, 2017 at 04:38 PM.

 August 30th, 2017, 04:27 PM #4 Global Moderator   Joined: May 2007 Posts: 6,805 Thanks: 716 There may be a flaw in that as n increases, the set of x's changes and so would the set of z's. Therefore y_{max} may change with n. I had further thoughts after my first post. Qualitatively, a histogram, with narrower intervals for increasing n, of the x values should resemble the density function of the probability. The y values correspond to specific points of the density. The intervals (containing y values) of the histograms should eventually contain some x (therefore z) value close to the y value. I have never taken a course in statistics, although my specialty is probability theory, so I am not sure hot to develop a proof. Last edited by mathman; August 30th, 2017 at 04:29 PM.

 Tags converge, large, number, prove, samples, samplings, set, small

 Thread Tools Display Modes Linear Mode

 Similar Threads Thread Thread Starter Forum Replies Last Post Azarashi Algebra 10 December 12th, 2016 08:38 AM ishaanmj007 Number Theory 2 May 14th, 2015 05:07 PM FloorPlay Number Theory 2 October 30th, 2013 08:59 PM Jessicasamide Algebra 1 November 24th, 2012 05:47 PM BigBugBuzz Advanced Statistics 3 December 15th, 2010 01:25 PM

 Contact - Home - Forums - Cryptocurrency Forum - Top