My Math Forum Bit probability from a PDF of binary messages

 Probability and Statistics Basic Probability and Statistics Math Forum

October 20th, 2017, 01:37 PM   #1
Senior Member

Joined: Oct 2014
From: EU

Posts: 224
Thanks: 26

Math Focus: Calculus
Bit probability from a PDF of binary messages

Hello,

I am stuck on an homework for which I can't find a solution.
The statement is based on a entropy coding.

Given the following map of independent messages and its probability:

$\displaystyle \begin{array}{c c } m_0 = 0 \mapsto 0 & p(m_0) = 0.4\\ m_1 = +1 \mapsto 10 & p(m_1) = 0.3\\ m_2 = -1 \mapsto 11 & p(m_2) = 0.3 \end{array}$

it asks to proof that $\displaystyle p(1) = 0.5625$ and $\displaystyle p(0) = 0.4375$.

If I model the probabilty space as

$\displaystyle \Omega = \{1,1,1,0,0\}$

of course I don't get the correct result ($\displaystyle p(1) = 0.6$ and $\displaystyle p(0) = 0.4$) and it's obvious I have to apply a different reasoning.
So I started from the following binary tree I deduced:

From which I write:

\displaystyle \begin{aligned} P(1) & = P(11|m_2) + P(10|m_1) \quad(1)\\ P(0) & = P(0|m_0) + P(10|m_1) \quad(2) \end{aligned}

From this reasoning I get non sense results (at least for me), e.g.:

\displaystyle \begin{aligned} P(11|m_2) & = \frac{P(m_2 | 11) P(11)}{P(m_2)} \quad(3) \end{aligned}

but the probability of $\displaystyle m_2$ is never conditioned by $\displaystyle 11$ which makes me think that $\displaystyle P(11|m_2) = 0$ but again, it leads to a wrong result..

I am quiet confused because after many attempts I cannot get the correct result and I finally understand I don't know how to model this problem correctly.
Could some one point me to the right way ?

szz
Attached Images
 huffman.png (5.4 KB, 15 views)

October 20th, 2017, 01:42 PM   #2
Senior Member

Joined: Aug 2012

Posts: 2,010
Thanks: 574

Quote:
 Originally Posted by szz Hello, I am stuck on an homework for which I can't find a solution. The statement is based on a entropy coding. Given the following map of independent messages and its probability: $\displaystyle \begin{array}{c c } m_0 = 0 \mapsto 0 & p(m_0) = 0.4\\ m_1 = +1 \mapsto 10 & p(m_1) = 0.3\\ m_2 = -1 \mapsto 11 & p(m_2) = 0.3 \end{array}$ it asks to proof that $\displaystyle p(1) = 0.5625$ and $\displaystyle p(0) = 0.4375$.

Can you explain your terminology and notation? What are messages? I gather that there can be three outcomes, 0, 1, and -1, with probabilities .4, .3, and .3 respectively. Then you ask for p(1). Isn't it already given as .3? Or else what does your notation mean?

What does $m_2 = -1 \mapsto 11$ mean? If you could provide the general context of this question it would be helpful. A lot of people here understand probability theory but don't know the specific terminology of entropy coding. FWIW a glance at the relevant Wiki page was no help at all.

Last edited by Maschke; October 20th, 2017 at 01:53 PM.

October 20th, 2017, 02:58 PM   #3
Senior Member

Joined: Oct 2014
From: EU

Posts: 224
Thanks: 26

Math Focus: Calculus
Hello Maschke,

Sorry If I wasn't clear enough.

Quote:
 I gather that there can be three outcomes, 0, 1, and -1, with probabilities .4, .3, and .3 respectively.
Correct.

The set of messages $\displaystyle \{m_0,m_1,m_2\} = \{0,1,-1\}$ are the outcomes of a digital differential encoder, and this set of messages is converted to an entropy code as the Huffaman code [1] [2], where to each symbol are assigned more bits as its probability is lower.
So low probability symbols have greater length than high probability symbols.
That's why I built the binary tree.

To build the entropy code symbols $\displaystyle \{0,10,11\}$, $\displaystyle m_0$ is mapped to $\displaystyle 0$, $\displaystyle m_1$ is mapped to $\displaystyle 10$ and $\displaystyle m_2$ is mapped to $\displaystyle 11$

that's why I used the \mapsto ($\displaystyle \mapsto$) LaTeX symbol.
So resuming:

$\displaystyle \begin{array}{c c } m_0 = \{0\} \mapsto \{0\} & p(m_0) = 0.4\\ m_1 = \{+1\} \mapsto \{10\} & p(m_1) = 0.3\\ m_2 = \{-1\} \mapsto \{11\} & p(m_2) = 0.3 \end{array}$

I hope this explanation is better than my first post.

Given the set of messages and the entropy code symbols, the problem statement asks to demonstra that $\displaystyle p(1) = 0.5625$ and $\displaystyle p(0) = 0.4375$.

I am the first one I think that the statement isn't clear enough.

szz

Last edited by szz; October 20th, 2017 at 03:01 PM.

October 20th, 2017, 03:07 PM   #4
Senior Member

Joined: Aug 2012

Posts: 2,010
Thanks: 574

Quote:
 Originally Posted by szz Hello Maschke, Sorry If I wasn't clear enough. Correct. The set of messages $\displaystyle \{m_0,m_1,m_2\} = \{0,1,-1\}$ are the outcomes of a digital differential encoder, and this set of messages is converted to an entropy code as the Huffaman code [1] [2], where to each symbol are assigned more bits as its probability is lower. So low probability symbols have greater length than high probability symbols. That's why I built the binary tree. To build the entropy code symbols $\displaystyle \{0,10,11\}$, $\displaystyle m_0$ is mapped to $\displaystyle 0$, $\displaystyle m_1$ is mapped to $\displaystyle 10$ and $\displaystyle m_2$ is mapped to $\displaystyle 11$ that's why I used the \mapsto ($\displaystyle \mapsto$) LaTeX symbol. So resuming: $\displaystyle \begin{array}{c c } m_0 = \{0\} \mapsto \{0\} & p(m_0) = 0.4\\ m_1 = \{+1\} \mapsto \{10\} & p(m_1) = 0.3\\ m_2 = \{-1\} \mapsto \{11\} & p(m_2) = 0.3 \end{array}$ I hope this explanation is better than my first post. Given the set of messages and the entropy code symbols, the problem statement asks to demonstra that $\displaystyle p(1) = 0.5625$ and $\displaystyle p(0) = 0.4375$. Honestly, I am quiet confused and I have no more data information about this problem. I am the first one I think that the statement isn't clear enough. szz
There's really something missing IMO. What is p(1)? What is the '1' in there? You say 1 maps to 10. What does that mean?

October 20th, 2017, 03:14 PM   #5
Senior Member

Joined: Oct 2014
From: EU

Posts: 224
Thanks: 26

Math Focus: Calculus
Quote:
 There's really something missing IMO. What is p(1)? What is the '1' in there? You say 1 maps to 10. What does that mean?
$\displaystyle p(1)$ is the proability of having a bit set to 1 into the code book of the Huffman code. The codebook is {0, 10, 11}.

With "$\displaystyle m_1$ maps to 10" I mean that the output value of 1 (like a voltage value of 1 Volt) is translated to the binary sequence 10.

Sorry if I used wrong therms.

szz

Last edited by szz; October 20th, 2017 at 03:22 PM.

October 20th, 2017, 03:52 PM   #6
Senior Member

Joined: Aug 2012

Posts: 2,010
Thanks: 574

Quote:
 Originally Posted by szz $\displaystyle p(1)$ is the proability of having a bit set to 1 into the code book of the Huffman code. The codebook is {0, 10, 11}. With "$\displaystyle m_1$ maps to 10" I mean that the output value of 1 (like a voltage value of 1 Volt) is translated to the binary sequence 10. Sorry if I used wrong therms.
Not wrong terms, just domain-specific terms that most (well me anyway) aren't familiar with. But the basic probability theory is clear.

So the question is what is the probability of getting 10 or 11. Yes? But then the probability of getting 10 is .3 and the prob of 11 is .3 per your givens. Is that the question? Well the prob of NOT getting one of those is .4, so the prob of getting either of the others is .6. If you get -1 and +1 with prob .3 each, and they give you the outputs you care about, there is .6 chance that you'll get one of those.

There is really something missing here because clearly this is not the correct interpretation. You are not telling us something.

ps -- Practically everyone here knows more probability theory than I do! Feel free to jump in please.

Last edited by Maschke; October 20th, 2017 at 03:54 PM.

October 20th, 2017, 04:58 PM   #7
Senior Member

Joined: Oct 2014
From: EU

Posts: 224
Thanks: 26

Math Focus: Calculus
Quote:
 Originally Posted by Maschke So the question is what is the probability of getting 10 or 11. Yes?
No. The question is not the probability of getting 11 or 10.

Quote:
 Originally Posted by Maschke There is really something missing here because clearly this is not the correct interpretation. You are not telling us something.
Of course I could explain it better (it's 3 o'clock AM here) so I'll try a better explanation.

Suppose we have a discrete sampled signal $\displaystyle m(n)$ at the output of a quantizer, for which the possible value can be $\displaystyle \{-1,0,+1\}$, so we call these 3 possible values respectively $\displaystyle \{m_2, m_0, m_1\}$.

So the possible values of $\displaystyle m(n)$ are $\displaystyle \{m_2, m_0,m_1\}$ and they are randomly distributed. I have no information about the PDF of $\displaystyle m(n)$ (I actually invented the notation $\displaystyle m(n)$ for a better description).
I only know that the quantizer is not uniform.

We also know that:

\displaystyle \begin{aligned} p(m_0) &= 0.4\\ p(m_1) &= 0.3\\ p(m_2) &= 0.3 \end{aligned}

And the above probabilities are independent each other.
Now we want to encode this discrete signal with an entropy encoder. To accomplish this, we perform the following translations (what I previously called mapping):

$\displaystyle \begin{array}{c c } m_0 = \{0\} & \mapsto \{0\} \\ m_1 = \{+1\} & \mapsto \{10\}\\ m_2 = \{-1\} & \mapsto \{11\} \end{array}$

so we finally will have a set of binary messages (code book) of $\displaystyle \Omega = \{0, 10, 11\}$.

So if the output of the quantizer will be (I am inventing this sequence):

$\displaystyle m(n) = -\delta(n) + 0\delta(n-1) + \delta(n-2) + \delta(n-3) + \delta(n-4) + \delta(n-5) -\delta(n-6) = \{-1,0,1,1,1,1,-1\}$

$\displaystyle m(n)$ will be encoded to $\displaystyle D = \{11,0,10,10,10,10,11\}$.

The problem asks to demonstrate that $\displaystyle p(1) = 0.5625$ and $\displaystyle p(0) = 0.4375$, where $\displaystyle p(1)$ is the probability of having a bit set to 1 and $\displaystyle p(0)$ is the probability of having a bit set to 0, both from the encoded sequence $\displaystyle D$.

Quote:
 Practically everyone here knows more probability theory than I do!
Probably you know more probability theory than me.

szz

Last edited by szz; October 20th, 2017 at 05:14 PM.

 October 20th, 2017, 05:18 PM #8 Senior Member   Joined: Aug 2012 Posts: 2,010 Thanks: 574 I'm going to quit while I'm behind here. Perhaps someone else can be of help. Ok one more question. "m(n)m(n) will be encoded to D={11,0,10,10,10,10,11}D={11,0,10,10,10,10,11}. The problem asks to demonstrate that p(1)=0.5625p(1)=0.5625 and p(0)=0.4375p(0)=0.4375, where p(1)p(1) is the probability of having a bit set to 1 ..." Which bit?? You showed me D, a sequence of outputs of the 0, 1, -1 process. What is the probability we're looking for? How long is D? Probability of a 1 showing up WHERE? Last edited by Maschke; October 20th, 2017 at 05:47 PM.
October 22nd, 2017, 09:52 PM   #9
Senior Member

Joined: Sep 2015
From: USA

Posts: 2,100
Thanks: 1093

Quote:
 Originally Posted by Maschke I'm going to quit while I'm behind here. Perhaps someone else can be of help. Ok one more question. "m(n)m(n) will be encoded to D={11,0,10,10,10,10,11}D={11,0,10,10,10,10,11}. The problem asks to demonstrate that p(1)=0.5625p(1)=0.5625 and p(0)=0.4375p(0)=0.4375, where p(1)p(1) is the probability of having a bit set to 1 ..." Which bit?? You showed me D, a sequence of outputs of the 0, 1, -1 process. What is the probability we're looking for? How long is D? Probability of a 1 showing up WHERE?
The question/process is this.

Take a random stream of trinary symbols with the given probabilities of occurrence and encode that into a binary stream using the given Huffman code.

Then looking at the binary stream, what is the probability that any bit in the stream is 1.

This problem is haunting me because I should be able to solve it but I'm not getting the answers given, and sim makes those answers look correct.

 October 23rd, 2017, 03:41 PM #10 Senior Member     Joined: Sep 2015 From: USA Posts: 2,100 Thanks: 1093 Ok, I've got it. Tricky little devil but really not that bad. Assume we've got an input message of $n$ trinary symbols, $\{a,b,c \}$ These occur with probabilities $\{0.4, 0.3, 0.3\}$ They are Huffman encoded as $\begin{cases}a &0 \\b &10 \\c &11 \end{cases}$ Let there be $n_a,\text{ a's, and }n_b,\text{ b's}$ There will be a total of $n_0 = n_a + n_b \text{ zeros.}$ and $n_1 = n_b + 2(n-n_a - n_b) = 2n - 2n_a - n_b \text{ ones.}$ the total number of binary bits will be $n_0+n_1 = 2n - n_a$ so $p_0 = \displaystyle \lim_{n\to \infty}\dfrac{n_a + n_b}{2n-n_a}$ $p_1 = \displaystyle \lim_{n\to \infty}\dfrac{2n - 2n_a - n_b}{2n-n_a}$ in the limit we expect that $n_a = 0.4 n,~n_b = 0.3 n$ substituting these in $p_0 = \displaystyle \lim_{n\to \infty}\dfrac{0.4 n +0.3n}{2n-0.4 n}$ $p_0 = \displaystyle \lim_{n\to \infty} \dfrac{0.7 n}{1.6n} = \dfrac {7}{16} = 0.4735$ Likewise $p_1 = \dfrac{2n-2(0.4)n - (0.3)n}{2n-(0.4)n} = \dfrac{9}{16} = 0.5625$ Thanks from Maschke

 Tags binary, bit, messages, pdf, probability

 Thread Tools Display Modes Linear Mode

 Similar Threads Thread Thread Starter Forum Replies Last Post Karma Probability and Statistics 9 July 17th, 2017 01:28 PM Saginogdu7422 Art 0 September 4th, 2016 03:02 AM vagulus Elementary Math 4 February 16th, 2015 04:34 AM

 Contact - Home - Forums - Cryptocurrency Forum - Top