My Math Forum Collatz, a pigeonhole approach

 Number Theory Number Theory Math Forum

August 26th, 2014, 07:47 PM   #11
Global Moderator

Joined: Nov 2006
From: UTC -5

Posts: 16,046
Thanks: 938

Math Focus: Number theory, computational mathematics, combinatorics, FOM, symbolic logic, TCS, algorithms
Quote:
Originally Posted by Hedge
Quote:
 Originally Posted by CRGreathouse But in that case you still don't have nearly enough power to prove what you need. The sum of the levels of all the acceptable odds (i.e., f(n) is a positive integer) up to a million is only 21882 which is much less than 500,000. Up to 10 million it gets worse at only 63488. So even if there were no duplicates through any of these (surely false)....
I'm pretty sure there are no duplicates. The logic is that this function is simply the reverse of a way of expressing a Collatz chain, and the chain from any odd number to 1 is unique.
f(3) = 1 and f(29) = 3, in particular both are positive integers. Neither has q in the forbidden range, but 3 -> 5 -> 1 and 29 -> 11 -> 17 -> 13 -> 5 -> 1. Both use 5, and so this is what I mean by a duplicate.

There are lots of other examples. Here's one more:
15 -> 23 -> 35 -> 53 -> 5 -> 1

63 -> 95 -> 143 -> 215 -> 323 -> 485 -> 91 -> 137 -> 103 -> 155 -> 233 -> 175 -> 263 -> 395 -> 593 -> 445 -> 167 -> 251 -> 377 -> 283 -> 425 -> 319 -> 479 -> 719 -> 1079 -> 1619 -> 2429 -> 911 -> 1367 -> 2051 -> 3077 -> 577 -> 433 -> 325 -> 61 -> 23 -> 35 -> 53 -> 5 -> 1

with f(15) = 5 and f(63) = 21 and neither q bad.

Here's the code I'm using:
Code:
ok(p)={
my(n=log(p)\log(2)+1,N=2^n,q=N-p,t,r);
if(N/4<=q&&q<N/2,return(0));
t=sum(i=0,#binary(q),if(bittest(q,i),q-=2^i;3^hammingweight(q)*2^i,0));
r=(2^n-t)/3^hammingweight(2^n-p);
denominator(r)==1&&r>0
};

cc(n)=my(v=List());while(n>1,listput(v,n);n=3*n+1;n>>=valuation(n,2));Set(v);

showcc(n)={
while(n>1,
print1(n" -> ");
n=3*n+1;
n>>=valuation(n,2)
);
print(1)
};

forstep(k=1,n-2,2,
my(t=cc(k),u=setintersect(t,v));
if(#u, showcc(k);v=setminus(v,u))
);
showcc(n)
};

v=[];forstep(n=1,1e6,2,if(ok(n),t1=setintersect(t=cc(n),v);if(#t1&&t1!=[5],bad(n,t1);print);v=Set(concat(v,t))))

August 27th, 2014, 01:06 AM   #12
Senior Member

Joined: Mar 2012

Posts: 572
Thanks: 26

Quote:
 Originally Posted by CRGreathouse f(3) = 1 and f(29) = 3, in particular both are positive integers. Neither has q in the forbidden range, but 3 -> 5 -> 1 and 29 -> 11 -> 17 -> 13 -> 5 -> 1.
No, there are two things wrong here. Firstly, f(3) = 1 is in the forbidden range. I said the exclusion is that we only apply the function to numbers > 3/4*2^n. 3 = 3/4 * 4 so this isn't an input we should use. The first permissible number we can input is 7 (because 7>6), then 13, 15, 25, 27, 29, 31 etc.

Secondly you're mixing up the input and the output. f(29) outputs 3, so it describes the path 3 - > 5 -> 1. f(3) is disallowed but it would describe the path 1 -> 1.

The path for 29 -> 11 -> 17 -> 13 -> 5 -> 1 is identified by f(7591) = 7047/243 = 29. You mustn't conflate f(7591) = 29 with f(29) = 3.

Quote:
 Originally Posted by CRGreathouse Both use 5, and so this is what I mean by a duplicate.
We can't possibly rule out duplications of 'paths that go through 5 on the way to 1'. All I'm claiming is that each f(n) with an integer solution is the only way that this function can produce that particular integer solution. So, f(7591) and f(29) have different outputs and these are the only inputs that give us 29 and 3 respectively.

Quote:
 Originally Posted by CRGreathouse There are lots of other examples. Here's one more: 15 -> 23 -> 35 -> 53 -> 5 -> 1 63 -> 95 -> 143 -> 215 -> 323 -> 485 -> 91 -> 137 -> 103 -> 155 -> 233 -> 175 -> 263 -> 395 -> 593 -> 445 -> 167 -> 251 -> 377 -> 283 -> 425 -> 319 -> 479 -> 719 -> 1079 -> 1619 -> 2429 -> 911 -> 1367 -> 2051 -> 3077 -> 577 -> 433 -> 325 -> 61 -> 23 -> 35 -> 53 -> 5 -> 1 with f(15) = 5 and f(63) = 21 and neither q bad.
Again, here you've mixed up input and output. both are level 1 numbers. f(15) outputs 5 (5 -> 1), f(63) outputs 21 (21 -> 1).

To get this path: 15 -> 23 -> 35 -> 53 -> 5 -> 1, the input is f(3825)

The path for 63 above would be output by a number somewhere closer to 2^65 and is irrelevant to f(63).

Quote:
 Originally Posted by CRGreathouse Here's the code I'm using:
Shouldn't this bit disqualify f(3) where q = N/4?

if(N/4<=q&&q<N/2,return(0)

(I'm not a coder, so might have it wrong?)

Last edited by Hedge; August 27th, 2014 at 01:58 AM.

August 27th, 2014, 05:12 AM   #13
Global Moderator

Joined: Nov 2006
From: UTC -5

Posts: 16,046
Thanks: 938

Math Focus: Number theory, computational mathematics, combinatorics, FOM, symbolic logic, TCS, algorithms
Quote:
 Originally Posted by Hedge No, there are two things wrong here. Firstly, f(3) = 1 is in the forbidden range. I said the exclusion is that we only apply the function to numbers > 3/4*2^n.
You wrote:
Code:
However we immediately need to exclude some of these: Where the largest of the powers of 2 in q is 2^(n-2) (in other words a quarter of 2^n) the path identified is a repetition of a chain that can be expressed more simply.
which corresponds to $2^{n-2}\le q<2^{n-1},$ i.e., $\frac14\cdot2^n\le q<\frac12\cdot2^n.$ I can't help if you intended something else.

August 27th, 2014, 05:38 AM   #14
Senior Member

Joined: Mar 2012

Posts: 572
Thanks: 26

Quote:
 Originally Posted by CRGreathouse You wrote: Code: However we immediately need to exclude some of these: Where the largest of the powers of 2 in q is 2^(n-2) (in other words a quarter of 2^n) the path identified is a repetition of a chain that can be expressed more simply. which corresponds to $2^{n-2}\le q<2^{n-1},$ i.e., $\frac14\cdot2^n\le q<\frac12\cdot2^n.$ I can't help if you intended something else.
Sorry, am I missing something? Where 2^n = 4, p = 3, and q = 1, then q = 1/4*2^n. So on your definition of the exclusion above f(3) should be excluded.

(I've put this slightly differently above, where I'm referring to p (=2^n - q) and saying it must be > 3/4*2^n to be permissible, but surely that comes to the same thing?)

Last edited by Hedge; August 27th, 2014 at 05:57 AM.

 August 27th, 2014, 06:52 AM #15 Global Moderator     Joined: Nov 2006 From: UTC -5 Posts: 16,046 Thanks: 938 Math Focus: Number theory, computational mathematics, combinatorics, FOM, symbolic logic, TCS, algorithms Yes, it looks like 3 should be excluded -- looks like I forgot to check ok(k) in one place. But there are lots of other examples, like 15 -> 23 -> 35 -> 53 -> 5 -> 1 63 -> 95 -> 143 -> 215 -> 323 -> 485 -> 91 -> 137 -> 103 -> 155 -> 233 -> 175 -> 263 -> 395 -> 593 -> 445 -> 167 -> 251 -> 377 -> 283 -> 425 -> 319 -> 479 -> 719 -> 1079 -> 1619 -> 2429 -> 911 -> 1367 -> 2051 -> 3077 -> 577 -> 433 -> 325 -> 61 -> 23 -> 35 -> 53 -> 5 -> 1
 August 27th, 2014, 07:00 AM #16 Senior Member   Joined: Mar 2012 Posts: 572 Thanks: 26 I already pointed out in my previous post those aren't examples - you're mixing up the input to the function with the output. Where this function produces a positive integer it is the output that must have a Collatz chain to 1, not the input number we started with. f(63) = 21 (for which the Collatz chain is 21 - > 1) f(15) = 5 (for which the Collatz chain is 5 -> 1) No duplication in the output. (And it doesn't matter if part of the Collatz chain for (eg) 3 and 29 is the same, that is inevitable). To get the input equivalent to this chain 15 -> 23 -> 35 -> 53 -> 5 -> 1 you need f(3825) = 15 Whereas for the chain 23 -> 35 -> 53 -> 5 -> 1 you need f(1913) = 23 Again, no duplication in the output, even though the chain is partially the same. Last edited by Hedge; August 27th, 2014 at 07:49 AM.
 August 27th, 2014, 07:06 AM #17 Senior Member   Joined: Mar 2012 Posts: 572 Thanks: 26 Thanks for your patience with my explanations incidentally. There are some things further on that really are wobbly but it would be nice if we could at least make it through some of this stuff which I'm fairly confident is more secure, whether or not it ends up being useful in the end.
August 27th, 2014, 10:21 AM   #18
Global Moderator

Joined: Nov 2006
From: UTC -5

Posts: 16,046
Thanks: 938

Math Focus: Number theory, computational mathematics, combinatorics, FOM, symbolic logic, TCS, algorithms
Quote:
 Originally Posted by Hedge To get the input equivalent to this chain 15 -> 23 -> 35 -> 53 -> 5 -> 1 you need f(3825) = 15 Whereas for the chain 23 -> 35 -> 53 -> 5 -> 1 you need f(1913) = 23
OK, if that's what you're trying to do. But this seems to make it far less useful -- you only get a handful of tiny chains for searching a huge range.

Quote:
 Originally Posted by Hedge Thanks for your patience with my explanations incidentally.
It's hard because it's not clear what you're doing and even less clear where you're trying to go.

August 28th, 2014, 03:06 AM   #19
Senior Member

Joined: Mar 2012

Posts: 572
Thanks: 26

Quote:
 Originally Posted by CRGreathouse OK, if that's what you're trying to do. But this seems to make it far less useful -- you only get a handful of tiny chains for searching a huge range.
Yes, but the interesting things to me are the one-to-one match between inputs and outputs and the predictable regularity of the pattern produced. Bear in mind Euclid used the properties of very big numbers in the infinitude of primes proof – not claiming this is similar but it’s an example of why a huge range needn’t be an obstacle.

Quote:
 Originally Posted by CRGreathouse It's hard because it's not clear what you're doing and even less clear where you're trying to go.
OK, forgetting the blog post for now, let me see if I can boil down the chain of thought (and flag up some problems).

1) The function outputs all positive integers that have a Collatz chain to 1 (necessarily because we can reverse the process to find the ‘input’ for any given ‘output’.)
2) We can at least imagine a process of putting all odd numbers (with the restriction of inputs to numbers where q < 1/4*2^n) through the function.
3) Any odd positive integer that has a Collatz chain to 1 can be uniquely generated by the function without duplicates. (I hope you can see this works now...)
4) This suggests a few possible strategies – the one I am pursuing right now is to use the function to ‘count’/estimate positive integer solutions in a particular region.
5) The number we input to the function determines what level of Collatz chain the output number has. (eg for the input 3825 = 4096 - 256 - 8 - 4 -2 -1 the output is a level 5 Collatz chain - 15 - 23 -35 - 53 -5 -1).
6) We can use Pascal's triangle to describe the pattern of input numbers thus:
At 2^n = 8 we have 1 level 1 input,
At 2^n = 16 we have 1 level 1, 1 level 2,
At 2^n =32 we have 1 level 1, 2 level 2, 1 level 3
At 2^n = 64 we have 1 level 1, 3 level 2, 3 level 3, 1 level 4
7) As the level rises, the power of 3^n we divide by to find a result also rises – so the density of ‘all outputs’ (as opposed to integer outputs) below a bound is always rising. (The total density of integer outputs only increases very slowly, which is why we only have a few hundred integer results up to 10 million, but these are concentrated into the lower numbers).
8 ) For each level we have a decent initial estimate of the proportion of input numbers that result in a positive integer output:
1 in 2 level 1 numbers
1 in 6 level 2 numbers
1 in 18 level 3 numbers etc
(For instance, for q = 3, exactly one of 16-5, 32-5, 64-5, 128-5, 256-5, 512-5 must be divisible by 9 and thus give a positive integer result.)*
9 ) So we can theoretically use the function to estimate how many positive integer results for levels 1 to x there are under a bound N.** (roughly speaking, by only considering level 1 inputs up to 3N, level 2 inputs up to 9N, level 3 inputs up to 27N etc.)
10) Exploring this estimation method further suggests that each time we double N, we can double our estimate of integer solutions (so long as we can increase the range of levels under consideration)***
(11) Given that we know the integers are ‘full’ for a long way, this would suggest that all integers will continue to have Collatz chains to 1.

*Problem 1 – however, using this as an estimate for 'all level 2 numbers' or 'all level n numbers’ has some complications, in spite of being pretty accurate in practise.
**Problem 2 - the estimation method comes with some provisos, which can be a bit thorny...
*** Problem 3 - note the weasel words 'suggests that'. I'm pretty sure this is correct, but haven't pinned down a way to show it is necessary.

Last edited by Hedge; August 28th, 2014 at 03:29 AM.

August 28th, 2014, 04:37 AM   #20
Senior Member

Joined: Mar 2012

Posts: 572
Thanks: 26

Quote:
 Originally Posted by Hedge (The total density of integer outputs only increases very slowly, which is why we only have a few hundred integer results up to 10 million, but these are concentrated into the lower numbers).
The first part of this sentence is wrong, obviously, as total density falls.... The second half is OK though - for instance, overall density for integers output by inputs <100,000 is pretty low, but these inputs would give us 100% of the odd integers up to 16.

Last edited by Hedge; August 28th, 2014 at 05:13 AM.

 Tags approach, collatz, pigeonhole

 Thread Tools Display Modes Linear Mode

 Similar Threads Thread Thread Starter Forum Replies Last Post hiddenphi1 Applied Math 1 July 14th, 2014 11:56 AM merlin2222 Algebra 2 June 27th, 2011 02:27 PM deadlyned Number Theory 1 March 12th, 2011 01:34 PM mathdigger Applied Math 0 October 9th, 2010 12:55 AM wulfgarpro Algebra 17 March 22nd, 2010 04:13 PM

 Contact - Home - Forums - Cryptocurrency Forum - Top