Main Page | Report this Page
Science Forum Index  »  Statistics - Math Forum  »  Variance within Lotto sets (combinations)....
Page 1 of 1    

Variance within Lotto sets (combinations)....

Author Message
Stig Holmquist...
Posted: Wed Oct 28, 2009 3:08 pm
Guest
Back in Aug 2005 I posted a question about the mean variance within
n/N Lotto games, and was given the formula N(N+1)/12. But this formula
yields the value 14.29 as std.dev. for a 6/49 game, but the actual
std.dev. is only 13.93.

As an additional test I tried it on all n/10 games, which I generated
with a program at "BetStarter". After transporting the combinations to
a spreadsheet and using the Excel I got the following mean data:
for 2/10-2.59; 3/10-2.84; 4/10-2.93; 5/10-2.96; 6/10-2.98;
7/10-3.0075; 8/10-3.0165; 9/10-3.0228; 10/10-3.02765.
This last number is the same as the formula yields. These data form a
smooth curve.

So my qustion is: what formula , if any, can predict the exact values?

If I multiply the "fformula" by (n^2-1)/n^2 I get values very close to
the actual ones. But the "correction" factor should become 1 at n=10

Is there a better formula?

Stig Holmquist
 
Ray Koopman...
Posted: Wed Oct 28, 2009 9:20 pm
Guest
On Oct 28, 2:08 pm, Stig Holmquist <stigfjor... at (no spam) hotmail.com> wrote:
[quote]Back in Aug 2005 I posted a question about the mean variance within
n/N Lotto games, and was given the formula N(N+1)/12. But this formula
yields the value 14.29 as std.dev. for a 6/49 game, but the actual
std.dev. is only 13.93.

As an additional test I tried it on all n/10 games, which I generated
with a program at "BetStarter". After transporting the combinations to
a spreadsheet and using the Excel I got the following mean data:
for 2/10-2.59; 3/10-2.84; 4/10-2.93; 5/10-2.96; 6/10-2.98;
7/10-3.0075; 8/10-3.0165; 9/10-3.0228; 10/10-3.02765.
This last number is the same as the formula yields.
These data form a smooth curve.

So my qustion is: what formula, if any, can predict the exact values?

If I multiply the "fformula" by (n^2-1)/n^2 I get values very close to
the actual ones. But the "correction" factor should become 1 at n=10

Is there a better formula?

Stig Holmquist
[/quote]
For N = 10, Mathematica gets 55/6 = N(N+1)/12 for n = 2,...,10:

Table[{n,Mean[Variance/ at (no spam) Subsets[Range at (no spam) 10,{n}]]},{n,2,10}]

{{ 2, 55/6},
{ 3, 55/6},
{ 4, 55/6},
{ 5, 55/6},
{ 6, 55/6},
{ 7, 55/6},
{ 8, 55/6},
{ 9, 55/6},
{10, 55/6}}
 
Stig Holmquist...
Posted: Thu Oct 29, 2009 7:06 am
Guest
On Thu, 29 Oct 2009 00:20:11 -0700 (PDT), Ray Koopman <koopman at (no spam) sfu.ca>
wrote:

[quote]On Oct 28, 2:08 pm, Stig Holmquist <stigfjor... at (no spam) hotmail.com> wrote:
Back in Aug 2005 I posted a question about the mean variance within
n/N Lotto games, and was given the formula N(N+1)/12. But this formula
yields the value 14.29 as std.dev. for a 6/49 game, but the actual
std.dev. is only 13.93.

As an additional test I tried it on all n/10 games, which I generated
with a program at "BetStarter". After transporting the combinations to
a spreadsheet and using the Excel I got the following mean data:
for 2/10-2.59; 3/10-2.84; 4/10-2.93; 5/10-2.96; 6/10-2.98;
7/10-3.0075; 8/10-3.0165; 9/10-3.0228; 10/10-3.02765.
This last number is the same as the formula yields.
These data form a smooth curve.

So my qustion is: what formula, if any, can predict the exact values?

If I multiply the "fformula" by (n^2-1)/n^2 I get values very close to
the actual ones. But the "correction" factor should become 1 at n=10

Is there a better formula?

Stig Holmquist

For N = 10, Mathematica gets 55/6 = N(N+1)/12 for n = 2,...,10:

Table[{n,Mean[Variance/ at (no spam) Subsets[Range at (no spam) 10,{n}]]},{n,2,10}]

{{ 2, 55/6},
{ 3, 55/6},
{ 4, 55/6},
{ 5, 55/6},
{ 6, 55/6},
{ 7, 55/6},
{ 8, 55/6},
{ 9, 55/6},
{10, 55/6}}
[/quote]
I'm having a very hard time understanding where the divisor "6" could
come from an any set other than 7/10 if we calculate the sample
variance or from 6/10 for the population variance.

It's almost impossible to find a "6" in any of the sets for 2/10.
Each will yield units of sqrt 2 as a sample std.dev.It's easy to write
down all 45 sets with two digits for 2/10 if you replace 10 with 0.
The sum of all 45 sample std.dev. will be 82.5x1.414 with a mean of
only 2.59. It takes but a few minutes because of repeats.

What does Mathematica calculate? Please explain.

Stig
 
Ray Koopman...
Posted: Thu Oct 29, 2009 7:13 am
Guest
On Oct 29, 6:06 am, Stig Holmquist <stigfjor... at (no spam) hotmail.com> wrote:
[quote]On Thu, 29 Oct 2009 00:20:11 -0700 (PDT), Ray Koopman <koop... at (no spam) sfu.ca
wrote:
On Oct 28, 2:08 pm, Stig Holmquist <stigfjor... at (no spam) hotmail.com> wrote:
Back in Aug 2005 I posted a question about the mean variance within
n/N Lotto games, and was given the formula N(N+1)/12. But this formula
yields the value 14.29 as std.dev. for a 6/49 game, but the actual
std.dev. is only 13.93.

As an additional test I tried it on all n/10 games, which I generated
with a program at "BetStarter". After transporting the combinations to
a spreadsheet and using the Excel I got the following mean data:
for 2/10-2.59; 3/10-2.84; 4/10-2.93; 5/10-2.96; 6/10-2.98;
7/10-3.0075; 8/10-3.0165; 9/10-3.0228; 10/10-3.02765.
This last number is the same as the formula yields.
These data form a smooth curve.

So my qustion is: what formula, if any, can predict the exact values?

If I multiply the "fformula" by (n^2-1)/n^2 I get values very close to
the actual ones. But the "correction" factor should become 1 at n=10

Is there a better formula?

Stig Holmquist

For N = 10, Mathematica gets 55/6 = N(N+1)/12 for n = 2,...,10:

Table[{n,Mean[Variance/ at (no spam) Subsets[Range at (no spam) 10,{n}]]},{n,2,10}]

{{ 2, 55/6},
{ 3, 55/6},
{ 4, 55/6},
{ 5, 55/6},
{ 6, 55/6},
{ 7, 55/6},
{ 8, 55/6},
{ 9, 55/6},
{10, 55/6}}

I'm having a very hard time understanding where the divisor "6" could
come from an any set other than 7/10 if we calculate the sample
variance or from 6/10 for the population variance.

It's almost impossible to find a "6" in any of the sets for 2/10.
Each will yield units of sqrt 2 as a sample std.dev.It's easy to write
down all 45 sets with two digits for 2/10 if you replace 10 with 0.
The sum of all 45 sample std.dev. will be 82.5x1.414 with a mean of
only 2.59. It takes but a few minutes because of repeats.

What does Mathematica calculate? Please explain.

Stig
[/quote]
Here is what it does for n = 3, broken down into steps.

First it generates all the 3-element subsets of 1...10:

In[1]:= Subsets[Range at (no spam) 10,{3}]

Out[1]=
{{1,2,3},{1,2,4},{1,2,5},{1,2,6},{1,2,7},{1,2,8},{1,2,9},{1,2,10},
{1,3,4},{1,3,5},{1,3,6},{1,3,7},{1,3,8},{1,3,9},{1,3,10},{1,4,5},
{1,4,6},{1,4,7},{1,4,8},{1,4,9},{1,4,10},{1,5,6},{1,5,7},{1,5,8},
{1,5,9},{1,5,10},{1,6,7},{1,6,8},{1,6,9},{1,6,10},{1,7,8},{1,7,9},
{1,7,10},{1,8,9},{1,8,10},{1,9,10},{2,3,4},{2,3,5},{2,3,6},{2,3,7},
{2,3,8},{2,3,9},{2,3,10},{2,4,5},{2,4,6},{2,4,7},{2,4,8},{2,4,9},
{2,4,10},{2,5,6},{2,5,7},{2,5,8},{2,5,9},{2,5,10},{2,6,7},{2,6,8},
{2,6,9},{2,6,10},{2,7,8},{2,7,9},{2,7,10},{2,8,9},{2,8,10},{2,9,10},
{3,4,5},{3,4,6},{3,4,7},{3,4,8},{3,4,9},{3,4,10},{3,5,6},{3,5,7},
{3,5,8},{3,5,9},{3,5,10},{3,6,7},{3,6,8},{3,6,9},{3,6,10},{3,7,8},
{3,7,9},{3,7,10},{3,8,9},{3,8,10},{3,9,10},{4,5,6},{4,5,7},{4,5,8},
{4,5,9},{4,5,10},{4,6,7},{4,6,8},{4,6,9},{4,6,10},{4,7,8},{4,7,9},
{4,7,10},{4,8,9},{4,8,10},{4,9,10},{5,6,7},{5,6,8},{5,6,9},{5,6,10},
{5,7,8},{5,7,9},{5,7,10},{5,8,9},{5,8,10},{5,9,10},{6,7,8},{6,7,9},
{6,7,10},{6,8,9},{6,8,10},{6,9,10},{7,8,9},{7,8,10},{7,9,10},
{8,9,10}}

Then it gets the variance of each subset,
using n-1 in the denominator:

In[2]:= Variance/ at (no spam) %

Out[2]=
{1, 7/3, 13/3, 7, 31/3, 43/3, 19, 73/3, 7/3, 4, 19/3, 28/3, 13, 52/3,
67/3, 13/3, 19/3, 9, 37/3, 49/3, 21, 7, 28/3, 37/3, 16, 61/3, 31/3,
13, 49/3, 61/3, 43/3, 52/3, 21, 19, 67/3, 73/3, 1, 7/3, 13/3, 7,
31/3, 43/3, 19, 7/3, 4, 19/3, 28/3, 13, 52/3, 13/3, 19/3, 9, 37/3,
49/3, 7, 28/3, 37/3, 16, 31/3, 13, 49/3, 43/3, 52/3, 19, 1, 7/3,
13/3, 7, 31/3, 43/3, 7/3, 4, 19/3, 28/3, 13, 13/3, 19/3, 9, 37/3,
7, 28/3, 37/3, 31/3, 13, 43/3, 1, 7/3, 13/3, 7, 31/3, 7/3, 4, 19/3,
28/3, 13/3, 19/3, 9, 7, 28/3, 31/3, 1, 7/3, 13/3, 7, 7/3, 4, 19/3,
13/3, 19/3, 7, 1, 7/3, 13/3, 7/3, 4, 13/3, 1, 7/3, 7/3, 1}

Finally it gets the mean of the variances:

In[3]:= Mean at (no spam) %

Out[3]= 55/6
 
Ray Koopman...
Posted: Thu Oct 29, 2009 12:11 pm
Guest
On Oct 29, 2:39 pm, Stig Holmquist <stigfjor... at (no spam) hotmail.com> wrote:
[quote][...]

But I'm now back to my original problem of finding a formula for the
mean std.dev. Is there a solution to that ?

Stig
[/quote]
Not that I know of. The square roots make things difficult.
 
Stig Holmquist...
Posted: Thu Oct 29, 2009 3:39 pm
Guest
On Thu, 29 Oct 2009 10:13:44 -0700 (PDT), Ray Koopman <koopman at (no spam) sfu.ca>
wrote:

[quote]On Oct 29, 6:06 am, Stig Holmquist <stigfjor... at (no spam) hotmail.com> wrote:
On Thu, 29 Oct 2009 00:20:11 -0700 (PDT), Ray Koopman <koop... at (no spam) sfu.ca
wrote:
On Oct 28, 2:08 pm, Stig Holmquist <stigfjor... at (no spam) hotmail.com> wrote:
Back in Aug 2005 I posted a question about the mean variance within
n/N Lotto games, and was given the formula N(N+1)/12. But this formula
yields the value 14.29 as std.dev. for a 6/49 game, but the actual
std.dev. is only 13.93.

As an additional test I tried it on all n/10 games, which I generated
with a program at "BetStarter". After transporting the combinations to
a spreadsheet and using the Excel I got the following mean data:
for 2/10-2.59; 3/10-2.84; 4/10-2.93; 5/10-2.96; 6/10-2.98;
7/10-3.0075; 8/10-3.0165; 9/10-3.0228; 10/10-3.02765.
This last number is the same as the formula yields.
These data form a smooth curve.

So my qustion is: what formula, if any, can predict the exact values?

If I multiply the "fformula" by (n^2-1)/n^2 I get values very close to
the actual ones. But the "correction" factor should become 1 at n=10

Is there a better formula?

Stig Holmquist

For N = 10, Mathematica gets 55/6 = N(N+1)/12 for n = 2,...,10:

Table[{n,Mean[Variance/ at (no spam) Subsets[Range at (no spam) 10,{n}]]},{n,2,10}]

{{ 2, 55/6},
{ 3, 55/6},
{ 4, 55/6},
{ 5, 55/6},
{ 6, 55/6},
{ 7, 55/6},
{ 8, 55/6},
{ 9, 55/6},
{10, 55/6}}

I'm having a very hard time understanding where the divisor "6" could
come from an any set other than 7/10 if we calculate the sample
variance or from 6/10 for the population variance.

It's almost impossible to find a "6" in any of the sets for 2/10.
Each will yield units of sqrt 2 as a sample std.dev.It's easy to write
down all 45 sets with two digits for 2/10 if you replace 10 with 0.
The sum of all 45 sample std.dev. will be 82.5x1.414 with a mean of
only 2.59. It takes but a few minutes because of repeats.

What does Mathematica calculate? Please explain.

Stig

Here is what it does for n = 3, broken down into steps.

First it generates all the 3-element subsets of 1...10:

In[1]:= Subsets[Range at (no spam) 10,{3}]

Out[1]=
{{1,2,3},{1,2,4},{1,2,5},{1,2,6},{1,2,7},{1,2,8},{1,2,9},{1,2,10},
{1,3,4},{1,3,5},{1,3,6},{1,3,7},{1,3,8},{1,3,9},{1,3,10},{1,4,5},
{1,4,6},{1,4,7},{1,4,8},{1,4,9},{1,4,10},{1,5,6},{1,5,7},{1,5,8},
{1,5,9},{1,5,10},{1,6,7},{1,6,8},{1,6,9},{1,6,10},{1,7,8},{1,7,9},
{1,7,10},{1,8,9},{1,8,10},{1,9,10},{2,3,4},{2,3,5},{2,3,6},{2,3,7},
{2,3,8},{2,3,9},{2,3,10},{2,4,5},{2,4,6},{2,4,7},{2,4,8},{2,4,9},
{2,4,10},{2,5,6},{2,5,7},{2,5,8},{2,5,9},{2,5,10},{2,6,7},{2,6,8},
{2,6,9},{2,6,10},{2,7,8},{2,7,9},{2,7,10},{2,8,9},{2,8,10},{2,9,10},
{3,4,5},{3,4,6},{3,4,7},{3,4,8},{3,4,9},{3,4,10},{3,5,6},{3,5,7},
{3,5,8},{3,5,9},{3,5,10},{3,6,7},{3,6,8},{3,6,9},{3,6,10},{3,7,8},
{3,7,9},{3,7,10},{3,8,9},{3,8,10},{3,9,10},{4,5,6},{4,5,7},{4,5,8},
{4,5,9},{4,5,10},{4,6,7},{4,6,8},{4,6,9},{4,6,10},{4,7,8},{4,7,9},
{4,7,10},{4,8,9},{4,8,10},{4,9,10},{5,6,7},{5,6,8},{5,6,9},{5,6,10},
{5,7,8},{5,7,9},{5,7,10},{5,8,9},{5,8,10},{5,9,10},{6,7,8},{6,7,9},
{6,7,10},{6,8,9},{6,8,10},{6,9,10},{7,8,9},{7,8,10},{7,9,10},
{8,9,10}}

Then it gets the variance of each subset,
using n-1 in the denominator:

In[2]:= Variance/ at (no spam) %

Out[2]=
{1, 7/3, 13/3, 7, 31/3, 43/3, 19, 73/3, 7/3, 4, 19/3, 28/3, 13, 52/3,
67/3, 13/3, 19/3, 9, 37/3, 49/3, 21, 7, 28/3, 37/3, 16, 61/3, 31/3,
13, 49/3, 61/3, 43/3, 52/3, 21, 19, 67/3, 73/3, 1, 7/3, 13/3, 7,
31/3, 43/3, 19, 7/3, 4, 19/3, 28/3, 13, 52/3, 13/3, 19/3, 9, 37/3,
49/3, 7, 28/3, 37/3, 16, 31/3, 13, 49/3, 43/3, 52/3, 19, 1, 7/3,
13/3, 7, 31/3, 43/3, 7/3, 4, 19/3, 28/3, 13, 13/3, 19/3, 9, 37/3,
7, 28/3, 37/3, 31/3, 13, 43/3, 1, 7/3, 13/3, 7, 31/3, 7/3, 4, 19/3,
28/3, 13/3, 19/3, 9, 7, 28/3, 31/3, 1, 7/3, 13/3, 7, 7/3, 4, 19/3,
13/3, 19/3, 7, 1, 7/3, 13/3, 7/3, 4, 13/3, 1, 7/3, 7/3, 1}

Finally it gets the mean of the variances:

In[3]:= Mean at (no spam) %

Out[3]= 55/6
[/quote]
Thank you for clarification. I now understand whaat error I made.
I calculated the std.dev. for each set and then took the mean.
But that is different from calculatiing the variance for each set and
then takingi their mean. Exchanging 0 for 10 maked no difference.

My superficial know how tripped me up.

But I'm now back to my original problem of finding a formula for the
mean std.dev. Is there a solution to that ?

Stig

Stig
 
 
Page 1 of 1    
All times are GMT - 5 Hours
The time now is Wed Dec 02, 2009 2:49 pm