Main Page | Report this Page
 
   
Science Forum Index  »  Bio Evolution Forum  »  a Coefficient of Relationship
Page 1 of 1    
Author Message
Paul Nutteing
Posted: Sun Jan 28, 2007 8:20 pm
Guest
Can anyone point to a/some actual rather than
theoretical Coefficient of Relationship number/s for some
perfectly normal / average population , not
some autochthonous village half way up
a mountain or highly consanguinous community ?
Perplexed in Peoria
Posted: Mon Jan 29, 2007 8:55 am
Guest
"Paul Nutteing" <nutteing@quickfindit.com> wrote in message news:epk3o8$dbu$1@darwin.ediacara.org...
Quote:
Can anyone point to a/some actual rather than
theoretical Coefficient of Relationship number/s for some
perfectly normal / average population , not
some autochthonous village half way up
a mountain or highly consanguinous community ?

Perhaps first you should tell us what you mean by actual rather
than theoretical. If I tell you that the coefficient of relationship
between two full siblings is 1/2, is that 'actual' or 'theoretical'?
If theoretical, what evidence would you want to use to come up with
an 'actual' value.

You do realize, don't you, that a coefficient of relationship exists
between two people. It is not a summary statistic for populations.
Perhaps you want an average over all pairs in the population. But
what would be the point of that? And, in any case, if you are talking
about 'probability of identity by descent' as your metric of relationship,
you would need accurate family tree information for each member of the
population in order to compute the average. Or, you would have to
investigate the genomes of every member of the population to an extent
sufficient to infer the family trees.

But perhaps you are thinking of some different definition of 'coefficient
of relationship'. For example, the one discussed by Alan Grafen
http://users.ox.ac.uk/~grafen/cv/
in his 1985 paper "A geometric view of relatedness". That paper
describes a coefficient between two people in the context of a population.
Using that definition, it would be possible to estimate the average
coefficient over pairs within a population by means of some modest
sampling of genomes. But, there would not be much point in actually
(as opposed to theoretically) doing this sampling - the average
is zero, by definition.

I'm guessing that your motivation here is skepticism regarding some
simplified presentation of a Hamilton's Rule argument for altruism
in human populations. Perhaps if you sketched the logic of that
argument, people here could tell you whether your skepticism is
warranted. But I will point out here that such arguments cannot
be successful unless you average the coefficients over *interacting*
pairs of individuals, which set of pairs is a subset of the set
of all pairs in the population.
ErikW
Posted: Tue Jan 30, 2007 9:30 am
Guest
On Jan 29, 7:20 am, "Paul Nutteing" <nutte...@quickfindit.com> wrote:
Quote:
Can anyone point to a/some actual rather than
theoretical Coefficient of Relationship number/s for some
perfectly normal / average population , not
some autochthonous village half way up
a mountain or highly consanguinous community ?

As PiP said, it is not at all clear what you mean by the above. I have
a different suspicion than he has as to what you are fishing for.
People often say "R between us is 1/2 but our genomes ares 99%
identical" or such. So instead of some coefficient of relatedness I
propose you google for nucleotide diversity. Those are "actual"
numbers; the average number of nucleotide differences per site between
two sequences drawn from your population. It's really a diversity
measure but you can think of it as a genetic distance measure too if
you like.

ErikW
Paul Nutteing (valid emai
Posted: Tue Jan 30, 2007 9:30 am
Guest
Perplexed in Peoria <jimmenegay@sbcglobal.net> wrote in message
news:eplfvt$tee$1@darwin.ediacara.org...
Quote:

"Paul Nutteing" <nutteing@quickfindit.com> wrote in message
news:epk3o8$dbu$1@darwin.ediacara.org...
Can anyone point to a/some actual rather than
theoretical Coefficient of Relationship number/s for some
perfectly normal / average population , not
some autochthonous village half way up
a mountain or highly consanguinous community ?

Perhaps first you should tell us what you mean by actual rather
than theoretical. If I tell you that the coefficient of relationship
between two full siblings is 1/2, is that 'actual' or 'theoretical'?
If theoretical, what evidence would you want to use to come up with
an 'actual' value.

You do realize, don't you, that a coefficient of relationship exists
between two people. It is not a summary statistic for populations.
Perhaps you want an average over all pairs in the population. But
what would be the point of that? And, in any case, if you are talking
about 'probability of identity by descent' as your metric of relationship,
you would need accurate family tree information for each member of the
population in order to compute the average. Or, you would have to
investigate the genomes of every member of the population to an extent
sufficient to infer the family trees.

But perhaps you are thinking of some different definition of 'coefficient
of relationship'. For example, the one discussed by Alan Grafen
http://users.ox.ac.uk/~grafen/cv/
in his 1985 paper "A geometric view of relatedness". That paper
describes a coefficient between two people in the context of a population.
Using that definition, it would be possible to estimate the average
coefficient over pairs within a population by means of some modest
sampling of genomes. But, there would not be much point in actually
(as opposed to theoretically) doing this sampling - the average
is zero, by definition.

I'm guessing that your motivation here is skepticism regarding some
simplified presentation of a Hamilton's Rule argument for altruism
in human populations. Perhaps if you sketched the logic of that
argument, people here could tell you whether your skepticism is
warranted. But I will point out here that such arguments cannot
be successful unless you average the coefficients over *interacting*
pairs of individuals, which set of pairs is a subset of the set
of all pairs in the population.



http://www.maa.org/devlin/devlin_09_06.html
and
http://www.maa.org/devlin/devlin_10_06.html
Disclosed the data for about 65,000 convicted
people in Arizona and 13 loci DNA profiles

"... A study of the Arizona CODIS database carried out in 2005 showed that
approximately 1 in every 228 profiles in the database matched another
profile in the database at nine or more loci, that approximately 1 in every
1,489 profiles matched at 10 loci, 1 in 16,374 profiles matched at 11 loci,
and 1 in 32,747 matched at 12 loci. ..."

Translates to

144, 9 loci matches
22 , 10 loci matches
2, 11 loci matches
1 , 12 loci matches
The 11 and 12 loci ones being related.
To get those 11&12 matches requires consanguinity
but the other matches reflect the degree of
co-ancestry of a large population,
presumably mainly male.

Applying an average CofR of
0.0385 for 65,000 and doing the statistics
gives a close simulation

133, 9 loci matches
22.1 , 10 loci matches
1.7, 11 loci matches
0.3 , 12 loci matches

Plus two or three 7/8 CofR to
supply the related matches of 11 and 12.

I wished to compare this 0.0385 for
other population CofR
Perplexed in Peoria
Posted: Thu Feb 01, 2007 12:46 pm
Guest
"Paul Nutteing (valid email address in post script )" <nutteing@quickfindit.com> wrote in message
news:epo6ch$1s67$1@darwin.ediacara.org...
Quote:
Perplexed in Peoria <jimmenegay@sbcglobal.net> wrote in message
news:eplfvt$tee$1@darwin.ediacara.org...

"Paul Nutteing" <nutteing@quickfindit.com> wrote in message
news:epk3o8$dbu$1@darwin.ediacara.org...
Can anyone point to a/some actual rather than
theoretical Coefficient of Relationship number/s for some
perfectly normal / average population , not
some autochthonous village half way up
a mountain or highly consanguinous community ?

Perhaps first you should tell us what you mean by actual rather
than theoretical. If I tell you that the coefficient of relationship
between two full siblings is 1/2, is that 'actual' or 'theoretical'?
If theoretical, what evidence would you want to use to come up with
an 'actual' value.

You do realize, don't you, that a coefficient of relationship exists
between two people. It is not a summary statistic for populations.
Perhaps you want an average over all pairs in the population. But
what would be the point of that? And, in any case, if you are talking
about 'probability of identity by descent' as your metric of relationship,
you would need accurate family tree information for each member of the
population in order to compute the average. Or, you would have to
investigate the genomes of every member of the population to an extent
sufficient to infer the family trees.

But perhaps you are thinking of some different definition of 'coefficient
of relationship'. For example, the one discussed by Alan Grafen
http://users.ox.ac.uk/~grafen/cv/
in his 1985 paper "A geometric view of relatedness". That paper
describes a coefficient between two people in the context of a population.
Using that definition, it would be possible to estimate the average
coefficient over pairs within a population by means of some modest
sampling of genomes. But, there would not be much point in actually
(as opposed to theoretically) doing this sampling - the average
is zero, by definition.

I'm guessing that your motivation here is skepticism regarding some
simplified presentation of a Hamilton's Rule argument for altruism
in human populations. Perhaps if you sketched the logic of that
argument, people here could tell you whether your skepticism is
warranted. But I will point out here that such arguments cannot
be successful unless you average the coefficients over *interacting*
pairs of individuals, which set of pairs is a subset of the set
of all pairs in the population.



http://www.maa.org/devlin/devlin_09_06.html
and
http://www.maa.org/devlin/devlin_10_06.html
Disclosed the data for about 65,000 convicted
people in Arizona and 13 loci DNA profiles

"... A study of the Arizona CODIS database carried out in 2005 showed that
approximately 1 in every 228 profiles in the database matched another
profile in the database at nine or more loci, that approximately 1 in every
1,489 profiles matched at 10 loci, 1 in 16,374 profiles matched at 11 loci,
and 1 in 32,747 matched at 12 loci. ..."

Translates to

144, 9 loci matches
22 , 10 loci matches
2, 11 loci matches
1 , 12 loci matches
The 11 and 12 loci ones being related.
To get those 11&12 matches requires consanguinity
but the other matches reflect the degree of
co-ancestry of a large population,
presumably mainly male.

Applying an average CofR of
0.0385 for 65,000 and doing the statistics
gives a close simulation

133, 9 loci matches
22.1 , 10 loci matches
1.7, 11 loci matches
0.3 , 12 loci matches

Plus two or three 7/8 CofR to
supply the related matches of 11 and 12.

I wished to compare this 0.0385 for
other population CofR

Ah! Forensic DNA testing. Yep. You have good reason to want actual
data. Sorry, I don't have any for you. Carry on.
ErikW
Posted: Thu Feb 01, 2007 12:46 pm
Guest
On Jan 30, 8:30 pm, "Paul Nutteing (valid email address in post
script )" <nutte...@quickfindit.com> wrote:
Quote:
Perplexed in Peoria <jimmene...@sbcglobal.net> wrote in messagenews:eplfvt$tee$1@darwin.ediacara.org...

"Paul Nutteing" <nutte...@quickfindit.com> wrote in message

news:epk3o8$dbu$1@darwin.ediacara.org...





Can anyone point to a/some actual rather than
theoretical Coefficient of Relationship number/s for some
perfectly normal / average population , not
some autochthonous village half way up
a mountain or highly consanguinous community ?

Perhaps first you should tell us what you mean by actual rather
than theoretical. If I tell you that the coefficient of relationship
between two full siblings is 1/2, is that 'actual' or 'theoretical'?
If theoretical, what evidence would you want to use to come up with
an 'actual' value.

You do realize, don't you, that a coefficient of relationship exists
between two people. It is not a summary statistic for populations.
Perhaps you want an average over all pairs in the population. But
what would be the point of that? And, in any case, if you are talking
about 'probability of identity by descent' as your metric of relationship,
you would need accurate family tree information for each member of the
population in order to compute the average. Or, you would have to
investigate the genomes of every member of the population to an extent
sufficient to infer the family trees.

But perhaps you are thinking of some different definition of 'coefficient
of relationship'. For example, the one discussed by Alan Grafen
http://users.ox.ac.uk/~grafen/cv/
in his 1985 paper "A geometric view of relatedness". That paper
describes a coefficient between two people in the context of a population.
Using that definition, it would be possible to estimate the average
coefficient over pairs within a population by means of some modest
sampling of genomes. But, there would not be much point in actually
(as opposed to theoretically) doing this sampling - the average
is zero, by definition.

I'm guessing that your motivation here is skepticism regarding some
simplified presentation of a Hamilton's Rule argument for altruism
in human populations. Perhaps if you sketched the logic of that
argument, people here could tell you whether your skepticism is
warranted. But I will point out here that such arguments cannot
be successful unless you average the coefficients over *interacting*
pairs of individuals, which set of pairs is a subset of the set
of all pairs in the population.

http://www.maa.org/devlin/devlin_09_06.html
andhttp://www.maa.org/devlin/devlin_10_06.html
Disclosed the data for about 65,000 convicted
people in Arizona and 13 loci DNA profiles

"... A study of the Arizona CODIS database carried out in 2005 showed that
approximately 1 in every 228 profiles in the database matched another
profile in the database at nine or more loci, that approximately 1 in every
1,489 profiles matched at 10 loci, 1 in 16,374 profiles matched at 11 loci,
and 1 in 32,747 matched at 12 loci. ..."

Translates to

144, 9 loci matches
22 , 10 loci matches
2, 11 loci matches
1 , 12 loci matches
The 11 and 12 loci ones being related.
To get those 11&12 matches requires consanguinity
but the other matches reflect the degree of
co-ancestry of a large population,
presumably mainly male.

Applying an average CofR of
0.0385 for 65,000 and doing the statistics
gives a close simulation

133, 9 loci matches
22.1 , 10 loci matches
1.7, 11 loci matches
0.3 , 12 loci matches

Plus two or three 7/8 CofR to
supply the related matches of 11 and 12.

I wished to compare this 0.0385 for
other population CofR- Hide quoted text -

From what can gather from your message I don't think you'll get very
much sense out of such an analysis. The statistics involved here looks

strange to me but I don't understand what you are doing so I may be
wrong (as I often enough am). How do you arrive at a CofR of 0.0385
(by that I also ask, what is your CofR)? Any such calculation shold
include the allel frequencies for that population, shouldn't it? How
else can it work?

ErikW

Quote:

- Show quoted text -
Paul Nutteing (valid emai
Posted: Fri Feb 02, 2007 8:19 am
Guest
ErikW <bryophyta@hotmail.com> wrote in message
news:eptqk3$u5v$1@darwin.ediacara.org...
Quote:
On Jan 30, 8:30 pm, "Paul Nutteing (valid email address in post
script )" <nutte...@quickfindit.com> wrote:
Perplexed in Peoria <jimmene...@sbcglobal.net> wrote in
messagenews:eplfvt$tee$1@darwin.ediacara.org...


From what can gather from your message I don't think you'll get very
much sense out of such an analysis. The statistics involved here looks
strange to me but I don't understand what you are doing so I may be
wrong (as I often enough am). How do you arrive at a CofR of 0.0385
(by that I also ask, what is your CofR)? Any such calculation shold
include the allel frequencies for that population, shouldn't it? How
else can it work?

ErikW


- Show quoted text -




I would have thought the evolution/population
geneticists would have jumped on the
Arizona data, precisely to give a large
scale CoR or whatever more appropriate
terminology, for a general population.
The underlying maths is at the end of the
piece, indeed built on allele frequencies.

This is my attempt to explain the
Arizona partial DNA profile matches.
Requiring the use
approximation for non-integre factorials
via the Gamma function and back to factorial notation.
(n+a)! == n! * (n + (1+a)/2 )^a or a Gamma Function Calculator
on the net.
For various coefficients of relationship (C of R)
so statistical combinations of eg 6.5 from 9
( for brothers, CofR =1/2 so 13/2) as well as 9 from 13 so
numbers like 6.5!, 3.25!, 0.5! etc
For a general population C of R of
0.0385 or 0.5/13, half a locus co-ancestry on average,
T9 (for > 5.6 per cent, CofR 0.5/13) = 2.6 * 10^-11,
from the background maths at end of this piece.
134, 9 loci matches
22.4 , 10 loci matches
2.2, 11 loci matches
0.07 , 12 loci matches
T10 = 1.44*10^-11, T11 = 3.9*10^-12.
On top of that it is only required to add
2 or 3 people from one consanguinous family
so increasing the C of R to 7/8 , to supply
the related 11 and 12 loci matches.
T9 (for > 6 per cent, CofR 0.4/13) = 3.6 * 10^-11.
149, 9 loci matches
39 , 10 loci matches
3.1, 11 loci matches

The Arizona Data is 144, 9 loci matches; 22 , 10 loci matches;
2 related 11 loci matches and 1 related 12 loci match in
65,493 , 13 loci DNA profiles.
It is impossible , by anyone's maths
to get those 11 and 12 loci related matches by
normal human mating.

My maths involves using the formula for
the first match , from the loaded dice derivation.
For the Arizona data near-match above
my T13 value was determined by ignoring all
AFs ( allele frequencies from the RCMP site )
less than 5.6 per cent . The simulated
populations below used 6 per cent as the cut-off.
My "coefficient of allelic co-ancestry"
for the Arizona simulation above is
T13 = 3.6 * 10^-14 for 13 loci

Then scaling T by 715, 286, 78 etc
for T9, T10,T11 etc, partial matches and then scaling
by the non-integre combination factors
for 1/2 , 1/4, 1/8, 1/26 etc shared DNA.
The bounds for T13 restrict it to the range of
allele frequencies to be >5.6 per cent to > 6 per cent
(CofR in range 0.4/13 and 0.5/13 )
to give the unrelated 9 loci mastches to be less than
144 on the one hand and not more than 0.6666 unrelated
11 loci matches on the other.
So for T9 = the T for 9 loci, x9 =
number of 9 loci matches, n = half the
square of the population being considered,
C(2.5,9) the number of combinations of
2.5 from 9 because 6.5 (13 loci/2) match as brothers say.
Then x9 = T9 * n * C(9,13) * C(2.5,9)

Attemps to simulate a sub-population of
related fathers /sond/cousins along
with an unrelated subpopulation failed to give
anything like
the Arizona 144,22,2,1 numbers .
There was always in excess of 0.6666, 11
loci matches if you got the right number of
9s or 10s to match.

Even if there was as much as 2.5 percent cousin-
cousin marriages (USA generally less than 1 percent)
that would only contribute a single 9 locus partial match.
It is a juggling optimisation exercise with the
main constraints being:
I've allowed the maximum of 11 loci unrelated partial
matches to be less than 0.6666 so less than one
when summed and rounded, precludes putting the co-ancestry
coefficient too high.
For related matches , 11 loci,
Quote:
1.3333 to give 2 when rounded,
precludes increasing the related numbers too high.



I've made sons and brothers non-exclusive to
a certain extent so say 59,000 unrelated
plus 6,000 (fathers and sons F+Ss) and 4,000 brothers
( B+Bs ) can sum to 65,000. I've also added a
cross-component of random matches between the
related and untrelated sections, to the unrelated
side, relatively minor, but considered.
Target from the Arizona data
144 pairs at 9 loci
22 pairs at 10 loci
2 pairs at 11 loci
1 pair at 12 loci

........... Unrelated / F+Ss / B+Bs / .... Totals
............ 59,000 ...6,000 .. 4,000 ... 65,000
9 loci... 54.3 ...... 26.9 ... 19.9 ... 101.1
10 loci . 8.7 ....... 12.4 ... 4.7 .... 25.8
11 loci . 0.64 ...... 1.33 ... 0.6 .... 2.6
12 loci . 0.02 ...... 0.05 ... 0.02 ... 0.09

One 12 loci match is easily added by the use of
one 7/8 consanguinity pair of grandfather and
grandson via incestuous son and daughter mating.
Changing to the following gives a better match but
I do not know how to increase the 9 loci figures
without increasing the 10 loci figures outside
the bounds. Cousin matches do not work
either.

........... Unrelated / F+S / B+B / .... Totals
............ 59,000 ...1,000 .. 5,500 ... 65,000
9 loci... 54.3 ...... 1.0 ... 51.1 .... 106
10 loci . 8.7 ....... 0.46 ... 12.1 .... 21.3
11 loci . 0.64 ...... 0.05 ... 1.27 .... 1.96
and adding a 7/8 consanguinous pair for the
12 loci match.


So the simpler simulations using a non-Bayesian coefficient of
co-ancestry for everyone of order of about one allele in 26 gives
the closer results, unless anyone has any ideas how to juggle a
hypothetical population of fathers, brothers, cousins, etc.

Background maths

Consider a 10 faced loaded dice with weighting
such that
face 0 or face 1 have a probability of 0.2 each
face 2 or 3 , probability 0.15 each
and faces 4 to 9 , 0.05 each

Toss 10 times and record the 10 digit number
Repeat n times.
Determine a number N where a repeat
of a previously occuring 10 digit number will occur.

The probability of a random pair of single
digits matching is
sum of squares = 2(.2^2) + 2(.15^2) + 6(.05)^2 = 0.14. The digits
in each of the 10 positions are independent, so the overall
probability of all 10 digits matching is ( sum of squares )^10 ~= 2.893e-9,
and call p.

To generate N numbers, there are N(N-1)/2 pairs of numbers which
must all be different to avoid a repeat. If the pairs were
independent then the expected number of repeats would be pN(N-1)/2,
which will be 1 when N is about 26,000. The pairs won't actually be
independent, but this estimate for the expected value should be fairly
close for N << 1/p.
N = SQRT(2/p)
By comparison, if
the numbers were unbiased then about 1 repeat in the
first 140,000 numbers.
Now convert to factor-in directed pairs
as by convention heterozygotic pairs are
directed, low-high eg (12,15)
If all pairs were directed then the new directed pair (dp) probability
would by, taking 2 at a time, be dp = 2p*p but the pairs 00,11,22 etc
are not directed so 2p*p is inflated by the probability of just
the doublets so deduct this factor from the fomula.
The factor dp now becomes (2 * 0.14^2 - 0.14^3)^5.
Now convert to the DNA profile situation and formula becomes

For n loci 1..... 5 (6,9,10,13,15 or any number)
and m (valid) alleles at each locus and 2 per locus.
So Allele Frequencies are AF1 ..... AFm
Let Sn be the sum of the squares of AFs at locus n
ie Sn = AF1^2 + AF2^2 +...... + AFm^2 for each n
Let Qn = Sn^2 for each n
Let p = (Q1 * Q2 * .... * Qn ) [(2-S1) * (2-S2) * .... * (2-Sn)]
Then N = minimum number before finding a match is
N = SQRT (2/p)
For partial matches in Arizona you
have a fixed value of N so modify p by the combination factor
of selecting 9 loci from 13 and selecting (9 minus shared DNA) from 9
for the 9 loci partial matches , for example.
To get the values of T13, (T13=p), for the 9 and 10
loci nmatches it is a simple matter of adjusting
the sets of AFs by progressively ignoring all
sub 1 percenters rescaling to sum to 1,
2 percenters etc until you end up with
a set of reduced AFs that for the Arizona set of
data means modified AF sets in range (>5.6
percent) to (>6 percent).
From random population simulations I know that
full (10 rather than 13 loci matching DNA profiles)
never have contributions
from minor alleles. The vast majority of partial
matches likewise do not involve the minor ones.
William Morse
Posted: Tue Feb 06, 2007 8:51 am
Guest
"Paul Nutteing (valid email address in post script )"
<nutteing@quickfindit.com> wrote in
news:epvvak$1n8o$1@darwin.ediacara.org:

(snip most of post to save bandwidth, as I have no comments on it)

Quote:
Even if there was as much as 2.5 percent cousin-
cousin marriages (USA generally less than 1 percent)
that would only contribute a single 9 locus partial match.
It is a juggling optimisation exercise with the
main constraints being:
I've allowed the maximum of 11 loci unrelated partial
matches to be less than 0.6666 so less than one
when summed and rounded, precludes putting the co-ancestry
coefficient too high.
For related matches , 11 loci,
1.3333 to give 2 when rounded,
precludes increasing the related numbers too high.


I have not tried to follow the math. The number of 0.0385 seems high.
Obviously there has to be intermarriage of cousins, since if you go back
thirty generations you have a billion ancestors, and there weren't that
many people alive then. But based on some comments in "Mapping Human
History", it seems that typically common ancestors would appear at about
ten generations back, and that gives a much lower figure than 0.0385. You
might try looking at the work of Cavalli-Sforza, who has traced a lot of
human ancestry, to see if you can find figures for average relatedness.




--
Yours,

Bill Morse

It was once projected that a million monkeys with a million typewriters
could, by random typing, eventually reproduce the works of Shakespeare.
Now, thanks to the internet, we know that this is not true
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Sat Oct 11, 2008 3:13 am