Linkage analysis in the presence of errors IV: Joint pseudomarker analysis of linkage and/or linkage disequilibrium on a mixture of pedigrees and singletons when the mode of inheritance cannot be accurately specified

Harald H H Göring, Joseph D. Terwilliger

Research output: Contribution to journalArticle

134 Citations (Scopus)

Abstract

There is a lot of confusion in the literature about the 'differences' between 'model-based' and 'model-free' methods and about which approach is better suited for detection of the genes predisposing to complex multifactorial phenotypes. By starting from first principles, we demonstrate that the differences between the two approaches have more to do with study design than statistical analysis. When simple data structures are repeatedly ascertained, no assumptions about the genotype-phenotype relationship need to be made for the analysis to be powerful, since simple data structures admit only a small number of df. When more complicated and/or heterogeneous data structures are ascertained, however, the number of df in the underlying probability model is too large to have a powerful, truly 'model-free' test. So-called 'model-free' methods typically simplify the underlying probability model by implicitly assuming that, in some sense, all meioses connecting two affected individuals are informative for linkage with identical probability and that the affected individuals in a pedigree share as many disease- predisposing alleles as possible. By contrast, 'model-based' methods add structure to the underlying parameter space by making assumptions about the genotype-phenotype relationship, making it possible to probabilistically assign disease-locus genotypes to all individuals in the data set on the basis of the observed phenotypes. In this study, we demonstrate the equivalence of these two approaches in a variety of situations and exploit this equivalence to develop more powerful and efficient likelihood-based analogues of 'model-free' tests of linkage and/or linkage disequilibrium. Through the use of a 'pseudomarker' locus to structure the space of observations, sib-pairs, triads, and singletons can be analyzed jointly, which will lead to tests that are more well-behaved, efficient, and powerful than traditional 'model-free' tests such as the affected sib-pair, transmission/disequilibrium, haplotype relative risk, and case-control tests. Also described is an extension of this approach to large pedigrees, which, in practice, is equivalent to affected relative-pair analysis. The proposed methods are equally applicable to two-point and multipoint analysis (using complex-valued recombination fractions).

Original languageEnglish (US)
Pages (from-to)1310-1327
Number of pages18
JournalAmerican Journal of Human Genetics
Volume66
Issue number4
DOIs
StatePublished - 2000
Externally publishedYes

Fingerprint

Linkage Disequilibrium
Pedigree
Phenotype
Genotype
Meiosis
Haplotypes
Genetic Recombination
Alleles
Genes

ASJC Scopus subject areas

  • Genetics

Cite this

@article{05fb8b164b6a4162ab79b6586af8cf63,
title = "Linkage analysis in the presence of errors IV: Joint pseudomarker analysis of linkage and/or linkage disequilibrium on a mixture of pedigrees and singletons when the mode of inheritance cannot be accurately specified",
abstract = "There is a lot of confusion in the literature about the 'differences' between 'model-based' and 'model-free' methods and about which approach is better suited for detection of the genes predisposing to complex multifactorial phenotypes. By starting from first principles, we demonstrate that the differences between the two approaches have more to do with study design than statistical analysis. When simple data structures are repeatedly ascertained, no assumptions about the genotype-phenotype relationship need to be made for the analysis to be powerful, since simple data structures admit only a small number of df. When more complicated and/or heterogeneous data structures are ascertained, however, the number of df in the underlying probability model is too large to have a powerful, truly 'model-free' test. So-called 'model-free' methods typically simplify the underlying probability model by implicitly assuming that, in some sense, all meioses connecting two affected individuals are informative for linkage with identical probability and that the affected individuals in a pedigree share as many disease- predisposing alleles as possible. By contrast, 'model-based' methods add structure to the underlying parameter space by making assumptions about the genotype-phenotype relationship, making it possible to probabilistically assign disease-locus genotypes to all individuals in the data set on the basis of the observed phenotypes. In this study, we demonstrate the equivalence of these two approaches in a variety of situations and exploit this equivalence to develop more powerful and efficient likelihood-based analogues of 'model-free' tests of linkage and/or linkage disequilibrium. Through the use of a 'pseudomarker' locus to structure the space of observations, sib-pairs, triads, and singletons can be analyzed jointly, which will lead to tests that are more well-behaved, efficient, and powerful than traditional 'model-free' tests such as the affected sib-pair, transmission/disequilibrium, haplotype relative risk, and case-control tests. Also described is an extension of this approach to large pedigrees, which, in practice, is equivalent to affected relative-pair analysis. The proposed methods are equally applicable to two-point and multipoint analysis (using complex-valued recombination fractions).",
author = "G{\"o}ring, {Harald H H} and Terwilliger, {Joseph D.}",
year = "2000",
doi = "10.1086/302845",
language = "English (US)",
volume = "66",
pages = "1310--1327",
journal = "American Journal of Human Genetics",
issn = "0002-9297",
publisher = "Cell Press",
number = "4",

}

TY - JOUR

T1 - Linkage analysis in the presence of errors IV

T2 - Joint pseudomarker analysis of linkage and/or linkage disequilibrium on a mixture of pedigrees and singletons when the mode of inheritance cannot be accurately specified

AU - Göring, Harald H H

AU - Terwilliger, Joseph D.

PY - 2000

Y1 - 2000

N2 - There is a lot of confusion in the literature about the 'differences' between 'model-based' and 'model-free' methods and about which approach is better suited for detection of the genes predisposing to complex multifactorial phenotypes. By starting from first principles, we demonstrate that the differences between the two approaches have more to do with study design than statistical analysis. When simple data structures are repeatedly ascertained, no assumptions about the genotype-phenotype relationship need to be made for the analysis to be powerful, since simple data structures admit only a small number of df. When more complicated and/or heterogeneous data structures are ascertained, however, the number of df in the underlying probability model is too large to have a powerful, truly 'model-free' test. So-called 'model-free' methods typically simplify the underlying probability model by implicitly assuming that, in some sense, all meioses connecting two affected individuals are informative for linkage with identical probability and that the affected individuals in a pedigree share as many disease- predisposing alleles as possible. By contrast, 'model-based' methods add structure to the underlying parameter space by making assumptions about the genotype-phenotype relationship, making it possible to probabilistically assign disease-locus genotypes to all individuals in the data set on the basis of the observed phenotypes. In this study, we demonstrate the equivalence of these two approaches in a variety of situations and exploit this equivalence to develop more powerful and efficient likelihood-based analogues of 'model-free' tests of linkage and/or linkage disequilibrium. Through the use of a 'pseudomarker' locus to structure the space of observations, sib-pairs, triads, and singletons can be analyzed jointly, which will lead to tests that are more well-behaved, efficient, and powerful than traditional 'model-free' tests such as the affected sib-pair, transmission/disequilibrium, haplotype relative risk, and case-control tests. Also described is an extension of this approach to large pedigrees, which, in practice, is equivalent to affected relative-pair analysis. The proposed methods are equally applicable to two-point and multipoint analysis (using complex-valued recombination fractions).

AB - There is a lot of confusion in the literature about the 'differences' between 'model-based' and 'model-free' methods and about which approach is better suited for detection of the genes predisposing to complex multifactorial phenotypes. By starting from first principles, we demonstrate that the differences between the two approaches have more to do with study design than statistical analysis. When simple data structures are repeatedly ascertained, no assumptions about the genotype-phenotype relationship need to be made for the analysis to be powerful, since simple data structures admit only a small number of df. When more complicated and/or heterogeneous data structures are ascertained, however, the number of df in the underlying probability model is too large to have a powerful, truly 'model-free' test. So-called 'model-free' methods typically simplify the underlying probability model by implicitly assuming that, in some sense, all meioses connecting two affected individuals are informative for linkage with identical probability and that the affected individuals in a pedigree share as many disease- predisposing alleles as possible. By contrast, 'model-based' methods add structure to the underlying parameter space by making assumptions about the genotype-phenotype relationship, making it possible to probabilistically assign disease-locus genotypes to all individuals in the data set on the basis of the observed phenotypes. In this study, we demonstrate the equivalence of these two approaches in a variety of situations and exploit this equivalence to develop more powerful and efficient likelihood-based analogues of 'model-free' tests of linkage and/or linkage disequilibrium. Through the use of a 'pseudomarker' locus to structure the space of observations, sib-pairs, triads, and singletons can be analyzed jointly, which will lead to tests that are more well-behaved, efficient, and powerful than traditional 'model-free' tests such as the affected sib-pair, transmission/disequilibrium, haplotype relative risk, and case-control tests. Also described is an extension of this approach to large pedigrees, which, in practice, is equivalent to affected relative-pair analysis. The proposed methods are equally applicable to two-point and multipoint analysis (using complex-valued recombination fractions).

UR - http://www.scopus.com/inward/record.url?scp=0000874908&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0000874908&partnerID=8YFLogxK

U2 - 10.1086/302845

DO - 10.1086/302845

M3 - Article

C2 - 10731466

AN - SCOPUS:0000874908

VL - 66

SP - 1310

EP - 1327

JO - American Journal of Human Genetics

JF - American Journal of Human Genetics

SN - 0002-9297

IS - 4

ER -