Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Ординатура / Офтальмология / Английские материалы / Myopia Animal Models to Clinical Trials_Beuerman, Saw, Tan_2009.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
3.4 Mб
Скачать

225 Statistical Analysis of Genome-wide Association Studies for Myopia

and then follow-up with haplotype association analyses. It is clear that we may miss the haplotypes that do not show single locus main effect but joint effects from multiple loci to the target phenotype.

Correlated Phenotypes

Although myopia is clinically determined by refractive errors, as mentioned before, other ocular biometrics also play a role in the development of myopia. Therefore, these ocular biometric are correlated in a certain degree. For instance, in Table 2, we show the squared Pearson correlation coefficient (r 2) for all pairs of SPH, SE, axial length, and cornea curvature based on the right eye data using data from Chinese participants in SCORM study. The correlation among SPH, SE, and axial length are indeed strong (r 2 > 0.56), in which axial length is negatively correlated with SPH and SE. Axial length is correlated at some degree with cornea curvature (r 2 = 0.17) and anterior chamber depth (r 2 = 0.19). If GWA analyses are conducted for these endophentoypes of myopia, an immediately question will be whether one should correct multiple testing based on the number of traits tested on top of the number of markers in the panel. This is an open question without an absolute answer. Our view is that if the traits are highly correlated, we do not consider the needs for correcting multiple testing for the number of traits tested since they are equivalent to a single trait.

Another aspect of myopia-related phenotypes is that each biometric is measured for the right and left eye, respectively. Should one analyze data from each eye individually or a summary form of both eyes such as the average? Certainly, the results will vary depending on the degree of the

Table 2. Pairwise Squared Pearson Correlation Coefficient (r 2) Across Four Ocular Biometrics

 

 

 

 

 

Anterior

 

 

Sphere

Axial

Cornea

Chamber

 

Sphere

Equivalent

Length

Curvature

Depth

 

 

 

 

 

 

Sphere

1

0.98

0.56

0.01

0.05

Sphere equivalent

 

1.00

0.57

0.02

0.05

Axial length

 

 

1.00

0.17

0.19

Cornea curvature

 

 

0.00

1.00

0.00

Anterior chamber depth

 

 

 

 

1

 

 

 

 

 

 

226 Y.J. Li and Q. Fan

similarity between the measures of both eyes for these approaches. Apart from analyzing single phenotype at a time, there are statistical methods available for analyzing correlated data jointly. For instance, generalized equation (GEE) can take into account correlation within the same strata (same individual in this case), which can serve as an alternative approach. Here, we utilize the GWA data from SCORM to illustrate the association results (log 10(p-value)) in a region (from 23,555,218 to 24,149,104 bp) of chromosome 11 MYP7 locus using GEE analysis for SE from both eyes, and linear model analyses for SE from the right eye and left eye, respectively, and the average of SE of both eyes. This analysis shows that both GEE and linear model analysis for the average SE revealed intermediate results between those obtained from the right and left eye, respectively, for almost all markers tested (Fig. 2). Although in this example the

5

right eye left eye GEE average

 

4

value)

3

−log10(p

2

 

1

 

0

23600000 23700000 23800000 23900000 24000000 24100000

basepair position

Figure 2. Association results (−log 10( p-values)) of the linear model analysis using SE of right and left eye, respectively, and the average SE of both eyes from linear model analysis, and GEE analysis using SE from both eyes.

227 Statistical Analysis of Genome-wide Association Studies for Myopia

use of average SE from both eyes seems to provide better p-values (smallest p-values) for the top-hit marker than GEE, this does not dismiss the GEE analysis until more formal evaluation is done. The fact that GEE or the analysis on the average SE from both eyes support the top finding from the right or left eye will enhance the credibility of the conclusion for the study.

Imputation and Meta-Analysis

Under the phenomenon of common variants of common diseases, most susceptibility variants have small to moderate genetic effects to the disease. Therefore, without a large sample size, it is hard to detect true positive results in a single association study, which is often constrained by the budget and sample resources.42 Meta-analysis, by combining evidence from comparable independent association studies, thus provides a robust approach to enhance statistics power and effective sample size.43,44 Application of meta-analysis in GWAS society is becoming a standard practice recently to identify loci related to the risk of disease, exemplified by studies for diabetes, Alzheimer, bipolar disorder, etc.45–47

Prior to meta-analysis, as described earlier, one should ensure that the phenotypes are comparable and were measured in similar ways across datasets. In addition, due to the rapid changes of SNP chips, different studies may utilize different versions of SNP chips with different coverage of SNP content. That is, not all SNPs were typed consistently across studies. The development of several imputation methods for inferring genotypes of untyped markers has provided a solution for this problem. The basic idea behind imputation is to utilize the correlation among untyped and typed markers to infer the genotypes of untyped markers in each dataset.48 This correlation mostly relies on the information obtained from the reference panel that has genotypes of both untyped and typed markers. With the availability of more than three million genotype data from the International HapMap Project, most non-overlapping SNPs between SNP chips can now be inferred. It should be noted that imputation is generally computational intensive. IMPUT,49 MACH,50 BEAGLE,51 and BIMBAM52 are the frequently used programs for imputation. Each of them has different strengths and weaknesses, but none of them is optimal for all situations.48,53 Nonetheless, with these imputation programs becoming available, we now can impute untyped markers at the first stage to allow assessing multiple datasets for the same set of SNPs.49

228 Y.J. Li and Q. Fan

Meta-analysis in the setting of genetic studies refers to combining summary statistics of overlapping SNPs from multiple genetic association studies. Since combining raw individual genotype and phenotype data across studies to perform pooled analysis is in general difficult, the metaanalysis is a reasonable surrogate to assess the association results across all datasets. Here, we describe a few meta-analysis methods.

First, the simplest meta-analysis method is Fisher’s methods Tfisher = –2Σ log(pi), where pi is p value of study I, i = 1,…,k. Tfisher follows a χ2 distribution of 2k degrees of freedom, where k is the total

number of datasets. Since Fisher’s method takes only information from the p-values, it is important to keep in mind that Fisher’s method should be applied to the markers with the same direction of the effect to the susceptibility of the disease. Second, Mantel–Haenszel methods are commonly used for dichotomous traits if the information on 2 × 2 table can be recoverable from each study.54 In combining the odds ratio, weight is usually given proportionally to the precision of its results in each study. Finally, if a 2 × 2 table is not available in each study, such as if p values were obtained from logistical regression framework in order to adjust for potential confounding covariates, using z-score statistics to compute the meta-p values is the best. Z-score statistics are wildly used in practice for metaanalysis since Z-score could be easily converted in each study and the direction of effect is manifested in itself.55 Combined z-score is calculated as:

Zmeta = Σzixwi, where zi is the z-score from study i and wi is the weight of study i. Once pooled z score is obtained, the corresponding p values for the

combined studies can be computed as well. Most widely used weights are the inverse of the variance of the effect estimate for each study. The pooled inverse variance-weighted z-score is calculated as the sum of individual z score using inverse variance as weight. In case the variance is not given in the summary statistics or standard error, SE in the equation below, is not on the same unit (for example, the quantitative trait is not measure on the same unit), z score can then be summed across multiple studies weighting them by study sample size

Zmeta =

bi

× wi

where wi =

Ni

 

 

.

SEi

N total

 

 

 

 

 

It is unlikely that every dataset for a meta-analysis is derived from a single homogenous population with the same genetic effect. Therefore, it is important to access the heterogeneity across datasets. Random effects,