articles+ search results
349 articles+ results
Sort by relevance
Number of results to display per page
20 per page
Select all
Unselect all
Subjects
Male, Humans, Bayes Theorem, Polymorphism, Single Nucleotide genetics, Computer Simulation, Genome-Wide Association Study methods, Quantitative Trait Loci, and Prostatic Neoplasms genetics
Abstract
The aim of fine mapping is to identify genetic variants causally contributing to complex traits or diseases. Existing fine-mapping methods employ Bayesian discrete mixture priors and depend on a pre-specified maximum number of causal variants, which may lead to sub-optimal solutions. In this work, we propose a Bayesian fine-mapping method called h2-D2, utilizing a continuous global-local shrinkage prior. We also present an approach to define credible sets of causal variants in continuous prior settings. Simulation studies demonstrate that h2-D2 outperforms current state-of-the-art fine-mapping methods such as SuSiE and FINEMAP in accurately identifying causal variants and estimating their effect sizes. We further applied h2-D2 to prostate cancer analysis and discovered some previously unknown causal variants. In addition, we inferred 369 target genes associated with the detected causal variants and several pathways that were significantly over-represented by these genes, shedding light on their potential roles in prostate cancer development and progression. Competing Interests: Declaration of interests The authors declare no competing interests. (Copyright © 2023 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.)
Subjects
Chromosome Mapping methods, Linkage Disequilibrium, Phenotype, Polymorphism, Single Nucleotide, Quantitative Trait Loci genetics, and Genome-Wide Association Study methods
Abstract
Genome-wide association studies (GWASs) have achieved remarkable success in associating thousands of genetic variants with complex traits. However, the presence of linkage disequilibrium (LD) makes it challenging to identify the causal variants. To address this critical gap from association to causation, many fine-mapping methods have been proposed to assign well-calibrated probabilities of causality to candidate variants, taking into account the underlying LD pattern. In this manuscript, we introduce a statistical framework that incorporates expression quantitative trait locus (eQTL) information to fine-mapping, built on the sum of single-effects (SuSiE) regression model. Our new method, SuSiE2, connects two SuSiE models, one for eQTL analysis and one for genetic fine-mapping. This is achieved by first computing the posterior inclusion probabilities (PIPs) from an eQTL-based SuSiE model with the expression level of the candidate gene as the phenotype. These calculated PIPs are then utilized as prior inclusion probabilities for risk variants in another SuSiE model for the trait of interest. By prioritizing functional variants within the candidate region using eQTL information, SuSiE2 improves SuSiE by increasing the detection rate of causal SNPs and reducing the average size of credible sets. We compared the performance of SuSiE2 with other multi-trait fine-mapping methods with respect to power, coverage, and precision through simulations and applications to the GWAS results of Alzheimer's disease (AD) and body mass index (BMI). Our results demonstrate the better performance of SuSiE2, both when the in-sample linkage disequilibrium (LD) matrix and an external reference panel is used in inference. Competing Interests: The authors have declared that no competing interests exist. (Copyright: © 2024 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
Subjects
Humans, Chromatin genetics, Polymorphism, Single Nucleotide, Genome-Wide Association Study methods, Quantitative Trait Loci, and Databases, Genetic
Abstract
Annotating genetic variants to their target genes is of great importance in unraveling the causal variants and genetic mechanisms that underlie complex diseases. However, disease-associated genetic variants are often located in non-coding regions and manifest context-specific effects, making it challenging to accurately identify the target genes and regulatory mechanisms. Here, we present TargetGene (https://ngdc.cncb.ac.cn/targetgene/), a comprehensive database reporting target genes for human genetic variants from various aspects. Specifically, we collected a comprehensive catalog of multi-omics data at the single-cell and bulk levels and from various human tissues, cell types and developmental stages. To facilitate the identification of Single Nucleotide Polymorphism (SNP)-to-gene connections, we have implemented multiple analytical tools based on chromatin co-accessibility, 3D interaction, enhancer activities and quantitative trait loci, among others. We applied the pipeline to evaluate variants from nearly 1300 Genome-wide association studies (GWAS) and assembled a comprehensive atlas of multiscale regulation of genetic variants. TargetGene is equipped with user-friendly web interfaces that enable intuitive searching, navigation and browsing through the results. Overall, TargetGene provides a unique resource to empower researchers to study the regulatory mechanisms of genetic variants in complex human traits. (© The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.)
Subjects
Polymorphism, Single Nucleotide, Genome-Wide Association Study methods, Quantitative Trait Loci genetics, and Gene Expression Regulation genetics
Abstract
An expression quantitative trait locus (eQTL) is a chromosomal region where genetic variants are associated with the expression levels of specific genes that can be both nearby or distant. The identifications of eQTLs for different tissues, cell types, and contexts have led to a better understanding of the dynamic regulations of gene expressions and implications of functional genes and variants for complex traits and diseases. Although most eQTL studies have been performed on data collected from bulk tissues, recent studies have demonstrated the importance of cell-type-specific and context-dependent gene regulations in biological processes and disease mechanisms. In this review, we discuss statistical methods that have been developed to enable the detection of cell-type-specific and context-dependent eQTLs from bulk tissues, purified cell types, and single cells. We also discuss the limitations of the current methods and future research opportunities. Competing Interests: Conflict of interest The authors declare no conflict of interest. (Copyright © 2023 The Authors. Published by Elsevier Ltd.. All rights reserved.)
Subjects
Humans, Genome-Wide Association Study methods, RNA genetics, T-Lymphocytes, RNA, Circular, and Quantitative Trait Loci
Abstract
Motivation: Molecular quantitative trait locus (QTL) mapping has proven to be a powerful approach for prioritizing genetic regulatory variants and causal genes identified by genome-wide association studies. Recently, this success has been extended to circular RNA (circRNA), a potential group of RNAs that can serve as markers for the diagnosis, prognosis, or therapeutic targets of various human diseases. However, a well-developed computational pipeline for circRNA QTL (circQTL) discovery is still lacking. Results: We introduce an integrative method for circQTL mapping and implement it as an automated pipeline based on Nextflow, named cscQTL. The proposed method has two main advantages. Firstly, cscQTL improves the specificity by systematically combining outputs of multiple circRNA calling algorithms to obtain highly confident circRNA annotations. Secondly, cscQTL improves the sensitivity by accurately quantifying circRNA expression with the help of pseudo references. Compared to the single method approach, cscQTL effectively identifies circQTLs with an increase of 20%-100% circQTLs detected and recovered all circQTLs that are highly supported by the single method approach. We apply cscQTL to a dataset of human T cells and discover genetic variants that control the expression of 55 circRNAs. By colocalization tests, we further identify circBACH2 and circYY1AP1 as potential candidates for immune disease regulation. Availability and Implementation: cscQTL is freely available at: https://github.com/datngu/cscQTL and https://doi.org/10.5281/zenodo.7851982. (© The Author(s) 2023. Published by Oxford University Press.)
Subjects
Humans, Genome-Wide Association Study methods, Bayes Theorem, Brain, Polymorphism, Single Nucleotide genetics, Genetic Predisposition to Disease genetics, Membrane Proteins genetics, Quantitative Trait Loci genetics, and Suicide
Abstract
Recent large-scale genome-wide association studies (GWAS) have started to identify potential genetic risk loci associated with risk of suicide; however, a large portion of suicide-associated genetic factors affecting gene expression remain elusive. Dysregulated gene expression, not assessed by GWAS, may play a significant role in increasing the risk of suicide death. We performed the first comprehensive genomic association analysis prioritizing brain expression quantitative trait loci (eQTLs) within regulatory regions in suicide deaths from the Utah Suicide Genetic Risk Study (USGRS). 440,324 brain-regulatory eQTLs were obtained by integrating brain eQTLs, histone modification ChIP-seq, ATAC-seq, DNase-seq, and Hi-C results from publicly available data. Subsequent genomic analyses were conducted in whole-genome sequencing (WGS) data from 986 suicide deaths of non-Finnish European (NFE) ancestry and 415 ancestrally matched controls. Additional independent USGRS suicide deaths with genotyping array data (n = 4657) and controls from the Genome Aggregation Database were explored for WGS result replication. One significant eQTL locus, rs926308 (p = 3.24e-06), was identified. The rs926308-T is associated with lower expression of RFPL3S, a gene important for neocortex development and implicated in arousal. Gene-based analyses performed using Sherlock Bayesian statistical integrative analysis also detected 20 genes with expression changes that may contribute to suicide risk. From analyzing publicly available transcriptomic data, ten of these genes have previous evidence of differential expression in suicide death or in psychiatric disorders that may be associated with suicide, including schizophrenia and autism (ZNF501, ZNF502, CNN3, IGF1R, KLHL36, NBL1, PDCD6IP, SNX19, BCAP29, and ARSA). Electronic health records (EHR) data was further merged to evaluate if there were clinically relevant subsets of suicide deaths associated with genetic variants. In summary, our study identified one risk locus and ten genes associated with suicide risk via gene expression, providing new insight into possible genetic and molecular mechanisms leading to suicide. (© 2023. The Author(s).)
Subjects
Humans, Chromosome Mapping, Research Design, Quantitative Trait Loci, and Genome-Wide Association Study methods
Abstract
Recent advancements in single-cell technologies have enabled expression quantitative trait locus (eQTL) analysis across many individuals at single-cell resolution. Compared with bulk RNA sequencing, which averages gene expression across cell types and cell states, single-cell assays capture the transcriptional states of individual cells, including fine-grained, transient, and difficult-to-isolate populations at unprecedented scale and resolution. Single-cell eQTL (sc-eQTL) mapping can identify context-dependent eQTLs that vary with cell states, including some that colocalize with disease variants identified in genome-wide association studies. By uncovering the precise contexts in which these eQTLs act, single-cell approaches can unveil previously hidden regulatory effects and pinpoint important cell states underlying molecular mechanisms of disease. Here, we present an overview of recently deployed experimental designs in sc-eQTL studies. In the process, we consider the influence of study design choices such as cohort, cell states, and ex vivo perturbations. We then discuss current methodologies, modeling approaches, and technical challenges as well as future opportunities and applications.
Subjects
Humans, Likelihood Functions, Genome-Wide Association Study methods, Quantitative Trait Loci genetics, and Polymorphism, Single Nucleotide
Abstract
Expression quantitative trait loci (eQTL) studies utilize regression models to explain the variance of gene expressions with genetic loci or single nucleotide polymorphisms (SNPs). However, regression models for eQTL are challenged by the presence of high dimensional non-sparse and correlated SNPs with small effects, and nonlinear relationships between responses and SNPs. Principal component analyses are commonly conducted for dimension reduction without considering responses. Because of that, this non-supervised learning method often does not work well when the focus is on discovery of the response-covariate relationship. We propose a new supervised structural dimensional reduction method for semiparametric regression models with high dimensional and correlated covariates; we extract low-dimensional latent features from a vast number of correlated SNPs while accounting for their relationships, possibly nonlinear, with gene expressions. Our model identifies important SNPs associated with gene expressions and estimates the association parameters via a likelihood-based algorithm. A GTEx data application on a cancer related gene is presented with 18 novel eQTLs detected by our method. In addition, extensive simulations show that our method outperforms the other competing methods in bias, efficiency, and computational cost. (© 2023 John Wiley & Sons Ltd.)
Subjects
Humans, Greece, Gene Expression Regulation, Genotype, Polymorphism, Single Nucleotide, Genome-Wide Association Study methods, Genetic Predisposition to Disease, and Quantitative Trait Loci
Abstract
Background: Expression quantitative trait loci (eQTL) studies provide insights into regulatory mechanisms underlying disease risk. Expanding studies of gene regulation to underexplored populations and to medically relevant tissues offers potential to reveal yet unknown regulatory variants and to better understand disease mechanisms. Here, we performed eQTL mapping in subcutaneous (S) and visceral (V) adipose tissue from 106 Greek individuals (Greek Metabolic study, GM) and compared our findings to those from the Genotype-Tissue Expression (GTEx) resource. Results: We identified 1,930 and 1,515 eGenes in S and V respectively, over 13% of which are not observed in GTEx adipose tissue, and that do not arise due to different ancestry. We report additional context-specific regulatory effects in genes of clinical interest (e.g. oncogene ST7) and in genes regulating responses to environmental stimuli (e.g. MIR21, SNX33). We suggest that a fraction of the reported differences across populations is due to environmental effects on gene expression, driving context-specific eQTLs, and suggest that environmental effects can determine the penetrance of disease variants thus shaping disease risk. We report that over half of GM eQTLs colocalize with GWAS SNPs and of these colocalizations 41% are not detected in GTEx. We also highlight the clinical relevance of S adipose tissue by revealing that inflammatory processes are upregulated in individuals with obesity, not only in V, but also in S tissue. Conclusions: By focusing on an understudied population, our results provide further candidate genes for investigation regarding their role in adipose tissue biology and their contribution to disease risk and pathogenesis. (© 2023. BioMed Central Ltd., part of Springer Nature.)
Subjects
Cattle genetics, Humans, Animals, Reproducibility of Results, Genotype, Phenotype, Polymorphism, Single Nucleotide, Genome-Wide Association Study veterinary, Genome-Wide Association Study methods, and Quantitative Trait Loci
Abstract
Genotype data from dairy cattle selection programs have greatly facilitated GWAS to identify variants related to economic traits. Results can enhance the accuracy of genomic prediction, analyze more complex models that go beyond additive effects, elucidate the genetic architecture of a trait, and finally, decipher the underlying biology of traits. The entire process, comprising data generation, quality control, statistical analyses, interpretation of association results, and linking results to biology should be designed and executed to minimize the generation of false-positive and false-negative associations and misleading links to biological processes. This review aims to provide general guidelines for data analysis that address data quality control, association tests, adjustment for population stratification, and significance evaluation to improve the reliability of conclusions. We also provide guidance on post-GWAS strategy and the interpretation of results. These guidelines are tailored to dairy cattle, which are characterized by long-range linkage disequilibrium, large half-sib families, and routinely collected phenotypes, requiring different approaches than those applied in human GWAS. We discuss common limitations and challenges that have been overlooked in the analysis and interpretation of GWAS to identify candidate sequence variants in dairy cattle. (© 2023, The Authors. Published by Elsevier Inc. and Fass Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).)
Subjects
Humans, Genome-Wide Association Study methods, Genetic Predisposition to Disease genetics, Phenotype, Polymorphism, Single Nucleotide genetics, Multifactorial Inheritance genetics, and Quantitative Trait Loci genetics
Abstract
Genome-wide association studies (GWASs) are a valuable tool for understanding the biology of complex human traits and diseases, but associated variants rarely point directly to causal genes. In the present study, we introduce a new method, polygenic priority score (PoPS), that learns trait-relevant gene features, such as cell-type-specific expression, to prioritize genes at GWAS loci. Using a large evaluation set of genes with fine-mapped coding variants, we show that PoPS and the closest gene individually outperform other gene prioritization methods, but observe the best overall performance by combining PoPS with orthogonal methods. Using this combined approach, we prioritize 10,642 unique gene-trait pairs across 113 complex traits and diseases with high precision, finding not only well-established gene-trait relationships but nominating new genes at unresolved loci, such as LGR4 for estimated glomerular filtration rate and CCR7 for deep vein thrombosis. Overall, we demonstrate that PoPS provides a powerful addition to the gene prioritization toolbox. (© 2023. The Author(s), under exclusive licence to Springer Nature America, Inc.)
Subjects
Humans, Genome-Wide Association Study methods, Phenotype, Algorithms, Genetic Predisposition to Disease, Polymorphism, Single Nucleotide, Exome, and Quantitative Trait Loci
Abstract
GWAS has identified thousands of loci associated with disease, yet the causal genes within these loci remain largely unknown. Identifying these causal genes would enable deeper understanding of the disease and assist in genetics-based drug development. Exome-wide association studies (ExWAS) are more expensive but can pinpoint causal genes offering high-yield drug targets, yet suffer from a high false-negative rate. Several algorithms have been developed to prioritize genes at GWAS loci, such as the Effector Index (Ei), Locus-2-Gene (L2G), Polygenic Prioritization score (PoPs), and Activity-by-Contact score (ABC) and it is not known if these algorithms can predict ExWAS findings from GWAS data. However, if this were the case, thousands of associated GWAS loci could potentially be resolved to causal genes. Here, we quantified the performance of these algorithms by evaluating their ability to identify ExWAS significant genes for nine traits. We found that Ei, L2G, and PoPs can identify ExWAS significant genes with high areas under the precision recall curve (Ei: 0.52, L2G: 0.37, PoPs: 0.18, ABC: 0.14). Furthermore, we found that for every unit increase in the normalized scores, there was an associated 1.3-4.6-fold increase in the odds of a gene reaching exome-wide significance (Ei: 4.6, L2G: 2.5, PoPs: 2.1, ABC: 1.3). Overall, we found that Ei, L2G, and PoPs can anticipate ExWAS findings from widely available GWAS results. These techniques are therefore promising when well-powered ExWAS data are not readily available and can be used to anticipate ExWAS findings, allowing for prioritization of genes at GWAS loci. (© 2023. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.)
Subjects
Male, Female, Humans, Animals, Rats, Sample Size, Polymorphism, Single Nucleotide, Phenotype, Genome-Wide Association Study methods, and Quantitative Trait Loci
Abstract
Power analyses are often used to determine the number of animals required for a genome-wide association study (GWAS). These analyses are typically intended to estimate the sample size needed for at least 1 locus to exceed a genome-wide significance threshold. A related question that is less commonly considered is the number of significant loci that will be discovered with a given sample size. We used simulations based on a real data set that consisted of 3,173 male and female adult N/NIH heterogeneous stock rats to explore the relationship between sample size and the number of significant loci discovered. Our simulations examined the number of loci identified in subsamples of the full data set. The subsampling analysis was conducted for 4 traits with low (0.15 ± 0.03), medium (0.31 ± 0.03 and 0.36 ± 0.03), and high (0.46 ± 0.03) SNP-based heritabilities. For each trait, we subsampled the data 100 times at different sample sizes (500, 1,000, 1,500, 2,000, and 2,500). We observed an exponential increase in the number of significant loci with larger sample sizes. Our results are consistent with similar observations in human GWAS and imply that future rodent GWAS should use sample sizes that are significantly larger than those needed to obtain a single significant result. Competing Interests: Conflicts of interest The author(s) declare no conflict of interest. (© The Author(s) 2023. Published by Oxford University Press on behalf of The Genetics Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
Subjects
Genome-Wide Association Study methods, Plant Breeding, Phenotype, Polymorphism, Single Nucleotide, Triticum genetics, and Quantitative Trait Loci
Abstract
Background: Plant architecture associated with increased grain yield and adaptation to the local environments is selected during wheat (Triticum aestivum) breeding. The internode length of individual stems and tiller length of individual plants are important for the determination of plant architecture. However, few studies have explored the genetic basis of these traits. Results: Here, we conduct a genome-wide association study (GWAS) to dissect the genetic basis of geographical differentiation of these traits in 306 worldwide wheat accessions including both landraces and traditional varieties. We determine the changes of haplotypes for the associated genomic regions in frequency in 831 wheat accessions that are either introduced from other countries or developed in China from last two decades. We identify 83 loci that are associated with one trait, while the remaining 247 loci are pleiotropic. We also find 163 associated loci are under strong selective sweep. GWAS results demonstrate independent regulation of internode length of individual stems and consistent regulation of tiller length of individual plants. This makes it possible to obtain ideal haplotype combinations of the length of four internodes. We also find that the geographical distribution of the haplotypes explains the observed differences in internode length among the worldwide wheat accessions. Conclusion: This study provides insights into the genetic basis of plant architecture. It will facilitate gene functional analysis and molecular design of plant architecture for breeding. (© 2023. The Author(s).)
Subjects
Animals, Genome-Wide Association Study methods, Birth Weight genetics, Chromosome Mapping, Phenotype, Polymorphism, Single Nucleotide, Quantitative Trait Loci, and Deer genetics
Abstract
The genetic architecture of traits under selection has important consequences for the response to selection and potentially for population viability. Early QTL mapping studies in wild populations have reported loci with large effect on trait variation. However, these results are contradicted by more recent genome-wide association analyses, which strongly support the idea that most quantitative traits have a polygenic basis. This study aims to re-evaluate the genetic architecture of a key morphological trait, birth weight, in a wild population of red deer (Cervus elaphus), using genomic approaches. A previous study using 93 microsatellite and allozyme markers and linkage mapping on a kindred of 364 deer detected a pronounced QTL on chromosome 21 explaining 29% of the variance in birth weight, suggesting that this trait is partly controlled by genes with large effects. Here, we used data for more than 2,300 calves genotyped at >39,000 SNP markers and two approaches to characterise the genetic architecture of birth weight. First, we performed a genome-wide association (GWA) analysis, using a genomic relatedness matrix to account for population structure. We found no SNPs significantly associated with birth weight. Second, we used genomic prediction to estimate the proportion of variance explained by each SNP and chromosome. This analysis confirmed that most genetic variance in birth weight was explained by loci with very small effect sizes. Third, we found that the proportion of variance explained by each chromosome was slightly positively correlated with its size. These three findings highlight a highly polygenic architecture for birth weight, which contradicts the previous QTL study. These results are probably explained by the differences in how associations are modelled between QTL mapping and GWA. Our study suggests that models of polygenic adaptation are the most appropriate to study the evolutionary trajectory of this trait. Competing Interests: Conflicts of interest statement The authors declare no conflict of interest. (© The Author(s) 2023. Published by Oxford University Press on behalf of the Genetics Society of America.)
Subjects
Humans, Genome-Wide Association Study methods, Linkage Disequilibrium, Genomics methods, Phenotype, Norway, Polymorphism, Single Nucleotide, Genotype, Quantitative Trait Loci, and Picea genetics
Abstract
Genomic prediction (GP) or genomic selection is a method to predict the accumulative effect of all quantitative trait loci (QTLs) in a population by estimating the realized genomic relationships between the individuals and by capturing the linkage disequilibrium between markers and QTLs. Thus, marker preselection is considered a promising method to capture Mendelian segregation effects. Using QTLs detected in a genome-wide association study (GWAS) may improve GP. Here, we performed GWAS and GP in a population with 904 clones from 32 full-sib families using a newly developed 50 k SNP Norway spruce array. Through GWAS we identified 41 SNPs associated with budburst stage (BB) and the largest effect association explained 5.1% of the phenotypic variation (PVE). For the other five traits such as growth and wood quality traits, only 2 - 13 associations were observed and the PVE of the strongest effects ranged from 1.2% to 2.0%. GP using approximately 100 preselected SNPs, based on the smallest p-values from GWAS showed the greatest predictive ability (PA) for the trait BB. For the other traits, a preselection of 2000-4000 SNPs, was found to offer the best model fit according to the Akaike information criterion being minimized. But PA-magnitudes from GP using such selections were still similar to that of GP using all markers. Analyses on both real-life and simulated data also showed that the inclusion of a large QTL SNP in the model as a fixed effect could improve PA and accuracy of GP provided that the PVE of the QTL was ≥ 2.5%. (© 2023. The Author(s).)
Subjects
Genome-Wide Association Study methods, Reproducibility of Results, Plant Breeding, Phenotype, Polymorphism, Single Nucleotide genetics, Quantitative Trait Loci genetics, and Oryza genetics
Abstract
Genome-wide association studies (GWASs) are used to detect quantitative trait loci (QTL) using genomic and phenotypic data as inputs. While genomic data are obtained with high throughput and low cost, obtaining phenotypic data requires a large amount of effort and time. In past breeding programs, researchers and breeders have conducted a large number of phenotypic surveys and accumulated results as legacy data. In this study, we conducted a GWAS using phenotypic data of temperate japonica rice (Oryza sativa) varieties from a public database. The GWAS using the legacy data detected several known agriculturally important genes, indicating reliability of the legacy data for GWAS. By comparing the GWAS using legacy data (L-GWAS) and a GWAS using phenotypic data that we measured (M-GWAS), we detected reliable QTL for agronomically important traits. These results suggest that an L-GWAS is a strong alternative to replicate tests to confirm the reproducibility of QTL detected by an M-GWAS. In addition, because legacy data have often been accumulated for many traits, it is possible to evaluate the pleiotropic effect of the QTL identified for the specific trait that we focused on with respect to various other traits. This study demonstrates the effectiveness of using legacy data for GWASs and proposes the use of legacy data to accelerate genomic breeding. Competing Interests: Conflict of interest statement. The authors declare no conflict of interest. (© The Author(s) 2023. Published by Oxford University Press on behalf of American Society of Plant Biologists.)
Subjects
Phenotype, Polymorphism, Single Nucleotide, Transcriptome, Humans, Genome-Wide Association Study methods, Metabolome, and Quantitative Trait Loci
Abstract
Despite the success of genome-wide association studies (GWASs) in identifying genetic variants associated with complex traits, understanding the mechanisms behind these statistical associations remains challenging. Several methods that integrate methylation, gene expression, and protein quantitative trait loci (QTLs) with GWAS data to determine their causal role in the path from genotype to phenotype have been proposed. Here, we developed and applied a multi-omics Mendelian randomization (MR) framework to study how metabolites mediate the effect of gene expression on complex traits. We identified 216 transcript-metabolite-trait causal triplets involving 26 medically relevant phenotypes. Among these associations, 58% were missed by classical transcriptome-wide MR, which only uses gene expression and GWAS data. This allowed the identification of biologically relevant pathways, such as between ANKH and calcium levels mediated by citrate levels and SLC6A12 and serum creatinine through modulation of the levels of the renal osmolyte betaine. We show that the signals missed by transcriptome-wide MR are found, thanks to the increase in power conferred by integrating multiple omics layer. Simulation analyses show that with larger molecular QTL studies and in case of mediated effects, our multi-omics MR framework outperforms classical MR approaches designed to detect causal relationships between single molecular traits and complex phenotypes. Competing Interests: CA, MS, TW, AR, ZK, EP No competing interests declared (© 2023, Auwerx et al.)
Subjects
Humans, Genetic Predisposition to Disease, Mendelian Randomization Analysis, Dopaminergic Neurons, Genome-Wide Association Study methods, Polymorphism, Single Nucleotide genetics, Nuclear Proteins genetics, GTP-Binding Proteins genetics, Phospholipases genetics, Quantitative Trait Loci genetics, and Schizophrenia genetics
Abstract
Multiple integrative studies have been performed to identify the potential target genes of the non-coding schizophrenia (SCZ) risk variants. However, all the integrative studies used expression quantitative trait loci (eQTL) data from bulk tissues. Considering the cell type-specific regulatory effect of many genetic variants, it is important to conduct integrative studies using cell type-specific eQTL data. Here, we conduct a Mendelian randomization (MR) study by integrating genome-wide associations of SCZ (74,776 cases and 101,023 controls) and eQTL data (N = 215) from dopaminergic neurons, which were differentiated from human-induced pluripotent stem cell (iPSC) lines. For eQTL from young post-mitotic dopaminergic neurons (differentiation of iPSC for 30 days, D30), we identified 34 genes whose genetically regulated expression in dopaminergic neurons may have a causal role in SCZ. Among which, ARL3 showed the most significant associations with SCZ. For eQTL from more mature dopaminergic neurons (D52), we identified 37 potential SCZ causal genes, and ARL3 and GNL3 showed the most significant associations. Only 12 genes showed significant associations with SCZ in both D30 and D52 eQTL datasets, indicating the time point-specific genetic regulatory effects in young post-mitotic dopaminergic neurons and more mature dopaminergic neurons. Comparing the results from dopaminergic neurons with bulk brain tissues prioritized 2 high-confidence risk genes, including DDHD2 and GALNT10. Our study identifies multiple risk genes whose genetically regulated expression in dopaminergic neurons may have a causal role in SCZ. Further mechanistic investigation will provide pivotal insights into SCZ pathophysiology. (© 2022. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.)
Subjects
RNA-Seq, Polymorphism, Single Nucleotide, Quantitative Trait Loci, and Genome-Wide Association Study methods
Abstract
Using latent variables in gene expression data can help correct unobserved confounders and increase statistical power for expression quantitative trait Loci (eQTL) detection. The probabilistic estimation of expression residuals (PEER) and principal component analysis (PCA) are widely used methods that can remove unwanted variation and improve eQTL discovery power in bulk RNA-seq analysis. However, their performance has not been evaluated extensively in single-cell eQTL analysis, especially for different cell types. Potential challenges arise due to the structure of single-cell RNA-seq data, including sparsity, skewness, and mean-variance relationship. Here, we show by a series of analyses that PEER and PCA require additional quality control and data transformation steps on the pseudo-bulk matrix to obtain valid latent variables; otherwise, it can result in highly correlated factors (Pearson's correlation r = 0.63 ~ 0.99). Incorporating valid PFs/PCs in the eQTL association model would identify 1.7 ~ 13.3% more eGenes. Sensitivity analysis showed that the pattern of change between the number of eGenes detected and fitted PFs/PCs varied significantly in different cell types. In addition, using highly variable genes to generate latent variables could achieve similar eGenes discovery power as using all genes but save considerable computational resources (~ 6.2-fold faster). (© 2023. The Author(s).)
Catalog
Books, media, physical & digital resources
Guides
Course- and topic-based guides to collections, tools, and services.