The inclusive polygenic score browser is a resource of pre-trained polygenic score models and their predictive performance evaluation. We currently host over 50 polygenic score model weights trained on individual-level data from UK Biobank. The results are provided on an "AS-IS" basis, without warranty of any kind. Use of the website and its content is at the user's sole risk.
We request that any use of data from the browser, including the inclusive polygenic score weights, cite the following manuscript.
Inclusive Polygenic Scores (iPGS) is a PGS training strategy to train PGS models with improved transferability across ancestry groups. We fit a penalized regression model directly on the individual-level data across ancestry-diverse individuals using the BASIL algorithm implemented in the R snpnet package. In both simulation and application to UK Biobank dataset, we show that our iPGS approach improves transferability compared to a baseline model trained only on individuals of European ancestry.
We focused on N~406,000 unrelated individuals in our study. A detailed description of the sample-level quality control procedure is described in our publication. We used a combination of genotype PCs and self-reported ethnicity to define the following population groups: white British, non-British white, South Asian, African, and the other remaining individuals. Specifically, we applied a Bayesian outlier detection algorithm, aberrant, to the first six genotype principal components (PCs) to define European, South Asian, and African individuals. We further subdivided the Europen set into white British and non-British white. We took the remaining heterogeneous individuals as the other group. Please read our publication for more information.
Our iPGS approach is complementary to other multi-ancestry PGS models. There is an increasing number of multi-ancestry PGS models that take GWAS summary statistics and ancestry-matched LD reference panels from multiple ancestry groups (reviewed, for example, in Kachuri et al. 2023 PMID: 37620596). Those methods are advantageous when (meta-analyzed) GWAS summary statistics from a large number of individuals are readily available. However, admixed individuals are rarely considered in PGS training, given the challenges in representing ancestry-matched linkage-disequilibrium reference panels. Our iPGS approach directly operates on the individual-level data, thus directly applicable to admixed individuals. In our manuscript, we performed a systematic comparison of our iPGS approach and PRS-CSx method, a commonly used summary-statistics-based multi-ancestry PGS model, across 60 UK Biobank traits. When trained on a similar number of individuals, we show that our iPGS model has competitive or improved predictive performance over PRS-CSx. Please read our publication for more information.
You may download the coefficients of iPGS models.
We provide the coefficients (BETA, "effect_weight") of the inclusive PGS models as a bgzip-compressed table file. The file has the following columns:
You may compute polygenic scores for each individual using individual-level genetic data and an iPGS coefficients file. You may use plink2's --score command. Yosuke previously wrote a short blog post on the example usage of the plink2 command to compute polygenic scores. Alternatively, you may use the pgsc_calc tool from the PGS catalog. Please cite our manuscript when you use our polygenic score models in your research.
This website collects some personal data from its users. Specifically, we use Google Analytics, a web analytics service provided by Google LLC ("Google"), to help us understand resource usage. Google Analytics uses cookies to track your interactions with our website. The information generated by the cookies about your use of our website (including your IP address) will be transmitted to and stored by Google.