CRAFT (README)¶
Credible Refinement and Annotation of Functional Targets (CRAFT)
- Free, open-source software: MIT license
- Documentation: https://craft.readthedocs.io
This repository is still in development; our package is not due for full release until end July 2019.
(CRAFT) is a pipeline for the calculation, annotation and visualisation of credible SNP sets. It takes input as p-values from GWAS results. We have implemented this as a Python library, available via PyPI.
Quick start guide¶
- We recommend creating a virtual environment for package installation (craft and dependencies), using
venvor conda. - Install Python 3.6 or later. Alternatively, you can create a local clone of this GitHub repository and run
setup.pyin your terminal. - Install python package at terminal using:
python -m pip install --index-url https://test.pypi.org/simple/ --no-deps bio-craft - Install ANNOVAR (and Perl if required, which ANNOVAR requires to run). http://annovar.openbioinformatics.org/en/latest/. You will also need to move the ANNOVAR directory into CRAFT, then download the hg19 database using the shell command
annotate_variation.pl -buildver hg19 -downdb -webfrom annovar refGen -buildver hg19 humandb/. - Optional - Download HapMap recombination maps for the correct human genome build from the NCBI FTP. Otherwise, you can use the version distributed by default with CRAFT (GRCh37/hg19, see genetic_maps directory for more information.)
- Optional - Install supported finemapping packages you’d like to run (FINEMAP, PAINTOR or CAVIARBF) from their source websites and repositories (www.christianbrenner.com, https://github.com/gkichaev/PAINTOR_V3.0). If none are installed, you can use the approximate Bayes factor (ABF) calculation and credible SNP selection included in CRAFT.
- Optional - Change the config file to match the locations and name of the ANNOVAR, finemapping packages and genetic maps directories (if required).
- Optional - Make an empty ‘output’ folder if you want to run the test scripts.
The CRAFT workflow¶
Input file formats¶
GWAS summary statistics
Output¶
Documentation still to be added; see output folder for examples of different types of output.
Test data¶
Currently we include two sets of test data distributed with the package, both for binary traits:
- SNPTEST summary statistics from psoriatic arthritis (PsA) patients (chromosome 1)
- PLINK .assoc.logistic and .frq.cc summary statistic files from PsA patients typed using Immunochip (chromosome 1)
We have not (yet) tested this pipeline using data for quantitative traits, but have applied it to large datasets (>12 million SNPs) in patients with PsA, and are in the process of applying it in patients with JIA.
Did you find an issue / missing feature?¶
We welcome all bug reports and requests for additional features using our GitHub issues tracker. This software will be supported by an active developer until at least 2020.
References¶
Approximate Bayes Factor (ABF)
CRAFT uses a Python reimplemented version of the abf.R function written by [Chris Wallace](http://chr1swallace.github.io/) for the calculation of credible SNPs.
- Jon Wakefield (2008) Bayes factors for genome-wide association studies: comparison with P-values. Genet Epidemiol DOI: 10.1002/gepi.20359
- Bowes et. al (2015) Dense genotyping of immune-related susceptibility loci reveals new insights into the genetics of psoriatic arthritis.
ANNOVAR
- Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
CAVIARBF (C++ version, v0.1.4.1)
FINEMAP(v1.3)
- Refining fine-mapping: effect sizes and regional heritability. bioRxiv. (2018).
- Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am. J. Hum. Genet. (2017).
- FINEMAP: Efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493-1501 (2016).
PAINTOR (v3.0)
- Kichaev et al. (PLOS Genetics, 2014)
- Kichaev et al. (American Journal of Human Genetics, 2015)
- Kichaev et al. (Bioinformatics, 2016).