.. _setup: ===== Setup ===== System requirements =================== .. list-table:: :widths: 1 2 :header-rows: 1 :class: tight-table * - Hardware/Software - Requirement * - Operating system - KGGSEE runs in a Java Virtual Machine. It does not matter which operating system it runs in. * - Java Runtime Environment - A Java SE Runtime Environment of version 1.8 or higher is needed. * - CPU - A CPU with four cores or more is recommended. * - Memory - 16 GB RAM or higher is recommended. * - Free space - KGGSEE and related datasets may take up to 10 GB. Setup a Java Runtime Environment (JRE) ====================================== KGGSEE needs JRE 1.8 or higher. Both `Java(TM) SE JRE `_ and `OpenJDK JRE `_ are competent. After installing a JRE, check by entering ``java -version`` in a Terminal of Linux/MacOS, or a CMD/PowerShell of MS Windows. If it displays the JRE version like ``Java(TM) SE Runtime Environment (build x)`` or ``OpenJDK Runtime Environment (build x)``, it means the JRE has already been set up. Otherwise, check if JRE has been installed and if ``java`` is in ``$PATH``. KGGSEE and its running resources ================================ KGGSEE is written in Java and distributed as a Java Archive ``kggsee.jar``. To perform an analysis, corresponding running resources are also needed. For example, reference genotypes and gene annotations are needed for gene-based association tests (GATES and ECS) and heritability estimations (EHE); in addition, eQTL summary statistics are needed for gene-expression causal-effect estimations (EMIC). Thus, ``kggsee.jar`` is always needed and which resource files are needed depends on the analysis. We provide the following download links. .. list-table:: :widths: 1 2 1 :header-rows: 1 :class: tight-table * - File - Description - Size * - `kggsee.jar `_ - The KGGSEE program - 46 MB * - `resources/ `_ - A OneDrive folder containing all running resource files provided by us - * - `resources.zip `_ - Running resource files except for reference genotypes and eQTL summary statistics - 362 MB * - `tutorials.zip `_ - A tutorial dataset to run through :ref:`the four types of analyses ` - 155 MB Set up an environment for the Quick tutorials ============================================= A quick and easy way to set up an environment for the :ref:`Quick tutorials ` is #. Download `kggsee.jar `_, `resources.zip `_ and `tutorials.zip `_ #. Unzip ``resources.zip`` and ``tutorials.zip`` #. Put ``kggsee.jar``, ``resources/`` and ``tutorials/`` under one directory. where `resources.zip `_ contains .. list-table:: :widths: 1 1 :header-rows: 1 :class: tight-table * - File - Description * - ``resources/{hg19,hg38}/kggseqv1.1_{hg19,hg38}_GEncode.txt.gz`` - `GENCODE `_ annotations * - ``resources/{hg19,hg38}/kggseqv1.1_{hg19,hg38}_refGene.txt.gz`` - `RefGene `_ annotations * - ``resources/HgncGene.txt.gz`` - `HGNC `_ gene ID * - ``resources/ENSTGene.gz`` - `Ensembl `_ gene ID and transcript ID * - ``resources/*.symbols.gmt.gz`` - `MSigDB `_ gene sets * - ``resources/GTEx_v8_TMM_all.gene.meanSE.txt.gz`` - The gene-level expression profile of the `GTEx v8 `_ tissues * - ``resources/GTEx_v8_TMM_all.transcript.meanSE.txt.gz`` - The transcript-level expression profile of the `GTEx v8 `_ tissues and `tutorials.zip `_ contains .. list-table:: :widths: 1 1 :header-rows: 1 :class: tight-table * - File - Description * - ``tutorials/scz_gwas_eur_chr1.tsv.gz`` - Chromosome 1 summary statistics of a schizophrenia GWAS with a European sample. * - ``tutorials/1kg_hg19_eur_chr1.vcf.gz`` - Chromosome 1 genotypes of the European panel of the 1000 Genomes Project * - ``tutorials/GTEx_v8_gene_BrainBA9.eqtl.txt.gz`` - eQTL summary statistics calculated from the brain BA9 gene-level expression profile of GTEx v8 * - ``tutorials/GTEx_v8_transcript_BrainBA9.eqtl.txt.gz`` - eQTL summary statistics calculated from the brain BA9 transcript-level expression profile of GTEx v8 Set up an environment for customized analyses ============================================= In addition to the files packaged in `resources.zip `_, reference genotypes of five 1000 Genomes Project super populations and eQTL summary statistics of 49 GTEx v8 tissues are also available for downloading under `resources/ `_: .. list-table:: :widths: 1 2 :header-rows: 1 :class: tight-table * - File - Description * - `resources/hg19/gty/*.vcf.gz `_ - VCF files of each super-population panel of the `1000 Genomes Project `_ using hg19 coordinates. Each VCF file includes biallelic variants with MAF>0.01 of the super population. The VCF files include autosomes and chrX. * - `resources/hg38/gty/*.vcf.gz `_ - VCF files of each super-population panel of the `1000 Genomes Project `_ using hg38 coordinates. Each VCF file includes biallelic variants with MAF>0.01 of the super population. The VCF files include only autosomes. * - `resources/hg19/eqtl/*.eqtl.txt.gz `_ - cis-eQTL summary statistics using hg19 coordinates calculated from the gene or transcript-level expression profile of the `GTEx v8 `_ dataset * - `resources/hg38/eqtl/*.eqtl.txt.gz `_ - cis-eQTL summary statistics using hg38 coordinates calculated from the gene or transcript-level expression profile of the `GTEx v8 `_ dataset Then, a straightforward way to set up an environment for customized analyses is #. Download `kggsee.jar `_ and `resources.zip `_ #. Unzip ``resources.zip``, and put ``kggsee.jar`` and ``resources/`` under one directory #. Download the reference genotypes (`1kg_hg19 `_ or `1kg_hg38 `_) of the population that matches your GWAS. #. For running EMIC or eDESE, also download the eQTL summary statistics (`eqtl_hg19 `_ or `eqtl_hg38 `_) of phenotype-associated tissues. #. To prepare customized resource files, refer to :ref:`Detailed Document ` for descriptions of the file formats.