Extended installation guide for amplimap¶
If you need to install amplimap and its requirements (such as Python, read aligners, etc) from scratch we recommend that you use bioconda.
1. Install bioconda¶
Follow the official installation guide on the conda website to install Miniconda.
Then, set up the bioconda channels as described in the bioconda documentation.
2. Setup amplimap environment¶
Download amplimap's environment file, which contains a list of all the software used by amplimap.
Use the conda tool to install this software into a conda environment:
conda env create -f environment.yml
If you want to run germline variant calling and annotation, you also need to download and install Annovar manually. Make sure you also download the relevant indices for the reference genome you want to use,
3. Activate amplimap¶
Load amplimap by running this command:
source activate amplimap
You only need to run this command once per session, e.g. when you open a new terminal window.
Run amplimap --version to confirm that amplimap has been installed and activated.
4. Set up your reference genome and indices¶
Download the reference genome FASTA file that you want to use, for example from the Ensembl
FTP. When in doubt, we recommend using the
primary_assembly version, for example Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz.
Once you have downloaded this file, you need to prepare it for use in amplimap:
#decompress the file (if it ends in .gz)
gunzip Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
#index FASTA file
samtools faidx Homo_sapiens.GRCh38.dna.primary_assembly.fa
#build bwa index, if you want to use bwa:
bwa index Homo_sapiens.GRCh38.dna.primary_assembly.fa
#build bowtie2 index, if you want to use bowtie2:
bowtie2-build Homo_sapiens.GRCh38.dna.primary_assembly.fa Homo_sapiens.GRCh38.dna.primary_assembly.fa
5. Update amplimap configuration¶
Finally, you need to add the paths of the reference genome files to your config_default.yaml.
To find out where this file is located, run:
amplimap --basedir
Open the file config_default.yaml at this location and look for the settings under paths:
corresponding to the indices you created.
Replace these with the full paths to your files. If you haven’t generated the corresponding
file, leave the setting empty. For example, if you generated indices for bwa and bowtie2 (but not STAR or Annovar)
and always used the same FASTA filename as the prefix:
paths:
hg38:
bwa: "/home/user/amplimap/Homo_sapiens.GRCh38.dna.primary_assembly.fa"
bowtie2: "/home/user/amplimap/Homo_sapiens.GRCh38.dna.primary_assembly.fa"
fasta: "/home/user/amplimap/Homo_sapiens.GRCh38.dna.primary_assembly.fa"
If you used a different reference genome, change the “hg38” to the appropriate abbreviation (e.g. “mm10”)
and also update the line genome_name: "hg38" below.
If you are using Annovar, make sure you also provide the path to its indices directory under paths:
and adjust the protocols/operations under annotate: annovar: protocols: to match the indices you
have downloaded.
Save the file and confirm that the settings are being read correctly by looking at the output of amplimap --print-config.
6. Run amplimap!¶
Now you are ready to run amplimap! When you start a new session, activate the conda environment as described above and then run the amplimap commands as usual:
source activate amplimap
amplimap --version