View on GitHub

Loman Lab Mock Community Experiments

Databases

Newer kraken2 databases

Since September 2020, Langmead et al. routinely contruct and distribute kraken2 hash files of the RefSeq database and make them freely available for download via AWS. For posterity, we still provide links to our databases below. The Langmead PlusPF database is equivalent to our older kraken2-microbial database below, so we would recommend downloading that instead.

kraken2-microbial (September 2018, 30GB)

A database built by stacking the kraken2 --download-library command for the following database types:

kraken2 downloads and hashes sequences from RefSeq that are marked as “complete” or “representative”

Prepared 2018-09-03 by @samstudio8

mkdir kraken2-microbial-fatfree/
cd kraken2-microbial-fatfree/
wget -c https://refdb.s3.climb.ac.uk/kraken2-microbial/hash.k2d
wget https://refdb.s3.climb.ac.uk/kraken2-microbial/opts.k2d
wget https://refdb.s3.climb.ac.uk/kraken2-microbial/taxo.k2d

Note wget -c will allow you to continue a stalled download without starting over!

maxikraken2_1903_140GB (March 2019, 140GB)

A database built in a similar way to kraken2-microbial, but instead allowing hashes to be generated from genomes that are not “complete” or “representative”, for the following database types:

The database has been made available for sharing by Daniel Fischer (Natural Resources Institute Finland) and was built using scripts available on Github. CSC - IT Center for Science, Finland, is acknowledged for computational resources.

Prepared 2019-03 by @fischuu

mkdir maxikraken2_1903_140GB/
cd maxikraken2_1903_140GB/
wget -c https://refdb.s3.climb.ac.uk/maxikraken2_1903_140GB/hash.k2d
wget https://refdb.s3.climb.ac.uk/maxikraken2_1903_140GB/opts.k2d
wget https://refdb.s3.climb.ac.uk/maxikraken2_1903_140GB/taxo.k2d

Note wget -c will allow you to continue a stalled download without starting over!