Table Of Contents

This Page


Biodoop-BLAST is a wrapper-based MapReduce implementation of BLAST for Hadoop. It is based on Biodoop’s core component, in turn based on the Pydoop API for Hadoop.


  1. install Biodoop core

  2. get biodoop-blast from the download page

  3. unpack the biodoop-blast tarball, move to the distribution directory and run:

    python install

    for a system-wide installation, or:

    python install --user

    for a local installation


Here is a quick usage example that can be run on a laptop:

$ hadoop dfsadmin -safemode wait
$ wget*random*
$ zcat chr*.gz >genome
$ formatdb -p F -o T -i genome
$ tar cf genome.tar genome.*
$ cat >query.fa
$ biodoop_blast -o output.csv -p blastn -d genome query.fa genome.tar