[DRE-commits] [ruby-crb-blast] 01/03: Imported Upstream version 0.6.4
Michael Crusoe
misterc-guest at moszumanska.debian.org
Sat Oct 3 05:34:59 UTC 2015
This is an automated email from the git hooks/post-receive script.
misterc-guest pushed a commit to branch master
in repository ruby-crb-blast.
commit 0c291256e88d55c21c4a9c5b282bfdbf48af545c
Author: Michael R. Crusoe <michael.crusoe at gmail.com>
Date: Sat Sep 19 21:57:30 2015 -0700
Imported Upstream version 0.6.4
---
.gitignore | 13 ++
Gemfile | 3 +
README.md | 95 ++++++++
Rakefile | 8 +
bin/crb-blast | 75 +++++++
build | 1 +
crb-blast.gemspec | 29 +++
deps/deps.yaml | 27 +++
lib/crb-blast.rb | 4 +
lib/crb-blast/cmd.rb | 19 ++
lib/crb-blast/crb-blast.rb | 547 +++++++++++++++++++++++++++++++++++++++++++++
lib/crb-blast/hit.rb | 36 +++
lib/crb-blast/version.rb | 12 +
metadata.yml | 254 +++++++++++++++++++++
test/helper.rb | 16 ++
test/query.fasta | 22 ++
test/query2.fasta | 30 +++
test/target.fasta | 62 +++++
test/target2.fasta | 76 +++++++
test/test_bin.rb | 17 ++
test/test_test.rb | 99 ++++++++
test/test_test2.rb | 90 ++++++++
22 files changed, 1535 insertions(+)
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..3b6cbee
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,13 @@
+*.lock
+*.nhr
+*.nin
+*.nsq
+*.phr
+*.pin
+*.psq
+*.blast
+coverage
+*~
+*.gem
+*fa
+*tsv
diff --git a/Gemfile b/Gemfile
new file mode 100644
index 0000000..b4e2a20
--- /dev/null
+++ b/Gemfile
@@ -0,0 +1,3 @@
+source "https://rubygems.org"
+
+gemspec
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..4ef318f
--- /dev/null
+++ b/README.md
@@ -0,0 +1,95 @@
+CRB-BLAST
+=========
+
+Conditional Reciprocal Best BLAST - high confidence ortholog assignment.
+
+### What is Conditional Reciprocal Best BLAST?
+
+CRB-BLAST is a novel method for finding orthologs between one set of sequences and another. This is particularly useful in genome and transcriptome annotation.
+
+CRB-BLAST initially performs a standard reciprocal best BLAST. It does this by performing BLAST alignments of query->target and target->query. Reciprocal best BLAST hits are those where the best match for any given query sequence in the query->target alignment is also the best hit of the match in the reverse (target->query) alignment.
+
+Reciprocal best BLAST is a very conservative way to assign orthologs. The main innovation in CRB-BLAST is to learn an appropriate e-value cutoff to apply to each pairwise alignment by taking into account the overall relatedness of the two datasets being compared. This is done by fitting a function to the distribution of alignment e-values over sequence lengths. The function provides the e-value cutoff for a sequence of given length.
+
+CRB-BLAST greatly improves the accuracy of ortholog assignment for de-novo transcriptome assembly ([Aubry et al. 2014](http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1004365)).
+
+The CRB-BLAST algorithm was designed by [Steve Kelly](http://www.stevekellylab.com), and this implementation is by Chris Boursnell and Richard Smith-Unna. The original reference implementation from the paper is available for online use at http://www.stevekellylab.com/software/conditional-orthology-assignment.
+
+### Installation
+
+To install CRB-BLAST, simply use rubygems:
+
+`gem install crb-blast`
+
+### Prerequisites
+
+ - NCBI BLAST+ (preferably the latest version) should be installed and in your PATH.
+ - Ruby v2.0 or later. If you don't have Ruby, we suggest installing it with [RVM](http://rvm.io).
+
+`\curl -sSL https://get.rvm.io | bash -s stable --ruby`
+
+
+### Usage
+
+CRB-BLAST can be run from the command-line as a standalone program, or used as a library in your own code.
+
+#### Command-line usage
+
+CRB-BLAST can be run from the command line with:
+
+```
+crb-blast
+```
+
+The options are
+
+```
+ --query, -q <s>: query fasta file in nucleotide format
+ --target, -t <s>: target fasta file as nucleotide or protein
+ --evalue, -e <f>: e-value cut off for BLAST. Format 1e-5 (default: 1.0e-05)
+ --threads, -h <i>: number of threads to run BLAST with (default: 1)
+ --output, -o <s>: output file as tsv
+ --split, -s: split the fasta files into chunks and run multiple blast
+ jobs and then combine them.
+ --help, -l: Show this message
+```
+
+An example command is:
+
+```bash
+crb-blast --query assembly.fa --target reference_proteins.fa --threads 8 --output annotation.tsv
+```
+
+#### Library usage
+
+To include the gem in your code just `require 'crb-blast'`
+
+A quick example:
+
+```ruby
+blaster = CRB_Blast.new('test/query.fasta', 'test/target.fasta')
+blaster.run(1e-5, 4, true) # to run with an evalue cutoff of 1e-5 and 4 threads
+```
+
+A longer example with each step at a time:
+
+```ruby
+blaster = CRB_Blast.new('test/query.fasta', 'test/target.fasta')
+blaster.makedb
+blaster.run_blast(1e-5, 6, true)
+blaster.load_outputs
+blaster.find_reciprocals
+blaster.find_secondaries
+```
+
+### Getting help
+
+Please use the issue tracker if you find bugs or have trouble running CRB-BLAST.
+
+Chris Boursnell <cmb211 at cam.ac.uk> maintains this software.
+
+### License
+
+This is adademic software - please cite us if you use it in your work.
+
+CRB-BLAST is released under the MIT license.
diff --git a/Rakefile b/Rakefile
new file mode 100644
index 0000000..debc11c
--- /dev/null
+++ b/Rakefile
@@ -0,0 +1,8 @@
+require 'rake/testtask'
+
+Rake::TestTask.new do |t|
+ t.libs << 'test'
+end
+
+desc "Run tests"
+task :default => :test
diff --git a/bin/crb-blast b/bin/crb-blast
new file mode 100755
index 0000000..1978c52
--- /dev/null
+++ b/bin/crb-blast
@@ -0,0 +1,75 @@
+#!/usr/bin/env ruby
+
+#
+# run crb-blast from the cli
+#
+
+require 'trollop'
+require 'crb-blast'
+require 'bindeps'
+
+opts = Trollop::options do
+ version CRB_Blast::VERSION::STRING.dup
+ banner <<-EOS
+
+CRB-Blast v#{CRB_Blast::VERSION::STRING.dup} by Chris Boursnell <cmb211 at cam.ac.uk>
+
+Conditional Reciprocal Best BLAST
+
+USAGE:
+crb-blast <options>
+
+OPTIONS:
+
+EOS
+ opt :query,
+ "query fasta file",
+ :required => true,
+ :type => String
+
+ opt :target,
+ "target fasta file",
+ :required => true,
+ :type => String
+
+ opt :evalue,
+ "e-value cut off for BLAST. Format 1e-5",
+ :default => 1e-5,
+ :type => :float
+
+ opt :threads,
+ "number of threads to run BLAST with",
+ :default => 1,
+ :type => :int
+
+ opt :output,
+ "output file as tsv",
+ :required => true,
+ :type => String
+
+ opt :split,
+ "split the fasta files into chunks and run multiple blast jobs and then"+
+ " combine them."
+end
+
+Trollop::die :query, "must exist" if !File.exist?(opts[:query])
+Trollop::die :target, "must exist" if !File.exist?(opts[:target])
+
+gem_dir = Gem.loaded_specs['crb-blast'].full_gem_path
+gem_deps = File.join(gem_dir, 'deps', 'deps.yaml')
+Bindeps.require gem_deps
+
+blaster = CRB_Blast::CRB_Blast.new(opts.query, opts.target)
+dbs = blaster.makedb
+run = blaster.run_blast(opts.evalue, opts.threads, opts.split)
+load = blaster.load_outputs
+recips = blaster.find_reciprocals
+secondaries = blaster.find_secondaries
+
+File.open("#{opts.output}", 'w') do |out|
+ blaster.reciprocals.each_pair do |query_id, hits|
+ hits.each do |hit|
+ out.write "#{hit}\n"
+ end
+ end
+end
diff --git a/build b/build
new file mode 100755
index 0000000..ae17607
--- /dev/null
+++ b/build
@@ -0,0 +1 @@
+gem build crb-blast.gemspec
\ No newline at end of file
diff --git a/crb-blast.gemspec b/crb-blast.gemspec
new file mode 100644
index 0000000..36f8e38
--- /dev/null
+++ b/crb-blast.gemspec
@@ -0,0 +1,29 @@
+
+require File.expand_path('../lib/crb-blast/version', __FILE__)
+
+Gem::Specification.new do |gem|
+ gem.name = 'crb-blast'
+ gem.version = CRB_Blast::VERSION::STRING.dup
+ gem.date = '2015-05-19'
+ gem.summary = "Run conditional reciprocal best blast"
+ gem.description = "See summary"
+ gem.authors = ["Chris Boursnell", "Richard Smith-Unna"]
+ gem.email = 'cmb211 at cam.ac.uk'
+ gem.files = `git ls-files`.split("\n")
+ gem.executables = ["crb-blast"]
+ gem.require_paths = %w( lib )
+ gem.homepage = 'https://github.com/cboursnell/crb-blast'
+ gem.license = 'MIT'
+
+ gem.add_dependency 'trollop', '~> 2.0'
+ gem.add_dependency 'bio', '~> 1.4', '>= 1.4.3'
+ gem.add_dependency 'fixwhich', '~> 1.0', '>= 1.0.2'
+ gem.add_dependency 'threach', '~> 0.2', '>= 0.2.0'
+ gem.add_dependency 'bindeps', '~> 1.0', '>= 1.0.3'
+
+ gem.add_development_dependency 'rake', '~> 10.3', '>= 10.3.2'
+ gem.add_development_dependency 'turn', '~> 0.9', '>= 0.9.7'
+ gem.add_development_dependency 'simplecov', '~> 0.8', '>= 0.8.2'
+ gem.add_development_dependency 'shoulda-context', '~> 1.2', '>= 1.2.1'
+ gem.add_development_dependency 'coveralls', '~> 0.7'
+end
diff --git a/deps/deps.yaml b/deps/deps.yaml
new file mode 100644
index 0000000..4ea1100
--- /dev/null
+++ b/deps/deps.yaml
@@ -0,0 +1,27 @@
+blastplus:
+ binaries:
+ - makeblastdb
+ - blastn
+ - tblastn
+ - blastp
+ - blastx
+ - tblastx
+ - makembindex
+ - psiblast
+ - rpsblast
+ - blastdbcmd
+ - segmasker
+ - dustmasker
+ - blast_formatter
+ - windowmasker
+ - blastdb_aliastool
+ - deltablast
+ - rpstblastn
+ - blastdbcheck
+ version:
+ number: '2.2.29'
+ command: 'blastx -version'
+ url:
+ 64bit:
+ macosx: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.29/ncbi-blast-2.2.29+-universal-macosx.tar.gz
+ linux: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.29/ncbi-blast-2.2.29+-x64-linux.tar.gz
\ No newline at end of file
diff --git a/lib/crb-blast.rb b/lib/crb-blast.rb
new file mode 100644
index 0000000..2f72c3d
--- /dev/null
+++ b/lib/crb-blast.rb
@@ -0,0 +1,4 @@
+require 'crb-blast/cmd'
+require 'crb-blast/hit'
+require 'crb-blast/crb-blast'
+require 'crb-blast/version'
diff --git a/lib/crb-blast/cmd.rb b/lib/crb-blast/cmd.rb
new file mode 100644
index 0000000..713cbbd
--- /dev/null
+++ b/lib/crb-blast/cmd.rb
@@ -0,0 +1,19 @@
+require 'open3'
+
+module CRB_Blast
+
+ class Cmd
+
+ attr_accessor :cmd, :stdout, :stderr, :status
+
+ def initialize cmd
+ @cmd = cmd
+ end
+
+ def run
+ @stdout, @stderr, @status = Open3.capture3 @cmd
+ end
+
+ end
+
+end
\ No newline at end of file
diff --git a/lib/crb-blast/crb-blast.rb b/lib/crb-blast/crb-blast.rb
new file mode 100644
index 0000000..87c8edd
--- /dev/null
+++ b/lib/crb-blast/crb-blast.rb
@@ -0,0 +1,547 @@
+#!/usr/bin/env ruby
+
+require 'bio'
+require 'fixwhich'
+require 'threach'
+
+module CRB_Blast
+
+ class Bio::FastaFormat
+ def isNucl?
+ Bio::Sequence.guess(self.seq, 0.9, 500) == Bio::Sequence::NA
+ end
+
+ def isProt?
+ Bio::Sequence.guess(self.seq, 0.9, 500) == Bio::Sequence::AA
+ end
+ end
+
+ class CRB_Blast
+
+ include Which
+
+ attr_accessor :query_name, :target_name, :reciprocals
+ attr_accessor :missed
+ attr_accessor :target_is_prot, :query_is_prot
+ attr_accessor :query_results, :target_results, :working_dir
+
+ def initialize query, target, output=nil
+ raise IOError.new("File not found #{query}") if !File.exist?(query)
+ raise IOError.new("File not found #{target}") if !File.exist?(target)
+ @query = File.expand_path(query)
+ @target = File.expand_path(target)
+ if output.nil?
+ #@working_dir = File.expand_path(File.dirname(query)) # no trailing /
+ @working_dir = "."
+ else
+ @working_dir = File.expand_path(output)
+ mkcmd = "mkdir #{@working_dir}"
+ if !Dir.exist?(@working_dir)
+ puts mkcmd
+ mkdir = Cmd.new(mkcmd)
+ mkdir.run
+ if !mkdir.status.success?
+ raise RuntimeError.new("Unable to create output directory")
+ end
+ end
+ end
+ @makedb_path = which('makeblastdb')
+ raise 'makeblastdb was not in the PATH' if @makedb_path.empty?
+ @blastn_path = which('blastn')
+ raise 'blastn was not in the PATH' if @blastn_path.empty?
+ @tblastn_path = which('tblastn')
+ raise 'tblastn was not in the PATH' if @tblastn_path.empty?
+ @blastx_path = which('blastx')
+ raise 'blastx was not in the PATH' if @blastx_path.empty?
+ @blastp_path = which('blastp')
+ raise 'blastp was not in the PATH' if @blastp_path.empty?
+ @makedb_path = @makedb_path.first
+ @blastn_path = @blastn_path.first
+ @tblastn_path = @tblastn_path.first
+ @blastx_path = @blastx_path.first
+ @blastp_path = @blastp_path.first
+ end
+
+ #
+ # makes a blast database from the query and the target
+ #
+ def makedb
+ # only scan the first few hundred entries
+ n = 100
+ # check if the query is a nucl or prot seq
+ query_file = Bio::FastaFormat.open(@query)
+ count_p=0
+ count=0
+ query_file.take(n).each do |entry|
+ count_p += 1 if entry.isProt?
+ count += 1
+ end
+ if count_p > count*0.9
+ @query_is_prot = true
+ else
+ @query_is_prot = false
+ end
+
+ # check if the target is a nucl or prot seq
+ target_file = Bio::FastaFormat.open(@target)
+ count_p=0
+ count=0
+ target_file.take(n).each do |entry|
+ count_p += 1 if entry.isProt?
+ count += 1
+ end
+ if count_p > count*0.9
+ @target_is_prot = true
+ else
+ @target_is_prot = false
+ end
+ # construct the output database names
+ @query_name = File.basename(@query).split('.')[0..-2].join('.')
+ @target_name = File.basename(@target).split('.')[0..-2].join('.')
+
+ # check if the databases already exist in @working_dir
+ make_query_db_cmd = "#{@makedb_path} -in #{@query}"
+ make_query_db_cmd << " -dbtype nucl " if !@query_is_prot
+ make_query_db_cmd << " -dbtype prot " if @query_is_prot
+ make_query_db_cmd << " -title #{query_name} "
+ make_query_db_cmd << " -out #{@working_dir}/#{query_name}"
+ db_query = "#{query_name}.nsq" if !@query_is_prot
+ db_query = "#{query_name}.psq" if @query_is_prot
+ if !File.exists?("#{@working_dir}/#{db_query}")
+ make_db = Cmd.new(make_query_db_cmd)
+ make_db.run
+ if !make_db.status.success?
+ msg = "BLAST Error creating database:\n" +
+ make_db.stdout + "\n" +
+ make_db.stderr
+ raise RuntimeError.new(msg)
+ end
+ end
+
+ make_target_db_cmd = "#{@makedb_path} -in #{@target}"
+ make_target_db_cmd << " -dbtype nucl " if !@target_is_prot
+ make_target_db_cmd << " -dbtype prot " if @target_is_prot
+ make_target_db_cmd << " -title #{target_name} "
+ make_target_db_cmd << " -out #{@working_dir}/#{target_name}"
+
+ db_target = "#{target_name}.nsq" if !@target_is_prot
+ db_target = "#{target_name}.psq" if @target_is_prot
+ if !File.exists?("#{@working_dir}/#{db_target}")
+ make_db = Cmd.new(make_target_db_cmd)
+ make_db.run
+ if !make_db.status.success?
+ raise RuntimeError.new("BLAST Error creating database")
+ end
+ end
+ @databases = true
+ [@query_name, @target_name]
+ end
+
+ # Construct BLAST output file name and run blast with multiple chunks or
+ # with multiple threads
+ #
+ # @param [Float] evalue The evalue cutoff to use with BLAST
+ # @param [Integer] threads The number of threads to run
+ # @param [Boolean] split If the fasta files should be split into chunks
+ def run_blast(evalue, threads, split)
+ if @databases
+ @output1 = "#{@working_dir}/#{query_name}_into_#{target_name}.1.blast"
+ @output2 = "#{@working_dir}/#{target_name}_into_#{query_name}.2.blast"
+ if @query_is_prot
+ if @target_is_prot
+ bin1 = "#{@blastp_path} "
+ bin2 = "#{@blastp_path} "
+ else
+ bin1 = "#{@tblastn_path} "
+ bin2 = "#{@blastx_path} "
+ end
+ else
+ if @target_is_prot
+ bin1 = "#{@blastx_path} "
+ bin2 = "#{@tblastn_path} "
+ else
+ bin1 = "#{@blastn_path} "
+ bin2 = "#{@blastn_path} "
+ end
+ end
+ if split and threads > 1
+ run_blast_with_splitting evalue, threads, bin1, bin2
+ else
+ run_blast_with_threads evalue, threads, bin1, bin2
+ end
+ return true
+ else
+ return false
+ end
+ end
+
+ # Run BLAST using its own multithreading
+ #
+ # @param [Float] evalue The evalue cutoff to use with BLAST
+ # @param [Integer] threads The number of threads to run
+ # @param [String] bin1
+ # @param [String] bin2
+ def run_blast_with_threads evalue, threads, bin1, bin2
+ # puts "running blast with #{threads} threads"
+ cmd1 = "#{bin1} -query #{@query} -db #{@working_dir}/#{@target_name} "
+ cmd1 << " -out #{@output1} -evalue #{evalue} "
+ cmd1 << " -outfmt \"6 std qlen slen\" "
+ cmd1 << " -max_target_seqs 50 "
+ cmd1 << " -num_threads #{threads}"
+
+ cmd2 = "#{bin2} -query #{@target} -db #{@working_dir}/#{@query_name} "
+ cmd2 << " -out #{@output2} -evalue #{evalue} "
+ cmd2 << " -outfmt \"6 std qlen slen\" "
+ cmd2 << " -max_target_seqs 50 "
+ cmd2 << " -num_threads #{threads}"
+ if !File.exist?("#{@output1}")
+ blast1 = Cmd.new(cmd1)
+ blast1.run
+ if !blast1.status.success?
+ raise RuntimeError.new("BLAST Error:\n#{blast1.stderr}")
+ end
+ end
+
+ if !File.exist?("#{@output2}")
+ blast2 = Cmd.new(cmd2)
+ blast2.run
+ if !blast2.status.success?
+ raise RuntimeError.new("BLAST Error:\n#{blast2.stderr}")
+ end
+ end
+ end
+
+ # Run BLAST by splitting the input into multiple chunks and using 1 thread
+ # for each chunk
+ #
+ # @param [Float] evalue The evalue cutoff to use with BLAST
+ # @param [Integer] threads The number of threads to run
+ # @param [String] bin1
+ # @param [String] bin2
+ def run_blast_with_splitting evalue, threads, bin1, bin2
+ # puts "running blast by splitting input into #{threads} pieces"
+ if !File.exist?(@output1)
+ blasts=[]
+ files = split_input(@query, threads)
+ threads = [threads, files.length].min
+ files.threach(threads) do |thread|
+ cmd1 = "#{bin1} -query #{thread} -db #{@working_dir}/#{@target_name} "
+ cmd1 << " -out #{thread}.blast -evalue #{evalue} "
+ cmd1 << " -outfmt \"6 std qlen slen\" "
+ if bin1=~/blastn/
+ cmd1 << " -dust no "
+ elsif bin1=~/blastx/ or bin1=~/blastp/ or bin1=~/tblastn/
+ cmd1 << " -seg no "
+ end
+ cmd1 << " -soft_masking false "
+ cmd1 << " -max_target_seqs 50 "
+ cmd1 << " -num_threads 1"
+ if !File.exists?("#{thread}.blast")
+ blast1 = Cmd.new(cmd1)
+ blast1.run
+ if !blast1.status.success?
+ raise RuntimeError.new("BLAST Error:\n#{blast1.stderr}")
+ end
+ end
+ blasts << "#{thread}.blast"
+ end
+ cat_cmd = "cat "
+ cat_cmd << blasts.join(" ")
+ cat_cmd << " > #{@output1}"
+ catting = Cmd.new(cat_cmd)
+ catting.run
+ if !catting.status.success?
+ raise RuntimeError.new("Problem catting files:\n#{catting.stderr}")
+ end
+ files.each do |file|
+ File.delete(file) if File.exist?(file)
+ end
+ blasts.each do |b|
+ File.delete(b) # delete intermediate blast output files
+ end
+ end
+
+ if !File.exist?(@output2)
+ blasts=[]
+ files = split_input(@target, threads)
+ threads = [threads, files.length].min
+ files.threach(threads) do |thread|
+ cmd2 = "#{bin2} -query #{thread} -db #{@working_dir}/#{@query_name} "
+ cmd2 << " -out #{thread}.blast -evalue #{evalue} "
+ cmd2 << " -outfmt \"6 std qlen slen\" "
+ cmd2 << " -max_target_seqs 50 "
+ cmd2 << " -num_threads 1"
+ if !File.exists?("#{thread}.blast")
+ blast2 = Cmd.new(cmd2)
+ blast2.run
+ if !blast2.status.success?
+ raise RuntimeError.new("BLAST Error:\n#{blast2.stderr}")
+ end
+ end
+ blasts << "#{thread}.blast"
+ end
+ cat_cmd = "cat "
+ cat_cmd << blasts.join(" ")
+ cat_cmd << " > #{@output2}"
+ catting = Cmd.new(cat_cmd)
+ catting.run
+ if !catting.status.success?
+ raise RuntimeError.new("Problem catting files:\n#{catting.stderr}")
+ end
+ files.each do |file|
+ File.delete(file) if File.exist?(file)
+ end
+ blasts.each do |b|
+ File.delete(b) # delete intermediate blast output files
+ end
+ end
+
+ end
+
+ # Split a fasta file in pieces
+ #
+ # @param [String] filename
+ # @param [Integer] pieces
+ def split_input filename, pieces
+ input = {}
+ name = nil
+ seq=""
+ sequences=0
+ File.open(filename).each_line do |line|
+ if line =~ /^>(.*)$/
+ sequences+=1
+ if name
+ input[name]=seq
+ seq=""
+ end
+ name = $1
+ else
+ seq << line.chomp
+ end
+ end
+ input[name]=seq
+ # construct list of output file handles
+ outputs=[]
+ output_files=[]
+ pieces = [pieces, sequences].min
+ pieces.times do |n|
+ outfile = File.basename("#{filename}_chunk_#{n}.fasta")
+ outfile = "#{@working_dir}/#{outfile}"
+ outputs[n] = File.open("#{outfile}", "w")
+ output_files[n] = "#{outfile}"
+ end
+ # write sequences
+ count=0
+ input.each_pair do |name, seq|
+ outputs[count].write(">#{name}\n")
+ outputs[count].write("#{seq}\n")
+ count += 1
+ count %= pieces
+ end
+ outputs.each do |out|
+ out.close
+ end
+ output_files
+ end
+
+ # Load the two BLAST output files and store the hits in a hash
+ #
+ def load_outputs
+ if File.exist?("#{@working_dir}/reciprocal_hits.txt")
+ # puts "reciprocal output already exists"
+ else
+ @query_results = Hash.new
+ @target_results = Hash.new
+ q_count=0
+ t_count=0
+ if !File.exists?("#{@output1}")
+ raise RuntimeError.new("can't find #{@output1}")
+ end
+ if !File.exists?("#{@output2}")
+ raise RuntimeError.new("can't find #{@output2}")
+ end
+ if File.exists?("#{@output1}") and File.exists?("#{@output2}")
+ File.open("#{@output1}").each_line do |line|
+ cols = line.chomp.split("\t")
+ hit = Hit.new(cols, @query_is_prot, @target_is_prot)
+ @query_results[hit.query] = [] if !@query_results.has_key?(hit.query)
+ @query_results[hit.query] << hit
+ q_count += 1
+ end
+ File.open("#{@output2}").each_line do |line|
+ cols = line.chomp.split("\t")
+ hit = Hit.new(cols, @target_is_prot, @query_is_prot)
+ @target_results[hit.query] = [] if !@target_results.has_key?(hit.query)
+ @target_results[hit.query] << hit
+ t_count += 1
+ end
+ else
+ raise "need to run blast first"
+ end
+ end
+ [q_count, t_count]
+ end
+
+ # fills @reciprocals with strict reciprocal hits from the blast results
+ def find_reciprocals
+ if File.exist?("#{@working_dir}/reciprocal_hits.txt")
+ # puts "reciprocal output already exists"
+ else
+ @reciprocals = Hash.new
+ @missed = Hash.new
+ @evalues = []
+ @longest = 0
+ hits = 0
+ @query_results.each_pair do |query_id, list_of_hits|
+ list_of_hits.each_with_index do |target_hit, query_index|
+ if @target_results.has_key?(target_hit.target)
+ list_of_hits_2 = @target_results[target_hit.target]
+ list_of_hits_2.each_with_index do |query_hit2, target_index|
+ if query_index == 0 && target_index == 0 &&
+ query_id == query_hit2.target
+ e = target_hit.evalue.to_f
+ e = 1e-200 if e==0
+ e = -Math.log10(e)
+ @reciprocals[query_id] ||= []
+ @reciprocals[query_id] << target_hit
+ hits += 1
+ @longest = target_hit.alnlen if target_hit.alnlen > @longest
+ @evalues << {:e => e, :length => target_hit.alnlen}
+ elsif query_id == query_hit2.target
+ if !@missed.key?(query_id)
+ @missed[query_id] = []
+ end
+ @missed[query_id] << target_hit
+ end
+ end
+ end
+ end
+ end
+ end
+ return hits
+ end
+
+ # Learns the evalue cutoff based on the length of the sequence
+ # Finds hits that have a lower evalue than this cutoff
+ def find_secondaries
+
+ if File.exist?("#{@working_dir}/reciprocal_hits.txt")
+ # puts "reciprocal output already exists"
+ else
+ length_hash = Hash.new
+ fitting = Hash.new
+ @evalues.each do |h|
+ length_hash[h[:length]] = [] if !length_hash.key?(h[:length])
+ length_hash[h[:length]] << h
+ end
+
+ (10.. at longest).each do |centre|
+ e = 0
+ count = 0
+ s = centre*0.1
+ s = s.to_i
+ s = 5 if s < 5
+ (-s..s).each do |side|
+ if length_hash.has_key?(centre+side)
+ length_hash[centre+side].each do |point|
+ e += point[:e]
+ count += 1
+ end
+ end
+ end
+ if count>0
+ mean = e/count
+ fitting[centre] = mean
+ end
+ end
+ hits = 0
+ @missed.each_pair do |id, list|
+ list.each do |hit|
+ l = hit.alnlen.to_i
+ e = hit.evalue
+ e = 1e-200 if e==0
+ e = -Math.log10(e)
+ if fitting.has_key?(l)
+ if e >= fitting[l]
+ if !@reciprocals.key?(id)
+ @reciprocals[id] = []
+ found = false
+ @reciprocals[id].each do |existing_hit|
+ if existing_hit.query == hit.query &&
+ existing_hit.target == hit.target
+ found = true
+ end
+ end
+ if !found
+ @reciprocals[id] << hit
+ hits += 1
+ end
+ end
+ end
+ end
+ end
+ end
+ end
+ return hits
+ end
+
+ def clear_memory
+ # running lots of jobs at the same time was keeping a lot of stuff in
+ # memory that you might not want so this empties out those big hashes.
+ @query_results = nil
+ @target_results = nil
+ end
+
+ # delete blast database files
+ def tidy_up
+ Dir["*nin"].each do |file|
+ File.delete(file)
+ end
+ Dir["*nhr"].each do |file|
+ File.delete(file)
+ end
+ Dir["*nsq"].each do |file|
+ File.delete(file)
+ end
+ Dir["*blast"].each do |file|
+ File.delete(file)
+ end
+ end
+
+ def run evalue=1e-5, threads=1, split=true
+ makedb
+ run_blast evalue, threads, split
+ load_outputs
+ find_reciprocals
+ find_secondaries
+ end
+
+ def size
+ hits=0
+ @reciprocals.each_pair do |key, list|
+ list.each do |hit|
+ hits += 1
+ end
+ end
+ hits
+ end
+
+ def write_output
+ s=""
+ unless @reciprocals.nil?
+ @reciprocals.each_pair do |query_id, hits|
+ hits.each do |hit|
+ s << "#{hit}\n"
+ end
+ end
+ File.open("#{@working_dir}/reciprocal_hits.txt", "w") {|f| f.write s }
+ end
+ end
+
+ def has_reciprocal? contig
+ return true if @reciprocals.has_key?(contig)
+ return false
+ end
+ end
+
+end
diff --git a/lib/crb-blast/hit.rb b/lib/crb-blast/hit.rb
new file mode 100644
index 0000000..9bde669
--- /dev/null
+++ b/lib/crb-blast/hit.rb
@@ -0,0 +1,36 @@
+module CRB_Blast
+
+ class Hit
+ # Fields: query id, subject id, % identity, alignment length, mismatches,
+ # gap opens, q. start, q. end, s. start, s. end, evalue, bit score
+ attr_accessor :query, :target, :id, :alnlen, :mismatches, :gaps, :qstart,
+ :qend, :tstart, :tend, :evalue, :bitscore, :qlen, :tlen, :qprot, :tprot
+
+ def initialize(list, qprot, tprot)
+ raise(RuntimeError, "unexpected number of columns") if list.length < 14
+ @query = list[0].split(/[\|\ ]/).first
+ @target = list[1].split(/[\|\ ]/).first
+ @id = list[2]
+ @alnlen = list[3].to_i
+ @mismatches = list[4].to_i
+ @gaps = list[5].to_i
+ @qstart = list[6].to_i
+ @qend = list[7].to_i
+ @tstart = list[8].to_i
+ @tend = list[9].to_i
+ @evalue = list[10].to_f
+ @bitscore = list[11].to_f
+ @qlen = list[12].to_f
+ @tlen = list[13].to_f
+ @qprot = qprot # bool
+ @tprot = tprot # bool
+ end
+
+ def to_s
+ s = "#{@query}\t#{@target}\t#{@id}\t#{@alnlen}\t#{@evalue}\t#{@bitscore}\t"
+ s << "#{@qstart}..#{@qend}\t#{@tstart}..#{@tend}\t#{@qlen}\t#{@tlen}"
+ return s
+ end
+ end
+
+end
\ No newline at end of file
diff --git a/lib/crb-blast/version.rb b/lib/crb-blast/version.rb
new file mode 100644
index 0000000..070cd27
--- /dev/null
+++ b/lib/crb-blast/version.rb
@@ -0,0 +1,12 @@
+module CRB_Blast
+
+ module VERSION
+ MAJOR = 0
+ MINOR = 6
+ PATCH = 4
+ BUILD = nil
+
+ STRING = [MAJOR, MINOR, PATCH, BUILD].compact.join('.')
+ end
+
+end
diff --git a/metadata.yml b/metadata.yml
new file mode 100644
index 0000000..b50249e
--- /dev/null
+++ b/metadata.yml
@@ -0,0 +1,254 @@
+--- !ruby/object:Gem::Specification
+name: crb-blast
+version: !ruby/object:Gem::Version
+ version: 0.6.4
+platform: ruby
+authors:
+- Chris Boursnell
+- Richard Smith-Unna
+autorequire:
+bindir: bin
+cert_chain: []
+date: 2015-05-19 00:00:00.000000000 Z
+dependencies:
+- !ruby/object:Gem::Dependency
+ name: trollop
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '2.0'
+ type: :runtime
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '2.0'
+- !ruby/object:Gem::Dependency
+ name: bio
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '1.4'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 1.4.3
+ type: :runtime
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '1.4'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 1.4.3
+- !ruby/object:Gem::Dependency
+ name: fixwhich
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '1.0'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 1.0.2
+ type: :runtime
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '1.0'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 1.0.2
+- !ruby/object:Gem::Dependency
+ name: threach
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '0.2'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 0.2.0
+ type: :runtime
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '0.2'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 0.2.0
+- !ruby/object:Gem::Dependency
+ name: bindeps
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '1.0'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 1.0.3
+ type: :runtime
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '1.0'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 1.0.3
+- !ruby/object:Gem::Dependency
+ name: rake
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '10.3'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 10.3.2
+ type: :development
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '10.3'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 10.3.2
+- !ruby/object:Gem::Dependency
+ name: turn
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '0.9'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 0.9.7
+ type: :development
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '0.9'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 0.9.7
+- !ruby/object:Gem::Dependency
+ name: simplecov
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '0.8'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 0.8.2
+ type: :development
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '0.8'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 0.8.2
+- !ruby/object:Gem::Dependency
+ name: shoulda-context
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '1.2'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 1.2.1
+ type: :development
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '1.2'
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: 1.2.1
+- !ruby/object:Gem::Dependency
+ name: coveralls
+ requirement: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '0.7'
+ type: :development
+ prerelease: false
+ version_requirements: !ruby/object:Gem::Requirement
+ requirements:
+ - - "~>"
+ - !ruby/object:Gem::Version
+ version: '0.7'
+description: See summary
+email: cmb211 at cam.ac.uk
+executables:
+- crb-blast
+extensions: []
+extra_rdoc_files: []
+files:
+- ".gitignore"
+- Gemfile
+- README.md
+- Rakefile
+- bin/crb-blast
+- build
+- crb-blast.gemspec
+- deps/deps.yaml
+- lib/crb-blast.rb
+- lib/crb-blast/cmd.rb
+- lib/crb-blast/crb-blast.rb
+- lib/crb-blast/hit.rb
+- lib/crb-blast/version.rb
+- test/helper.rb
+- test/query.fasta
+- test/query2.fasta
+- test/target.fasta
+- test/target2.fasta
+- test/test_bin.rb
+- test/test_test.rb
+- test/test_test2.rb
+homepage: https://github.com/cboursnell/crb-blast
+licenses:
+- MIT
+metadata: {}
+post_install_message:
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+ requirements:
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: '0'
+required_rubygems_version: !ruby/object:Gem::Requirement
+ requirements:
+ - - ">="
+ - !ruby/object:Gem::Version
+ version: '0'
+requirements: []
+rubyforge_project:
+rubygems_version: 2.2.2
+signing_key:
+specification_version: 4
+summary: Run conditional reciprocal best blast
+test_files: []
diff --git a/test/helper.rb b/test/helper.rb
new file mode 100644
index 0000000..42b7dfa
--- /dev/null
+++ b/test/helper.rb
@@ -0,0 +1,16 @@
+require 'simplecov'
+require 'coveralls'
+
+SimpleCov.formatter = SimpleCov::Formatter::MultiFormatter[
+ SimpleCov::Formatter::HTMLFormatter,
+ Coveralls::SimpleCov::Formatter
+]
+SimpleCov.start
+
+require 'test/unit'
+begin; require 'turn/autorun'; rescue LoadError; end
+require 'shoulda-context'
+require 'crb-blast'
+
+Turn.config.format = :pretty
+Turn.config.trace = 5
diff --git a/test/query.fasta b/test/query.fasta
new file mode 100644
index 0000000..14979fe
--- /dev/null
+++ b/test/query.fasta
@@ -0,0 +1,22 @@
+>scaffold3
+CCACCCCACTTCAACTCCCAACAACATATATACATTACACACCCCCACAACACACTCACACACTCTCTCACACAACACCACCAGACAATGAACCACAACTACCGTTACAATGTTCTTCTTTTCTTCTTCCTCATCATCTTCTCACATACATCTGCTCATCGTTTCTTGCACACCAAATCAGGTGAAAGCATGGTGAAACCCGATGATNCCAGGTGAAAACATGATGAAACTCGATGATCAATCCATCAATTACACCGGTGAAAACCCTCTTGTCGAAGTAGAAATAGTCAACTCTGACTCCTTCAATGAGGTTATGGGGGTGGAGGACTGTGGGAGTGGAGATGAAGAGTGTTTGAAGAGAAGGGTGCTTGCAGATGCTCATTTGGACTATATTTACACTCAGCATCATAAGCATTGAAATAAGTTGTATTTCACCCTCTTTATTTATAGCTTTTTATGTCTTAATGTAACATAGGGTTAAATTCAAGTAAT [...]
+>scaffold4
+CCACCCCCCCACTTCAACTCCCAACAACATATATACATTACACACCCCCACAACACACTCACACACTCTCTCACACAACACCACCAGACAATGAACCACAACTACCGTTACAATGTTCTTCTTTTTTTCTTCTTCCTCATCATCTTCTCACATACATCTGCTCATCGTTTCTTGCACACCAAATCAGGTGAAAGCATGGTGAAACCCGATGATNCCAGGTGAAAACATGATGAAACTCGATGATCAATCCATCAATTACACCGGTGAAAACCCTCTTGTCGAAGTAGAAATAGTCAACTCTGACTCCTTCAATGAGGTTATGGGGGTGGAGGACTGTGGGAGTGGAGATGAAGAGTGTTTGAAGAGAAGGGTGCTTGCAGATGCTCATTTGGACTATATTTACACTCAGCATCATAAGCATTGAAATAAGTTGTATTTCACCCTCTTTATTTATAGCTTTTTATGTCTTAATGTAACATAGGGTTAAATTCA [...]
+>scaffold5
+CCACAAACAAGATTTGTTTTAAAGAAAGCTCTTGAATTCGGTCATGCAGTTGTTGTTGTTGTGAATAAGATTGATAGACCGTCTGCTCGCCCTGAATTCGTCATCAATTCTACTTTTGAACTCTTTATTGAACTCAATGCATCTGATGAACAATGTGATTTTCAAGCAGTGTATGCTAGTGGTATAAAGGGAATGGCTGGACTTTCTCCTGATAATTTAGCAGATGATCTTGGACCACTTTTTGAGACAATAATTAGATGCATACCCGGGCCACGTGTTAAGAAGGATGGCGCACTTCAAATGCTTGTCACGAGCACAGAATATGATGAGCACAAAGGAAGAATAGCGATTGGGAGGTTACATGCCGGTATTTTGAGTAGAGGCATGGATGTTCGTATATGCACACCAGATGATGCATGCAGATTTGGGAAAGTTAGTGAATTGTTTGTANGATTAGTTTCTGTTATGTAAAATGTGTAACAGATTGTGTTC [...]
+>scaffold19
+CTGTTGGCTACAAGTAGTGCTATATTTCACCTTTTAATTACATTGATAGCTGCCTTCAACCCAATTTATTCGTATGATCAATTCATGGACGAGTGTGCAATTGGCTTCTCTGGTGTTCTATTCTCTATGATTGTTATAGAAACAAGCTTAAGTGGAGTCCAATCTAGAAATGTATTTGGACTTTTTAATGTACCTGCTAAGTTGTATCCGTGGTTGTTATTGGTGCTGTTCCAGCTTCTTATGACAAACATCTCATTACTCGGACACTTATGTGGGATTCTATCTGGATTTGCATATACTTATGGATTATTCAACTTTATAATTCCGGGCAGCTCGTTCTTTTCAGGCATAGAATCAGCTTCTTGGCTTTCAACTTGTGTGCGTAGACCAAAATATATAATGTGCACAGGTGGTGATCCTTCTGGATACATCCCTACATATTCTACTCGTAACACAACATCTAGTGGAACTCTTTCCGGGAATATGTGGACA [...]
+>scaffold27
+TATTTTCCCAAAAGGTTACAAAAGTATTTGATAATAACATAAAAATTAACCATATATGTGGTGAAGATTAGCCCAATGGGTACACCAGTGGTGAAGATCAAAGGTTAAGTTGTAACCTCAAAACCCATCTCCCTATCCAAATCCTTTAAACCAGTGGTCAATATTTCACAGCCATGTGGCATCACAGGCAATCTCAATTGTGTGGTGGTTAACAATTCAAAAGCTAATCCTCTCTATATAAGTTATTATCTTCNCTTCACAGTGTACCTTCGTATCATTCCTAAATTCAACTTCAGTTCCACTGACACTCGCCTGAGATCAATGGCAACCGCCGTCTCTACCGTCGGAGCCGTCAGCCGTGTTCCGTTGAGCTTGAATGGAACCAGTGGTGGTGCTGCAGTCCCAAACTCAGCTTTCATGGGCAACAGCTTGAAGAAGACAGTGAACTCAAGATTGATAAACAACAAGCCAGTGTCCCTGAAAGTATTTGCA [...]
+>scaffold32
+GGAAAGACATTTAGAGAGCAAGTTCTTCGCTGGTTAACTTATCATAAGATTTCTCAAGTTGATTCTATTGTTTTGACCCATGAACATGCTGATGCAGTACTTGGTCTTGATGACATCCGTGCAGTGCAACCATATAGTCCAATAAATGAAATTGAACCTACACCTATTTTCCTGAATCAACATGCGATGGAAAGCTTAAAAGTGAAATTTCCGTACTTGGTTCAAAAGAAGCTCAAGGCTGGTCAAGAAGTTAGACGTGTGGCACAACTTGATTGGAAAATAATTGAGAATGATTGTAAGAGTCCATTTGTTGCATCCGGGCTAGAAATTATACCCTTGCCGGTAATGCATGGGGAGGATTACATCTGTTTAGGTTTTTCATTTGGGAAACATAGCAAAGTAGCTTATATATCTGATATTTCTCGTTTCATAGAAGAAACAGAACTCTATATTTCTAAAAGAGGAGGTAAGCAGCTGGATCTTCTTATATTG [...]
+>scaffold52
+ACTCAGTATCCAAACGCCCCCTATGACATGTTCATCGAACACCCGCCAAAACCTTCTCGTCTGATTCATCTTCTCCGTCCAAAATCAATCAAAAACCCTAAAATTCATCCCCAAAATGAAGATCACAAAACGCTTCAATCGAGTCGTTAAACCCCTAAAACCCATAAATTCATCCACAAGTGATTTCAAACCTAATACCACCAAGAACAAACCAAATCTGAAGCTGAAATCGAAAACAAAACCTACAACAATCAAGCAGAAACCCCCAAATAAACACGATATTCATTCAATAGAATCCAAACCCTCGTTCTCACTCGAGAAATCCCCAATTTTGGGCCCTGAGTNGCAAAACCTACAACAATCAAACAAAAAGCCCCAACTCAACCCACTATTCACTCAACAGAATCAAAACCCTCGTTCTTACTCGAGAAATCCCCAATTTTGGGCCCTGAATTTTACCAAATTGACGCCCTTGATCTTGCGCCTCGGTTG [...]
+>scaffold53
+AAAGCCTTCACAAACCCAACACACCACATGTCAAGAATTTACATGGCCAATAATACACAGACATGACTAATCTAATGTAGCATCTTGAATCTTTTGCACATTACTTCATATCTAAAAATGAAAATATAAATTTCATAACACACAGTGACAATAACGCCAAGGGCAGGATCAGTGATGAGGTACACCACTACACGGGACTCAACATTCTCATCCAAGAAAACAACTCAAACCTAGTGATAATTTAATAACTTTTAAAGTTGAGATAATCCTCCTAAAGACGTAAAAGCCTAGAATTTCTATTTCGATAAGAAGCACGGCATTGAGCATTGTGATTTCTCGCCTTGAGGTCCCAACTTAGCCCCATGACACCCCCCCCTCCTTAAACCCACAAAAAAACTCAGCTAAAAAAAACATAGAAAGATCATGGCTAAGGTTGGTGATGAGGTACAAGAACTCAACATTCCCATCCAAGAACACAACTTTGAAAATGAT [...]
+>scaffold90
+GAGCTCTCCTGTCAGGTTTTTATATATAATAAAATCTTTAAACGTTGTTTATGTACTTTAGATAATTCGCTAAGTCGATGCATCTATATGTTGCATTACATTGCAGATACACGAGGAGGCACGGAAGTTTTCATATCAGACAGGTGTCAAGGTGGTGGTTGTTTATGGAGGAACACCGATTAATCAACAGTTAAGNCTCTCCTGTCAGATACACGAGGAGGCACGGAAGTTTTCATATCAGACAGGTGTCAAAGTGGTGGTTGTTTATGGAGGAACACCGATTAATCAACAGTTAAGAGAGCTCGAGAGGGGAGTTGATATTCTTGTTGCCACACCAGGAAGGTTAGTTGATCTACTGGAGAGGGCAAAAGTGTCATTGCAAATGATAAGGTATCTAGCTCTTGATGAGGCGGATCGGATGCTGGATATGGGATTTGAGCCTCAAATTAGAAAAATTGTCGAGCAAACGGACATGCCTCCACCAGGTGAAAG [...]
+>scaffold103
+GTCAAACTCAAAGTGGTTGTGTTCCACGCAAAGATAAGGTGTGAAGCATAAACCCCAAACCATTCCAGGATATTCTTCCTCGTGTTTGTTTCTTTCTTATCGGAAAACAACTCTGCTTCGATGCAGATCGGAGTCCTGATTTCATGGACCGCATCATAACATCATACCCTTGTTCTTCTTCTTCTACTCTAACACTGTCCCTCTCAAGATTTTGTTCAAATATCAAGAAACTAGGAAGAAGAAATTCCACTGGCATCTCTCTCTCTTGTTCAAACCCTCAAAACCCTAAATCTTTATGGCAGAAATACACCTCATCCATCAAGAAGTTACCCATTTTCAGACACTACGCCATGGATCCACTCAGTCGCCCCAGTTGCCTGCTGAATGGAGGGCTCGGTATGGCTCTGTTGAGCGTAACTGCAACGGCAAAGGTTCGTATTAGCCCCTTTGTTGCCACACTAGCTGCGAATCCAACTTTTGTCTCTGGGTTTT [...]
+>scaffold109
+CCAGATGAAGACGACGATAAGGAGGAAAAAGCATCGAAATCATCACTACACAACAACAAGGTGAAAAACTACTCGTGTGTTGATAACTGCTGTTGGTTCATCGGTTGCGTGTGCTCCGTGTGGTGGTTGTTGCTGTTTCTGTACAATGCGATGCCGGCGTCGTTCCCGCAGTTCGTGACGGAGGCGATCTCCGGACCGTTTCCGGATCCTCCAGGTGTAAAGTGTTTGAAGGAGGGGCTGAAGGTGAAGCATCCGGTGGTGTTTGTGCCCGGGATTGTGACCGGTGGACTTGAGCTGTGGGAGGGGCACCAGTGTATGGATGGATTGTTCAGGAAGAGGCTTTGGGGCGGTACGTTTGGTGAAGTTTATAAGAGGCCTTCATGCTGGGTACAACATATGTCGCTGGACAACAAAACTGGGATGGATCCACCAGGTATACGGGTCAGACCCGTAAATGGACTTGTAGCTGCTGATTACTTCGCCCCAGGATAT [...]
diff --git a/test/query2.fasta b/test/query2.fasta
new file mode 100644
index 0000000..7c57e66
--- /dev/null
+++ b/test/query2.fasta
@@ -0,0 +1,30 @@
+>scaffold24621
+AAAAAGCACACTCAAAAGACATCAGCTTTTATTTCCCTGTCTGTCTGCCATTAATTCCCATAAATATAACACATCACATCTCTCTGCTCCTCCTTTCCCCTTAAAACCAGAGACCCTTTTCTCACCATAACATCTCTCTCTCTCTCTCTCTCTCTCTCTCATGGACTCTCATCATACCTTCATATAACCTTCATTTACCCCCAATTCCCCACCTGGGTTTTCTTCTTCACCATCTTCCTTCACCTGGGTCATCTTCCAATTTCACAATATGATCACTGGTTCCGACCTTTACCATGTCTTTACGGCCGTGGTTCCACTCTATGTCGCCATGATCCTAGCCTATGGCTCGGTGAAATGGTGGAAGATCTTCACCCCTGATCAATGCTCAGGGATCAACAGATTTGTAGCTCTATTTGCTGTCCCATTGCTCTCCTTTCACTTCATCTCAACTAACAACCCTTACACCATGAACTCTAGGTTCATCGCTGCTGA [...]
+>scaffold46687
+AACTTAAGCTGTACAAAAATGTTTGCACTTACAAGCATGCAAGAACATAAAATATGTCTTGGAGTGTAGAGATCAATTTGTTATGCCTGATTGCCTCAGACGAGATGAAGTGTGAGAAATTGTGTGCAACAAACTCATCTATAGACTATACACATGCAATTTGTAGGCGTCCAAATTTTTTATGCTAAGCACTCTTATAAGAGCAACTCCAACATAAACAAGCACAAAAAATCTTTCATGTTACATGTCCCAGCATTTATACATGTAACATCCAATACTAGCTAGTTAGATCTTGGGCTGCAAATATTTTTATTAATTGCAGTGCCACAATTCACCCTTGCATCACTCAGTTCTATGATATATTAATCTTTTTATCTTGTTATCATGCTGATATCATTAATTCACCTTTTAATTCAAGAGGAGCAGGAGATCCATCAAGAAGCACAAGGAACAGATGAAATAAGAGAAACNGCACAAGGAACAGATGAAATA [...]
+>scaffold47909
+TTTTTTTAAATGAAATTTAGAAAAACAACCAAAATGTTTTAAAAAAAATTGAAAAACTAAATAATAAATTCCTTACATTAGTTGATTATCATTTAACATCTTCTTTTGTAATTGATTTTGTTTCAATTCAAGTGTTTTCTAAATATATTAAAACATCATCATTCAATTGAATGTATGTCATGTACTGGTTGTAGGTAAAAAACACAAAAGATAACCTAACCATAAACATGCGTTTAATTAAGCATCTTTTTTACTTTTGAGACGTTTATAAAAGGTAAATAAAAAACATAAAGTTAATCACACTTATAAGTGTCAGTTGACCCTCTTTTTTTTTATGTTAACATGCTCACACTGAGATGTCCTTTCTGTTTGAAAAAATAAAGATATAAAGGTAGCTACCTTAACATACAAATTCTTCATTCATTGTCAAGCACCGCAGCATTTAATTTGAAAGCACTGCACCATGATCACTGTTTCAGATTTCTACCATGT [...]
+>C612630
+GTGCGCTTTTTTATAACCAAGAAATATATAATAATCAAAGTGCCTTTTCATGAAACATTAATACCTGGACAATTGCAATGCGTAAAAGAACGCCACGTAGGCCAATTGGATAAGAAACGATCGTCATTACAGCTGGTCCCACAAAGAATCTCACACCCATTGCCATTGTTGCCATGGAATTCCCGCATGCTATGATCCTTGGTTGTAATGCCATAAACAAA
+>contig-4577 283 0
+CGCTCGTTGGCGGCATTGACTTGACGCCTGTACCCGTTCCTTCAATATTTATGTTGTTATTTCCGTAACTCTCCAAGTACTCTTCTCCGTTACCTAAAGTACGTAAACGTAACATTTTATCGATCAGCTTAAATCTAAAAAGATAACGACAATGAAAATATTCTTAGATATATAAAAGTTGACCGGCCTTTTACGGGTGATATCGGAACTTTTAAATCTTTTGTGGTTGGTGGGTCGAGTGCTCCGTATTCACGAGCCGCGAATACATCGGAAACCGGAGAAG
+>contig-69208 224 0
+ATTTGCGCCACTTTCGACTCATTTTACCCGTTCCCTTTAAAGTTGCATATTTTTATTTGAGCCATTTAGACCGTATTACATAAAACACAACCCAAATAGACCCATTTATTCAAAACTGCCACCTCTATTTTCATTTATACCTAGACTAAACATAGCCATTCCTAGGCCTGCATCCGAGAGAATGGTAACTGAATTTTCGAGAATCTGTGGTTTCTTTATTCCCC
+>contig-115072 367 0
+GGTTGATACTGATCATGGTGTGGCGGAAACTTATCCGGAATCCAAACACATACTCGAGCCTCATTGGTCTAATATGGTCTCTTGTCTCGTTCAGGTAAATGGTTATGATATAAATATGTAGATTCAGGTAAATGATTATGATATAAATATGTAGATTTAAAAGTAGATTTAATACGTGTTCATATGTTAGAAGTTAAAAAAAACAATTATTAAAGGATAATGTTTTTAAATATATAGAATAAATATTTGTTTGTTAATAAATATATGTTTAATGGTGACAGATGGCATGTGGAAATGCCAATAATAATAGCAAAATCAATCTCAATACTTTCGGATGCTGGTCTTGGAATGGCTATGTTTAGCTTAG
+>contig-1453 255 0
+TAGCCATTCCTAGGCCTGCATCCGAGAGAATGGTAACTGAATTTTCGAGAATCTGTGGTTTCTTTATTCCCCACCTGCATATATCATGAAGAAAAGAAACAAAAGTTAAAACATTTTTGATCTTAATCTTTAATAGCATAAACCCATGTGAATCAAGTGGTAATCACCTGCAAGCAGCCAAAGCCCAGCTCAAACCAAGTATGCTTGCATATGAATTCGGGTTCTTCACTAGCTTAATGCATACCATTTTCAAGA
+>_47104
+CTAGTTAGATCTTGGGCTGCAAATATTTTATATTAATTGCAGTGCCACAATTCACCCTTGCATCACTCAGTTCTATGATATATTAATCTTTTTATCTTGTTATCATGCTGATATCATTAATTCACCTTTTAATTCAAGAGGAGCAGGAGATCCATCAAGAAGCACAAGGAACAGATGAAATAAGAGAAACAAACTCTAGTATTAAAAGAACAAGAATAATGCAGGTTTTATCAACTGTTGGGAAGAAGCTTCTAATTAATCCAAATTCTCATGCAACTGTAGCAGGCCTTTCATGGGCATTCATAAGCTTCGGGTAAGTGTGCATAAATACCGTCAGCACAAAGAATACAATTGCATAGATAATAATACAATAGTTAAATATAATGCAGGTGCAATGTAAAACTGCCAAAGGTATTTGAAGATTCAGTATCTATAATATCAGAAGGAGGACTTGGAATGGCAATGTTCAGCCTAGGTAACAACATAAATTTA [...]
+>_47472
+CAAGAGAATTACTAACATGAAATTAGAATATGCAACCTTTTTAAGTTAAAAGCATGTTGCTTTTGGTTTTACAAGTTTCTTTTAGCCCACTTACAAATGAGAATCAACTTGTCACATTCACACCATTAGGACACTTAAGTAGTCAATAACCCTTGAAAGCTTTACATCATATTTAATTAAAAACCTTAAATAATTACATCAAAAGAGATGATGCACATGCCACAAACAAACAAGAACCCAAATTAATTAAAAACATAGCTCTCCTTCCCATCAATCCTAATTGTTTTCCTAATATATCACTAAGATTTTTCTTTTTTTCCCCAAAAAAAGAAGAAAAAAAAAAGATTAAATTCCCTTCTTTTGCTCATGTTTGTCTCTCCCTCTTTTAAACTATATCCTTCTTCCTTGCTTTCTTCATTTCTTGAGAATCTACACATTAAAAGATACAACATTGGGAGATGGAGGTAGCACGATCCTTACATCAAGTCAAAA [...]
+>_77170
+AGCTCTGCCTCAAGGAATTGTCCCTTTTGTGTTTGCTAAAGAATATAATGTCCATCCAGATATACTTAGCACAGGGTGAGTTTTTACTGGCTTTACATATCTTATGTCTTACTTTTGCGCTTTTGTAACTTGCACATCATGCATATGAAAGTTGTACATTGTCATATGTAACAAAGCATCAAATGGTCATGTTTGAAGAGCAGAGATCCGTTGCAACTTTGCCATGCCTACTTTGTATCTTATGCATCAAAAGTTGTACATTGCTCCAAACATGCAATGCTTGTCAAAGAGTATAAGTAGCATCTTGATTTGAAAATTTACAAATCTTTAATACATGGTTTACATGTTTGTGTTGCAGGGTTATATTTGGAATGTTGATAGCACTACCAATAACACTGGTGTACTACATAGTGTTGGGGCTTTAAATGAGTGG
+>_88236
+TTTTTGTGCTAAAGAAGTAGACGCAACACTACTACTTTCTTTTCTTTTCATGGGAGAAGAATATACACAACACAACACAACACAAACAACAATTGGAGAGACAACAAAACAAACATTGATTTTACTTTACTTCCTGCCAATGTCTTCTTGTCAAATTACAATTCTCTCCCTCCCAACAATGACCCTCCCTCTCTAATGTCATCTCTTTTTACTCTCCACTAAAGTTAACAAAATGGTCATTTCACATGTTAGACAACCACAACCCAAAACAAACTTAAAACTTCAATTCTTCAAACAAGACAAGAGTCAAAGCCCACTCATTTAAAGCCCCAACACTATGTAGTACACCAGTGTTATTGGTAGTGCTATCAACATTCCAAATATAACCC
+>_93178
+TATAAAATAAAACATAAGATAAACTTAAGCTGTACAAAAATGTTTGCACTTACAAGCATGCAAGAACATAAAATATGTCTTGGAGTGTAGAGATCAATTTGTTATGCCTGATTGCCTCAGACGAGATGAAGTGTGAGAAATTGTGTGCAACAAACTCATCTATAGACTATACACATGCAATTTGTAGGCGTCCAAATTTTTTATGCTAAGCACTCTTATAAGAGCAACTCCAACATAAACAAGCACAAAAAATCTTTCATGTTACATGTCCCAGCATTTATACATGTAACATCCAATACTAGCTAGTTAGATCTTGGGCTGCAAATATTTTTATTAATTGCAGTGCCACAATTCACCCTTGCATCTCTCAGT
+>_123423
+CGAAACGATATTATTTGAGGAGTAATATAAAAATTTTTAATAATATAATTTAAAAAGTTGTATACATTAAAAATTGACATTTGCGCCACTTTCGACTCATTTTACCCGTTCCCTTTAAAGTTGCATATTTTTATTTGAGCCATTTAGACCGTATTACATAAAACACAACCCAAATAGACCCATTTATTCAAAACTGCCACCTCTATTTTCATTTATACCTAGACTAAACATAGCCATTCCTAGGCCTGCATCCGAGAGAATGGTAACTGAATTTTCGAGAATCTGTGGTTTC
+>_157256
+TGCTTCACGGTCCACTCCATTCGGGCCCGGTTGGTTCCCAAAGCTAAACTCATTCCGGCCAAAATCATCATAATCTTTAGTATGGGGCACCCCTCCGAGATCATTACCATACTCCCCACCTCTAAACACATGAATACCACCTTCTGATACAGGTGAAGCACTAGAACTCCACACAAACATATGAAGATCTTTACCACCACCACCACCACCACCACCGCCGCCGCCAAGATCGGGTTTT
diff --git a/test/target.fasta b/test/target.fasta
new file mode 100644
index 0000000..9df1a73
--- /dev/null
+++ b/test/target.fasta
@@ -0,0 +1,62 @@
+>AT2G39730.1|PACid:19639427
+MAAAVSTVGAINRAPLSLNGSGSGAVSAPASTFLGKKVVTVSRFAQSNKKSNGSFKVLAVKEDKQTDGDRWRGLAYDTSDDQQDITRGKGMVDSVFQAPMGTGTHHAVLSSYEYVSQGLRQYNLDNMMDGFYIAPAFMDKLVVHITKNFLTLPNIKVPLILGIWGGKGQGKSFQCELVMAKMGINPIMMSAGELESGNAGEPAKLIRQRYREAADLIKKGKMCCLFINDLDAGAGRMGGTTQYTVNNQMVNATLMNIADNPTNVQLPGMYNKEENARVPIICTGNDFSTLYAPLIRDGRMEKFYWAPTREDRIGVCKGIFRTDKIKDEDIVTLVDQFPGQSIDFFGALRARVYDDEVRKFVESLGVEKIGKRLVNSREGPPVFEQPEMTYEKLMEYGNMLVMEQENVKRVQLAETYLSQAALGDANADAIGRGTFYGKGAQQVNLPVPEGCTDPVAENFDPTARSDDGTCVYNF
+>AT2G24270.4|PACid:19639899
+MAGTGLFAEILDGEVYKYYADGEWKTSSSGKSVAIMNPATRKTQYKVQACTQEEVNAVMELAKSAQKSWAKTPLWKRAELLHKAAAILKDNKAPMAESLVKEIAKPAKDSVTEVVRSGDLISYCAEEGVRILGEGKFLLSDSFPGNDRTKYCLTSKIPLGVVLAIPPFNYPVNLAVSKIAPALIAGNSLVLKPPTQGAVSCLHMVHCFHLAGFPKGLISCITGKGSEIGDFLTMHPAVNCISFTGGDTGISISKKAGMIPLQMELGGKDACIVLDDADLDLVASNIIKGGFSYSGQRCTAVKVVLVMESVADELVEKVKAKVAKLTVGPPEENSDITAVVSESSANFIEGLVMDAKEKGATFCQEYKREGNLIWPLLLDNVRPDMRIAWEEPFGPVVPVLRINSVEEGINHCNASNFGLQGCVFTKDINKAILISDAMETGTVQINSAPARGPDHFPFQGLKDSGIGSQGVTNSINLMTKVKTTVINLPTPS [...]
+>AT2G44920.2|PACid:19640483
+MVILSNVSLFSCCNISQKPSLFSPSSRSSHCPIRCSQSQEGKEVVTSPLRSVVWSLGEEVSKRSLFALVSASLFFVDPALAFKGGGPYGQGVTRGQDLSGKDFSGQTLIRQDFKTSILRQANFKGAKLLGASFFDADLTGADLSEADLRGADFSLANVTKVNLTNANLEGATVTGNTSFKGSNITGADFTDVPLRDDQRVYLCKVADGVNATTGNATRDTLLCN
+>AT2G36370.1|PACid:19640825
+MSEILSYGSVKVRAHRTRLIQESSYFHGLLSGSFSESGLDHISVEWNLESFLNLLMCLYGYDIEITSSSFLPLFESALYFGVEKLLSICKNWLSVLASSNDNALPKVELSDLIQIWSFGLEHAGEFVPDLCVAYLAKNFMLVKSDKYFGNVPYELLMWCVKHPHLTVHSEMDLVDGLLIWLDAGGRLSDLPESSQDNTINLMEQVRFSLLPLWFIAGRSKSHGFSKFADQSIELVTKLMKMPSTCLVDSLTDGPPTDVRVRLTEYSEILDLSGCPQLNEASLLLSILPNSYFANLRWRKSLESFLKNPDDDERHQEQISHRTLPILSFESVKEIDISKCQRLDYKVVIKCFSKSFPSLRKLRAAYLLNIKVSTLLELLLNFRELTEVDLTVDVSPIIPVQASVFYSGQGHCLLSSITRLTLEGRSDICDMELRSISRVCESLCYLNIKGCALLSDACIASVIQRCKKLCSLIVCYTSFSENSILALCATISM [...]
+>AT2G42520.1|PACid:19641881
+MSASWADVADSENTGSGSSNQNSHPSRPAYVPPHLRNRPAASEPVAPLPANDRVGYGGPPSGSRWAPGGSGVGVGGGGGYRADAGRPGSGSGYGGRGGGGWNNRSGGWDRREREVNPFENDDSEPEPAFTEQDNTVINFDAYEDIPIETSGDNVPPPVNTFAEIDLGEALNLNIRRCKYVKPTPVQRHAIPILLEGRDLMACAQTGSGKTAAFCFPIISGIMKDQHVQRPRGSRTVYPLAVILSPTRELASQIHDEAKKFSYQTGVKVVVAYGGTPINQQLRELERGVDILVATPGRLNDLLERARVSMQMIRFLALDEADRMLDMGFEPQIRKIVEQMDMPPRGVRQTLLFSATFPREIQRLAADFLANYIFLAVGRVGSSTDLIVQRVEFVLDSDKRSHLMDLLHAQRENGIQGKQALTLVFVETKRGADSLENWLCINGFPATSIHGDRTQQEREVALKAFKSGRTPILVATDVAARGLDIPHVAHVVN [...]
+>AT2G24860.1|PACid:19642080
+MTICMCCDSHLVAKSILNPWNSPRITKLGFVPVVRRFPATTTVKASAVDSPESSSNFAKRMDQAWIISQQPSPVGCSSCNSKGHVECKWCAGTGFFILGDNMLCQVPSRNTSCVICSGQGSASCSDCKGTGFRAKWLEKPPVPT
+>AT2G47840.1|PACid:19642428
+MASLCLSLHQTLTNPLSAPRCRPLSLSFPGSSTFSIRPSSRRATALTTRASYTPTPATERVISIASYALPFFNSLQYGRFLFAQYPRLGLLFEPIFPILNLYRSVPYASFVAFFGLYLGVVRNTSFSRYVRFNAMQAVTLDVLLAVPVLLTRILDPGQGGGFGMKAMMWGHTGVFVFSFMCFVYGVVSSLLGKTPYIPFVADAAGRQL
+>AT2G26610.1|PACid:19642441
+MCSSSDCILPGPPSRSNLSAADLSPSGLLAFASGSSVSLVDSRSLQLISSVSLPSPISCAFSTVTSVRWAPVPVQRDLFSSDLLIAVGDHLGRIALVDFRLCSVRLWLEQSCDSASARGKSLGCGGVQDLCWVLARPDSYVLAAITGPSSLSLYTDSGQLFWKYDASPEYLSCIRCDPFDSRHFCVLGLKGFLLSLKLLGTTENDVPTKEFQIQTDCSDLQKLEREVVASSSHSTCPASAVFPLYSAKFSFSPHWKHILFATFPRELFVFDLKYEAALYVVALPRGYAKFVDVLPDPSQEFLYCLHLDGRLSIWRRKEGEQVHVLCAIEEFMPTIGNSVPSPSLLTLLISQLDSTLQNIRTIHSDALLDSSELEISFDFNNDAFLLFKTHFISISDDGKIWSWILTFNGDEDSNPQTNENLIESPTNGNQDLHPNISFEITLVGQLQLLSSAVTVLAIPTPSMTATLARGGNFPAVVVPLVALGTEAGTIDV [...]
+>AT4G23940.1|PACid:19646003
+MASIDNVFSLGTRFSIPENPKRSILKHATTSSFSARTQTRWRAPILRRSFTVLCELKTGSSSSGETNNSPAADDFVTRVLKENPSQVEPRYRVGDKLYNLKEREDLSKGTNAATGAFEFIKRKFDSKKKTETDKSEESVYLSDILREYKGKLYVPEQVFGPELSEEEEFEKNVKDLPKMSLEDFRKAMENDKVKLLTSKEVSGVSYTSGYRGFIVDLKEIPGVKSLQRTKWSMKLEVGEAQALLKEYTGPQYEIERHMTSWVGKVADFPNPVASSISSRVMVELGMVTAVIAAAAVVVGGFLASAVFAVTSFAFVTTVYVVWPIAKPFLKLFVGVFLGVLEKSWDYIVDVLADGGIFSRISDFYTFGGVASSLEMLKPILLVVMTMVLLVRFTLSRRPKNFRKWDLWQGIAFSQSKAEARVDGSTGVKFADVAGIDEAVDELQELVKYLKNPDLFDKMGIKPPHGVLLEGPPGCGKTLVAKAIAGEAGVPFY [...]
+>AT4G30870.1|PACid:19646942
+MDDERRVLCPENRGLAAYVLQKKQEYAEKPKGLSENLERTFVKGYRSVCDAKDPINTLKDLSQIKGFGKWMVKLMKGYFDTAVESSEQEDLPDNRAGKKANGKKRYIPQRNSVGYALLITLHRRTTNGKEFMRKQELIDAADANGLSHAPVGPEKGKGKAGLGHSKREWYSGWSCMTTLIQKGLVVKSSNPAKYMLTVEGREVANECILRSGLPDSVDILSVDEMDPTPQAKKTPNQNPTCSFTMREELPYVDPRCRAQSAIPSDILEKFTPFGYSKEQVVAAFREVSDGSGDKDPSTLWLSVMCHLRQAEVYNSCPDSRNSKKDSSGPFKSQIRQVDLEGSRAKKFRSCNDGSTLNPCSSGSSHAVKACSSSLASDGTKGITNIPRLPPLQFGETFEEAYDVILILDDREKFATKGSRSRNIVENICSEFNIKIEVRRLPVGDCIWIARHKYLETEYVLDFIAERKNVDDMRSSIRDNRYRDQKLRLQRSG [...]
+>AT4G35335.1|PACid:19646965
+MEYRKIKDEDDHDVASDIESVKGKSHTVASSNIAMATLGVGSSERINWKRKGVVTCALTILTSSQAILIVWSKRAGKYEYSVTTANFLVGTLKCALSLLALTRIWKNEGVTDDNRLSTTFDEVKVFPIPAALYLFKNLLQYYIFAYVDAPGYQILKNLNIISTGVLYRIILKRKLSEIQWAGFILLCCGCTTAQLNSNSDRVLQTSLPGWTMAIVMALLSGFAGVYTEAIIKKRPSRNINVQNFWLYVFGMAFNAVAIVIQDFDAVANKGFFHGYSFITLLMILNHALSGIAVSMVMKYADNIVKVYSTSVAMLLTAVVSVFLFNFHLSLAFFLGSTVVSVSVYLHSAGKLR
+>AT4G29380.1|PACid:19647069
+MGNKIARTTQVSATEYYLHDLPSSYNLVLKEVLGRGRFLKSIQCKHDEGLVVVKVYFKRGDSIDLREYERRLVKIKDVFLSLEHPHVWPFQFWQETDKAAYLVRQYFYSNLHDRLSTRPFLSLVEKKWLAFQLLLAVKQCHEKDICHGDIKCENVLLTSWNWLYLADFASFKPTYIPYDDPSDFSFFFDTRGQRLCYLAPERFYEHGGETQVAQDAPLKPSMDIFAVGCVIAELFLEGQPLFELAQLLAYRRGQHDPSQHLEKIPDPGIRKMILHMIQLEPEARLSAEDYLQNYVGVVFPNYFSPFLHTLYCCWNPLPSDMRVATCQGIFQEILKKMMENKSGDEIGVDSPVTSNPMNASTVQETFANHKLNSSKDLIRNTVNSKDEIFYSISDALKKNRHPFLKKITMDDLGTLMSLYDSRSDTYGTPFLPVEGNMRCEGMVLIASMLCSCIRNIKLPHLRREAILLLRSCSLYIDDDDRLQRVLPYVVAL [...]
+>AT4G12110.1|PACid:19648830
+MIPYATVEEASIALGRNLTRLETLWFDYSATKSDYYLYCHNILFLFLVFSLVPLPLVFVELARSASGLFNRYKIQPKVNYSLSDMFKCYKDVMTMFILVVGPLQLVSYPSIQMIEIRSGLPLPTITEMLSQLVVYFLIEDYTNYWVHRFFHSKWGYDKIHRVHHEYTAPIGYAAPYAHWAEVLLLGIPTFMGPAIAPGHMITFWLWIALRQMEAIETHSGYDFPWSPTKYIPFYGGAEYHDYHHYVGGQSQSNFASVFTYCDYIYGTDKGYRFQKKLLEQIKESSKKSNKHNGGIKSD
+>AT1G30090.1|PACid:19651723
+MQRVRVSSQRAVVHKLGDSQMTLSPKFRVAASIQSTLFDRSSELELSLRGEPLIPGLPDDVALNCLLRVPVQSHVSSKSVCKRWHLLFGTKETFFAKRKEFGFKDPWLFVVGFSRCTGKIQWKVLDLRNLTWHEIPAMPCRDKVCPHGFRSVSMPREGTMFVCGGMVSDSDCPLDLVLKYDMVKNHWTVTNKMITARSFFASGVIDGMIYAAGGNAADLYELDCAEVLNPLDGNWRPVSNMVAHMASYDTAVLNGKLLVTEGWLWPFFVSPRGQVYDPRTDQWETMSMGLREGWTGTSVVIYDRLFIVSELERMKMKVYDPVTDSWETINGPELPEQICRPFAVNCYGNRVYVVGRNLHLAVGNIWQSENKFAVRWEVVESPERYADITPSNSQILFA
+>AT1G01060.1|PACid:19652702
+MDTNTSGEELLAKARKPYTITKQRERWTEDEHERFLEALRLYGRAWQRIEEHIGTKTAVQIRSHAQKFFTKLEKEAEVKGIPVCQALDIEIPPPRPKRKPNTPYPRKPGNNGTSSSQVSSAKDAKLVSSASSSQLNQAFLDLEKMPFSEKTSTGKENQDENCSGVSTVNKYPLPTKQVSGDIETSKTSTVDNAVQDVPKKNKDKDGNDGTTVHSMQNYPWHFHADIVNGNIAKCPQNHPSGMVSQDFMFHPMREETHGHANLQATTASATTTASHQAFPACHSQDDYRSFLQISSTFSNLIMSTLLQNPAAHAAATFAASVWPYASVGNSGDSSTPMSSSPPSITAIAAATVAAATAWWASHGLLPVCAPAPITCVPFSTVAVPTPAMTEMDTVENTQPFEKQNTALQDQNLASKSPASSSDDSDETGVTKLNADSKTNDDKIEEVVVTAAVHDSNTAQKKNLVDRSSCGSNTPSGSDAETDALDKMEKDKE [...]
+>AT1G09690.1|PACid:19652733
+MPAGHGVRARTRDLFARPFRKKGYIPLSTYLRTFKVGDYVDVKVNGAIHKGMPHKFYHGRTGRIWNVTKRAVGVEVNKQIGNRIIRKRIHVRVEHVQQSRCAEEFKLRKKKNDELKAAAKANGETISTKRQPKGPKPGFMVEGMTLETVTPIPYDVVNDLKGGY
+>AT1G35660.1|PACid:19655505
+MEQSAAPQPPTMPLPYLPSSVEASRDDLQCIGTMVIVPPKPVGFLCGSIPVLADNSFPASFTSALLPSQETVVTAPRYQMLPMETDLNLPPLLTDFPDNVLPLAAVKSRITGDISKEANVITSNLSKKCEALAVSGLVEYGDEIDVIAPVDILKQIFKIPYSKARVSIAVQRVGQTLVLNPGPDVEEGEKLIRRHNNQPKCTKNVDESLFLNFAMHSVRMEACDIPPTHREHTEKRSSSSALPAGENSHDNAPDDRLDKPAGSSKQSKQDGFICEKKKSKKNKAGVEPVRKNSQISEKIKSSGDSEKHSRGGSNEFLRVLFWQFHNFRMLLGSDLLLFSNEKYVAVSLHLWDVSEKVTPLTWLEAWLDNVMASVPELAICYHENGIVQGYELLKTDDIFILKGISEDGTPAFHPHVVQQNGLAVLRFLQSNCKEDPGAYWLYKSAGEDELQLFDLSIISKNHSSSVHNDSASSPSLIHSGRSDSMFSLGNLL [...]
+>AT1G30300.1|PACid:19656499
+MMESSIPSENGSVTSDRDRSALIFLGTGCSSAVPNAMCLIQRSDNPCYVCSQSLSIPPEKNPNYRGNTSLLIDYCQSDGKHKYIQIDVGKTFREQVLRWFTLHKIPQVDSIILTHEHADAVLGLDDIRSVQPFSPTNDIDPTPIFVSQYAMESLAVKFPYLVQKKLKEGQEVRRVAQLDWRVIEEDCEKPFVASGLSFTPLPVMHGEDYVCLGFLFGEKSRVAYISDVSRFPPNTEYAISKSGGGQLDLLILDTLYKTGSHNTHLCFPQTLDTIKRLSPKRALLIGMTHEFDHHKDNEFLEEWSKREGISVKLAHDGLRVPIDL
+>AT1G76710.1|PACid:19657324
+MQFSCDPDQEGDELPQYEHIYQNDFSYRKHKKQKEEDISICECKFDFGDPDSACGERCLNVITNTECTPGYCPCGVYCKNQKFQKCEYAKTKLIKCEGRGWGLVALEEIKAGQFIMEYCGEVISWKEAKKRAQTYETHGVKDAYIISLNASEAIDATKKGSLARFINHSCRPNCETRKWNVLGEVRVGIFAKESISPRTELAYDYNFEWYGGAKVRCLCGAVACSGFLGAKSRGFQEDTYVWEDGDDRYSVDKIPVYDSAEDELTSEPSKNGESNTNEEKEKDISTENHLESTALNIQQQSDSTPTPMEEDVVTETVKTETSEDMKLLSQNSQEDSSPKTAIVSRVHGNISKIKSESLPKKRGRPFSGGKTKNVAQKHVDIANVVQLLATKEAQDEVLKYEEVKKEAAVRLSSLYDEIRPAIEEHERDSQDSVATSVAEKWIQASCNKLKAEFDLYSSVIKNIASTPIKPQDTKTKVAEAGNEDHIKLLEAK
+>AT3G44735.1|PACid:19659026
+MKQSLCLAVLFLILSTSSSAIRRGKEDQEINPLVSATSVEEDSVNKLMGMEYCGEGDEECLRRRMMTESHLDYIYTQHHKH
+>AT3G44735.2|PACid:19659027
+MKQSLCLAVLFLILSTSSSAIRRGKEDQEINPLVSATSVEEEHDSVNKLMGMEYCGEGDEECLRRRMMTESHLDYIYTQHHK
+>AT3G61770.1|PACid:19659581
+MESLTSNPSFACFSKTISHYPSRFSFVKLTSIQKLEASNNTLSLFCCKSSSNPKPDCNRRIKLNPLCVLRPIIRTIKGLVSSQSRQWMSRFRAYRDDTAAFSEDFAGDLKHNGGLGIALLSVTASAKIKISPFVATLSANPTFVSAVVAWFFAQSSKMVINFFIERKWDFRLLYASGGMPSSHSALCMALTTSVALCHGVADSLFPVCLGFSLIVMYDAIGVRRHAGMQAEVLNLIIRDLFEGHPISQRKLKELLGHTPSQVLAGALVGIVIACFCCQGYLVSV
+>AT3G58460.2|PACid:19660450
+MTYGISSSQKIIQSLRSREESFTVGEKRELFAGVQTRVGQWWNAIPFLTSSVVVVCGVIYLICLLTGYDTFYEVCFLPSAIISRFQVYRFYTAIIFHGSLLHVLFNMMALVPMGSELERIMGSVRLLYLTVLLATTNAVLHLLIASLAGYNPFYQYDHLMNECAIGFSGILFSMIVIETSLSGVTSRSVFGLFNVPAKLYPWILLIVFQLLMTNVSLLGHLCGILSGFSYSYGLFNFLMPGSSFFTTIESASWMSSFIRRPKFIMCTGGNPSSYIPTYSAQNTTSSGFSTGNAWRSLSSWLPQREASNQSSEDSRFPGRGRTLSTARDPTAPAGETDPNLHARLLEDSSSPDRLSDATVNTVADSRQAPIANAAVLPQSQGRVAASEEQIQKLVAMGFDRTQVEVALAAADDDLTVAVEILMSQQA
+>AT3G14750.1|PACid:19664721
+MSGRNRGPPPPSMKGGSYSGLQAPVHQPPFVRGLGGGPVPPPPHPSMIDDSREPQFRVDARGLPPQFSILEDRLAAQNQDVQGLLADNQRLAATHVALKQELEVAQHELQRIMHYIDSLRAEEEIMMREMYDKSMRSEMELREVDAMRAEIQKIRADIKEFTSGRQELTSQVHLMTQDLARLTADLQQIPTLTAEIENTKQELQRARAAIDYEKKGYAENYEHGKIMEHKLVAMARELEKLRAEIANSETSAYANGPVGNPGGVAYGGGYGNPEAGYPVNPYQPNYTMNPAQTGVVGYYPPPYGPQAAWAGGYDPQQQQQQQPPPQGQGHR
+>AT3G12040.1|PACid:19664981
+MKTPARRSKRVNQEESETNVTTRVVLRTRKTNCSKTRAARVRPDYPLTRTTSESEMKLMPPEFFQIDALDLAPRLLGKFMRRDNVVLRITEVEAYRPNDSACHGRFGVTPRTAPVFGPGGHAYVYLCYGLHMMLNIVADKEGVGAAVLIRSCSPVSGMETIQERRGLKTDKPVLLNGPGKVGQALGLSTEWSHHPLYSPGGLELLDGGEDVEKVMVGPRVGIDYALPEHVNALWRFAVADTPWISAPKNTLKPL
+>AT5G48030.1|PACid:19669587
+MVPSNGAKVLRLLSRRCLSSSLIQDLANQKLRGVCIGSYRRLNTSVGNHANVIGDYASKSGHDRKWINFGGFNTNFGSTRSFHGTGSSFMSAKDYYSVLGVSKNAQEGEIKKAYYGLAKKLHPDMNKDDPEAETKFQEVSKAYEILKDKEKRDLYDQVGHEAFEQNASGGFPNDQGFGGGGGGGFNPFDIFGSFNGDIFNMYRQDIGGQDVKVLLDLSFMEAVQGCSKTVTFQTEMACNTCGGQGVPPGTKREKCKACNGSGMTSLRRGMLSIQTTCQKCGGAGQTFSSICKSCRGARVVRGQKSVKVTIDPGVDNSDTLKVARVGGADPEGDQPGDLYVTLKVREDPVFRREGSDIHVDAVLSVTQAILGGTIQVPTLTGDVVVKVRPGTQPGHKVVLRNKGIRARKSTKFGDQYVHFNVSIPANITQRQRELLEEFSKAEQGEYEQRTATGSSQ
+>AT5G13650.2|PACid:19669692
+MELSLSTSSASPAVLRRQASPLLHKQQVLGVSFASALKPGGGALRFPSRRPLPRPITCSASPSTAEPASVEVKKKQLDRRDNVRNIAIVAHVDHGKTTLVDSMLRQAKVFRDNQVMQERIMDSNDLERERGITILSKNTSITYKNTKVNIIDTPGHSDFGGEVERVLNMVDGVLLVVDSVEGPMPQTRFVLKKALEFGHAVVVVVNKIDRPSARPEFVVNSTFELFIELNATDEQCDFQAIYASGIKGKAGLSPDDLAEDLGPLFEAIIRCVPGPNIEKDGALQMLATNIEYDEHKGRIAIGRLHAGVLRKGMDVRVCTSEDSCRFARVSELFVYEKFYRVPTDSVEAGDICAVCGIDNIQIGETIADKVHGKPLPTIKVEEPTVKMSFSVNTSPFSGREGKYVTSRNLRDRLNRELERNLAMKVEDGETADTFIVSGRGTLHITILIENMRREGYEFMVGPPKVINKRVNDKLLEPYEIATVEVPEAHMGP [...]
+>AT5G01030.1|PACid:19670273
+MAEKEGKASGTDNDNRVKTKRSSHSRRKECVNKSLEHNDELVKYMSKLPGYLQRIERGEESVHQSNVLNVGVLDWESLQRWKHGRAKGGEISGRSERKVSTIATTSTSGVVVPNDSANRCKIDDQVHTCSNLGKVKASRDLQYSLEPQLASRDSLNKQEIATCSYKSSGRDHKGVEPRKSRRTHSNRESTTGLSSEMGNSAGSLFRDKETQKRAGEIHAKEARERAKECVEKLDGDEKIIGDSEAGLTSEKQEFSNIFLLRSRKQSRSTLSGEPQISREVNRSLDFSDGINSSFGLRSQIPSSCPLSFDLERDSEDMMLPLGTDLSGKRGGKRHSKTTSRIFDREFPEDESRKERHPSPSKRFSFSFGRLSRNFSLKDISAGQPLSSSEDTIMSGSMRFDGSVCPSQSSNPENQNTHCRSRVSPLRRFLDPLLKPKASESVLPSKARSSSSNPKPITNSNVPLQDEKKQDASRTLAIFQLTIRNGIPLFQFV [...]
+>AT5G16780.1|PACid:19670576
+MEVEKSKSRHEIREERADYEGSPVREHRDGRRKEKDHRSKDKEKDYDREKIRDKDHRRDKEKERDRKRSRDEDTEKEISRGRDKEREKDKSRDRVKEKDKEKERNRHKDRENERDNEKEKDKDRARVKERASKKSHEDDDETHKAAERYEHSDNRGLNEGGDNVDAASSGKEASALDLQNRILKMREERKKKAEDASDALSWVARSRKIEEKRNAEKQRAQQLSRIFEEQDNLNQGENEDGEDGEHLSGVKVLHGLEKVVEGGAVILTLKDQSVLTDGDVNNEIDMLENVEIGEQKRRNEAYEAAKKKKGIYDDKFNDDPGAEKKMLPQYDEAATDEGIFLDAKGRFTGEAEKKLEELRKRIQGQTTHTFEDLNSSAKVSSDYFSQEEMLKFKKPKKKKQLRKKDKLDLSMLEAEAVASGLGAEDLGSRKDGRRQAMKEEKERIEYEKRSNAYQEAIAKADEASRLLRREQVQPFKRDEDESMVLADDAEDL [...]
+>AT5G13640.1|PACid:19671151
+MPLIHRKKPTEKPSTPPSEEVVHDEDSQKKPHESSKSHHKKSNGGGKWSCIDSCCWFIGCVCVTWWFLLFLYNAMPASFPQYVTERITGPLPDPPGVKLKKEGLKAKHPVVFIPGIVTGGLELWEGKQCADGLFRKRLWGGTFGEVYKRPLCWVEHMSLDNETGLDPAGIRVRAVSGLVAADYFAPGYFVWAVLIANLAHIGYEEKNMYMAAYDWRLSFQNTEVRDQTLSRMKSNIELMVSTNGGKKAVIVPHSMGVLYFLHFMKWVEAPAPLGGGGGPDWCAKYIKAVMNIGGPFLGVPKAVAGLFSAEAKDVAVARAIAPGFLDTDIFRLQTLQHVMRMTRTWDSTMSMLPKGGDTIWGGLDWSPEKGHTCCGKKQKNNETCGEAGENGVSKKSPVNYGRMISFGKEVAEAAPSEINNIDFRGAVKGQSIPNHTCRDVWTEYHDMGIAGIKAIAEYKVYTAGEAIDLLHYVAPKMMARGAAHFSYGIADD [...]
+>AT5G27660.1|PACid:19673236
+MMNFLRRAVSSSKRSELIRIISVATATSGILYASTNPDARTRVSLAIPESVRESLSLLPWQISPGLIHRPEQSLFGNFVFSSRVSPKSEAPINDEKGVSVEASDSSSKPSNGYLGRDTIANAAARIGPAVVNLSVPQGFHGISMGKSIGSGTIIDADGTILTCAHVVVDFQNIRHSSKGRVDVTLQDGRTFEGVVVNADLQSDIALVKIKSKTPLPTAKLGFSSKLRPGDWVIAVGCPLSLQNTVTAGIVSCVDRKSSDLGLGGKHREYLQTDCSINAGNSGGPLVNLDGEVIGVNIMKVLAADGLGFSVPIDSVSKIIEHFKKSGRVIRPWIGLKMVELNNLIVAQLKERDPMFPDVERGVLVPTVIPGSPADRAGFKPGDVVVRFDGKPVIEIMDDRVGKRMQTCETRRRLRQFCNGNFTIFNQMC
diff --git a/test/target2.fasta b/test/target2.fasta
new file mode 100644
index 0000000..31ae5f4
--- /dev/null
+++ b/test/target2.fasta
@@ -0,0 +1,76 @@
+>scaffold41641
+ATCCCGCGATTCTTAGTACTGCGTAAGTATTTGTACATATATACTGAATACTGATACAAACATGAAAATTTTTGTGTTGACGTATTTTCTTTGAATTTGATGATTATTTCAGGGTTATTTTCGGGATGTTGATAGCGTTACCGATAACTCTAGTCTACTACATCGTTCTTGGATTGTGAAGATCTAAAAAAGAAAAATTTATGAAGTGTCCAAGACAATTGGAATAAGGAATTTTTGACTTGATGCAAGGATCGTGCTACCTCCATCTCCCAA
+>scaffold41642
+ACCCTCCTCCATATCGCCATCGTGCAGGCCGCACTGCCTCAAGGGATTGTTCCTTTTGTTTTTGCAAAAGAGTACAATGTTCATCCCGCGATTCTCAGTACTGCGGTTATTTTTGGGATGTTGATAGCGTTACCGATAACTCTAGTCTACTACATCGTTCTTGGATTGTGAAGATCTAAAAAAGAAAAATTTATGAAGTGTCCAAGACAATTGGAATAAGGAATTTTTGACTTGATGCAAGGATCGTGCTACCTCCATCTCCCAA
+>scaffold78526
+TACATCCCTTTAAGCAAAGGGATTCCCATAACCAATGTGTTTGGTAAAGTTGATAGTGAAAAAAGTGTTATGGACCATTCTATACTGCCCTTTTTGGTGAGGTTGCTCCACGCGGCTAGGACCGCAAGGACAACGAGTTTCTGGAGCGTGTCGGCCGCGATGAACCGAAGGTTCATTTTGTAAGGGTTGTTGGAGGAAATGAAGTGAAAGGAGAGGAGGGGGACGNAAAGAGAGCTACAAATCTGTTGATCCCTGAGCATTGATCAGGGGTGAAGATCTTCCACCATTTCACCGAGCCGTAGGCTAGGATCATGGCGACATAGAGTGGAACGACGGCTGTGAAGACATGGTAAAGGTCCGAAGCCGTGATCATATTGAGAAATTGGA
+>scaffold87836
+GTGGTGGTGGTGGTGGTGGTAAAGATCTTCATATGTTTGTGTGGAGTTCTAGTGCTTCACCTGTATCAGAAGGTGGGATTCATGTGTTTAGAGGTGGGGAGTACGGTAATGATCATGGAGGGGTGCCCCATACTAAAGATTATGATGATTTTGGCCGGAACGAGTTTAGTTTTGGTAATCAACCAGGCCCAAATGGAGTGGACCGTGAAGCGTCGGTTCTATCTAAACTTGGGTCTAGCTCAACTGCAGAGCTCCACCCAAAGGCTAGCCCACATGGTGAAACCAAGCCCCCTTCCATGCCTCCTGCTAGTGTCATGACCAGACTCATTTTGATCATGGTGTGGCGCAAACTTATCAGAAATCCCAACACGTATTCGAGCTTAATTGGCCTCACATGGTCCTTGGTCTCCTTCAAGTGGAACGTTGAAATGCCTGCCATCGTTGCAAAATCCATAGCCATTTTATCCGATGCTGGTCTTGGCATGGCCATGT [...]
+>C2336898
+GCAGGGTACCCTGCACCCCCATGCCCATTCACCTTCACATTACTCCCACTTTCCTCATCATACCCTGCAAAATTCGACCCACGAGGACTCGCACCACTTTTCCCATTAACCATTGAGTAAAAATCAGTGTGATTGAAGCTCGAGCCCCTTGGTGTAGGGTTTCTTGACGACTGTAAAGAGTAAATTTCAGCATTTGTTAAGTTTGATGGCCGAGGGGTTATTGACACACCGGAGTTCATTCCGCCGTGTGACCGCCTTGAGAAGATCTCCGAATGAGAGCTTGCAGACTTCCGAACAGTCACATGAAGCTTCCCATCGTCCCCAACTTCAGCCTCTGTTTGCAATGGCTCTTTACCATCTAACGACAACACATCAGAATCAAC
+>contig-112109 405 0
+AAGTTATCGTTTATGTATTACGGGGACCGTAATGATTGCTGAATAGACTCCATTGCGGTCCCGTGGGGTGGGTACATCTTTTTAATGGAAAAAATCAACATCAATCTCATCAATTAGTTTGTTAAAAGTCGAGGTTCAATTTTTTATTTTGAAAATAAGAGAAATTGATTGAATAATTATAATTTTTCTAGTCAAAACAAGTTTAAAGTTTCAATACTTTGTGGAGACTAATGGCAAATAGAAATAGATTTTGTATCCGATCATAACATGAGTTTCACGAGTTAGGTTATAACTTGGTTGACCTCTGTTTTGTTTTGTTTTTTAATCTTTTGATATATTTATATTTTAAATTTACAATAAATAGTATAAATGTATAGAAAATAAGCACATCTAAATTATAAAATC
+>contig-233717 356 0
+TCCTGATGTTCTGAGCACTGCGGTGATATTTGGAATGATAATATCTTTGCCTATCACAATGCTGTATTACATCTTACTAGGTCTCTAAATGAAGCAAACGGAATTGGGAAACCAGAAAGAAGACTTGATCCATGGATATTCATATGCATATATAATTATACATATCAAGATAAAGGCATACGGTAACTACATATGTCAAACAAAGTTATTTAAGTAAGACTTCAATTTCACATTGATGTTTTGATAAGTTGATAATATGAGTGTTTGAGTAGCACAGTGGTTAGTAGTATTAAGTGTAGTTGTTCTACATTATACTTTCATAATATATATCTCTAGCATAGCTACGTCTTATTCTG
+>contig-96389 570 0
+AGGGAGTTTGATGGTGCAGATTGTGGTGCTTCAGTGCATCATTTGGTACACTTTGATGCTGTTTCTGTTTGAATACAGAGGTGCAAGATTGTTGATTGCTGAACAGTTTCCAGACACTGCTGGTTCGATTATCTCATTTAGGGTTGATTCTGATGTGTTGTCGTTAGATGGTAAAGAGCCATTGCAAACAGAGGCTGAAGTTGGGGACGATGGGAAGCTTCATGTGACTGTTCGGAAGTCTGCAAGTTCTCATTCGGAGATCTTCTCACGGCGGTCACACGGCGGAATGAACTCCGGTGTGTCATTAACCCCTCGTCCATCAAACTTAACAAATGCTGAAATTTACTCTTTACAGTCGTCAAGAAACCCTACACCAAGGGGCTCGAGCTTCAATCACACTGATTTTTACTCAATGGTTAATGGGAAAAGTGGTGCGAGTCCTCGTGGGTCGAATTTTGCAGGGTATGATGAGGAAAGTGGGAGTAATGTGAA [...]
+>contig-497662 553 0
+CAGAGACCCTTTTCTTCCCAAAACATCTCTCTCATGGACCATCAAACCTTCATTTAAACCCCAATTCCCACCTGGGTTTTCTTCTTCACCATCTTCCTTCACCTGGGTCCTCCTCCAATTTCTCAATATGATCACTGGTTCGGACCTTTACCATGTCTTCACAGCCGTGGTTCCACTCTATGTCGCCATGATCCTAGCCTACGGCTCGGTGAAATGGTGGAAGATCTTCACCCCTGATCAATGCTCAGGGATCAACAGATTTGTAGCTCTCTTTGCTGTCCCATTGCTCTCCTTTCATTTCATCTCAACAAACAACCCTTACACCATGAACACAAGGTTCATCGCCGCTGACACCCTTCAAAAACTCATCGTTCTTGTGGGTCTAGCCTTCTGGTCAAGATTAAGCGCTAGAGGCTCATTAGAATGGTCAATCACTCTGTTTTCTTTATCCACACTTCCCAACACACTTGTGATGGGAATCCCTCTTTTGAA [...]
+>contig-428810 267 0
+CTGCATTATTGTTGTTCTTTTAATACTAGAGTTTGTTTCTCTCATTGCATCTGTTCCTTCTGCTTCTTGATGGATCTCCTGCTCCTCTTGAATTAAAAGGTGAATTAATGATATCAGCATGATAACAAGATAAAAAGATTAATATATTATAGAACTGAGTGATGCAAGGGTGAATTGTGGCACTGCAATTAATAAAAATATTTGCAGCCCAAGATCTAACTAGCTAGTCTGGATGTTACATGTATAAATGCTGGGAAATGTAATACG
+>contig-219563 1305 0
+CGATATTTATGTTGTTATTTCCATAACTCTCCAAGTACTCTTCTCCGTTACCTTTTACGGGTGATATTGGAACTTTTAAATCTTTTGTAGTTGGTGGGTCGAGTGCTCCGTATTCACGAGCCGCGAATACATCAGAAACCGGGGAAGTGCTCGAGCTCCAAACAAACATGTGTAGATCGTTTGAACCTTCATCTGCTTTTTGCCCGTTAGCCTTTTTGGTTAACGGTATACTTGGTTTTCCAGTTGACGGAGAAAACATCCCAGGATTTGGTGCTGGGTAGTTATTACCACCCAGCACACCATGATGAAAACGAGACTTGTTAGAATTATTCCCACTGCCTTCCTCTTCATAGTTGGAATGACGGGGTGTGGGCCCTCTGGATACAGGAGGCCCATACACATCCGAAGCCCCGAAGTTCGAATTCCTTCCACCAACACCACCCGCCATAGAATAAAAATCATTATGGTTAAAACTCGACCCTCTTGGGGTAG [...]
+>contig-7932 510 0
+ATCTTTGGTTGCAACGCCATGAACAAACCGAGGCTAAACATGGCCATGCCAAGACCAGCATCGGATAAAATGGCTATGGATTTTGCAACAATGGCAGGCATTTCAACGTTCCACTTGAAGGAGACCAAGGACCATGTGAGGCCAATTAAGCTCGAATACGTGTTGGGATTTCTGATAAGTTTGCGCCACACCATGATCAAAATGAGTCTGGTCATGACACTAGCAGGAGGCATGGAAGTGGGCTTGGTTTCCCCATGTGGGCTAGCCTTTGGGTGGAGCTCTGCAGTTGAGCTAGACCCGAGTTTGGATAGAACCGATGCTTCACGGTCCACTCCATTCGGGCCTGGTTGGTTCCCAAAACTAAATTCGTTGCGGCCAAAATCATCATAATCTTTAGTATGGGGCACCCCTCCATGATCATTACCGTACTCCCCACCTCTAAACACATGAATCCCACCTTCTGATACAGGTGAAGCACTAGAACTCCACACA [...]
+>contig-245882 614 0
+ATCATCACCATACATGGATTTTAAAAGTGGGATTCCCATAACAAGTGTGTTTGGGAGTGTAGAAAGTGAAAACATGGTAATAGCCCAATCCAAGTCTCCTTTTTTCGACAAATTTGTCCATAAAAACAAGACAATTAGTGTCAAAGTCTTTGAAACACCATCGGCTGCTATGAACTTGAGGTTCATTTTGTAAGGATTGATTCTTGAAATGAACTCAAAGGACAACAAAGGGACGGCGAAAATGGCTACAAATCTGTTGATGCCTGCACACTGTTGTGGTGAGAAAATGTTCCATTTTACAGAGGCGTAAGCTACGAACATGGTAATGTAAAGAGGAACAACTGCAGTAAGTACACTATACAAGTCACCAAAGTTGATCATTTCTGCTGGCTTGTTAAGTATTTGGAGGGAAAATGATATGTTTTGAAGCAGTTAAGTGGGAGTTAGTTAAGGCTGTAGAAGCATGGATTTTGATGTCTTGATAGGGCTTAA [...]
+>contig-236838 594 0
+TGCACTGAAGCACCACAATCTGCACCATCAAACTCCCTGAAAAGTCACCATACATCCCTTTCAAAAGAGGGATTCCCATCACAAGTGTGTTGGGGAGTGTTGACAACGAAAACAGAGTAATCGACCATTCTAACGAGCCTCTGGCGCTCAATCTTGACCAGAAGGCTAGACCCACAAGAACGATGAGTTTTTGAAGGGTGTCAGCAGCGATGAACCTTGAGTTCATGGTGTAAGGATTGTTGGTTGAGATGAAGTGAAAGGAGAGCAATGGGACAGCAAAGAGAGCTACAAATCTGTTGATCCCAGAGCATTGATCAGGGGTGAAGATCTTCCACCATTTCACCGAGCCGTAGGCTAGGATCATGGCGACATAGAGTGGAACGACGGCTGTGAAGACATGGTAAAGGTCCGAAGCCGTGATCATATTGAGAAATTGGAAGATGACCCAGGTGAAAGAAGATGGTGAAGAAGAAAACCCAGGTGGGAATTGGG [...]
+>contig-430590 396 0
+GACAGTATTGATATCGACTTTGTTATGATTGCAGGCATATTAACATTCCATCTGAAACATACTAGAGACCAGACGAGACCGATTAAACTTGAATAAGTATTCGGGTTTCTAATCAGTTTCCTCCACACCATTATCAGTATCAGTCTGGTCATCACGCTCGTTGGTGGCATTGCATTGTTGCCAACAATGCCTGCGCCAGTCCCTTCAATGTTTATGTTGTTATTTCCGTAACCCTCCAAGTACTCTGCTCCGTTACCTTTTACGGGTGATATGGGAACTTTTAAATCTTTTGTAGTTGGCTGGTCGAGTGCTCCGTATTCACGAGCCGCGAATACATCAGAGACCGGGGAATTACTCGAGCTCCAAACAAACATGTGTAGATCATTTGAACCTTCG
+>contig-58203 379 0
+GTTCAAATGATCTACACATGTTTGTTTGGAGCTCGAGTAATTCCCCGGTCTCTGATGTATTCGCGGCTCGTGAATACGGAGCACTCGACCAGCAAACTACAAAAGATTTAAAAGGTAACGGAGAAGAGTACTTGGAGAGTTATGGAAATAACAACATAAACATCGAAGGGGCTGGCGCAGGCATTGTTGGCAACAATGTAATGCCTCCAACGAGTGTGATGACTAGACTGATACTGATAATGGTGTGGAGGAAACTGATTAGAAACCCGAATACTTATTCGAGTTTAATCGGTCTCGTCTGGTCTCTAGTATGTTTCAGATGGAATGTTAATATGCCTGCAATCATAACAAAGTCGATATCAATACTGTCTGATGCAGG
+>contig-104976 401 0
+TTTACTTTACTTCCTGCCAATGTCTTCTTGTCAAATTACAATTCTCTCCCTCCCAACAATGACCCACCCTCTCTAATGTCATCTCTTTTTACTCTCCACTAAAATTAACAAAATGGTCATTTCACATGTTGTTAGACATCCCCAACCCAAAACAAACTTAAACTTCACTTCTTCAAACAAGACAAGAGTCAAAGCCCACTCATTTAAAGCCCCAACACTATGTAGTACACCAGAGTAATTGGTAGAGCTATCAACATTCCAAATATAACCCCTGTGCTAAGTATATCTGGATGGACATTATACTCTTTAGCAAACACAAAAGGGACAATTCCTTGAGGCAGAGCTGCCTGAACTATGGCAACATGTAGCAATGTTCCATGTAGTCCAACAGCCAGTGATGC
+>contig-498074 549 0
+AAAAGATTCCGGCATTCGCCGGAGCAGGGTACCCTGCACCCCCATGACCCCCATTCACCTTCACATTACTCCCACTTTCCTCATCATACCCTGCAAAATTCGACCCACGAGGACTAGCACCACTTTTCCCATTAACCATTGAGTAAAAATCAGTGTGATTGAAGCTCGAGCCCCTTGGTGTTGGGTTTCTTGACGACTGTAAAGAGTAAATTTCAGCATTTGTTAAGTTTGATGGCCGAGGGGTTATTGACACACCGGAGTTCATTCCGCCGTGTGACCGCCTTGAGAAGATCTCCGAATGAGAGCTTGCAGACTTCCGAACAGTCACATGAAGCTTCCCATCGTCCCCAACTTCAGCCTCTGTCTGCAATGGCTCCTTACCATCTAACGACAACACATCAGAATCAACTCTAAAAGAAATAATCGAACCAGCAGTGTCTGGAAACTGCTCAGCGATCAACAATCTTGCACCTCTGTATTCAAACAGAAACA [...]
+>contig-474480 309 0
+GTTATTTTTGGGATGTTGATAGCGTTACCGATAACTCTAGTCTACTACATCGTTCTTGGATTGTGAAGATCTAAAAAATGAAAATTTTATGAAGTGTCCAAGACAATTGGAATAAGGAATTTTTGACTTGATGTAAGGATCGTGCTACCTCCATCTCCCAATGTTGTATCTTTTAATGTGTAGATTCTCAAGAAATGAGGAAAGCAAGAAAGAAGATATAGTTTAAAAGAGGGGGAGACAAACATGAGCAAAAGAAGGAAATTTAATCTTTTTTTTTCTTCTTTTTTGGGGGAAAAAAAGAAAAATCTT
+>contig-357246 874 0
+GATATTCGAAAACACCACCCTAAAAAAACACTTCATCTTCTTCACATCCTGAAAACCTGCAAATCTCCATTGTCATCTCACCATGATCACTCTTTCAGACTTCTACCACGTCATGACCGCCGTAGTCCCCCTCTACGTCGCCATGATCCTCGCCTACGGCTCCGTCAAATGGTGGAAGATCTTCACCCCCGATCAATGCTCCGGCATCAACCGCTTTGTCGCCCTCTTCGCCGTCCCCCTCCTCTCCTTCCACTTCATCTCCACCAACAATCCCTACACCATGAACCTCAAATTCATCACCGCTGACACCCTCCAGAAGCTCGTTGTCCTCGCTGCCCTTGCCGCCCTCGCCAACCTCACCAAACTCGTTAGCCTCGAGTGGTCGATCACACTCTTTTCACTTTCGACTTTGCCCAATACCCTTGTTATGGGCATCCCTTTGCTTAAAGGGATGTATGGGGGTGACTCTGGGAGTTTAATGGTTCAGATTGT [...]
+>contig-358147 1348 0
+GGAAAATACTACAAAATTACCATCCCCACCACCACCTTTGGAGCCTTAAAACACTTAACTCTTAAAACCTTCAAAGCACCAACATTCGCTAGTTCCCCTTGACGTACCATGATCAGTGTTTCAGATTTGTACCATGTGATGACGGCGGTCGTCCCGCTGTACGTGGCGATGATCTTAGCGTACGGGTCGGTCAAATGGTGGAAGATCTTCAGCCCTGATCAATGCTCAGGCATCAACCGGTTTGTCGCCCTTTTCGCCGTCCCCCTCCTCTCCTTTCACTTCATTTCCTCCAACAACCCTTACAAAATGAACCTTCGCTTCATCGCCGCCGACACCCTCCAAAAACTAGTCGTCCTCGCCGTCCTAGCCGCCTGGAGCAACCTTACTAAAAGGGGTAGTTTAGAATGGTCCATAACATTTTTTTCACTTTCAACTTTACCCAACACGTTGGTTATGGGGATCCCTTTGCTTAAAGGGATGTATGGTGGTGAG [...]
+>contig-492356 636 0
+GGTGATATTCGGCATGTTGATTGCGTTACCAATAACTCTAGTCTACTACATTCTTCTTGGTTTGTAAGGTGTATGATTAAGGTATAAAACATGGGAAAGAGGAAGAAATCTTTAAGACAGAATGTCGTTTGTTTGATTTGTTTAAGTCAAAAATTCGGTCATTGGTTTATACGTTAAGCGCGAACTGAATCAACACGAAGATGAAAATTTTGACTTGATTTGAAGCACTCGTCCTTGATCTAAGGTTCTTGCTACCCCTGAAACTACAACCCACAAATTAGATTCTATAGTTTTAAGAAAGCAATCAAGAGATTTATGGCTTTTGAAGAGGGAAACAAGGCAATATGACCATTTGATGGACTACAAGAAGAACAATTTAGTTGTATATTGGTTTGTTAGTGTTGTAACATATATGGTAATATTGTTTGTTTTTCTTTTAACATAATAATTACTTTAAAATTATTATGAGTAAAAGACGTGTATCGTAATATG [...]
+>contig-525045 392 0
+TACATAAACTACATTAACTACGACGAGTTTAACATTTTGTTGAATAATATTAACTTGACTAAACAATGTGAACCAAATTTCATTTAAGCAACCACCAACCATCCTTTTCTATTTCTAGCAACTAACAACCTGAAAACTACTTAACGAACTACTTTCTAGCATCCGTGTTGACGTATTTTCCGCCAAAGTATGACGATGTCAACCTTGTCTATTTTAACCCCAATAGAATGTAGTAGACAAGCGTGATAGGTAACGCTATTAGCATTCCAAATATAACACCAGTACTGAGGATATCAGGATGAACGCTGTACTCCTTAGCAAAGACAAAGGGGACAATACCCTGTGGTAAAGCAGCCTGGACAATTGCAATGCGTAAAAGGACGCCACGTAGG
+>_9982
+TACTTTTGTTCTGGTCTATCAATTTTCCTGCATAATTTCATCCCACAAACACACACAACAAAAAAAACACACAAAGGCAAAAGATATTAAAAAAACCCCATCCCTTTTTGAGTTTTTGTACACTGCAAAAAACACTTGATGATCTTCTTCTTCTTCTTCTTCACAATCACATACCTTTGAAAACCTGCAAATCTCCATTGTCATCTCACCATGATCACTCTCTCAGATTTCTACCACGTCATGACCGCCGTGGTCCCCCTGTACGTGGCCATGATCCTGGCCTACGCCTCCGTGAAATGGTGGAAGATCTTCTCCCCCGATCAATGCTCCGGCATCAACCGCTTCGTCGCCCTCTTCGCCGTCCCCCTCCTCTCCTTCCACTTCATCTCCACCAACAATCCTTACCAAATGAACCTCAGATTCATCGCCGCTGACACCCTACAGAAGCTGGTGGTTCTCGTCGCCCTCGCCCTCGTCGCCAACCTCACCAAA [...]
+>_25713
+TGTTCTTTTAATACTAGAGTTTGTTTCTCTCATTGCATCTGTTCCTTCTGCTTCTTGATGGATCTCCTGCTCCTCTTGAATTAAAAGGTGAATTAATGATATCAGCATGATAACAAGATAAAAAGATTAATATATTATAGAACTGAGTGATGCAAGGGTGAATTGTGGCACTGCAATTAATAAAAATATTTGCAGCCCAAGATCTAACTAGCTAGTCTGGATGTTACATGTATAAATGCTGGGAAATGTAATACGATAAGAATTTTTGTGCTTATTTATGTTGGAGTTGCTCTTATAAGAGTGCTTAGCATAAAAAATTTGGACGCTTACAAATTGCATGTGTATAGTCTATAGATGAGTTTGTCGCACACAATTTCTCACACTTCATCTCTTGTGAGGCAACGAGGCATAACAAATTGATCTCTACACTGCAAGACATATTTTATGTTCTTGCATGCTTGTAAGTGCAAACATTTTTGTACAGCTTATGTT [...]
+>_36964
+CACACAAAAAAAAACACAAAGACAAAAGATATTCGAAAACACCACCCTAAAAAAACACTTCATCTTCTTCACATCCTGAAAACCTGCAAATCTCCATTGTCATCTCACCATGATCACTCTTTCAGACTTCTACCACGTCATGACCGCCGTAGTCCCCCTCTACGTCGCCATGATCCTCGCCTACGGCTCCGTCAAATGGTGGAAGATCTTCACCCCCGATCAATGCTCCGGCATCAACCGCTTTGTCGCCCTCTTCGCCGTCCCCCTCCTCTCCTTCCACTTCATCTCCACCAACAATCCCTACACCATGAACCTCAAATTCATCACCGCTGACACCCTCCAGAAGCTCGTTGTCCTCGCTGCCCTTGCCGCCCTCGCCAACCTCACCAAACTCGTTAGCCTCGAGTGGTCGATCACACTCTTTTCACTTTCGACTTTGCCCAATACCCTTGTTATGGGCATCCCTTTGCTTAAAGGGATGTATGGGGGTGA [...]
+>_87167
+AGGTTTGTTCATGGCTTCACAACCAAAGCTTATAGCATGTGGGAAGAGATTAGCAGCATATGGGATGGTAGCAAGGTTTGTAGCAGGACCTGCTGTGATGGCGATTGCTTCTATAGCTGTAGGTCTAAGAGGCACAATCCTACAAGTTTCTATCGTACAAGCGTCTTTACCACAAGGGATTGTGCCATTTGTGTTTGCTCGGGAGTACAACCTCCATCCTGATGTTCTGAGCACTGCGGTGATATTTGGAATGATAATATCTTTGCCTATCACAATGCTGTATTACATCTTACTAGGTCTCTAAATGAAGCAAACGGAATTGGGAAACCAGAAAGAAGACTTGATCCATGGATATTCATATGCATATATAATTATACATATCAAGATAAAGGCATACGGTAACTACATATGTCAAACAAAGTTATTTAAGTAAGACTTCAATTTCACATTGATGTTTTGATAAGTTGATAATATGAGTGTTTGAGTAGCACA [...]
+>_90885
+TTGTGGTGCTTCAGTGCATCATTTGGTACACTTTGATGCTGTTTCTGTTTGAATACAGAGGTGCAAGATTGTTGATCGCTGAGCAGTTTCCAGACACTGCTGGTTCGATTATTTCTTTTAGAGTTGATTCTGATGTGTTGTCGTTAGATGGTAAGGAGCCATTGCAGACAGAGGCTGAAGTTGGGGACGATGGGAAGCTTCATGTGACTGTTCGGAAGTCTGCAAGCTCTCATTCGGAGATCTTCTCAAGGCGGTCACACGGCGGAATGAACTCCGGTGTGTCAATAACCCCTCGGCCATCAAACTTAACAAATGCTGAAATTTACTCTTTACAGTCGTCAAGAAACCCAACACCAAGGGGCTCGAGCTTCAATCACACTGATTTTTACTCAATGGTTAATGGGAAAAGTGGTGCTAGTCCTCGTGGGTCGAATTTTGCAGGGTATGATGAGGAAAGTGGGAGTAATGTGAAGGTGAATGGGGGTCATGGGG [...]
+>_143664
+CCAAAAAAAATAGTGAAAGAAGAAAATTAATCATTAGATTTTATAATTTAGATGTGCTTATTTTCTATACATTTATACTATTTATTGTAAATTTAAAATATAAATATATCAAAAGATTAAAAAACAAAACAAAACAGAGGTCAACCAAGTTATAACCTAACTCGTGAAACTCATGTTATGATCGGATACAAAATCTATTTCTATTTGCCATTAGTCTCCACAAAGTATTGAAACTTTAAACTTGTTTTGACTAGAAAAATTATAATTATTCAATCAATTTCTCTTATTTTCAAAATAAAAAATTGAACCTCGACTTTTAACAAACTAATTGATGAGATTGATGTTGATTTTTTCCATTAAAAAGATGTACCCACCCCACGGGACCGCAATGGAGTCTATTCAGCAATCATTACGGTCCCCGTAA
+>_159544
+TCCATCCAGATATACTTAGCACAGGGTGAGTTTTTAATGACTTATACATATCTTATGTCCGTGCATTAATGACTTGTACATATCTTACTTTCGTGCTTTTGTAACTTGCACATCATGCATATGAAAGTTGTACATTGTCAATGTAACAAGTAAATGTGGCATCCAATGGTCATGTTTGAAGAGCAGAGATCCATTGCAACTTTCCCATATCTACTTGTATCCTATGCAATTAAAGTTGTACAATGCTCCAAACATGCAATACTTGTCAAAGAGTACAAGTAGTATCTTGATTTGAAAATGTACAATCTTTAAAGCATGCTTTACTTGTTTGTGTTGCAGGGTTATATTTGGAATGTTGATAGCACTACCAATTACTCTGGTGTACTACATTGTGTTGG
+>_199159
+GTAATTGGTAGAGCTATCAACATTCCAAATATAACCCCGCAAAACAAACATGTAAACCATGTATTAAAGATTTGTACATTTTCAAATCAAGATACTACTACTTATACTCTTTGACAAACATTGCATGTTCGGAGCAATGTACAACTTTTGATGCATAAGATACAAGTAGACATGGCAAAGTTGCAATGGATCTATGCTCTTCAAACATGACCATTAGATGCCACATTAATTTGTTACATTTGACAATGTGCAACTTTCATATGCATTATGTGCAAGTTACTAATGCAGGGAAACAAAATATGTACAAATCGTTAATGCACAAAAGTATTTAAAACTCACCCTGTGC
+>_199413
+AAAAAAACACACACACACACACACAAAGGGGAAAATACTACAAAATTACCATCCCCACCACCACCTTTGGAGCCTTAAAACACTTAACTCTTAAAACCTTCAAAGCACCAACATTCGCTAGTTCCCCTTGACGTACCATGATCAGTGTTTCAGATTTGTACCATGTGATGACGGCGGTCGTCCCGCTGTACGTGGCGATGATCTTAGCGTACGGGTCGGTCAAATGGTGGAAGATCTTCAGCCCTGATCAATGCTCAGGCATCAACCGGTTTGTCGCCCTTTTCGCCGTCCCCCTCCTCTCCTTTCACTTCATTTCCTCCAACAACCCTTACAAAATGAACCTTC
+>_203694
+GAAGGTTCATTTTGTAAGGGTTGTTGGAGGAAATGAAGTGAAAGGAGAGGAGGGGGACGGCGAAAAGGGCGACAAAGCGGTTGATGCCTGAGCATTGATCAGGGGTGAAGATCTTCCACCATTTGACGGAGCCGTAGGCTAAGATCATGGCGACGTAGAGCGGGACGACCGCGGTCATGACATGGTAGAAGTCTGAAAGAGTGATCATGATGAAGGGGAGAATGAAGTAATTAAGAATGGTGGTTTGAAGGTTTGAAGAGATGAAGAGGGATGTTAAGGAAGTGAAGTGTTTTGAGGCTCCAAAGGTGGTGGGGGCTTGTTTTGTAGTATTTTCCACTTTG
+>_278972
+CTCTGCTCCTCCTTTCCCCTTAAAACCAGAGACCCTTTTCTTCCCAAAACATCTCTCTCATGGACCATCAAACCTTCATTTAAACCCCAATTCCCACCTGGGTTTTCTTCTTCACCATCTTCCTTCACCTGGGTCCTCCTCCAATTTCTCAATATGATCACTGGTTCGGACCTTTACCATGTCTTCACAGCCGTGGTTCCACTCTATGTCGCCATGATCCTAGCCTACGGCTCGGTGAAATGGTGGAAGATCTTCACCCCTGATCAATGCT
+>_288182
+CGAGTATAATGTTCACCCAGAAATCCTGAGCACCAGCTAACTAGAAGAGGATTCGTTCAAGCAAATGATAAAAAGCTAGAAGGCACCAAGATATTTCAAGCCCAGGAATTAGAACACATCAGCATGATATCTCGTCATTCGTTGTAGTGAAGGGTTAAAGATCAGTACAAGATTACATAAATCTTGTAAAGTTCATGTCCAAAAAACCACCATTCTACGAGTCTGTTGTTGTATTAACTTGTCTATTTATGAACCTAGTTTACA
+>_307804
+CCCTCTTCCCTTTATATACAACTTCCCCACCCCCCCCCCACTTTCTAATTTCTCTCGCCGGAACTTCCATCTCCGTCAACGGTAAAACCGCCACAAATCACCCACATTCCGGTTACAAAACCGTCACAAAAGATAACAAACTCCCACTCTCACTCTCACTTTCCCATTTTACCCCCACCACCACCACAATGATCACCGCCCACGACTTCTACACCGTCATGTCCGCCATGGTGCCGTTATACGTCGCCAT
+>_350457
+GAAACCACAGATTCTTGAAAATTCAGTTACCATTCTCTCAGATGCAGGCCTAGGAATGGCTATGTTTAGTCTAGGTAAATATGAAAATAGAGAAGGCATTTTTGAATAAACGGGTCTATTTGGGTTGTGTTTTATCTAATACAGTCTAAATGGCTCAAATCAAAATATGCAACTGTAAAGGGAACGGGTCAAATGAGTCGAAAATGGCGCAAAAATCAATTTTTA
+>_362402
+CGAAGTTTGACAACCTCCCACCCGGAAACCCCATCATGGAGTAAAAAATCCGAGTGGTTAAAATTCGAACCTCGCGGGGTCGGATTCCTCGACGAGCTCAAACTATAAATTTCCGCCCCCGTAAGGTTCGACAGCCGGGGAGTCATCTGACTCCCCGACCCAAACCCTACCGACTTCCTCGACACATTCGACCTCCTCACCGTCACATGTAACTTCCCA
diff --git a/test/test_bin.rb b/test/test_bin.rb
new file mode 100644
index 0000000..0f1edf3
--- /dev/null
+++ b/test/test_bin.rb
@@ -0,0 +1,17 @@
+#!/usr/bin/env ruby
+
+require 'helper'
+
+class TestBinary < Test::Unit::TestCase
+
+ context 'crb-blast' do
+
+ should 'run binary' do
+ cmd = "bundle exec bin/crb-blast --help"
+ runner = CRB_Blast::Cmd.new(cmd)
+ runner.run
+ assert runner.status.success?
+ end
+
+ end
+end
diff --git a/test/test_test.rb b/test/test_test.rb
new file mode 100644
index 0000000..81766a7
--- /dev/null
+++ b/test/test_test.rb
@@ -0,0 +1,99 @@
+#!/usr/bin/env ruby
+
+require 'helper'
+
+class TestCRBBlast < Test::Unit::TestCase
+
+ context 'crb-blast' do
+
+ setup do
+ query = File.join(File.dirname(__FILE__), 'query.fasta')
+ target = File.join(File.dirname(__FILE__), 'target.fasta')
+ @blaster = CRB_Blast::CRB_Blast.new(query, target)
+ @dbs = @blaster.makedb
+ @run = @blaster.run_blast(1e-5, 6, false)
+ @load = @blaster.load_outputs
+ @recips = @blaster.find_reciprocals
+ @secondaries = @blaster.find_secondaries
+ end
+
+ teardown do
+ extensions = ["blast", "nsq", "nin", "nhr", "psq", "pin", "phr"]
+ Dir["*"].each do |file|
+ extensions.each do |extension|
+ if file =~ /.*\.#{extension}$/
+ File.delete(file)
+ end
+ end
+ end
+ end
+
+ should 'raise error when files don\'t exist' do
+ query = File.join(File.dirname(__FILE__), 'not_query.fasta')
+ target = File.join(File.dirname(__FILE__), 'not_target.fasta')
+ assert_raise IOError do
+ blaster = CRB_Blast::CRB_Blast.new(query, target)
+ end
+ end
+
+ should 'setup should run ok' do
+ ans = @blaster != nil
+ assert_equal ans, true
+ end
+
+ should 'determine that the target is a protein sequence' do
+ prot = @blaster.target_is_prot
+ assert_equal @dbs, ['query', 'target']
+ assert_equal prot, true
+ end
+
+ should 'run blast' do
+ assert_equal @run, true
+ end
+
+ should 'load outputs' do
+ assert_equal @load, [15,15]
+ end
+
+ should 'find reciprocals' do
+ assert_equal @recips, 10
+ end
+
+ should 'check if contig has reciprocal hit' do
+ assert_equal @blaster.has_reciprocal?("scaffold3"), true
+ end
+
+ should 'not find fake scaffold name' do
+ assert_equal @blaster.has_reciprocal?("not_a_scaffold"), false
+ end
+
+ should 'get query results' do
+ count=0
+ @blaster.query_results.each_pair do |key, list|
+ list.each do |hit|
+ count+=1
+ end
+ end
+ cmd = "wc -l query_into_target.1.blast"
+ lines = `#{cmd}`.to_i
+ assert_equal count, lines
+ end
+
+ should 'output all reciprocal hits' do
+ a = @blaster.reciprocals
+ assert_equal a["scaffold3"][0].target, "AT3G44735.1"
+ assert_equal a["scaffold5"][0].target, "AT5G13650.2"
+ end
+
+ should 'run' do
+ blaster = CRB_Blast::CRB_Blast.new('test/query.fasta',
+ 'test/target.fasta')
+ blaster.run 1, 1, false
+ assert blaster.reciprocals
+ end
+
+ should 'get number of reciprocals' do
+ assert_equal 11, @blaster.size
+ end
+ end
+end
diff --git a/test/test_test2.rb b/test/test_test2.rb
new file mode 100644
index 0000000..8bce3ef
--- /dev/null
+++ b/test/test_test2.rb
@@ -0,0 +1,90 @@
+#!/usr/bin/env ruby
+
+require 'helper'
+
+class Test2CRBBlast < Test::Unit::TestCase
+
+ context 'crb-blast' do
+
+ setup do
+ @blaster = CRB_Blast::CRB_Blast.new('test/query2.fasta', 'test/target2.fasta')
+ @dbs = @blaster.makedb
+ @run = @blaster.run_blast(1e-5, 6, false)
+ @load = @blaster.load_outputs
+ @recips = @blaster.find_reciprocals
+ @secondaries = @blaster.find_secondaries
+ end
+
+ teardown do
+ extensions = ["blast", "nsq", "nin", "nhr", "psq", "pin", "phr"]
+ Dir["*"].each do |file|
+ extensions.each do |extension|
+ if file =~ /.*\.#{extension}$/
+ File.delete(file)
+ end
+ end
+ end
+ end
+
+ should 'setup2 should run ok' do
+ ans = @blaster != nil
+ assert_equal ans, true
+ end
+
+ should 'break input file into pieces' do
+ files = @blaster.split_input('test/query2.fasta', 4)
+ assert_equal 4, files.size
+ files.each do |file|
+ assert File.exist?(file)
+ end
+ # little teardown
+ files.each do |file|
+ File.delete(file)
+ end
+ end
+
+ should 'break input file into no more pieces than there are sequences' do
+ files = @blaster.split_input('test/query.fasta', 16)
+ assert_equal 11, files.size
+ files.each do |file|
+ assert File.exist?(file)
+ end
+ # little teardown
+ files.each do |file|
+ File.delete(file)
+ end
+ end
+
+ should 'run blast should check if the databases exist yet' do
+ tmp = CRB_Blast::CRB_Blast.new('test/query2.fasta',
+ 'test/target2.fasta')
+ assert_equal false, tmp.run_blast(10,1,false)
+ end
+
+ should 'load output should check if the databases exist' do
+ tmp = CRB_Blast::CRB_Blast.new('test/query2.fasta',
+ 'test/target2.fasta')
+ assert_raise RuntimeError do
+ tmp.load_outputs
+ end
+ end
+
+ should 'find reciprocals' do
+ assert_equal 7, @recips
+ end
+
+ should 'add secondary hits' do
+ assert_equal 2, @secondaries
+ end
+
+ should 'get non reciprocal hits' do
+ count=0
+ @blaster.missed.each_pair do |key, value|
+ value.each do |i|
+ count+=1
+ end
+ end
+ assert_equal count,70
+ end
+ end
+end
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/pkg-ruby-extras/ruby-crb-blast.git
More information about the Pkg-ruby-extras-commits
mailing list