[Cdd-commits] r1004 - projects/med/trunk/debian-med/tasks

Wed Jul 23 12:48:58 UTC 2008

Author: tille
Date: Wed Jul 23 12:48:58 2008
New Revision: 1004

Modified:
   projects/med/trunk/debian-med/tasks/bio
Log:
Added hexamer


Modified: projects/med/trunk/debian-med/tasks/bio
==============================================================================

--- projects/med/trunk/debian-med/tasks/bio	(original)
+++ projects/med/trunk/debian-med/tasks/bio	Wed Jul 23 12:48:58 2008
@@ -1890,3 +1890,30 @@
  .
  Please note FINEX is no longer supported but is available for
  download.
+
+Depends: hexamer
+Homepage: http://www.sanger.ac.uk/Software/analysis/hexamer/
+License: GPL
+Pkg-Description: scan DNA sequences to look for likely coding regions
+ Hexamer is a program to scan DNA sequences to look for likely coding
+ regions. The principle is to use 6mers, but to avoid deriving any
+ information from base composition. Therefore, the frequencies of each
+ 6mer are normalized by dividing by the total frequency of all 6mers
+ with the same base composition.
+ .
+ There are two programs involved in this process:
+  * hextable
+    hextable makes files of statistics that hexamer uses to scan for
+    likely coding regions.
+    The input of hextable is a fasta file of coding sequences in
+    frame.  The -o file output is an ascii list of 4096 floating point
+    numbers giving log likelihood ratio scores in bits.  The output on
+    stdout is a summary of the information content of the table,
+    indicating how disriminative it is likely to be.
+  * hexamer
+    Uses the .hex file from hextable to scan a DNA sequence for likely
+    coding regions.
+    The input is a fasta DNA file (n.b. that these programs assume all
+    'a','c','g','t'. 'n's found in the sequence files will be
+    converted to 'c'.
+    The output of hexamer is in General Feature Format (GFF) format.