Introduction

Purpose

edittag is a suite of programs for designing, checking, and using sets of sequence tags that conform to the edit metric. Here, we define these sets as the groups of sequence tags of a particular length where all tags within the set are at least the select edit (Levenshtein) distance [Levenshtein:1966], [Gusfield:1999] from one another.

At a more general level, edittag contains tools facilitating high levels of plexity during massively parallel DNA sequencing runs.

Using edittag, we have generated several large sets of edit metric sequence tags from four to 10 nucleotides in length and ranging from edit distance three to nine. We provide these pre-generated tag sets under a Creative Commons Attribution 3.0 United States license in several forms:

Features

  • edittag correctly designs sequence tag sets conforming to the edit metric
  • edittag contains a method to test designed sets (from edittag or elsewhere) for conformance to the edit metric (and/or the Hamming distance [Hamming:1950] between sequence tags)
  • edittag provides several programs for integrating sequence tags to primers and sequencing adapters
  • edittag uses multiprocessing to speed computation, particularly when designing sequence tags of arbitrary length
  • edittag provides unittests of critical code (edit distance computation) and output to ensure underlying methods return expected results

Availability

We provide several methods of installing eddittag, see Installation for additional details:

Dependencies

Optional

Although optional, we strongly recommend installation of py-levenshtein, which is a C-module for python that speeds computation of the Levenshtein distance between strings.

Table Of Contents

Related Topics

This Page