TnpPred - Documentation

Why TnPred?

Transposases (Tnps) are enzymes that are encoded by insertion sequences (ISs) and participate in the movement ISs within and between genomes. Tnps are one of the commonest and most ubiquitous proteins found in nature. However, they are difficult to predict bioinformatically and given the increasing availability of prokaryotic genomes and metagenomes, it is incumbent to develop rapid, high quality, automatic annotation of ISs. Two such programs are currently used, Pfam HMM profiles based on HMMs derived for several, but not all known Tnps, and ISsaga that uses Blastp to detect Tnps and links this information to DNA sequence motifs to predict IS famililes. We were prompted to develop a program for Tnps prediction (TnpPred) based on HMMs derived from all known Tnps, that extends the number of predictions available in ISsaga and provides better sensitivity and specificity than corresponding Pfam HMM Profiles.

What else is on the available

In addition to the web service TnPred, you can:

  1. Download the TnpPred HMM profiles used to construct TnPred.
  2. Download the ROC curves used to validate the HMM profiles (Coming soon).
  3. Download over 2000 prokaryotic transposase genome annotations using current NCBI genome annotations, improved with transposase annotations generated with TnpPred.

How were the HMM profiles built and validated?

The 47 HMM profiles used in the TnPred web service were built from protein sequences deposited in ISfinder (Classification of Prokaryotic Insertion Sequences) using the HMMer program. Validations were made by ROC curve analysis providing measurements of the sensitivity and specificity of each HMM profile.