Debian Med Project
Help us to see Debian used by medical practitioners and biomedical researchers! Join us on the Alioth page.
Summary
Biology
Debian Med bioinformatics packages

This metapackage will install Debian packages for use in molecular biology, structural biology and other biological sciences.

Description

For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:

If you discover a project which looks like a good candidate for Debian Med to you, or if you have prepared an unofficial Debian package, please do not hesitate to send a description of that project to the Debian Med mailing list

Links to other tasks

Debian Med Biology packages

Official Debian packages with high relevance

Abacas
Algorithm Based Automatic Contiguation of Assembled Sequences
Versions of package abacas
ReleaseVersionArchitectures
wheezy1.3.1-1all
jessie1.3.1-2all
stretch1.3.1-3all
sid1.3.1-3all
Debtags of package abacas:
roleprogram
Popcon: 5 users (51 upd.)*
Versions and Archs
License: DFSG free
Svn

ABACAS is intended to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence.

ABACAS uses MUMmer to find alignment positions and identify syntenies of assembled contigs against the reference. The output is then processed to generate a pseudomolecule taking overlapping contigs and gaps in to account. ABACAS generates a comparision file that can be used to visualize ordered and oriented contigs in ACT. Synteny is represented by red bars where colour intensity decreases with lower values of percent identity between comparable blocks. Information on contigs such as the orientation, percent identity, coverage and overlap with other contigs can also be visualized by loading the outputted feature file on ACT.

Please cite: Samuel Assefa, Thomas M. Keane, Thomas D. Otto, Chris Newbold and Matthew Berriman: ABACAS: algorithm-based automatic contiguation of assembled sequences. (PubMed,eprint) Bioinformatics 25(15):1968-1969 (2009)
Acedb-other
retrieval of DNA or protein sequences
Versions of package acedb-other
ReleaseVersionArchitectures
wheezy4.9.39+dfsg.01-5amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.9.39+dfsg.01-5amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.9.39+dfsg.01-6amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid4.9.39+dfsg.01-6amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package acedb-other:
biologynuceleic-acids
fieldbiology, biology:bioinformatics
roleprogram
scopeutility
Popcon: 6 users (49 upd.)*
Versions and Archs
License: DFSG free
Svn

This package collects all those smallish applications that acedb collects under its 'other' target of its Makefile.

efetch: presumably short for 'entry fetch' collects sequence information from common DNA and protein databases.

Please cite: L. D. Stein and J. Thierry-Mieg: AceDB: a genome database management system. Computing in Science and Engineering 1(3):44-52 (1999)
Acedb-other-belvu
multiple sequence alignment editor
Versions of package acedb-other-belvu
ReleaseVersionArchitectures
wheezy4.9.39+dfsg.01-5amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.9.39+dfsg.01-5amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.9.39+dfsg.01-6amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid4.9.39+dfsg.01-6amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package acedb-other-belvu:
roleprogram
uitoolkitgtk, ncurses
Popcon: 4 users (49 upd.)*
Versions and Archs
License: DFSG free
Svn

For the analysis of biological sequences, a general principle is to corresponding regions between related proteins, RNA or DNA. Written next to each other, corresponding positions above each other, one has prepared an alignment.

Belvu is best known for its perfect implementation of the Stockholm format of multiple sequence alignments, since upstream is maintaining that. That is for instance used in the Pfam and Rfam databases.

Please cite: L. D. Stein and J. Thierry-Mieg: AceDB: a genome database management system. Computing in Science and Engineering 1(3):44-52 (1999)
Acedb-other-dotter
visualisation of sequence similarity
Versions of package acedb-other-dotter
ReleaseVersionArchitectures
wheezy4.9.39+dfsg.01-5amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.9.39+dfsg.01-5amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.9.39+dfsg.01-6amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid4.9.39+dfsg.01-6amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package acedb-other-dotter:
roleprogram
uitoolkitgtk, ncurses
Popcon: 4 users (49 upd.)*
Versions and Archs
License: DFSG free
Svn

For the analysis of biological sequences, a general principle is to compare corresponding regions between related proteins, RNA or DNA.

Dotter, as an interactive dotplot with varying thresholds, displays graphically the similarity of DNA or protein sequence to itself or another sequence.

Please cite: L. D. Stein and J. Thierry-Mieg: AceDB: a genome database management system. Computing in Science and Engineering 1(3):44-52 (1999)
Adun-core
Molecular Simulator
Versions of package adun-core
ReleaseVersionArchitectures
stretch0.81-9amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.81-9amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 3 users (17 upd.)*
Versions and Archs
License: DFSG free
Svn

Adun is a biomolecular simulator that also includes data management and analysis capabilities. It was developed at the Computational Biophysics and Biochemistry Laboratory, a part of the Research Unit on Biomedical Informatics of the UPF.

This package contains the AdunCore program and the Adun server. If you want the graphical UI frontend, install the adun.app package.

Please cite: Michael A. Johnston, Ignacio Fdez. Galván and Jordi Villà-Freixa: Framework-based design of a new all-purpose molecular simulation application: The Adun simulator. (PubMed) J. Comp. Chem. 26(15):1647-1659 (2005)
Aegean
integrated genome analysis toolkit
Versions of package aegean
ReleaseVersionArchitectures
stretch0.15.2+dfsg-1amd64,arm64,armel,armhf,i386,mipsel,powerpc,ppc64el,s390x
sid0.15.2+dfsg-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mipsel,powerpc,ppc64el,s390x
Popcon: 2 users (46 upd.)*
Versions and Archs
License: DFSG free
Git

The AEGeAn Toolkit is designed for the Analysis and Evaluation of Genome Annotations. The toolkit includes a variety of analysis programs, e.g. for comparing distinct sets of gene structure annotations (ParsEval), computation of gene loci (LocusPocus) and more.

Please cite: Daniel S Standage and Volker P Brendel: ParsEval: parallel comparison and analysis of gene structure annotations.. (PubMed,eprint) BMC Bioinformatics 13(1):187 (2012)
Aevol
digital genetics model to run Evolution Experiments in silico
Versions of package aevol
ReleaseVersionArchitectures
jessie4.4-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.4-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid4.4-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 6 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

Aevol is a digital genetics model: populations of digital organisms are subjected to a process of selection and variation, which creates a Darwinian dynamics.

By modifying the characteristics of selection (e.g. population size, type of environment, environmental variations) or variation (e.g. mutation rates, chromosomal rearrangement rates, types of rearrangements, horizontal transfer), one can study experimentally the impact of these parameters on the structure of the evolved organisms. In particular, since Aevol integrates a precise and realistic model of the genome, it allows for the study of structural variations of the genome (e.g. number of genes, synteny, proportion of coding sequences).

The simulation platform comes along with a set of tools for analysing phylogenies and measuring many characteristics of the organisms and populations along evolution.

Please cite: Dusan Misevic, Antoine Frenoy, David P. Parsons and Francois Taddei: Effects of public good properties on the evolution of cooperation. (eprint) :218-225 (2012)
Alien-hunter
Interpolated Variable Order Motifs to identify horizontally acquired DNA
Versions of package alien-hunter
ReleaseVersionArchitectures
squeeze1.7-1all
wheezy1.7-1all
jessie1.7-3all
stretch1.7-3all
sid1.7-3all
Debtags of package alien-hunter:
fieldbiology, biology:structural
roleprogram
scopeutility
useanalysing
Popcon: 9 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

Alien_hunter is an application for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs). An IVOM approach exploits compositional biases using variable order motif distributions and captures more reliably the local composition of a sequence compared to fixed-order methods. Optionally the predictions can be parsed into a 2-state 2nd order Hidden Markov Model (HMM), in a change-point detection framework, to optimize the localization of the boundaries of the predicted regions. The predictions (embl format) can be automatically loaded into Artemis genome viewer freely available at: http://www.sanger.ac.uk/Software/Artemis/.

Please cite: Georgios S. Vernikos and Julian Parkhill: Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands. (PubMed,eprint) Bioinformatics 22(18):2196-2203 (2006)
Alter-sequence-alignment
genomic sequences ALignment Transformation EnviRonment
Versions of package alter-sequence-alignment
ReleaseVersionArchitectures
stretch1.3.3+dfsg-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid1.3.3+dfsg-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

ALTER (ALignment Transformation EnviRonment) is a tool to transform between multiple sequence alignment formats. ALTER focuses on the specifications of mainstream alignment and analysis programs rather than on the conversion among more or less specific formats.

Please cite: Daniel Glez-Peña, Daniel Gómez-Blanco, Miguel Reboiro-Jato, Florentino Fdez-Riverola and David Posada: ALTER: program-oriented conversion of DNA and protein alignments. (PubMed,eprint) Nucl. Acids Res. 38(suppl 2):W14-W18 (2010)
Altree
program to perform phylogeny-based association and localization analysis
Versions of package altree
ReleaseVersionArchitectures
squeeze1.0.1-3amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.2.1-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.3.1-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.3.1-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.3.1-4amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package altree:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram, shared-lib
scopeutility
useanalysing, comparing
works-with-formatplaintext
Popcon: 4 users (49 upd.)*
Versions and Archs
License: DFSG free
Svn

ALTree was designed to perform association detection and localization of susceptibility sites using haplotype phylogenetic trees: first, it allows the detection of an association between a candidate gene and a disease, and second, it enables to make hypothesis about the susceptibility loci.

Please cite: Claire Bardel, Vincent Danjean and Emmanuelle Genin: ALTree: association detection and localization of susceptibility sites using haplotype phylogenetic trees. (PubMed,eprint) Bioinformatics 22(11):1402-1403 (2006)
Amap-align
Protein multiple alignment by sequence annealing
Versions of package amap-align
ReleaseVersionArchitectures
squeeze2.2-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.2-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.2-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.2-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.2-5amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package amap-align:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing
works-with-formatplaintext
Popcon: 17 users (71 upd.)*
Versions and Archs
License: DFSG free
Svn

AMAP is a command line tool to perform multiple alignment of peptidic sequences. It utilizes posterior decoding, and a sequence-annealing alignment, instead of the traditional progressive alignment method. It is the only alignment program that allows one to control the sensitivity / specificity tradeoff. It is based on the ProbCons source code, but uses alignment metric accuracy and eliminates the consistency transformation.

The java visualisation tool of AMAP 2.2 is not yet packaged in Debian.

Please cite: Ariel S. Schwartz and Lior Pachter: Multiple alignment by sequence annealing. (eprint) Bioinformatics 23(2):e24-e29 (2007)
Remark of Debian Med team: Dead upstream

The homepage of this project vanished as well as the Download area. An old unmaintained version remained at code.google.com. Please drop the maintainer a note if you have any news of this project.

Ampliconnoise
removal of noise from 454 sequenced PCR amplicons
Versions of package ampliconnoise
ReleaseVersionArchitectures
wheezy1.25-1amd64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,powerpc,sparc
jessie1.29-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el
stretch1.29-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.29-5amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package ampliconnoise:
roleprogram
Popcon: 4 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

AmpliconNoise is a package of applications to clean up high-throughput sequence data. It consists of three main parts:

Pyronoise - does flowgram-based clustering to spot misreads SeqNoise - removes PCR point mutations Perseus - removes PCR chimeras without the need for a set of reference sequences

Previously there was a standalone "Pyronoise" by the same authors and this package includes an updated version. There is also a "Denoiser" in Qiime which is related but distinct.

Please cite: Christopher Quince, Anders Lanzen, Russell J Davenport and Peter J Turnbaugh: Removing Noise From Pyrosequenced Amplicons. (PubMed,eprint) BMC Bioinformatics 12:38 (2011)
Andi
Efficient Estimation of Evolutionary Distances
Versions of package andi
ReleaseVersionArchitectures
stretch0.10-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.10-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This is the andi program for estimating the evolutionary distance between closely related genomes. These distances can be used to rapidly infer phylogenies for big sets of genomes. Because andi does not compute full alignments, it is so efficient that it scales even up to thousands of bacterial genomes.

Please cite: Bernhard Haubold, Fabian Klötzl and Peter Pfaffelhuber: andi: Fast and accurate estimation of evolutionary distances between closely related genomes. (PubMed,eprint) Bioinformatics 31(8):1169-1175 (2015)
Anfo
Short Read Aligner/Mapper from MPG
Versions of package anfo
ReleaseVersionArchitectures
jessie0.98-4amd64,armel,armhf,i386,mips,mipsel,powerpc,s390x
sid0.98-4amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Svn

Anfo is a mapper in the spirit of Soap/Maq/Bowtie, but its implementation takes more after BLAST/BLAT. It's most useful for the alignment of sequencing reads where the DNA sequence is somehow modified (think ancient DNA or bisulphite treatment) and/or there is more divergence between sample and reference than what fast mappers will handle gracefully (say the reference genome is missing and a related species is used instead).

Aragorn
tRNA and tmRNA detection in nucleotide sequences
Versions of package aragorn
ReleaseVersionArchitectures
jessie1.2.36-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.2.37-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.2.37-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (65 upd.)*
Versions and Archs
License: DFSG free
Git

The program employs heuristic algorithms to predict tRNA secondary structure, based on homology with recognized tRNA consensus sequences and ability to form a base-paired cloverleaf. tmRNA genes are identified using a modified version of the BRUCE program.

Please cite: Dean Laslett and Bjorn Canback: ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. (PubMed,eprint) Nucleic Acids Research 32(1):11-16 (2004)
Arden
specificity control for read alignments using an artificial reference
Versions of package arden
ReleaseVersionArchitectures
jessie1.0-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid1.0-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ARDEN (Artificial Reference Driven Estimation of false positives in NGS data) is a novel benchmark that estimates error rates based on real experimental reads and an additionally generated artificial reference genome. It allows the computation of error rates specifically for a dataset and the construction of a ROC-curve. Thereby, it can be used to optimize parameters for read mappers, to select read mappers for a specific problem or also to filter alignments based on quality estimation.

Please cite: Sven H. Giese, Franziska Zickmann and Bernhard Y. Renard: Specificity control for read alignments using an artificial reference genome-guided false discovery rate. (PubMed,eprint) Bioinformatics 30(1):9-16 (2013)
Ariba
Antibiotic Resistance Identification By Assembly
Versions of package ariba
ReleaseVersionArchitectures
stretch1.0.1-1all
sid1.0.1-1all
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

ARIBA is a tool that identifies antibiotic resistance genes by running local assemblies. The input is a FASTA file of reference genes and paired sequencing reads. ARIBA reports which of the reference genes were found, plus detailed information on the quality of the assemblies and any variants between the sequencing reads and the reference genes.

Art-nextgen-simulation-tools
simulation tools to generate synthetic next-generation sequencing reads
Versions of package art-nextgen-simulation-tools
ReleaseVersionArchitectures
stretch20160605+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid20160605+dfsg-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

ART is a set of simulation tools to generate synthetic next-generation sequencing reads. ART simulates sequencing reads by mimicking real sequencing process with empirical error models or quality profiles summarized from large recalibrated sequencing data. ART can also simulate reads using user own read error model or quality profiles. ART supports simulation of single-end, paired-end/mate-pair reads of three major commercial next-generation sequencing platforms: Illumina's Solexa, Roche's 454 and Applied Biosystems' SOLiD. ART can be used to test or benchmark a variety of method or tools for next-generation sequencing data analysis, including read alignment, de novo assembly, SNP and structure variation discovery. ART was used as a primary tool for the simulation study of the 1000 Genomes Project . ART is implemented in C++ with optimized algorithms and is highly efficient in read simulation. ART outputs reads in the FASTQ format, and alignments in the ALN format. ART can also generate alignments in the SAM alignment or UCSC BED file format. ART can be used together with genome variants simulators (e.g. VarSim) for evaluating variant calling tools or methods.

Please cite: Weichun Huang, Leping Li, Jason R. Myers and Gabor T. Marth: ART: a next-generation sequencing read simulator. (PubMed,eprint) Bioinformatics 28(4):593-594 (2012)
Artemis
genome browser and annotation tool
Versions of package artemis
ReleaseVersionArchitectures
stretch16.0.0+dfsg-5all
sid16.0.0+dfsg-5all
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Artemis is a genome browser and annotation tool that allows visualisation of sequence features, next generation data and the results of analyses within the context of the sequence, and also its six-frame translation.

This package includes the Artemis genome browser, the Artemis Comparison Tool (ACT), and the DNAplotter and BamView utilities.

Please cite: Tim Carver, Simon R. Harris, Matthew Berriman, Julian Parkhill and Jacqueline A. McQuillan: Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. (PubMed,eprint) Bioinformatics 28(4):464-469 (2012)
Screenshots of package artemis
Artfastqgenerator
ouputs artificial FASTQ files derived from a reference genome
Versions of package artfastqgenerator
ReleaseVersionArchitectures
stretch0.0.20150519-1all
sid0.0.20150519-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

ArtificialFastqGenerator takes the reference genome (in FASTA format) as input and outputs artificial FASTQ files in the Sanger format. It can accept Phred base quality scores from existing FASTQ files, and use them to simulate sequencing errors. Since the artificial FASTQs are derived from the reference genome, the reference genome provides a gold-standard for calling variants (Single Nucleotide Polymorphisms (SNPs) and insertions and deletions (indels)). This enables evaluation of a Next Generation Sequencing (NGS) analysis pipeline which aligns reads to the reference genome and then calls the variants.

Please cite: Matthew Frampton and Richard Houlston: Generation of Artificial FASTQ Files to Evaluate the Performance of Next-Generation Sequencing Pipelines. (PubMed,eprint) PLOSone 7(11):e49110 (2012)
Augustus
gene prediction in eukaryotic genomes
Versions of package augustus
ReleaseVersionArchitectures
stretch3.2.2+dfsg-1amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
sid3.2.2+dfsg-1amd64,arm64,armel,i386,kfreebsd-amd64,kfreebsd-i386,mips64el,mipsel,ppc64el
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

AUGUSTUS is a software for gene prediction in eukaryotic genomic sequences that is based on a generalized hidden Markov model (HMM), a probabilistic model of a sequence and its gene structure. After learning gene structures from a reference annotation, AUGUSTUS uses the HMM to recognize genes in a new sequence and annotates it with the regions of identified genes. External hints, e.g. from RNA sequencing, EST or protein alignments etc. can be used to guide and improve the gene finding process. The result is the set of most likely gene structures that comply with all given user constraints, if such gene structures exist. AUGUSTUS already includes prebuilt HMMs for many species, as well as scripts to train custom models using annotated genomes.

Please cite: M. Stanke, O. Schöffmann, B. Morgenstern and S. Waack: Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. (PubMed,eprint) BMC Bioinformatics 7:62 (2006)
Autodock
analysis of ligand binding to protein structure
Versions of package autodock
ReleaseVersionArchitectures
squeeze4.2.3-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy4.2.3-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.2.6-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.2.6-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid4.2.6-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package autodock:
fieldbiology, biology:structural
interfacecommandline
roleprogram
scopeutility
useanalysing
works-with3dmodel
Popcon: 12 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

AutoDock is a prime representative of the programs addressing the simulation of the docking of fairly small chemical ligands to rather big protein receptors. Earlier versions had all flexibility in the ligands while the protein was kept rather ridgid. This latest version 4 also allows for a flexibility of selected sidechains of surface residues, i.e., takes the rotamers into account.

The AutoDock program performs the docking of the ligand to a set of grids describing the target protein. AutoGrid pre-calculates these grids.

The package is enhanced by the following packages: autogrid
Please cite: Garrett M. Morris, Ruth Huey, William Lindstrom, Michel F. Sanner, Richard K. Belew, David S. Goodsell and Arthur J. Olson: AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. (PubMed) Journal of Computational Chemistry 30(16):2785-2791 (2009)
Screenshots of package autodock
Autodock-vina
docking of small molecules to proteins
Versions of package autodock-vina
ReleaseVersionArchitectures
wheezy1.1.2-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.1.2-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.1.2-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.1.2-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 9 users (46 upd.)*
Versions and Archs
License: DFSG free
Svn

AutoDock Vina is a program to support drug discovery, molecular docking and virtual screening of compound libraries. It offers multi-core capability, high performance and enhanced accuracy and ease of use.

The same institute also developed autodock, which is widely used.

O. Trott, A. J. Olson, AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461

Please cite: Oleg Trott and Arthur J. Olson: AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. (eprint) Journal of Computational Chemistry 31(2):455-461 (2010)
Autogrid
pre-calculate binding of ligands to their receptor
Versions of package autogrid
ReleaseVersionArchitectures
squeeze4.2.3-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy4.2.3-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.2.6-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.2.6-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid4.2.6-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package autogrid:
fieldbiology, biology:structural
interfacecommandline
roleprogram
scopeutility
useanalysing
works-with3dmodel
Popcon: 9 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

The AutoDockSuite addresses the molecular analysis of the docking of a smaller chemical compounds to their receptors of known three-dimensional structure.

The AutoGrid program performs pre-calculations for the docking of a ligand to a set of grids that describe the effect that the protein has on point charges. The effect of these forces on the ligand is then analysed by the AutoDock program.

Please cite: Garrett M. Morris, Ruth Huey, William Lindstrom, Michel F. Sanner, Richard K. Belew, David S. Goodsell and Arthur J. Olson: AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. (PubMed) Journal of Computational Chemistry 30(16):2785-2791 (2009)
Axe-demultiplexer
Trie-based DNA sequencing read demultiplexer
Versions of package axe-demultiplexer
ReleaseVersionArchitectures
stretch0.3.1-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.3.1-4amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Axe very rapidly selects the optimal barcode present in a sequence read, even in the presence of sequencing errors. The algorithm is able to handle combinatorial barcoding, barcodes of differing length, and several mismatches per barcode.

Ballview
free molecular modeling and molecular graphics tool
Versions of package ballview
ReleaseVersionArchitectures
squeeze1.3.2-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.4.1+20111206-4amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.4.2+20140406-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid1.4.2+20140406-1.1arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,powerpc,ppc64el,s390x
sid1.4.3~beta1-1amd64,mips64el,mipsel
upstream1.4.3-BETA1
Debtags of package ballview:
interfacex11
roleprogram
uitoolkitqt
x11application
Popcon: 18 users (8 upd.)*
Newer upstream!
License: DFSG free
Git

BALLView provides fast OpenGL-based visualization of molecular structures, molecular mechanics methods (minimization, MD simulation using the AMBER, CHARMM, and MMFF94 force fields), calculation and visualization of electrostatic properties (FDPB) and molecular editing features.

BALLView can be considered a graphical user interface on the basis of BALL (Biochemical Algorithms Library) with a focus on the most common demands of protein chemists and biophysicists in particular. It is developed in the groups of Hans-Peter Lenhof (Saarland University, Saarbruecken, Germany) and Oliver Kohlbacher (University of Tuebingen, Germany). BALL is an application framework in C++ that has been specifically designed for rapid software development in Molecular Modeling and Computational Molecular Biology. It provides an extensive set of data structures as well as classes for Molecular Mechanics, advanced solvation methods, comparison and analysis of protein structures, file import/export, and visualization.

Please cite: Andreas Moll, Andreas Hildebrandt, Hans-Peter Lenhof and Oliver Kohlbacher: BALLView: a tool for research and education in molecular modeling. (PubMed,eprint) Bioinformatics 22(3):365-366 (2006)
Screenshots of package ballview
Bamtools
toolkit for manipulating BAM (genome alignment) files
Versions of package bamtools
ReleaseVersionArchitectures
jessie2.3.0+dfsg-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.4.0+dfsg-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.4.0+dfsg-6amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 6 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

BamTools facilitates research analysis and data management using BAM files. It copes with the enormous amount of data produced by current sequencing technologies that is typically stored in compressed, binary formats that are not easily handled by the text-based parsers commonly used in bioinformatics research.

BamTools provides both a C++ API for BAM file support as well as a command-line toolkit.

This is the bamtools command-line toolkit.

Available bamtools commands:

 convert  Converts between BAM and a number of other formats
 count    Prints number of alignments in BAM file(s)
 coverage Prints coverage statistics from the input BAM file
 filter   Filters BAM file(s) by user-specified criteria
 header   Prints BAM header information
 index    Generates index for BAM file
 merge    Merge multiple BAM files into single file
 random   Select random alignments from existing BAM file(s), intended more
          as a testing tool.
 resolve  Resolves paired-end reads (marking the IsProperPair flag as needed)
 revert   Removes duplicate marks and restores original base qualities
 sort     Sorts the BAM file according to some criteria
 split    Splits a BAM file on user-specified property, creating a new BAM
          output file for each value found
 stats    Prints some basic statistics from input BAM file(s)
Please cite: Derek W. Barnett, Erik K. Garrison, Aaron R. Quinlan, Michael P. Stromberg and Gabor T. Marth: BamTools: a C++ API and toolkit for analyzing and managing BAM files. (PubMed,eprint) Bioinformatics 27(12):1691-2 (2011)
Barrnap
rapid ribosomal RNA prediction
Versions of package barrnap
ReleaseVersionArchitectures
stretch0.7+dfsg-2all
sid0.7+dfsg-2all
Popcon: 2 users (45 upd.)*
Versions and Archs
License: DFSG free
Git

Barrnap (BAsic Rapid Ribosomal RNA Predictor) predicts the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S).

It takes FASTA DNA sequence as input, and writes GFF3 as output. It uses the NHMMER tool that comes with HMMER 3.1 for HMM searching in RNA:DNA style. Multithreading is supported and one can expect roughly linear speed-ups with more CPUs.

Bcftools
genomic variant calling and manipulation of VCF/BCF files
Versions of package bcftools
ReleaseVersionArchitectures
stretch1.3.1-1amd64,arm64,armel,mips64el,mipsel,ppc64el
sid1.3.1-1amd64,arm64,armel,kfreebsd-amd64,mips64el,mipsel,ppc64el
Popcon: 5 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.

Beagle
Genotype calling, genotype phasing and imputation of ungenotyped markers
Versions of package beagle
ReleaseVersionArchitectures
stretch4.1~160616-7e4+dfsg-1all
sid4.1~160616-7e4+dfsg-1all
upstream23Jul16.fb0
Popcon: 8 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

Beagle performs genotype calling, genotype phasing, imputation of ungenotyped markers, and identity-by-descent segment detection. Genotypic imputation works on phased haplotypes using a Li and Stephens haplotype frequency model. Beagle also implements the Refined IBD algorithm for detecting homozygosity-by-descent (HBD) and identity-by-descent (IBD) segments.

The package is enhanced by the following packages: beagle-doc
Please cite: Sharon R. Browning and Brian L. Browning: Rapid and Accurate Haplotype Phasing and Missing-Data Inference for Whole-Genome Association Studies By Use of Localized Haplotype Clustering. (eprint) The American Journal of Human Genetics 81(5):1084-1097 (2007)
Beast-mcmc
Bayesian MCMC phylogenetic inference
Versions of package beast-mcmc
ReleaseVersionArchitectures
jessie1.8.0-1 (contrib)all
stretch1.8.4+dfsg.1-1all
sid1.8.4+dfsg.1-1all
Popcon: 0 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

BEAST is a cross-platform program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability. Included is a simple to use user-interface program for setting up standard analyses and a suit of programs for analysing the results.

The package is enhanced by the following packages: beast-mcmc-doc beast-mcmc-examples
Please cite: Alexei J Drummond and Andrew Rambaut: BEAST: Bayesian evolutionary analysis by sampling trees. (PubMed,eprint) BMC Evol Biol 8(7):214 (2007)
Beast2-mcmc
Bayesian MCMC phylogenetic inference
Versions of package beast2-mcmc
ReleaseVersionArchitectures
stretch2.4.2+dfsg-1all
sid2.4.2+dfsg-1all
Popcon: 0 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

BEAST is a cross-platform program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability. Included is a simple to use user-interface program for setting up standard analyses and a suit of programs for analysing the results.

This is no new upstream version of beast-mcmc (1.x) but rather a rewritten version.

The package is enhanced by the following packages: beast2-mcmc-doc beast2-mcmc-examples
Please cite: Remco Bouckaert, Joseph Heled, Denise Kühnert, Tim Vaughan, Chieh-Hsi Wu, Dong Xie, Marc A. Suchard, Andrew Rambaut and Alexei J. Drummond: BEAST 2: A Software Platform for Bayesian Evolutionary Analysis. (PubMed,eprint) PLoS Comput Biol 10(4):e1003537 (2014)
Bedtools
suite of utilities for comparing genomic features
Versions of package bedtools
ReleaseVersionArchitectures
wheezy2.16.1-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.21.0-1amd64,arm64,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.25.0-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.25.0-1armhf,hurd-i386,mips,powerpc,s390x
sid2.26.0+dfsg-1amd64,arm64,armel,i386,kfreebsd-amd64,kfreebsd-i386,mips64el,mipsel,ppc64el
Debtags of package bedtools:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopesuite
useanalysing, comparing, converting, filtering
works-withbiological-sequence
Popcon: 31 users (68 upd.)*
Versions and Archs
License: DFSG free
Git

The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using BEDTools, one can develop sophisticated pipelines that answer complicated research questions by streaming several BEDTools together.

The groupBy utility is distributed in the filo package.

Please cite: Aaron R. Quinlan and Ira M. Hall: BEDTools: a flexible suite of utilities for comparing genomic features. (PubMed,eprint) Bioinformatics 26(6):841-842 (2010)
Berkeley-express
Streaming quantification for high-throughput sequencing
Versions of package berkeley-express
ReleaseVersionArchitectures
stretch1.5.1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.5.1-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

eXpress is a streaming tool for quantifying the abundances of a set of target sequences from sampled subsequences. Example applications include transcript-level RNA-Seq quantification, allele-specific/haplotype expression analysis (from RNA-Seq), transcription factor binding quantification in ChIP-Seq, and analysis of metagenomic data. It is based on an online-EM algorithm that results in space (memory) requirements proportional to the total size of the target sequences and time requirements that are proportional to the number of sampled fragments. Thus, in applications such as RNA-Seq, eXpress can accurately quantify much larger samples than other currently available tools greatly reducing computing infrastructure requirements. eXpress can be used to build lightweight high-throughput sequencing processing pipelines when coupled with a streaming aligner (such as Bowtie), as output can be piped directly into eXpress, effectively eliminating the need to store read alignments in memory or on disk.

In an analysis of the performance of eXpress for RNA-Seq data, it was observed that this efficiency does not come at a cost of accuracy. eXpress is more accurate than other available tools, even when limited to smaller datasets that do not require such efficiency. Moreover, like the Cufflinks program, eXpress can be used to estimate transcript abundances in multi-isoform genes. eXpress is also able to resolve multi-mappings of reads across gene families, and does not require a reference genome so that it can be used in conjunction with de novo assemblers such as Trinity, Oases, or Trans-ABySS. The underlying model is based on previously described probabilistic models developed for RNA-Seq but is applicable to other settings where target sequences are sampled, and includes parameters for fragment length distributions, errors in reads, and sequence-specific fragment bias.

eXpress can be used to resolve ambiguous mappings in other high-throughput sequencing based applications. The only required inputs to eXpress are a set of target sequences and a set of sequenced fragments multiply-aligned to them. While these target sequences will often be gene isoforms, they need not be. Haplotypes can be used as the reference for allele-specific expression analysis, binding regions for ChIP-Seq, or target genomes in metagenomics experiments. eXpress is useful in any analysis where reads multi-map to sequences that differ in abundance.

Please cite: Adam Roberts and Lior Pachter: Streaming fragment assignment for real-time analysis of sequencing experiments. (PubMed) Nature Methods 10(1):71–73 (2013)
Bio-rainbow
clustering and assembling short reads for bioinformatics
Versions of package bio-rainbow
ReleaseVersionArchitectures
stretch2.0.4-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.0.4-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Efficient tool for clustering and assembling short reads, especially for RAD.

Biomaj
biological data-bank updater
Versions of package biomaj
ReleaseVersionArchitectures
wheezy1.2.1-1all
jessie1.2.3-4all
stretch1.2.3-9all
sid1.2.3-9all
Debtags of package biomaj:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
Popcon: 1 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

BioMAJ downloads remote data banks, checks their status and applies transformation workflows, with consistent state, to provide ready-to-use data for biologists and bioinformaticians. For example, it can transform original FASTA files into BLAST indexes. It is very flexible and its post-processing facilities can be extended very easily.

Please cite: Olivier Filangi, Yoann Beausse, Anthony Assi, Ludovic Legrand, Jean-Marc Larré, Véronique Martin, Olivier Collin, Christophe Caron, Hugues Leroy and David Allouche: BioMAJ: a flexible framework for databanks synchronization and processing. (PubMed,eprint) Oxford Journals Bioinformatics 24(16):1823-1825 (2008)
Blasr
mapping single-molecule sequencing reads
Versions of package blasr
ReleaseVersionArchitectures
stretch0~20151014+git8e668be-1amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
sid0~20151014+git8e668be-1amd64,arm64,armel,i386,kfreebsd-amd64,kfreebsd-i386,mips64el,mipsel,ppc64el
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Basic local alignment with successive refinement (BLASR) is a method for mapping single-molecule sequencing reads against a reference genome. Such reads are thousands of bases long, with divergence between them and the genome being dominated by insertion and deletion error.

Blast2
Basic Local Alignment Search Tool
Versions of package blast2
ReleaseVersionArchitectures
squeeze2.2.21.20090809-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.2.26.20120620-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.2.26.20120620-8amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.2.26.20120620-10amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.2.26.20120620-10amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package blast2:
biologynuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
sciencecalculation
scopeutility
usesearching
works-withbiological-sequence
Popcon: 46 users (72 upd.)*
Versions and Archs
License: DFSG free
Git

The famous sequence alignment program. This is "official" NCBI version, #2. The blastall executable allows you to give a nucleotide or protein sequence to the program. It is compared against databases and a summary of matches is returned to the user.

Note that databases are not included in Debian; they must be retrieved manually.

The package is enhanced by the following packages: mcl
Bowtie
Ultrafast memory-efficient short read aligner
Versions of package bowtie
ReleaseVersionArchitectures
wheezy0.12.7-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,powerpc,s390,s390x,sparc
jessie1.1.1-2amd64
stretch1.1.2-4amd64,arm64,ppc64el
sid1.1.2-4amd64,arm64,kfreebsd-amd64,ppc64el
Debtags of package bowtie:
biologynuceleic-acids
fieldbiology:bioinformatics
interfacecommandline
roleprogram
sciencecalculation
scopeutility
useanalysing, comparing
works-withbiological-sequence
Popcon: 25 users (54 upd.)*
Versions and Archs
License: DFSG free
Svn

This package addresses the problem to interpret the results from the latest (2010) DNA sequencing technologies. Those will yield fairly short stretches and those cannot be interpreted directly. It is the challenge for tools like Bowtie to give a chromosomal location to the short stretches of DNA sequenced per run.

Bowtie aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).

The package is enhanced by the following packages: bowtie-examples
Please cite: Ben Langmead, Cole Trapnell, Mihai Pop and Steven L Salzberg: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. (eprint) Genome Biology 10:R25 (2009)
Bowtie2
ultrafast memory-efficient short read aligner
Versions of package bowtie2
ReleaseVersionArchitectures
wheezy2.0.0-beta6-3amd64,i386,kfreebsd-amd64,kfreebsd-i386
jessie2.2.4-1amd64
sid2.2.5-2kfreebsd-amd64
stretch2.2.9-3amd64
sid2.2.9-3amd64
Popcon: 14 users (63 upd.)*
Versions and Archs
License: DFSG free
Git

is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes.

Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes

The package is enhanced by the following packages: bowtie2-examples
Please cite: Ben Langmead and Steven L Salzberg: Fast gapped-read alignment with Bowtie 2. (PubMed) Nature Methods 9:357–359 (2012)
Boxshade
Pretty-printing of multiple sequence alignments
Versions of package boxshade
ReleaseVersionArchitectures
squeeze3.3.1-4amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy3.3.1-7+wheezy1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.3.1-8amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.3.1-9amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3.3.1-9amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package boxshade:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usetypesetting
works-with-formathtml, plaintext, postscript, tex
Popcon: 6 users (49 upd.)*
Versions and Archs
License: DFSG free
Svn

Boxshade is a program for creating good looking printouts from multiple-aligned protein or DNA sequences. The program does not perform the alignment by itself and requires as input a file that was created by a multiple alignment program or manually edited with respective tools.

Boxshade reads multiple-aligned sequences from either PILEUP-MSF, CLUSTAL-ALN, MALIGNED-data and ESEE-save files (limited to a maximum of 150 sequences with up to 10000 elements each). Various kinds of shading can be applied to identical/similar residues. Output is written to screen or to a file in the following formats: ANSI/VT100, PS/EPS, RTF, HPGL, ReGIS, LJ250-printer, ASCII, xFIG, PICT, HTML

Brig
BLAST Ring Image Generator
Versions of package brig
ReleaseVersionArchitectures
stretch0.95+dfsg-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.95+dfsg-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

BRIG can display circular comparisons between a large number of genomes, with a focus on handling genome assembly data.

  • Images show similarity between a central reference sequence and other sequences as concentric rings.
  • BRIG will perform all BLAST comparisons and file parsing automatically via a simple GUI.
  • Contig boundaries and read coverage can be displayed for draft genomes; customized graphs and annotations can be displayed.
  • Using a user-defined set of genes as input, BRIG can display gene presence, absence, truncation or sequence variation in a set of complete genomes, draft genomes or even raw, unassembled sequence data.
  • BRIG also accepts SAM-formatted read-mapping files enabling genomic regions present in unassembled sequence data from multiple samples to be compared simultaneously
Please cite: Nabil-Fareed Alikhan, Nicola K Petty, Nouri L Ben Zakour and Scott A Beatson: BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons. (PubMed,eprint) BMC Genomics 12:402 (2011)
Bwa
Burrows-Wheeler Aligner
Versions of package bwa
ReleaseVersionArchitectures
squeeze0.5.8c-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.6.2-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.7.10-1amd64
stretch0.7.15-2amd64
sid0.7.15-2amd64,kfreebsd-amd64
Debtags of package bwa:
biologynuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline, text-mode
roleprogram
useanalysing, comparing
Popcon: 30 users (92 upd.)*
Versions and Archs
License: DFSG free
Git

BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.

Please cite: Heng Li and Richard Durbin: Fast and accurate short read alignment with Burrows-Wheeler transform. (PubMed,eprint) Bioinformatics 25(14):1754-1760 (2009)
Cassiopee
index and search tool in genomic sequences
Versions of package cassiopee
ReleaseVersionArchitectures
jessie1.0.1+dfsg-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el
stretch1.0.4+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.0.4+dfsg-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

Cassiopee index and search library C implementation. It is a complete rewrite of the ruby Cassiopee gem. It scans an input genomic sequence (dna/rna/protein) and search for a subsequence with exact match or allowing substitutions (Hamming distance) and/or insertion/deletions.

This package contains the cassiopee and cassiopeeknife tools.

Cd-hit
suite of programs designed to quickly group sequences
Versions of package cd-hit
ReleaseVersionArchitectures
wheezy4.6-2012-04-25-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.6.1-2012-08-27-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.6.5-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid4.6.5-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream4.6.6
Popcon: 12 users (57 upd.)*
Newer upstream!
License: DFSG free
Svn

cd-hit contains a number of programs designed to quickly group sequences. cd-hit groups proteins into clusters that meet a user-defined similarity threshold. cd-hit-est is similar to cd-hit, but designed to group nucleotide sequences (without introns). cd-hit-est-2d is similar to cd-hit-2d but designed to compare two nucleotide datasets. A number of other related programs are also in this package. Please see the cd-hit user manual, also part of this package, for further information.

Cdbfasta
Constant DataBase indexing and retrieval tools for multi-FASTA files
Versions of package cdbfasta
ReleaseVersionArchitectures
jessie0.99-20100722-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.99-20100722-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.99-20100722-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

CDB (Constant DataBase) can be used for creating indices for quick retrieval of any particular sequences from large multi-FASTA files. It has the option to compress data records in order to save space.

Cgview
Circular Genome Viewer
Versions of package cgview
ReleaseVersionArchitectures
stretch0.0.20100111-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.0.20100111-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

CGView is a Java package for generating high quality, zoomable maps of circular genomes. Its primary purpose is to serve as a component of sequence annotation pipelines, as a means of generating visual output suitable for the web. Feature information and rendering options are supplied to the program using an XML file, a tab delimited file, or an NCBI ptt file. CGView converts the input into a graphical map (PNG, JPG, or Scalable Vector Graphics format), complete with labels, a title, legends, and footnotes. In addition to the default full view map, the program can generate a series of hyperlinked maps showing expanded views. The linked maps can be explored using any web browser, allowing rapid genome browsing, and facilitating data sharing. The feature labels in maps can be hyperlinked to external resources, allowing CGView maps to be integrated with existing web site content or databases.

In addition to the CGView application, an API is available for generating maps from within other Java applications, using the cgview package.

CGView can be used for any of the following:

  • Bacterial genome visualization and browsing - CGView can be incorporated into bacterial genome annotation pipelines, as a means of generating web content for data visualization and navigation. The PNG and image map content does not require Java applets or special browser plugins.
  • Genome poster generation - CGView can generate poster-sized images of circular genomes in rasterized image formats or in Scalable Vector Graphics format.
  • Sequence analysis visualization - CGView can be used to display the output of sequence analysis programs in a circular context.

CGView features:

  • Images can be generated in PNG, JPG, or SVG format. See the CGView gallery.
  • Static or interactive maps can be generated. The interactive maps make use of standard PNG images and HTML image maps. Scalable Vector Graphics output is included in the interactive maps (see example).
  • The XML input allows complete control over the appearance of the map.
  • Tab delimited input files and NCBI ptt files can be used as an alternative to the XML format.
  • The CGView API can be used to incorporate CGView into Java applications.
  • The CGView applet can be used to incorporate zoomable maps into web pages (see example).
  • The CGView Server can be used to generate maps online.
Please cite: Paul Stothard and David S. Wishart: Circular genome visualization and exploration using CGView. (PubMed,eprint) Bioinformatics 21(4):537-539 (2004)
Chimeraslayer
detects likely chimeras in PCR amplified DNA
Versions of package chimeraslayer
ReleaseVersionArchitectures
jessie20101212+dfsg-1all
stretch20101212+dfsg1-1all
sid20101212+dfsg1-1all
Debtags of package chimeraslayer:
biologyformat:aln, nuceleic-acids
fieldbiology, biology:molecular
roleprogram
scopeutility
Popcon: 4 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

ChimeraSlayer is a chimeric sequence detection utility, compatible with near-full length Sanger sequences and shorter 454-FLX sequences (~500bp).

Chimera Slayer involves the following series of steps that operate to flag chimeric 16S rRNA sequences:

 1. the ends of a query sequence are searched against an included
    database of reference chimera-free 16S sequences to identify potential
    parents of a chimera
 2. candidate parents of a chimera are selected as those that form a
    branched best scoring alignment to the NAST-formatted query sequence
 3. the NAST alignment of the query sequence is improved in a
    ‘chimera-aware’ profile-based NAST realignment to the selected
    reference parent sequences
 4. an evolutionary framework is used to flag query sequences found to
    exhibit greater sequence homology to an in silico chimera formed
    between any two of the selected reference parent sequences.

To run Chimera Slayer, you need NAST-formatted sequences generated by the nast-ier utility.

ChimeraSlayer is part of the microbiomeutil suite.

The package is enhanced by the following packages: microbiomeutil-data
Please cite: Brian J. Haas, Dirk Gevers, Ashlee M. Earl, Mike Feldgarden, Doyle V. Ward, Georgia Giannoukos, Dawn Ciulla, Diana Tabbaa, Sarah K. Highlander, Erica Sodergren, Barbara Methé, Todd Z. DeSantis, The Human Microbiome Consortium, Joseph F. Petrosino, Rob Knight and Bruce W. Birren: Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. (PubMed,eprint) Genome Research 21(3):494-504 (2011)
Circlator
circularize genome assemblies
Versions of package circlator
ReleaseVersionArchitectures
stretch1.2.1-1all
sid1.2.1-1all
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Circlator is a tool to automate assembly circularization for bacterial and small eukaryotic genomes and produce accurate linear representations of circular sequences.

Please cite: Martin Hunt, Nishadi De Silva, Thomas D. Otto, Julian Parkhill, Jacqueline A. Keane and Simon R. Harris: Circlator: automated circularization of genome assemblies using long sequencing reads. (PubMed) Genome Biology 29(16):294 (2015)
Circos
plotter for visualizing data
Versions of package circos
ReleaseVersionArchitectures
wheezy0.61-3all
jessie0.66-1all
stretch0.69.2+dfsg-1all
sid0.69.2+dfsg-1all
upstream0.69.3
Debtags of package circos:
fieldbiology:bioinformatics
roleprogram
useviewing
Popcon: 15 users (52 upd.)*
Newer upstream!
License: DFSG free
Git

Circos visualizes data in a circular layout — ideal for exploring relationships between objects or positions, and creating highly informative publication-quality graphics.

This package provides the Circos plotting engine, which is command-line driven (like gnuplot) and fully scriptable.

Please cite: Martin I Krzywinski, Jacqueline E Schein, Inanc Birol, Joseph Connors, Randy Gascoyne, Doug Horsman, Steven J Jones and Marco A Marra: Circos: An information aesthetic for comparative genomics. (PubMed,eprint) Genome Research 19(9):1639-45 (2009)
Clearcut
extremely efficient phylogenetic tree reconstruction
Versions of package clearcut
ReleaseVersionArchitectures
jessie1.0.9-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.0.9-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.0.9-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 8 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

Clearcut is the reference implementation for the Relaxed Neighbor Joining (RNJ) algorithm by J. Evans, L. Sheneman, and J. Foster from the Initiative for Bioinformatics and Evolutionary Studies (IBEST) at the University of Idaho.

Please cite: Jason Evans, Luke Sheneman and James A. Foster: Relaxed Neighbor-Joining: A Fast Distance-Based Phylogenetic Tree Construction Method. (PubMed) J. Mol. Evol. 62(6):785-792 (2006)
Clonalframe
inference of bacterial microevolution using multilocus sequence data
Versions of package clonalframe
ReleaseVersionArchitectures
wheezy1.2-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.2-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid1.2-4hurd-i386
stretch1.2-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.2-5amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package clonalframe:
roleprogram
Popcon: 4 users (49 upd.)*
Versions and Archs
License: DFSG free
Svn

ClonalFrame identifies the clonal relationships between the members of a sample, while also estimating the chromosomal position of homologous recombination events that have disrupted the clonal inheritance.

ClonalFrame can be applied to any kind of sequence data, from a single fragment of DNA to whole genomes. It is well suited for the analysis of MLST data, where 7 gene fragments have been sequenced, but becomes progressively more powerful as the sequenced regions increase in length and number up to whole genomes. However, it requires the sequences to be aligned. If you have genomic data that is not aligned, it is recommend to use Mauve which produces alignment of whole bacterial genomes in exactly the format required for analysis with ClonalFrame.

Please cite: Xavier Didelot and Daniel Falush: Inference of Bacterial Microevolution Using Multilocus Sequence Data. (PubMed,eprint) Genetics Advance 175:1251-1266 (2006)
Clustalo
General purpose multiple sequence alignment program for proteins
Versions of package clustalo
ReleaseVersionArchitectures
wheezy1.1.0-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.2.1-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.2.2-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.2.2-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream1.2.3
Popcon: 20 users (91 upd.)*
Newer upstream!
License: DFSG free
Git

Clustal-Omega is a general purpose multiple sequence alignment (MSA) program for dna/rna/proteins. It produces high quality MSAs and is capable of handling data-sets of hundreds of thousands of sequences in reasonable time.

Please cite: Fabian Sievers, Andreas Wilm, David Dineen, Toby J Gibson, Kevin Karplus, Weizhong Li, Rodrigo Lopez, Hamish McWilliam, Michael Remmert, Johannes Söding, Julie D Thompson and Desmond G Higgins: Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. (PubMed) Molecular Systems Biology 7(539) (2011)
Clustalw
global multiple nucleotide or peptide sequence alignment
Versions of package clustalw
ReleaseVersionArchitectures
squeeze2.0.12-1 (non-free)amd64,armel,i386,ia64,mips,mipsel,powerpc,s390,sparc
wheezy2.1+lgpl-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.1+lgpl-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.1+lgpl-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.1+lgpl-5amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package clustalw:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline, text-mode
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 23 users (77 upd.)*
Versions and Archs
License: DFSG free
Git

This program performs an alignment of multiple nucleotide or amino acid sequences. It recognizes the format of input sequences and whether the sequences are nucleic acid (DNA/RNA) or amino acid (proteins). The output format may be selected from in various formats for multiple alignments such as Phylip or FASTA. Clustal W is very well accepted.

The output of Clustal W can be edited manually but preferably with an alignment editor like SeaView or within its companion Clustal X. When building a model from your alignment, this can be applied for improved database searches. The Debian package hmmer creates such in form of an HMM.

The package is enhanced by the following packages: clustalw-mpi
Please cite: M. A. Larkin, G. Blackshields, N. P. Brown, R. Chenna, P. A. McGettigan, H. McWilliam, F. Valentin, I.M. Wallace, A. Wilm, R. Lopez, J. D. Thompson, T. J. Gibson and D. G. Higgins: Clustal W and Clustal X version 2.0. (PubMed,eprint) Bioinformatics 23(21):2947-2948 (2007)
Clustalx
Multiple alignment of nucleic acid and protein sequences (graphical interface)
Versions of package clustalx
ReleaseVersionArchitectures
squeeze1.83-4 (non-free)amd64,armel,i386,ia64,mips,mipsel,powerpc,s390,sparc
wheezy2.1+lgpl-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.1+lgpl-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid2.1+lgpl-4hurd-i386
stretch2.1+lgpl-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.1+lgpl-5amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package clustalx:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacex11
roleprogram
scopeutility
uitoolkitmotif
useanalysing, comparing, viewing
works-with-formatplaintext
x11application
Popcon: 72 users (38 upd.)*
Versions and Archs
License: DFSG free
Git

This package offers a GUI interface for the Clustal multiple sequence alignment program. It provides an integrated environment for performing multiple sequence- and profile-alignments to analyse the results. The sequence alignment is displayed in a window on the screen. A versatile coloring scheme has been incorporated to highlight conserved features in the alignment. For professional presentations, one should use the texshade LaTeX package or boxshade.

The pull-down menus at the top of the window allow you to select all the options required for traditional multiple sequence and profile alignment. You can cut-and-paste sequences to change the order of the alignment; you can select a subset of sequences to be aligned; you can select a sub-range of the alignment to be realigned and inserted back into the original alignment.

An alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted.

Please cite: M.A. Larkin, G. Blackshields, N.P. Brown, R. Chenna, P.A. McGettigan, H. McWilliam, F. Valentin, I.M. Wallace, A. Wilm, R. Lopez, J.D. Thompson, T.J. Gibson and D.G. Higgins: Clustal W and Clustal X version 2.0. (PubMed,eprint) Bioinformatics 23(21):2947-2948 (2007)
Codonw
Correspondence Analysis of Codon Usage
Versions of package codonw
ReleaseVersionArchitectures
stretch1.4.4-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.4.4-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

CodonW is a package for codon usage analysis. It was designed to simplify Multivariate Analysis (MVA) of codon usage. The MVA method employed in CodonW is correspondence analysis (COA) (the most popular MVA method for codon usage analysis). CodonW can generate a COA for codon usage, relative synonymous codon usage or amino acid usage. Additional analyses of codon usage include investigation of optimal codons, codon and dinucleotide bias, and/or base composition. CodonW analyses sequences encoded by genetic codes other than the universal code.

Concavity
predictor of protein ligand binding sites from structure and conservation
Versions of package concavity
ReleaseVersionArchitectures
jessie0.1-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.1+dfsg.1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.1+dfsg.1-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

ConCavity predicts protein ligand binding sites by combining evolutionary sequence conservation and 3D structure.

ConCavity takes as input a PDB format protein structure and optionally files that characterize the evolutionary sequence conservation of the chains in the structure file.

The following result files are produced by default:

  • Residue ligand binding predictions for each chain (*.scores).
  • Residue ligand binding predictions in a PDB format file (residue scores placed in the temp. factor field, *_residue.pdb).
  • Pocket prediction locations in a DX format file (*.dx).
  • PyMOL script to visualize the predictions (*.pml).
The package is enhanced by the following packages: conservation-code
Please cite: John A. Capra, Roman A. Laskowski, Janet M. Thornton, Mona Singh and Thomas A. Funkhouser: Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure. (PubMed) PLoS Computational Biology 5(12):e1000585 (2009)
Conservation-code
protein sequence conservation scoring tool
Versions of package conservation-code
ReleaseVersionArchitectures
jessie20110309.0-3all
stretch20110309.0-5all
sid20110309.0-5all
Popcon: 5 users (64 upd.)*
Versions and Archs
License: DFSG free
Svn

This package provides score_conservation(1), a tool to score protein sequence conservation.

The following conservation scoring methods are implemented:

  • sum of pairs
  • weighted sum of pairs
  • Shannon entropy
  • Shannon entropy with property groupings (Mirny and Shakhnovich 1995, Valdar and Thornton 2001)
  • relative entropy with property groupings (Williamson 1995)
  • von Neumann entropy (Caffrey et al 2004)
  • relative entropy (Samudrala and Wang 2006)
  • Jensen-Shannon divergence (Capra and Singh 2007)

A window-based extension that incorporates the estimated conservation of sequentially adjacent residues into the score for each column is also given. This window approach can be applied to any of the conservation scoring methods.

The program accepts alignments in the CLUSTAL and FASTA formats.

The sequence-specific output can be used as the conservation input for concavity.

Conservation is highly predictive in identifying catalytic sites and residues near bound ligands.

Please cite: John A. Capra and Mona Singh: Predicting functionally important residues from sequence conservation. (PubMed) Bioinformatics 23(15):1875-82 (2007)
Cutadapt
Clean biological sequences from high-throughput sequencing reads
Versions of package cutadapt
ReleaseVersionArchitectures
stretch1.10-2all
sid1.10-2all
Popcon: 1 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

Cutadapt helps with biological sequence clean tasks by finding the adapter or primer sequences in an error-tolerant way. It can also modify and filter reads in various ways. Adapter sequences can contain IUPAC wildcard characters. Also, paired-end reads and even colorspace data is supported. If you want, you can also just demultiplex your input data, without removing adapter sequences at all.

This package contains the user interface.

Please cite: Marcel Martin: Cutadapt removes adapter sequences from high-throughput sequencing reads. (eprint) EMBnet.journal 17(1):10-12 (2015)
Daligner
local alignment discovery between long nucleotide sequencing reads
Versions of package daligner
ReleaseVersionArchitectures
stretch1.0+20151214-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.0+20151214-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

These tools permit one to find all significant local alignments between reads encoded in a Dazzler database. The assumption is that the reads are from a Pacific Biosciences RS II long read sequencer. That is, the reads are long and noisy, up to 15% on average.

Please cite: Gene Myers: Efficient Local Alignment Discovery amongst Noisy Long Reads. 8701:52-67 (2014)
Dawg
program to simulate the evolution of recombinant DNA sequences
Versions of package dawg
ReleaseVersionArchitectures
stretch1.2-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.2-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

DNA Assembly with Gaps (Dawg) is an application designed to simulate the evolution of recombinant DNA sequences in continuous time based on the robust general time reversible model with gamma and invariant rate heterogeneity and a novel length-dependent model of gap formation. The application accepts phylogenies in Newick format and can return the sequence of any node, allowing for the exact evolutionary history to be recorded at the discretion of users. Dawg records the gap history of every lineage to produce the true alignment in the output. Many options are available to allow users to customize their simulations and results.

Dazzdb
manage nucleotide sequencing read data
Versions of package dazzdb
ReleaseVersionArchitectures
stretch1.0-1amd64,arm64,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.0-1amd64,arm64,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

To facilitate the multiple phases of the dazzler assembler, all the read data is organized into what is effectively a database of the reads and their meta-information. The design goals for this data base are as follows:

  • The database stores the source Pacbio read information in such a way that it can re-create the original input data, thus permitting a user to remove the (effectively redundant) source files. This avoids duplicating the same data, once in the source file and once in the database.
  • The data base can be built up incrementally, that is new sequence data can be added to the data base over time.
  • The data base flexibly allows one to store any meta-data desired for reads. This is accomplished with the concept of tracks that implementors can add as they need them.
  • The data is held in a compressed form equivalent to the .dexta and .dexqv files of the data extraction module. Both the .fasta and .quiva information for each read is held in the data base and can be recreated from it. The .quiva information can be added separately and later on if desired.
  • To facilitate job parallel, cluster operation of the phases of the assembler, the database has a concept of a current partitioning in which all the reads that are over a given length and optionally unique to a well, are divided up into blocks containing roughly a given number of bases, except possibly the last block which may have a short count. Often programs can be run on blocks or pairs of blocks and each such job is reasonably well balanced as the blocks are all the same size. One must be careful about changing the partition during an assembly as doing so can void the structural validity of any interim block-based results.
Dialign
Segment-based multiple sequence alignment
Versions of package dialign
ReleaseVersionArchitectures
squeeze2.2.1-3amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.2.1-5amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.2.1-7amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.2.1-8amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.2.1-8amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package dialign:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing
works-with-formatplaintext
Popcon: 20 users (87 upd.)*
Versions and Archs
License: DFSG free
Git

DIALIGN2 is a command line tool to perform multiple alignment of protein or DNA sequences. It constructs alignments from gapfree pairs of similar segments of the sequences. This scoring scheme for alignments is the basic difference between DIALIGN and other global or local alignment methods. Note that DIALIGN does not employ any kind of gap penalty.

Please cite: Burkhard Morgenstern: DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. (PubMed,eprint) Bioinformatics 15(3):211-218 (1999)
Dialign-tx
Segment-based multiple sequence alignment
Versions of package dialign-tx
ReleaseVersionArchitectures
squeeze1.0.2-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.0.2-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.0.2-7amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.0.2-8amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.0.2-8amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package dialign-tx:
fieldbiology, biology:bioinformatics
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 16 users (55 upd.)*
Versions and Archs
License: DFSG free
Svn

DIALIGN-TX is a command line tool to perform multiple alignment of protein or DNA sequences. It is a complete reimplementation of the segment-base approach including several new improvements and heuristics that significantly enhance the quality of the output alignments compared to DIALIGN 2.2 and DIALIGN-T. For pairwise alignment, DIALIGN-TX uses a fragment-chaining algorithm that favours chains of low-scoring local alignments over isolated high-scoring fragments. For multiple alignment, DIALIGN-TX uses an improved greedy procedure that is less sensitive to spurious local sequence similarities.

The package is enhanced by the following packages: dialign-tx-data
Please cite: Amarendran R. Subramanian, Michael Kaufmann and Burkhard Morgenstern: DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. (PubMed) Algorithms for Molecular Biology 3(1):6 (2008)
Dindel
determines indel calls from short-read data
Versions of package dindel
ReleaseVersionArchitectures
stretch1.01+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.01+dfsg-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Dindel is a program for calling small indels from short-read sequence data ('next generation sequence data'). It currently is designed to handle only Illumina data.

Dindel requires a BAM file containing the read-alignments as input. It then extracts candidate indels from the BAM file, and realigns the reads to candidate haplotypes consisting of these candidate indels. If there is sufficient evidence for an alternative haplotype to the reference, it will call an indel.

It is possible to test indels discovered with other methods using Dindel, for instance longer indels obtained through assembly methods. Dindel will then realign both mapped and unmapped reads to see if the candidate indel is supported by the reads.

Dindel outputs both genotype likelihoods and includes a script to convert these to a VCF file with indel and SNP calls.

There is basic support for outputting realigned BAM files for each realignment-window. These realigned BAM files can be used to call SNPs near (candidate) indels.

Please cite: Cornelis A. Albers, Gerton Lunter, Daniel G. MacArthur, Gilean McVean, Willem H. Ouwehand, Richard Durbin: ???. (PubMed,eprint) Genome Research 21(6):961-973 (2010)
Discosnp
discovering Single Nucleotide Polymorphism from raw set(s) of reads
Versions of package discosnp
ReleaseVersionArchitectures
jessie1.2.5-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.2.6-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.2.6-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (46 upd.)*
Versions and Archs
License: DFSG free
Svn

Software discoSnp is designed for discovering Single Nucleotide Polymorphism (SNP) from raw set(s) of reads obtained with Next Generation Sequencers (NGS).

Note that number of input read sets is not constrained, it can be one, two, or more. Note also that no other data as reference genome or annotations are needed.

The software is composed by two modules. First module, kissnp2, detects SNPs from read sets. A second module, kissreads, enhance the kissnp2 results by computing per read set and for each found SNP: 1) its mean read coverage 2) the (phred) quality of reads generating the polymorphism.

Disulfinder
cysteines disulfide bonding state and connectivity predictor
Versions of package disulfinder
ReleaseVersionArchitectures
wheezy1.2.11-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.2.11-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.2.11-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.2.11-6amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package disulfinder:
roleprogram
Popcon: 8 users (63 upd.)*
Versions and Archs
License: DFSG free
Svn

'disulfinder' is for predicting the disulfide bonding state of cysteines and their disulfide connectivity starting from sequence alone. Disulfide bridges play a major role in the stabilization of the folding process for several proteins. Prediction of disulfide bridges from sequence alone is therefore useful for the study of structural and functional properties of specific proteins. In addition, knowledge about the disulfide bonding state of cysteines may help the experimental structure determination process and may be useful in other genomic annotation tasks.

'disulfinder' predicts disulfide patterns in two computational stages: (1) the disulfide bonding state of each cysteine is predicted by a BRNN-SVM binary classifier; (2) cysteines that are known to participate in the formation of bridges are paired by a Recursive Neural Network to obtain a connectivity pattern.

Please cite: A. Ceroni, A. Passerini, A. Vullo and P. Frasconi: DISULFIND: a disulfide bonding state and cysteine connectivity prediction server.. (PubMed) Nucleic Acids Res 34(Web Server issue):W177-81 (2006)
Dnaclust
tool for clustering millions of short DNA sequences
Versions of package dnaclust
ReleaseVersionArchitectures
jessie3-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3-4amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 4 users (56 upd.)*
Versions and Archs
License: DFSG free
Git

dnaclust is a tool for clustering large number of short DNA sequences. The clusters are created in such a way that the "radius" of each clusters is no more than the specified threshold.

The input sequences to be clustered should be in Fasta format. The id of each sequence is based on the first word of the seqeunce in the Fasta format. The first word is the prefix of the header up to the first occurance of white space characters in the header.

Please cite: Mohammadreza Ghodsi, Bo Liu and Mihai Pop: DNACLUST: accurate and efficient clustering of phylogenetic marker genes. (PubMed,eprint) BMC Bioinformatics 12:271 (2011)
Dssp
protein secondary structure assignment based on 3D structure
Versions of package dssp
ReleaseVersionArchitectures
wheezy2.0.4-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.2.1-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.2.1-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.2.1-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 9 users (53 upd.)*
Versions and Archs
License: DFSG free
Svn

DSSP is an application you use to assign the secondary structure of a protein based on its solved three dimensional (3D) structure.

This version (2) of DSSP is a rewrite that produces the same output as the original DSSP, but deals better with exceptions in PDB files and is much faster.

Dwgsim
short sequencing read simulator
Versions of package dwgsim
ReleaseVersionArchitectures
stretch0.1.11-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.1.11-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

DWGSIM simulates short sequencing reads from modern sequencing platforms. DWGSIM generates base error rates using a parametric model, allowing a more realisic error profile. It was originally developed for use in evaluating short read aligners.

Ea-utils
command-line tools for processing biological sequencing data
Versions of package ea-utils
ReleaseVersionArchitectures
stretch1.1.2+dfsg-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.1.2+dfsg-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Ea-utils provides a set of command-line tools for processing biological sequencing data, barcode demultiplexing, adapter trimming, etc.

Primarily written to support an Illumina based pipeline - but should work with any FASTQs.

Main Tools are:

  • fastq-mcf Scans a sequence file for adapters, and, based on a log-scaled threshold, determines a set of clipping parameters and performs clipping. Also does skewing detection and quality filtering.

  • fastq-multx Demultiplexes a fastq. Capable of auto-determining barcode id's based on a master set fields. Keeps multiple reads in-sync during demultiplexing. Can verify that the reads are in-sync as well, and fail if they're not.

  • fastq-join Similar to audy's stitch program, but in C, more efficient and supports some automatic benchmarking and tuning. It uses the same "squared distance for anchored alignment" as other tools.

  • varcall Takes a pileup and calculates variants in a more easily parameterized manner than some other tools.

Please cite: Erik Aronesty: Comparison of Sequencing Utility Programs. (eprint) The Open Bioinformatics Journal 7:1-8 (2013)
Ecopcr
estimate PCR barcode primers quality
Versions of package ecopcr
ReleaseVersionArchitectures
stretch0.5.0-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.5.0-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (46 upd.)*
Versions and Archs
License: DFSG free
Git

DNA barcoding is a tool for characterizing the species origin using a short sequence from a standard position and agreed upon position in the genome. To be used as a DNA barcode, a genome locus should vary among individuals of the same species only to a minor degree and it should vary among species very quickly. From a practical point of view, a barcode locus should be flanked by two conserved regions to design PCR primers. Several manually discovered barcode loci like COI, rbcL, 18S, 16S and 23S rDNA, or trnH-ps are routinely used today, but no objective function has been described to measure their quality in terms of universality (barcode coverage, Bc ) or in terms of taxonomical discrimination capacity (barcode specificity, Bs ).

ecoPCR is an electronic PCR software developed by LECA and Helix-Project. It helps to estimate Barcode primers quality. In conjunction with OBITools you can postprocess ecoPCR output to compute barcode coverage and barcode specificity. New barcode primers can be developed using the ecoPrimers software

Edtsurf
triangulated mesh surfaces for protein structures
Versions of package edtsurf
ReleaseVersionArchitectures
jessie0.2009-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.2009-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.2009-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

EDTSurf is a open source program to construct triangulated surfaces for macromolecules. It generates three major macromolecular surfaces: van der Waals surface, solvent-accessible surface and molecular surface (solvent-excluded surface). EDTsurf also identifies cavities which are inside of macromolecules.

Please cite: Dong Xu and Yang Zhang: Generating Triangulated Macromolecular Surfaces by Euclidean Distance Transform.. (PubMed,eprint) PLoS ONE 4(12):e8140 (2009)
Eigensoft
reduction of population bias for genetic analyses
Versions of package eigensoft
ReleaseVersionArchitectures
stretch6.1.2+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid6.1.2+dfsg-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The EIGENSOFT package combines functionality from the group's population genetics methods (Patterson et al. 2006) and their EIGENSTRAT stratification method (Price et al. 2006). The EIGENSTRAT method uses principal components analysis to explicitly model ancestry differences between cases and controls along continuous axes of variation; the resulting correction is specific to a candidate marker's variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. The EIGENSOFT package has a built-in plotting script and supports multiple file formats and quantitative phenotypes.

Embassy-domainatrix
Extra EMBOSS commands to handle domain classification file
Versions of package embassy-domainatrix
ReleaseVersionArchitectures
wheezy0.1.0+20110714-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.1.650-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.1.650-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.1.650-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream0.1.660
Debtags of package embassy-domainatrix:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, converting, editing, searching
works-with-formatplaintext
Popcon: 5 users (47 upd.)*
Newer upstream!
License: DFSG free
Svn

The DOMAINATRIX programs were developed by Jon Ison and colleagues at MRC HGMP for their protein domain research. They are included as an EMBASSY package as a work in progress.

Applications in the current domainatrix release are cathparse (generates DCF file from raw CATH files), domainnr (removes redundant domains from a DCF file), domainreso (removes low resolution domains from a DCF file), domainseqs (adds sequence records to a DCF file), domainsse (adds secondary structure records to a DCF file), scopparse (generates DCF file from raw SCOP files) and ssematch (searches a DCF file for secondary structure matches).

Embassy-domalign
Extra EMBOSS commands for protein domain alignment
Versions of package embassy-domalign
ReleaseVersionArchitectures
wheezy0.1.0+20110714-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.1.650-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.1.650-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.1.650-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream0.1.660
Debtags of package embassy-domalign:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing, editing
works-with-formatplaintext
Popcon: 5 users (47 upd.)*
Newer upstream!
License: DFSG free
Svn

The DOMALIGN programs were developed by Jon Ison and colleagues at MRC HGMP for their protein domain research. They are included as an EMBASSY package as a work in progress.

Applications in the current domalign release are allversusall (sequence similarity data from all-versus-all comparison), domainalign (generates alignments (DAF file) for nodes in a DCF file), domainrep (reorders DCF file to identify representative structures) and seqalign (extend alignments (DAF file) with sequences (DHF file)).

Embassy-domsearch
Extra EMBOSS commands to search for protein domains
Versions of package embassy-domsearch
ReleaseVersionArchitectures
wheezy0.1.0+20110714-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.1.650-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.1.650-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.1.650-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream0.1.660
Debtags of package embassy-domsearch:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing
Popcon: 4 users (47 upd.)*
Newer upstream!
License: DFSG free
Svn

The DOMSEARCH programs were developed by Jon Ison and colleagues at MRC HGMP for their protein domain research. They are included as an EMBASSY package as a work in progress.

Applications in this DOMSEARCH release are seqfraggle (removes fragment sequences from DHF files), seqnr (removes redundancy from DHF files), seqsearch (generates PSI-BLAST hits (DHF file) from a DAF file), seqsort (Remove ambiguous classified sequences from DHF files) and seqwords (Generates DHF files from keyword search of UniProt).

Emboss
European molecular biology open software suite
Versions of package emboss
ReleaseVersionArchitectures
squeeze6.1.0-5amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy6.4.0-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie6.6.0+dfsg-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch6.6.0+dfsg-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid6.6.0+dfsg-3amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package emboss:
fieldbiology, biology:bioinformatics, biology:molecular
interfacecommandline
roleprogram
scopesuite
useanalysing, comparing, converting, editing, organizing, searching, text-formatting, typesetting, viewing
works-withdb
works-with-formatplaintext
Popcon: 569 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole. EMBOSS breaks the historical trend towards commercial software packages.

The package is enhanced by the following packages: clustalw primer3
Please cite: Peter Rice, Ian Longden and Alan Bleasby: EMBOSS: The European Molecular Biology Open Software Suite. (PubMed) Trends in Genetics 16(6):276 - 277 (2000)
Screenshots of package emboss
Exonerate
generic tool for pairwise sequence comparison
Versions of package exonerate
ReleaseVersionArchitectures
squeeze2.2.0-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.2.0-6amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.2.0-6amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.4.0-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.4.0-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package exonerate:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usesearching
works-with-formatplaintext
Popcon: 21 users (79 upd.)*
Versions and Archs
License: DFSG free
Git

Exonerate allows you to align sequences using a many alignment models, using either exhaustive dynamic programming, or a variety of heuristics. Much of the functionality of the Wise dynamic programming suite was reimplemented in C for better efficiency. Exonerate is an intrinsic component of the building of the Ensembl genome databases, providing similarity scores between RNA and DNA sequences and thus determining splice variants and coding sequences in general.

An In-silico PCR Experiment Simulation System (see the ipcress man page) is packaged with exonerate.

This package also comes with a selection of utilities for performing simple manipulations quickly on fasta files beyond 2Gb

Please cite: Guy C. Slater and Ewan Birney: Automated generation of heuristics for biological sequence comparison. (PubMed,eprint) BMC Bioinformatics 6(1):31 (2005)
Falconkit
genome assembly toolkit
Versions of package falconkit
ReleaseVersionArchitectures
stretch0.7-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.7-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Falcon is a set of tools for fast aligning long reads for consensus and assembly. It is a simple code collection for efficient assembly of haploid and diploid genomes.

Fastahack
utility for indexing and sequence extraction from FASTA files
Versions of package fastahack
ReleaseVersionArchitectures
stretch0.0+20160702-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el
sid0.0+20160702-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el
Popcon: 1 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

fastahack is a small application for indexing and extracting sequences and subsequences from FASTA files. The included Fasta.cpp library provides a FASTA reader and indexer that can be embedded into applications which would benefit from directly reading subsequences from FASTA files. The library automatically handles index file generation and use.

Features:

  • FASTA index (.fai) generation for FASTA files
  • Sequence extraction
  • Subsequence extraction
  • Sequence statistics (currently only entropy is provided)

Sequence and subsequence extraction use fseek64 to provide fastest-possible extraction without RAM-intensive file loading operations. This makes fastahack a useful tool for bioinformaticists who need to quickly extract many subsequences from a reference FASTA sequence.

Fastaq
FASTA and FASTQ file manipulation tools
Versions of package fastaq
ReleaseVersionArchitectures
jessie1.5.0-1all
stretch3.12.1-1all
sid3.12.1-1all
Popcon: 7 users (48 upd.)*
Versions and Archs
License: DFSG free
Git

Fastaq represents a diverse collection of scripts that perform useful and common FASTA/FASTQ manipulation tasks, such as filtering, merging, splitting, sorting, trimming, search/replace, etc. Input and output files can be gzipped (format is automatically detected) and individual Fastaq commands can be piped together.

Fastdnaml
Tool for construction of phylogenetic trees of DNA sequences
Versions of package fastdnaml
ReleaseVersionArchitectures
squeeze1.2.2-9amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.2.2-10amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.2.2-10amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.2.2-11amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.2.2-11amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package fastdnaml:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing
works-with-formatplaintext
Popcon: 7 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

fastDNAml is a program derived from Joseph Felsenstein's version 3.3 DNAML (part of his PHYLIP package). Users should consult the documentation for DNAML before using this program.

fastDNAml is an attempt to solve the same problem as DNAML, but to do so faster and using less memory, so that larger trees and/or more bootstrap replicates become tractable. Much of fastDNAml is merely a recoding of the PHYLIP 3.3 DNAML program from PASCAL to C.

Note that the homepage of this program is not available any more and so this program will probably not see any further updates.

Please cite: Gary J. Olsen, Hideo Matsuda, Ray Hagstrom and Ross Overbeek: fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. (PubMed,eprint) Comput Appl Biosci 10(1):41-48 (1994)
Fastlink
faster version of pedigree programs of Linkage
Versions of package fastlink
ReleaseVersionArchitectures
squeeze4.1P-fix95-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy4.1P-fix95-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.1P-fix95-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.1P-fix100+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid4.1P-fix100+dfsg-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package fastlink:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing
Popcon: 6 users (49 upd.)*
Versions and Archs
License: DFSG free
Git

Genetic linkage analysis is a statistical technique used to map genes and find the approximate location of disease genes. There was a standard software package for genetic linkage called LINKAGE. FASTLINK is a significantly modified and improved version of the main programs of LINKAGE that runs much faster sequentially, can run in parallel, allows the user to recover gracefully from a computer crash, and provides abundant new documentation. FASTLINK has been used in over 1000 published genetic linkage studies.

This package contains the following programs:

 ilink:    GEMINI optimization procedure to find a locally
           optimal value of the theta vector of recombination
           fractions
 linkmap:  calculates location scores of one locus against a
           fixed map of other loci
 lodscore: compares likelihoods at locally optimal theta
 mlink:    calculates lod scores and risk with two of more loci
 unknown:  identify possible genotypes for unknowns
Please cite: R. W. Cottingham Jr., R. M. Idury and A. A. Schaffer: Faster Sequential Genetic Linkage Computations. (PubMed,eprint) American Journal of Human Genetics 53(1):252-263 (1993)
Fastml
maximum likelihood ancestral amino-acid sequence reconstruction
Versions of package fastml
ReleaseVersionArchitectures
stretch3.1-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3.1-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

FastML is a bioinformatics tool for the reconstruction of ancestral sequences based on the phylogenetic relations between homologous sequences. FastML runs several algorithms that reconstruct the ancestral sequences with emphasis on an accurate reconstruction of both indels and characters. For character reconstruction the previously described FastML algorithms are used to efficiently infer the most likely ancestral sequences for each internal node of the tree. Both joint and the marginal reconstructions are provided. For indels reconstruction the sequences are first coded according to the indel events detected within the multiple sequence alignment (MSA) and then a state-of-the-art likelihood model is used to reconstruct ancestral indels states. The results are the most probable sequences, together with posterior probabilities for each character and indel at each sequence position for each internal node of the tree. FastML is generic and is applicable for any type of molecular sequences (nucleotide, protein, or codon sequences).

Please cite: Haim Ashkenazy, Osnat Penn, Adi Doron-Faigenboim, Ofir Cohen, Gina Cannarozzi, Oren Zomer and Tal Pupko: FastML: a web server for probabilistic reconstruction of ancestral sequences. (PubMed,eprint) Nucleic Acids Research 40(Web Server issue):W580-W584 (2012)
Fastqc
quality control for high throughput sequence data
Versions of package fastqc
ReleaseVersionArchitectures
jessie0.11.2+dfsg-3all
stretch0.11.5+dfsg-2all
sid0.11.5+dfsg-3all
Popcon: 18 users (49 upd.)*
Versions and Archs
License: DFSG free
Git

FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.

The main functions of FastQC are

  • Import of data from BAM, SAM or FastQ files (any variant)
  • Providing a quick overview to tell you in which areas there may be problems
  • Summary graphs and tables to quickly assess your data
  • Export of results to an HTML based permanent report
  • Offline operation to allow automated generation of reports without running the interactive application
Fastqtl
Quantitative Trait Loci (QTL) mapper in cis for molecular phenotypes
Versions of package fastqtl
ReleaseVersionArchitectures
stretch2.184+dfsg-3amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
sid2.184+dfsg-3amd64,arm64,armel,i386,kfreebsd-amd64,kfreebsd-i386,mips64el,mipsel,ppc64el
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

The goal of FastQTL is to identify single-nucleotide polymorphisms (SNPs) which are significantly associated with various molecular phenotypes (i.e. expression of known genes, cytosine methylation levels, etc). It performs scans for all possible phenotype-variant pairs in cis (i.e. variants located within a specific window around a phenotype). FastQTL implements a new permutation scheme (Beta approximation) to accurately and rapidly correct for multiple-testing at both the genotype and phenotype levels.

The package is enhanced by the following packages: fastqtl-doc
Please cite: Halit Ongen, Alfonso Buil, Andrew Anand Brown, Emmanouil T. Dermitzakis and and Olivier Delaneau: Fast and efficient QTL mapper for thousands of molecular phenotypes. (eprint) Bioinformatics (2015)
Fasttree
phylogenetic trees from alignments of nucleotide or protein sequences
Versions of package fasttree
ReleaseVersionArchitectures
wheezy2.1.4-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.1.7-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.1.9-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.1.9-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 10 users (57 upd.)*
Versions and Archs
License: DFSG free
Git

FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. It handles alignments with up to a million of sequences in a reasonable amount of time and memory. For large alignments, FastTree is 100-1,000 times faster than PhyML 3.0 or RAxML 7.

FastTree is more accurate than PhyML 3 with default settings, and much more accurate than the distance-matrix methods that are traditionally used for large alignments. FastTree uses the Jukes-Cantor or generalized time-reversible (GTR) models of nucleotide evolution and the JTT (Jones-Taylor-Thornton 1992) model of amino acid evolution. To account for the varying rates of evolution across sites, FastTree uses a single rate for each site (the "CAT" approximation). To quickly estimate the reliability of each split in the tree, FastTree computes local support values with the Shimodaira-Hasegawa test (these are the same as PhyML 3's "SH-like local supports").

This package contains a single threaded version (fasttree) and a parallel version which uses OpenMP (fasttreMP).

Please cite: Morgan N. Price, Paramvir S. Dehal and Adam P. Arkin: FastTree 2 -- Approximately Maximum-Likelihood Trees for Large Alignments.. (PubMed,eprint) PLoS ONE 5(3):e9490 (2010)
Fastx-toolkit
FASTQ/A short nucleotide reads pre-processing tools
Versions of package fastx-toolkit
ReleaseVersionArchitectures
wheezy0.0.13.2-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.0.14-1amd64,arm64,armel,armhf,i386,mips,mipsel,ppc64el,s390x
stretch0.0.14-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.0.14-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package fastx-toolkit:
roleprogram
Popcon: 10 users (48 upd.)*
Versions and Archs
License: DFSG free
Git

The FASTX-Toolkit is a collection of command line tools for preprocessing short nucleotide reads in FASTA and FASTQ formats, usually produced by Next-Generation sequencing machines. The main processing of such FASTA/FASTQ files is mapping (aligning) the sequences to reference genomes or other databases using specialized programs like BWA, Bowtie and many others. However, it is sometimes more productive to preprocess the FASTA/FASTQ files before mapping the sequences to the genome—manipulating the sequences to produce better mapping results. The FASTX-Toolkit tools perform some of these preprocessing tasks.

Ffindex
simple index/database for huge amounts of small files
Versions of package ffindex
ReleaseVersionArchitectures
wheezy0.9.6.1-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.9.9.3-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.9.9.7-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.9.9.7-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 8 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

FFindex is a very simple index/database for huge amounts of small files. The files are stored concatenated in one big data file, separated by '\0'. A second file contains a plain text index, giving name, offset and length of the small files. The lookup is currently done with a binary search on an array made from the index file.

This package provides the executables.

Figtree
graphical phylogenetic tree viewer
Versions of package figtree
ReleaseVersionArchitectures
wheezy1.3.1-1all
jessie1.4-2all
stretch1.4.2+dfsg-2all
sid1.4.2+dfsg-2all
Popcon: 10 users (48 upd.)*
Versions and Archs
License: DFSG free
Git

FigTree is designed as a graphical viewer of phylogenetic trees and as a program for producing publication-ready figures. In particular it is designed to display summarized and annotated trees produced by BEAST.

Filo
FILe and stream Operations
Versions of package filo
ReleaseVersionArchitectures
wheezy1.1+2011020401.2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.1+2011123001.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.1.0-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.1.0-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 14 users (48 upd.)*
Versions and Archs
License: DFSG free
Git

The following tools are available as part of the filo package:

groupBy – mimics the “groupBy” clause in database systems.

shuffle – randomize the order of lines in a file.

stats – computes descriptive statistic on a given column of a tab-delimited file or stream.

Because their name is too generic, ‘shuffle’ and ‘stats’ are relocated in /usr/lib/filo.

Fitgcp
fitting genome coverage distributions with mixture models
Versions of package fitgcp
ReleaseVersionArchitectures
jessie0.0.20130418-2amd64,arm64,i386,ppc64el
stretch0.0.20130418-2amd64,arm64,ppc64el
sid0.0.20130418-2amd64,arm64,kfreebsd-amd64,ppc64el
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Genome coverage, the number of sequencing reads mapped to a position in a genome, is an insightful indicator of irregularities within sequencing experiments. While the average genome coverage is frequently used within algorithms in computational genomics, the complete information available in coverage profiles (i.e. histograms over all coverages) is currently not exploited to its full extent. Thus, biases such as fragmented or erroneous reference genomes often remain unaccounted for. Making this information accessible can improve the quality of sequencing experiments and quantitative analyses.

fitGCP is a framework for fitting mixtures of probability distributions to genome coverage profiles. Besides commonly used distributions, fitGCP uses distributions tailored to account for common artifacts. The mixture models are iteratively fitted based on the Expectation-Maximization algorithm.

Please cite: Martin S. Lindner, Maximilian Kollock, Franziska Zickmann and Bernhard Y. Renard: Analyzing genome coverage profiles with applications to quality control in metagenomics. (PubMed,eprint) Bioinformatics 29(10):1260-1267 (2013)
Flexbar
flexible barcode and adapter removal for sequencing platforms
Versions of package flexbar
ReleaseVersionArchitectures
jessie2.50-1amd64,arm64,armhf,i386,powerpc,ppc64el
stretch2.50-2amd64,arm64,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el
sid2.50-2amd64,arm64,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el
Popcon: 5 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

Flexbar preprocesses high-throughput sequencing data efficiently. It demultiplexes barcoded runs and removes adapter sequences. Moreover, trimming and filtering features are provided. Flexbar increases mapping rates and improves genome and transcriptome assemblies. It supports next-generation sequencing data in fasta/q and csfasta/q format from Illumina, Roche 454, and the SOLiD platform.

Parameter names changed in Flexbar. Please review scripts. The recent months, default settings were optimised, several bugs were fixed and various improvements were made, e.g. revamped command-line interface, new trimming modes as well as lower time and memory requirements.

Please cite: Matthias Dodt, Johannes T. Roehr, Rina Ahmed and Christoph Dieterich: FLEXBAR — Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms. (eprint) Biology 1(3):895-905 (2012)
Freecontact
fast protein contact predictor
Versions of package freecontact
ReleaseVersionArchitectures
jessie1.0.21-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.0.21-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.0.21-4amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (48 upd.)*
Versions and Archs
License: DFSG free
Git

FreeContact is a protein residue contact predictor optimized for speed. Its input is a multiple sequence alignment. FreeContact can function as an accelerated drop-in for the published contact predictors EVfold-mfDCA of DS. Marks (2011) and PSICOV of D. Jones (2011).

FreeContact is accelerated by a combination of vector instructions, multiple threads, and faster implementation of key parts. Depending on the alignment, 8-fold or higher speedups are possible.

A sufficiently large alignment is required for meaningful results. As a minimum, an alignment with an effective (after-weighting) sequence count bigger than the length of the query sequence should be used. Alignments with tens of thousands of (effective) sequences are considered good input.

jackhmmer(1) from the hmmer package, or hhblits(1) from hhsuite can be used to generate the alignments, for example.

This package contains the command line tool freecontact(1).

Please cite: László Kaján, Thomas A. Hopf, Matúš Kalaš, Debora S. Marks and Burkhard Rost: FreeContact: ... BMC Bioinformatics (201?)
Fsa
Fast Statistical Alignment of protein, RNA or DNA sequences
Versions of package fsa
ReleaseVersionArchitectures
stretch1.15.9+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid1.15.9+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Popcon: 4 users (82 upd.)*
Versions and Archs
License: DFSG free
Git

FSA is a probabilistic multiple sequence alignment algorithm which uses a "distance-based" approach to aligning homologous protein, RNA or DNA sequences. Much as distance-based phylogenetic reconstruction methods like Neighbor-Joining build a phylogeny using only pairwise divergence estimates, FSA builds a multiple alignment using only pairwise estimations of homology. This is made possible by the sequence annealing technique for constructing a multiple alignment from pairwise comparisons, developed by Ariel Schwartz.

FSA brings the high accuracies previously available only for small-scale analyses of proteins or RNAs to large-scale problems such as aligning thousands of sequences or megabase-long sequences. FSA introduces several novel methods for constructing better alignments:

  • FSA uses machine-learning techniques to estimate gap and substitution parameters on the fly for each set of input sequences. This "query-specific learning" alignment method makes FSA very robust: it can produce superior alignments of sets of homologous sequences which are subject to very different evolutionary constraints.
  • FSA is capable of aligning hundreds or even thousands of sequences using a randomized inference algorithm to reduce the computational cost of multiple alignment. This randomized inference can be over ten times faster than a direct approach with little loss of accuracy.
  • FSA can quickly align very long sequences using the "anchor annealing" technique for resolving anchors and projecting them with transitive anchoring. It then stitches together the alignment between the anchors using the methods described above.
  • The included GUI, MAD (Multiple Alignment Display), can display the intermediate alignments produced by FSA, where each character is colored according to the probability that it is correctly aligned
Please cite: Robert K. Bradley, Adam Roberts, Michael Smoot, Sudeep Juvekar, Jaeyoung Do, Colin Dewey, Ian Holmes and Lior Pachter: Fast Statistical Alignment. (PubMed,eprint) PLoS Comput Biol. 5(5):e1000392 (2009)
Remark of Debian Med team: Precondition for T-Coffee

see http://wiki.debian.org/DebianMed/TCoffee

Upstream address bounced when contacting about segfaults so it seems to be dead upstream and no good code quality.

Fsm-lite
frequency-based string mining (lite)
Versions of package fsm-lite
ReleaseVersionArchitectures
stretch1.0-2amd64,arm64,mips64el,ppc64el,s390x
sid1.0-2amd64,arm64,kfreebsd-amd64,mips64el,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

A singe-core implementation of frequency-based substring mining used in bioinformatics to extract substrings that discriminate two (or more) datasets inside high-throughput sequencing data.

Gamgi
General Atomistic Modelling Graphic Interface (GAMGI)
Versions of package gamgi
ReleaseVersionArchitectures
squeeze0.14.8-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.15.8-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.17.1-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.17.1-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.17.1-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package gamgi:
roleprogram
uitoolkitgtk
Popcon: 19 users (49 upd.)*
Versions and Archs
License: DFSG free
Svn

The General Atomistic Modelling Graphic Interface (GAMGI) provides a graphical interface to build, view and analyze atomic structures. The program is aimed at the scientific community and provides a graphical interface to study atomic structures and to prepare images for presentations, and for teaching the atomic structure of matter.

The package is enhanced by the following packages: gamgi-data gamgi-doc
Screenshots of package gamgi
Garli
phylogenetic analysis of molecular sequence data using maximum-likelihood
Versions of package garli
ReleaseVersionArchitectures
stretch2.1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.1-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Svn

GARLI, Genetic Algorithm for Rapid Likelihood Inference is a program for inferring phylogenetic trees. Using an approach similar to a classical genetic algorithm, it rapidly searches the space of evolutionary trees and model parameters to find the solution maximizing the likelihood score. It implements nucleotide, amino acid and codon-based models of sequence evolution, and runs on all platforms. The latest version adds support for partitioned models and morphology-like datatypes.

Garlic
A visualization program for biomolecules
Versions of package garlic
ReleaseVersionArchitectures
squeeze1.6-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.6-1.1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.6-1.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.6-1.1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.6-1.1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package garlic:
fieldbiology, chemistry
interfacex11
roleprogram
scopeutility
uitoolkitxlib
useviewing
x11application
Popcon: 29 users (50 upd.)*
Versions and Archs
License: DFSG free
Svn

Garlic is written for the investigation of membrane proteins. It may be used to visualize other proteins, as well as some geometric objects. This version of garlic recognizes PDB format version 2.1. Garlic may also be used to analyze protein sequences.

It only depends on the X libraries, no other libraries are needed.

Features include:

  • The slab position and thickness are visible in a small window.
  • Atomic bonds as well as atoms are treated as independent drawable objects.
  • The atomic and bond colors depend on position. Five mapping modes are available (as for slab).
  • Capable to display stereo image.
  • Capable to display other geometric objects, like membrane.
  • Atomic information is available for atom covered by the mouse pointer. No click required, just move the mouse pointer over the structure!
  • Capable to load more than one structure.
  • Capable to draw Ramachandran plot, helical wheel, Venn diagram, averaged hydrophobicity and hydrophobic moment plot.
  • The command prompt is available at the bottom of the main window. It is able to display one error message and one command string.
Please cite: Damir Zucic and Davor Juretic: Precise Annotation of Transmembrane Segments with Garlic - a Free Molecular Visualization Program (eprint) Croatica Chemica Acta 77(1-2):397-401 (2004)
Gasic
genome abundance similarity correction
Versions of package gasic
ReleaseVersionArchitectures
jessie0.0.r18-2amd64
stretch0.0.r18-3amd64
sid0.0.r18-3amd64,kfreebsd-amd64
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

One goal of sequencing based metagenomic analysis is the quantitative taxonomic assessment of microbial community compositions. However, the majority of approaches either quantify at low resolution (e.g. at phylum level) or have severe problems discerning highly similar species. Yet, accurate quantification on species level is desirable in applications such as metagenomic diagnostics or community comparison. GASiC is a method to correct read alignment results for the ambiguities imposed by similarities of genomes. It has superior performance over existing methods.

Please cite: Martin S. Lindner and Bernhard Y. Renard: Metagenomic abundance estimation and diagnostic testing on species level. (PubMed,eprint) Nucleic Acids Research 41(1):e10 (2013)
Gbrowse
GMOD Generic Genome Browser
Versions of package gbrowse
ReleaseVersionArchitectures
wheezy2.48~dfsg-1all
jessie2.54+dfsg-3all
sid2.54+dfsg-3all
sid2.54+dfsg-6all
stretch2.54+dfsg-7all
sid2.54+dfsg-7all
Debtags of package gbrowse:
fieldbiology, biology:bioinformatics
interfaceweb
roleprogram
useanalysing, viewing
webapplication, cgi
Popcon: 104 users (31 upd.)*
Versions and Archs
License: DFSG free
Git

Generic Genome Browser is a simple but highly configurable web-based genome browser. It is a component of the Generic Model Organism Systems Database project (GMOD). Some of its features:

  • Simultaneous bird's eye and detailed views of the genome;
  • Scroll, zoom, center;
  • Attach arbitrary URLs to any annotation;
  • Order and appearance of tracks are customizable by administrator and end-user;
  • Search by annotation ID, name, or comment;
  • Supports third party annotation using GFF formats;
  • Settings persist across sessions;
  • DNA and GFF dumps;
  • Connectivity to different databases, including BioSQL and Chado;
  • Multi-language support;
  • Third-party feature loading;
  • Customizable plug-in architecture (e.g. run BLAST, dump & import many formats, find oligonucleotides, design primers, create restriction maps, edit features).
The package is enhanced by the following packages: libbio-samtools-perl
Please cite: Maureen J. Donlin: Using the Generic Genome Browser (GBrowse). (eprint) Department of Biochemistry and Molecular Biology and Department of Molecular Microbiology and Immunology, Saint Louis University School of Medicine (2009)
Gdpc
visualiser of molecular dynamic simulations
Versions of package gdpc
ReleaseVersionArchitectures
squeeze2.2.5-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.2.5-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.2.5-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.2.5-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.2.5-6amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package gdpc:
fieldbiology, biology:structural, chemistry, physics
interfacex11
roleprogram
scopeapplication
uitoolkitgtk
useviewing
works-with3dmodel, image, video
works-with-formatjpg, png
x11application
Popcon: 22 users (75 upd.)*
Versions and Archs
License: DFSG free
Git

gpdc is a graphical program for visualising output data from molecular dynamics simulations. It reads input in the standard xyz format, as well as other custom formats, and can output pictures of each frame in JPG or PNG format.

Other screenshots of package gdpc
VersionURL
2.2.5-1https://screenshots.debian.net/screenshots/000/007/396/large.png
Screenshots of package gdpc
Genometools
versatile genome analysis toolkit
Versions of package genometools
ReleaseVersionArchitectures
jessie1.5.3-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.5.9+ds-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.5.9+ds-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package genometools:
biologynuceleic-acids
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
uitoolkitncurses
Popcon: 6 users (69 upd.)*
Versions and Archs
License: DFSG free
Git

The GenomeTools contains a collection of useful tools for biological sequence analysis and -presentation combined into a single binary.

The toolkit contains binaries for sequence and annotation handling, sequence compression, index structure generation and access, annotation visualization, and much more.

Please cite: Gordon Gremme, Sascha Steinbiss and Stefan Kurtz: GenomeTools: a comprehensive software library for efficient processing of structured genome annotations.. (PubMed) IEEE/ACM Transactions on Computational Biology and Bioinformatics 10(3):645-656 (2013)
Gentle
suite to plan genetic cloning
Versions of package gentle
ReleaseVersionArchitectures
squeeze1.9+cvs20100605+dfsg-2 (contrib)amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.9+cvs20100605+dfsg1-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.9+cvs20100605+dfsg1-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.9+cvs20100605+dfsg1-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid1.9+cvs20100605+dfsg1-4amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package gentle:
biologynuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacex11
roleprogram
uitoolkitwxwidgets
Popcon: 6 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

GENtle is a software for DNA and amino acid editing, database management, plasmid maps, restriction and ligation, alignments, sequencer data import, calculators, gel image display, PCR, and much more.

Please cite: Magnus Manske: GENtle, a free multi-purpose molecular biology tool. (eprint) (2006)
Gff2aplot
pair-wise alignment-plots for genomic sequences in PostScript
Versions of package gff2aplot
ReleaseVersionArchitectures
squeeze2.0-5amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.0-7amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.0-7amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.0-8amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.0-8amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package gff2aplot:
fieldbiology, biology:bioinformatics
interfacecommandline, shell
roleprogram
scopeutility
useconverting, viewing
works-withimage:vector
works-with-formatplaintext, postscript
Popcon: 6 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

A program to visualize the alignment of two genomic sequences together with their annotations. From GFF-format input files it produces PostScript figures for that alignment. The following menu lists many features of gff2aplot:

  • Comprehensive alignment plots for any GFF-feature. Attributes are defined separately so you can modify only whatsoever attributes for a given file or share same customization across different data-sets.
  • All parameters are set by default within the program, but it can be also fully configured via gff2ps-like flexible customization files. Program can handle several of such files, summarizing all the settings before producing the corresponding figure. Moreover, all customization parameters can be set via command-line switches, which allows users to play with those parameters before adding any to a customization file.
  • Source order is taken from input files, if you swap file order you can visualize alignment and its annotation with the new input arrangement.
  • All alignment scores can be visualized in a PiP box below gff2aplot area, using grey-color scale, user-defined color scale or score-dependent gradients.
  • Scalable fonts, which can also be chosen among the basic PostScript default fonts. Feature and group labels can be rotated to improve readability in both annotation axes.
  • The program is still defined as a Unix filter so it can handle data from files, redirections and pipes, writing output to standard-output and warnings to standard error.
  • gff2aplot is able to manage many physical page formats (from A0 to A10, and more -see available page sizes in its manual-), including user-defined ones. This allows, for instance, the generation of poster size genomic maps, or the use of a continuous-paper supporting plotting device, either in portrait or landscape.
  • You can draw different alignments on same alignment plot and distinguish them by using different colors for each.
  • Shape dictionary has been expanded, so that further feature shapes are now available (see manual).
  • Annotation projections through alignment plots (so called ribbons) emulate transparencies via complementary color fill patterns. This feature allows one to show color pseudo-blending when horizontal and vertical ribbons overlap.
Please cite: J. F. Abril, R. Guigó and T. Wiehe: gff2aplot: Plotting sequence comparisons. (PubMed,eprint) Bioinformatics 19(18):2477-2479 (2003)
Gff2ps
produces PostScript graphical output from GFF-files
Versions of package gff2ps
ReleaseVersionArchitectures
squeeze0.98d-3all
wheezy0.98d-4all
jessie0.98d-4all
stretch0.98d-5all
sid0.98d-5all
Debtags of package gff2ps:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useconverting, viewing
works-withimage:vector
works-with-formatpostscript
Popcon: 6 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

gff2ps is a script program developed with the aim of converting gff-formatted records into high quality one-dimensional plots in PostScript. Such plots maybe useful for comparing genomic structures and to visualizing outputs from genome annotation programs. It can be used in a very simple way, because it assumes that the GFF file itself carries enough formatting information, but it also allows through a number of options and/or a configuration file, for a great degree of customization.

Please cite: J. F. Abril and R. Guigó: gff2ps: visualizing genomic annotations.. (PubMed,eprint) Bioinformatics 16(8):743-744 (2000)
Ghemical
GNOME molecular modelling environment
Versions of package ghemical
ReleaseVersionArchitectures
squeeze2.99.2-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy3.0.0-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
sid3.0.0-1amd64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package ghemical:
fieldchemistry
interface3d, x11
roleprogram
suitegnome
uitoolkitgtk
useediting, learning, viewing
works-with3dmodel
x11application
Popcon: 24 users (24 upd.)*
Versions and Archs
License: DFSG free
Svn

Ghemical is a computational chemistry software package written in C++. It has a graphical user interface and it supports both quantum- mechanics (semi-empirical) models and molecular mechanics models. Geometry optimization, molecular dynamics and a large set of visualization tools using OpenGL are currently available.

Ghemical relies on external code to provide the quantum-mechanical calculations. Semi-empirical methods MNDO, MINDO/3, AM1 and PM3 come from the MOPAC7 package (Public Domain), and are included in the package. The MPQC package is used to provide ab initio methods: the methods based on Hartree-Fock theory are currently supported with basis sets ranging from STO-3G to 6-31G**.

Screenshots of package ghemical
Giira
RNA-Seq driven gene finding incorporating ambiguous reads
Versions of package giira
ReleaseVersionArchitectures
jessie0.0.20140210-2amd64
stretch0.0.20140210-2amd64
sid0.0.20140210-2amd64,kfreebsd-amd64
Popcon: 7 users (42 upd.)*
Versions and Archs
License: DFSG free
Git

GIIRA is a gene prediction method that identifies potential coding regions exclusively based on the mapping of reads from an RNA-Seq experiment. It was foremost designed for prokaryotic gene prediction and is able to resolve genes within the expressed region of an operon. However, it is also applicable to eukaryotes and predicts exon intron structures as well as alternative isoforms.

Please cite: Franziska Zickmann, Martin S. Lindner and Bernhard Y. Renard: GIIRA—RNA-Seq driven gene finding incorporating ambiguous reads. (PubMed,eprint) Bioinformatics (2013)
Glam2
gapped protein motifs from unaligned sequences
Versions of package glam2
ReleaseVersionArchitectures
squeeze1064-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1064-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1064-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1064-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1064-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package glam2:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing, searching
works-with-formatplaintext
Popcon: 7 users (46 upd.)*
Versions and Archs
License: DFSG free
Svn

GLAM2 is a software package for finding motifs in sequences, typically amino-acid or nucleotide sequences. A motif is a re-occurring sequence pattern: typical examples are the TATA box and the CAAX prenylation motif. The main innovation of GLAM2 is that it allows insertions and deletions in motifs.

This package includes programs for discovering motifs shared by a set of sequences and finding matches to these motifs in a sequence database, as well as utilities for converting glam2 motifs to standard alignment formats, masking glam2 motifs out of sequences so that weaker motifs can be found, and removing highly similar members of a set of sequences.

The package includes these programs:

 glam2:       discovering motifs shared by a set of sequences;
 glam2scan:   finding matches, in a sequence database, to a motif discovered
              by glam2;
 glam2format: converting glam2 motifs to  standard alignment formats;
 glam2mask:   masking glam2 motifs out of sequences, so that weaker motifs
              can be found;
 glam2-purge: removing highly similar members of a set of sequences.

In this binary package, the fast Fourier algorithm (FFT) was enabled for the glam2 program.

Please cite: Martin C. Frith, Neil F. W. Saunders, Bostjan Kobe and Timothy L. Bailey: Discovering Sequence Motifs with Arbitrary Insertions and Deletions. (PubMed) PLoS Computational Biology 4(5):e1000071 (2008)
Screenshots of package glam2
Graphlan
circular representations of taxonomic and phylogenetic trees
Versions of package graphlan
ReleaseVersionArchitectures
stretch1.1-1all
sid1.1-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Svn

GraPhlAn is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees. It focuses on concise, integrative, informative, and publication-ready representations of phylogenetically- and taxonomically-driven investigation.

Grinder
Versatile omics shotgun and amplicon sequencing read simulator
Versions of package grinder
ReleaseVersionArchitectures
wheezy0.4.5-1all
jessie0.5.3-3all
stretch0.5.4-1all
sid0.5.4-1all
Popcon: 8 users (46 upd.)*
Versions and Archs
License: DFSG free
Git

Grinder is a versatile program to create random shotgun and amplicon sequence libraries based on DNA, RNA or proteic reference sequences provided in a FASTA file.

Grinder can produce genomic, metagenomic, transcriptomic, metatranscriptomic, proteomic, metaproteomic shotgun and amplicon datasets from current sequencing technologies such as Sanger, 454, Illumina. These simulated datasets can be used to test the accuracy of bioinformatic tools under specific hypothesis, e.g. with or without sequencing errors, or with low or high community diversity. Grinder may also be used to help decide between alternative sequencing methods for a sequence-based project, e.g. should the library be paired-end or not, how many reads should be sequenced.

Please cite: Florent E. Angly, Dana Willner, Forest Rohwer, Philip Hugenholtz and Gene W. Tyson: Grinder: a versatile amplicon and shotgun sequence simulator. (PubMed,eprint) Nucleic Acids Research Epub ahead of print (2012)
Gromacs
Molecular dynamics simulator, with building and analysis tools
Versions of package gromacs
ReleaseVersionArchitectures
squeeze4.0.7-3amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy4.5.5-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie5.0.2-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
experimental5.1.2-3armel
stretch5.1.3-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid5.1.3-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
experimental2016~rc1-3amd64,arm64,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream2016~rc1
Debtags of package gromacs:
fieldbiology, biology:structural, chemistry
interfacecommandline, x11
roleprogram
uitoolkitxlib
x11application
Popcon: 52 users (64 upd.)*
Newer upstream!
License: DFSG free
Svn

GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non- biological systems, e.g. polymers.

Please cite: Berk Hess, Carsten Kutzner, David van der Spoel and Erik Lindahl: GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. (eprint) J. Chem. Theory Comput. 4(3):435-447 (2008)
Gubbins
phylogenetic analysis of genome sequences
Versions of package gubbins
ReleaseVersionArchitectures
sid1.4.3-1kfreebsd-amd64
stretch2.0.0-1amd64,i386
sid2.0.0-1amd64,i386
upstream2.1.0
Popcon: 2 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

Gubbins supports rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences.

Gubbins (Genealogies Unbiased By recomBinations In Nucleotide Sequences) is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistic models of short-term bacterial evolution, and can be run in only a few hours on alignments of hundreds of bacterial genome sequences.

Please cite: Nicholas J. Croucher, Andrew J. Page, Thomas R. Connor, Aidan J. Delaney, Jacqueline A. Keane, Stephen D. Bentley, Julian Parkhill and Simon R. Harris: Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. (PubMed,eprint) Nucleic Acids Research 43(3):e15 (2014)
Gwama
Genome-Wide Association Meta Analysis
Versions of package gwama
ReleaseVersionArchitectures
stretch2.1+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.1+dfsg-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (46 upd.)*
Versions and Archs
License: DFSG free
Git

GWAMA (Genome-Wide Association Meta Analysis) software performs meta-analysis of the results of GWA studies of binary or quantitative phenotypes. Fixed- and random-effect meta-analyses are performed for both directly genotyped and imputed SNPs using estimates of the allelic odds ratio and 95% confidence interval for binary traits, and estimates of the allelic effect size and standard error for quantitative phenotypes. GWAMA can be used for analysing the results of all different genetic models (multiplicative, additive, dominant, recessive). The software incorporates error trapping facilities to identify strand alignment errors and allele flipping, and performs tests of heterogeneity of effects between studies.

Please cite: Reedik Mägi and Andrew P. Morris: GWAMA: software for genome-wide association meta-analysis. (eprint) BMC Bioinformatics 11(May):288 (2010)
Harvest-tools
archiving and postprocessing for reference-compressed genomic multi-alignments
Versions of package harvest-tools
ReleaseVersionArchitectures
stretch1.2-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.2-2amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

HarvestTools is a utility for creating and interfacing with Gingr files, which are efficient archives that the Harvest Suite uses to store reference-compressed multi-alignments, phylogenetic trees, filtered variants and annotations. Though designed for use with Parsnp and Gingr, HarvestTools can also be used for generic conversion between standard bioinformatics file formats.

Please cite: Todd J. Treangen, Brian D. Ondov, Sergey Koren and Adam M. Phillippy: Rapid Core-Genome Alignment and Visualization for Thousands of Intraspecific Microbial Genomes. (PubMed,eprint) bioRxiv 15(11):524 (2014)
Hhsuite
sensitive protein sequence searching based on HMM-HMM alignment
Versions of package hhsuite
ReleaseVersionArchitectures
wheezy2.0.15-1amd64
jessie2.0.16-5amd64
stretch2.0.16-6amd64
sid2.0.16-6amd64
Popcon: 7 users (43 upd.)*
Versions and Archs
License: DFSG free
Svn

HH-suite is an open-source software package for sensitive protein sequence searching based on the pairwise alignment of hidden Markov models (HMMs).

This package contains HHsearch and HHblits among other programs and utilities.

HHsearch takes as input a multiple sequence alignment (MSA) or profile HMM and searches a database of HMMs (e.g. PDB, Pfam, or InterPro) for homologous proteins. HHsearch is often used for protein structure prediction to detect homologous templates and to build highly accurate query-template pairwise alignments for homology modeling.

HHblits can build high-quality MSAs starting from single sequences or from MSAs. It transforms these into a query HMM and, using an iterative search strategy, adds significantly similar sequences from the previous search to the updated query HMM for the next search iteration. Compared to PSI-BLAST, HHblits is faster, up to twice as sensitive and produces more accurate alignments.

Please cite: Michael Remmert, Andreas Biegert, Andreas Hauser and Johannes Söding: HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment.. (PubMed) Nat. Methods 9(2):173-175 (2011)
Hmmer
profile hidden Markov models for protein sequence analysis
Versions of package hmmer
ReleaseVersionArchitectures
squeeze2.3.2-5amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy3.0-4amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.1b1-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.1b2-1mips64el
sid3.1b2-1mips64el
stretch3.1b2+dfsg-2amd64,i386
sid3.1b2+dfsg-2amd64,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386
Debtags of package hmmer:
biologyformat:aln, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usesearching
works-withdb
works-with-formatplaintext
Popcon: 28 users (98 upd.)*
Versions and Archs
License: DFSG free
Git

HMMER is an implementation of profile hidden Markov model methods for sensitive searches of biological sequence databases using multiple sequence alignments as queries.

Given a multiple sequence alignment as input, HMMER builds a statistical model called a "hidden Markov model" which can then be used as a query into a sequence database to find (and/or align) additional homologues of the sequence family.

Please cite: S. R. Eddy: Profile hidden Markov models. (PubMed,eprint) Bioinformatics 14(9):755-763 (1998)
Screenshots of package hmmer
Hmmer2
profile hidden Markov models for protein sequence analysis
Versions of package hmmer2
ReleaseVersionArchitectures
jessie2.3.2-8amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.3.2-11amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.3.2-11amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 11 users (60 upd.)*
Versions and Archs
License: DFSG free
Git

HMMER is an implementation of profile hidden Markov model methods for sensitive searches of biological sequence databases using multiple sequence alignments as queries.

Given a multiple sequence alignment as input, HMMER builds a statistical model called a "hidden Markov model" which can then be used as a query into a sequence database to find (and/or align) additional homologues of the sequence family.

Please cite: Eddy, SR: Profile hidden Markov models. Bioinformatics 14(9):755-763 (1998)
Remark of Debian Med team: This older version of HMMER is used in some applications

While Debian has HMMER 3 since some time there are users of HMMER 2 interested in having this old version available and thus the package is reintroduced.

Hyphy-mpi
Hypothesis testing using Phylogenies (MPI version)
Versions of package hyphy-mpi
ReleaseVersionArchitectures
stretch2.2.6+dfsg-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.2.6+dfsg-4amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (27 upd.)*
Versions and Archs
License: DFSG free
Git

HyPhy is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning. It features a complete graphical user interface (GUI) and a rich scripting language for limitless customization of analyses. Additionally, HyPhy features support for parallel computing environments (via message passing interface) and it can be compiled as a shared library and called from other programming environments such as Python or R. Continued development of HyPhy is currently supported in part by an NIGMS R01 award 1R01GM093939.

This package provides an executable using MPI to do multiprocessing.

Please cite: Sergei L. Kosakovsky Pond, Simon D. W. Frost and Spencer V. Muse: HyPhy: hypothesis testing using phylogenies. (PubMed,eprint) Bioinformatics 21(5):676-679 (2005)
Hyphygui
Hypothesis testing using Phylogenies (GTK+ gui)
Versions of package hyphygui
ReleaseVersionArchitectures
stretch2.2.6+dfsg-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.2.6+dfsg-4amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (27 upd.)*
Versions and Archs
License: DFSG free
Git

HyPhy is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning. It features a complete graphical user interface (GUI) and a rich scripting language for limitless customization of analyses. Additionally, HyPhy features support for parallel computing environments (via message passing interface) and it can be compiled as a shared library and called from other programming environments such as Python or R. Continued development of HyPhy is currently supported in part by an NIGMS R01 award 1R01GM093939.

This package contains the GTK+ gui.

Please cite: Sergei L. Kosakovsky Pond, Simon D. W. Frost and Spencer V. Muse: HyPhy: hypothesis testing using phylogenies. (PubMed,eprint) Bioinformatics 21(5):676-679 (2005)
Idba
iterative De Bruijn Graph De Novo short read assembler for transcriptome
Versions of package idba
ReleaseVersionArchitectures
jessie1.1.2-1amd64,arm64,armel,armhf,i386,mips,mipsel,ppc64el,s390x
stretch1.1.2-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid1.1.2-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,ppc64el,s390x
upstream1.1.3
Popcon: 5 users (64 upd.)*
Newer upstream!
License: DFSG free
Git

IDBA-Tran is an iterative De Bruijn Graph De Novo short read assembler for transcriptome. It is purely de novo assembler based on only RNA sequencing reads. IDBA-Tran uses local assembly to reconstructing missing k-mers in low-expressed transcripts and then employs progressive cutoff on contigs to separate the graph into components. Each component corresponds to one gene in most cases and contains not many transcripts. A heuristic algorithm based on pair-end reads is then used to find the isoforms.

Please cite: Yu Peng, Henry C. M. Leung, S. M. Yiu and Francis Y. L. Chin: IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. (PubMed,eprint) Bioinformatics 28(11):1420-1428 (2012)
Indelible
powerful and flexible simulator of biological evolution
Versions of package indelible
ReleaseVersionArchitectures
stretch1.03-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.03-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

INDELible is a new, portable, and flexible application for biological sequence simulation that combines many features in the same place for the first time. Using a length-dependent model of indel formation it can simulate evolution of multi-partitioned nucleotide, amino-acid, or codon data sets through the processes of insertion, deletion, and substitution in continuous time.

Nucleotide simulations may use the general unrestricted model or the general time reversible model and its derivatives, and amino-acid simulations can be conducted using fifteen different empirical rate matrices. Substitution rate heterogeneity can be modelled via the continuous and discrete gamma distributions, with or without a proportion of invariant sites. INDELible can also simulate under non-homogenous and non-stationary conditions where evolutionary models are permitted to change across a phylogeny.

Unique among indel simulation programs, INDELible offers the ability to simulate using codon models that exhibit nonsynonymous/synonymous rate ratio heterogeneity among sites and/or lineages.

Please cite: William Fletcher and Ziheng Yang: INDELible: A Flexible Simulator of Biological Sequence Evolution. (eprint) Molecular Biology and Evolution 26(8):1879-1888 (2009)
Infernal
inference of RNA secondary structural alignments
Versions of package infernal
ReleaseVersionArchitectures
squeeze1.0.2-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.0.2-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.1.1-2amd64,i386
stretch1.1.1-5amd64,i386
sid1.1.1-5amd64,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386
Debtags of package infernal:
biologynuceleic-acids
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
useanalysing
Popcon: 17 users (68 upd.)*
Versions and Archs
License: DFSG free
Git

Infernal ("INFERence of RNA ALignment") searches DNA sequence databases for RNA structure and sequence similarities. It provides an implementation of a special variant of profile stochastic context-free grammars called covariance models (CMs). A CM is like a sequence profile, but it scores a combination of sequence consensus and RNA secondary structure consensus, so in many cases, it is more capable of identifying RNA homologs that conserve their secondary structure more than their primary sequence.

The tool is an integral component of the Rfam database.

Please cite: Eric P. Nawrocki, Diana L. Kolbe and Sean R. Eddy: Infernal 1.0: inference of RNA alignments. (PubMed,eprint) Bioinformatics 25(10):1335-1337 (2009)
Ipig
integrating PSMs into genome browser visualisations
Versions of package ipig
ReleaseVersionArchitectures
jessie0.0.r5-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.0.r5-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.0.r5-2amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 4 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

iPiG targets the integration of peptide spectrum matches (PSMs) from mass spectrometry (MS) peptide identifications into genomic visualisations provided by genome browser such as the UCSC genome browser (http://genome.ucsc.edu/).

iPiG takes PSMs from the MS standard format mzIdentML (*.mzid) or in text format and provides results in genome track formats (BED and GFF3 files), which can be easily imported into genome browsers.

Please cite: Mathias Kuhring and Bernhard Y. Renard: iPiG: Integrating Peptide Spectrum Matches into Genome Browser Visualizations. (PubMed,eprint) PLoS ONE 7(12):e50246 (2012)
Iqtree
efficient phylogenetic software by maximum likelihood
Versions of package iqtree
ReleaseVersionArchitectures
stretch1.4.2+dfsg-1amd64,i386
sid1.4.2+dfsg-1amd64,i386,kfreebsd-amd64,kfreebsd-i386
upstream1.4.3
Popcon: 3 users (4 upd.)*
Newer upstream!
License: DFSG free
Git

IQ-TREE is a very efficient maximum likelihood phylogenetic software with following key features among others:

  • A novel fast and effective stochastic algorithm to estimate maximum likelihood trees. IQ-TREE outperforms both RAxML and PhyML in terms of likelihood while requiring similar amount of computing time (see Nguyen et al., 2015)
  • An ultrafast bootstrap approximation to assess branch supports (see Minh et al., 2013).
  • A wide range of substitution models for binary, DNA, protein, codon, and morphological alignments.
  • Ultrafast model selection for all data types, 10 to 100 times faster than jModelTest and ProtTest.
  • Finding best partition scheme like PartitionFinder.
  • Partitioned models with mixed data types for phylogenomic (multi- gene) alignments, allowing for separate, proportional, or joint branch lengths among genes.
  • Supporting the phylogenetic likelihod library (PLL) (see Flouri et al., 2014)
Please cite: Lam Tung Nguyen, Heiko A. Schmidt, Arndt von Haeseler and Bui Quang Minh: IQ-TREE: A fast and effective stochastic algorithm for estimating maximum likelihood phylogenies. (PubMed,eprint) Mol. Biol. Evol. 32(1):268-274 (2015)
Iva
iterative virus sequence assembler
Versions of package iva
ReleaseVersionArchitectures
stretch1.0.6+ds-1amd64,arm64,ppc64el
sid1.0.6+ds-1amd64,arm64,kfreebsd-amd64,mips64el,ppc64el
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Vcs

IVA is a de novo assembler designed to assemble virus genomes that have no repeat sequences, using Illumina read pairs sequenced from mixed populations at extremely high depth.

IVA's main algorithm works by iteratively extending contigs using aligned read pairs. Its input can be just read pairs, or additionally you can provide an existing set of contigs to be extended. Alternatively, it can take reads together with a reference sequence.

Please cite: M. Hunt, A. Gall, S. H. Ong, J. Brener, B. Ferns, P. Goulder, E. Nastouli, J. A. Keane, P. Kellam and T. D. Otto: IVA: accurate de novo assembly of RNA virus genomes. (PubMed) Bioinformatics 31(14):2374-2376 (2015)
Jaligner
Smith-Waterman algorithm with Gotoh's improvement
Versions of package jaligner
ReleaseVersionArchitectures
stretch1.0+dfsg-4all
sid1.0+dfsg-4all
Popcon: 2 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

JAligner is an open source Java implementation of the Smith-Waterman algorithm with Gotoh's improvement for biological local pairwise sequence alignment with the affine gap penalty model.

Jalview
multiple alignment editor
Versions of package jalview
ReleaseVersionArchitectures
wheezy2.7.dfsg-2all
wheezy-security2.7.dfsg-2+deb7u1all
jessie2.7.dfsg-4all
stretch2.7.dfsg-5all
sid2.7.dfsg-5all
upstream2.9
Popcon: 9 users (52 upd.)*
Newer upstream!
License: DFSG free
Git

JalView is a Java alignment editor that can work with sequence alignment produced by programs implementing alignment algorithms such as clustalw, kalign and t-coffee.

It has lots of features, is actively developed, and will compare advantageously to BioEdit, while being free as in free speech !

Screenshots of package jalview
Jellyfish
count k-mers in DNA sequences
Versions of package jellyfish
ReleaseVersionArchitectures
wheezy1.1.5-1amd64,kfreebsd-amd64
jessie2.1.4-1amd64
sid2.1.4-1kfreebsd-amd64
stretch2.2.5-1amd64
sid2.2.5-1amd64
upstream2.2.6
Popcon: 9 users (45 upd.)*
Newer upstream!
License: DFSG free
Git

JELLYFISH is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. JELLYFISH can count k-mers using an order of magnitude less memory and an order of magnitude faster than other k-mer counting packages by using an efficient encoding of a hash table and by exploiting the "compare-and-swap" CPU instruction to increase parallelism.

JELLYFISH is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in an binary format, which can be translated into a human-readable text format using the "jellyfish dump" command.

Please cite: Guillaume Marcais and Carl Kingsford: A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27(6):764-770 (2011)
Jmodeltest
HPC selection of models of nucleotide substitution
Versions of package jmodeltest
ReleaseVersionArchitectures
stretch2.1.10+dfsg-2all
sid2.1.10+dfsg-2all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

jModelTest is a tool to carry out statistical selection of best-fit models of nucleotide substitution. It implements five different model selection strategies: hierarchical and dynamical likelihood ratio tests (hLRT and dLRT), Akaike and Bayesian information criteria (AIC and BIC), and a decision theory method (DT). It also provides estimates of model selection uncertainty, parameter importances and model-averaged parameter estimates, including model-averaged tree topologies. jModelTest 2 includes High Performance Computing (HPC) capabilities and additional features like new strategies for tree optimization, model- averaged phylogenetic trees (both topology and branch length), heuristic filtering and automatic logging of user activity.

Please cite: Diego Darriba, Guillermo L Taboada, Ramón Doallo and David Posada: jModelTest 2: more models, new heuristics and parallel computing. (PubMed) Nature Methods 9(8):772 (2012)
Jmol
Molecular Viewer
Versions of package jmol
ReleaseVersionArchitectures
wheezy12.2.32+dfsg2-1all
jessie12.2.32+dfsg2-1all
stretch12.2.32+dfsg2-1all
sid12.2.32+dfsg2-1all
upstream14.0.13
Debtags of package jmol:
fieldchemistry
roleprogram
scopeutility
useviewing
Popcon: 47 users (51 upd.)*
Newer upstream!
License: DFSG free
Svn

Jmol is a Java molecular viewer for three-dimensional chemical structures. Features include reading a variety of file types and output from quantum chemistry programs, and animation of multi-frame files and computed normal modes from quantum programs. It includes with features for chemicals, crystals, materials and biomolecules. Jmol might be useful for students, educators, and researchers in chemistry and biochemistry.

File formats read by Jmol include PDB, XYZ, CIF, CML, MDL Molfile, Gaussian, GAMESS, MOPAC, ABINIT, ACES-II, Dalton and VASP.

Screenshots of package jmol
Kalign
Global and progressive multiple sequence alignment
Versions of package kalign
ReleaseVersionArchitectures
wheezy2.03+20110620-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.03+20110620-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.03+20110620-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.03+20110620-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
squeeze2.04-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
Debtags of package kalign:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 18 users (53 upd.)*
Versions and Archs
License: DFSG free
Git

Kalign is a command line tool to perform multiple alignment of biological sequences. It employs the Muth-Manber string-matching algorithm, to improve both the accuracy and speed of the alignment. It uses global, progressive alignment approach, enriched by employing an approximate string-matching algorithm to calculate sequence distances and by incorporating local matches into the otherwise global alignment.

Please cite: Timo Lassmann, Oliver Frings and Erik L. L. Sonnhammer: Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features. (PubMed,eprint) Nucl. Acids Res. 37(3):858-865 (2009)
Khmer
in-memory DNA sequence kmer counting, filtering & graph traversal
Versions of package khmer
ReleaseVersionArchitectures
stretch2.0+dfsg-7amd64,arm64,mips64el,ppc64el
sid2.0+dfsg-7amd64,arm64,mips64el,ppc64el
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

khmer is a library and suite of command line tools for working with DNA sequence. It is primarily aimed at short-read sequencing data such as that produced by the Illumina platform. khmer takes a k-mer-centric approach to sequence analysis, hence the name.

Please cite: Michael R. Crusoe, Hussien F. Alameldin, Sherine Awad, Elmar Bucher, Adam Caldwell, Reed Cartwright, Amanda Charbonneau, Bede Constantinides, Greg Edvenson, Scott Fay, Jacob Fenton, Thomas Fenzl, Jordan Fish, Leonor Garcia-Gutierrez, Phillip Garland, Jonathan Gluck, Iván González, Sarah Guermond, Jiarong Guo, Aditi Gupta, Joshua R. Herr, Adina Howe, Alex Hyer, Andreas Härpfer, Luiz Irber, Rhys Kidd, David Lin, Justin Lippi, Tamer Mansour, Pamela McA'Nulty, Eric McDonald, Jessica Mizzi, Kevin D. Murray, Joshua R. Nahum, Kaben Nanlohy, Alexander Johan Nederbragt, Humberto Ortiz-Zuazaga, Jeramia Ory, Jason Pell, Charles Pepe-Ranney, Zachary N Russ, Erich Schwarz, Camille Scott, Josiah Seaman, Scott Sievert, Jared Simpson, Connor T. Skennerton, James Spencer, Ramakrishnan Srinivasan, Daniel Standage, James A. Stapleton, Joe Stein, Susan R Steinman, Benjamin Taylor, Will Trimble, Heather L. Wiencko, Michael Wright, Brian Wyss, Qingpeng Zhang, en zyme and C. Titus Brown: The khmer software package: enabling efficient sequence analysis. (2015)
Kineticstools
detection of DNA modifications
Versions of package kineticstools
ReleaseVersionArchitectures
stretch0.5.2+dfsg-2all
sid0.5.2+dfsg-2all
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Tools for detecting DNA modifications from single molecule, real-time (SMRT®) sequencing data. This tool implements the P_ModificationDetection module in SMRT® Portal, used by the RS_Modification_Detection and RS_Modifications_and_Motif_Detection protocol. Researchers interested in understanding or extending the modification detection algorithms can use these tools as a starting point.

This package is part of the SMRTAnalysis suite.

King-probe
Evaluate and visualize protein interatomic packing
Versions of package king-probe
ReleaseVersionArchitectures
stretch2.13.110909-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.13.110909-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

king-probe is a program that allows one to evaluate atomic packing, either within or between molecules. It generates "contact dots" where atoms are in close contact.

The program king-probe generates "contact dots" at points on the van der Waals surface of atoms which are in close proximity to other atoms; reading atomic coordinates in protein databank (PDB) format files and writing color-coded dot lists (spikes where atoms clash) for inclusion in a kinemage.

Kissplice
Detection of various kinds of polymorphisms in RNA-seq data
Versions of package kissplice
ReleaseVersionArchitectures
sid2.1.0-1hurd-i386
jessie2.2.1-3amd64,arm64,ppc64el
stretch2.3.1-1amd64,arm64,mips64el,ppc64el
sid2.3.1-1amd64,arm64,kfreebsd-amd64,mips64el,ppc64el
upstream2.4.0-p1
Debtags of package kissplice:
biologynuceleic-acids
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
useanalysing
works-withbiological-sequence
Popcon: 5 users (43 upd.)*
Newer upstream!
License: DFSG free
Svn

KisSplice is a piece of software that enables the analysis of RNA-seq data with or without a reference genome. It is an exact local transcriptome assembler that allows one to identify SNPs, indels and alternative splicing events. It can deal with an arbitrary number of biological conditions, and will quantify each variant in each condition. It has been tested on Illumina datasets of up to 1G reads. Its memory consumption is around 5Gb for 100M reads.

Please cite: Gustavo AT Sacomoto, Janice Kielbassa, Rayan Chikhi, Raluca Uricaru, Pavlos Antoniou, Marie-France Sagot, Pierre Peterlongo and Vincent Lacroix: KISSPLICE: de-novo calling alternative splicing events from RNA-seq data. (PubMed,eprint) BMC Bioinformatics 13((Suppl 6)):S5 (2012)
Kmc
count kmers in genomic sequences
Versions of package kmc
ReleaseVersionArchitectures
stretch2.3+dfsg-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.3+dfsg-5amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The kmc software is designed for counting k-mers (sequences of consecutive k symbols) in a set of reads. K-mer counting is important for many bioinformatics applications, e.g. developing de Bruijn graph assemblers.

Building de Bruijn graphs is a commonly used approach for genome assembly with data from second-generation sequencing. Unfortunately, sequencing errors (frequent in practice) result in huge memory requirements for de Bruijn graphs, as well as long build time. One of the popular approaches to handle this problem is filtering the input reads in such a way that unique k-mers (very likely obtained as a result of an error) are discarded.

Thus, KMC scans the raw reads and produces a compact representation of all non-unique reads accompanied with number of their occurrences. The algorithm implemented in KMC makes use mostly of disk space rather than RAM, which allows one to use KMC even on rather typical personal computers. When run on high-end servers (what is necessary for KMC competitors) it outperforms them in both memory requirements and speed of computation. The disk space necessary for computation is in order of the size of input data (usually it is smaller).

Please cite: S. Deorowicz, M. Kokot, Sz. Grabowski and A. Debudaj-Grabysz: KMC 2: Fast and resource-frugal k-mer counting. (PubMed) Bioinformatics 31(10):1569-1576 (2015)
Kmer
suite of tools for DNA sequence analysis
Versions of package kmer
ReleaseVersionArchitectures
stretch0~20150903+r2013-1all
sid0~20150903+r2013-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The kmer package is a suite of tools for DNA sequence analysis. It provides tools for searching (ESTs, mRNAs, sequencing reads); aligning (ESTs, mRNAs, whole genomes); and a variety of analyses based on kmers.

This is a metapackage depending on the executable components of the kmer suite.

Kraken
assigning taxonomic labels to short DNA sequences
Versions of package kraken
ReleaseVersionArchitectures
stretch0.10.5~beta-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.10.5~beta-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs. Kraken aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm.

In its fastest mode of operation, for a simulated metagenome of 100 bp reads, Kraken processed over 4 million reads per minute on a single core, over 900 times faster than Megablast and over 11 times faster than the abundance estimation program MetaPhlAn. Kraken's accuracy is comparable with Megablast, with slightly lower sensitivity and very high precision.

Please cite: Derrick E Wood and Steven L Salzberg: Kraken: ultrafast metagenomic sequence classification using exact alignments. (PubMed,eprint) Genome Biol. 15(3):R46 (2014)
Last-align
genome-scale comparison of biological sequences
Versions of package last-align
ReleaseVersionArchitectures
squeeze128-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy199-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie490-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch746-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid746-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream749
Debtags of package last-align:
biologynuceleic-acids
fieldbiology, biology:bioinformatics
roleprogram
Popcon: 7 users (61 upd.)*
Newer upstream!
License: DFSG free
Git

LAST is software for comparing and aligning sequences, typically DNA or protein sequences. LAST is similar to BLAST, but it copes better with very large amounts of sequence data. Here are two things LAST is good at:

  • Comparing large (e.g. mammalian) genomes.
  • Mapping lots of sequence tags onto a genome.

The main technical innovation is that LAST finds initial matches based on their multiplicity, instead of using a fixed size (e.g. BLAST uses 10-mers). This allows one to map tags to genomes without repeat-masking, without becoming overwhelmed by repetitive hits. To find these variable-sized matches, it uses a suffix array (inspired by Vmatch). To achieve high sensitivity, it uses a discontiguous suffix array, analogous to spaced seeds.

Please cite: Martin C. Frith, Raymond Wan and Paul Horton: Incorporating sequence quality data into alignment improves DNA read mapping. (PubMed,eprint) Nucl. Acids Res. 38(7):e100 (2010)
Leaff
biological sequence library utilities and applications
Versions of package leaff
ReleaseVersionArchitectures
stretch0~20150903+r2013-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0~20150903+r2013-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 3 users (46 upd.)*
Versions and Archs
License: DFSG free
Git

LEAFF (Let's Extract Anything From Fasta) is a utility program for working with multi-fasta files. In addition to providing random access to the base level, it includes several analysis functions.

This package is part of the Kmer suite.

Librg-utils-perl
parsers and format conversion utilities used by (e.g.) profphd
Versions of package librg-utils-perl
ReleaseVersionArchitectures
wheezy1.0.43-1all
jessie1.0.43-2all
stretch1.0.43-4all
sid1.0.43-4all
Debtags of package librg-utils-perl:
devellang:perl, library
Popcon: 6 users (61 upd.)*
Versions and Archs
License: DFSG free
Svn

This package contributes to the PredictProtein server for the automated structural annotation of protein sequences. It features as series of conversion tools like:

  • blast2saf.pl
  • blastpgp_to_saf.pl
  • conv_hssp2saf.pl
  • copf.pl
  • hssp_filter.pl
  • safFilterRed.pl

which are supported by the modules:

  • RG:Utils::Conv_hssp2saf
  • RG:Utils::Copf
  • RG:Utils::Hssp_filter
Logol-bin
Pattern matching tool using Logol language
Versions of package logol-bin
ReleaseVersionArchitectures
jessie1.7.0-2amd64,arm64,armel,armhf,i386,powerpc,ppc64el
stretch1.7.4-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.7.4-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package logol-bin:
uitoolkitncurses
Popcon: 0 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

This package contains the Prolog binaries used by Logol to parse the sequence and match the grammar.

Logol is a pattern matching tool using the Logol language. It searches with a specific grammar a pattern in small or large sequence (dna, rna, protein). It provides complete result matching with the original grammar in the results.

Loki
MCMC linkage analysis on general pedigrees
Versions of package loki
ReleaseVersionArchitectures
squeeze2.4.7.4-4amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.4.7.4-4amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.4.7.4-5amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.4.7.4-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.4.7.4-6amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package loki:
fieldbiology
interfacecommandline
roleprogram
scopeutility
useanalysing
Popcon: 12 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

Performs Markov chain Monte Carlo multipoint linkage analysis on large, complex pedigrees. The current package supports analyses on quantitative traits only, although this restriction will be lifted in later versions. Joint estimation of QTL number, position and effects uses Reversible Jump MCMC. It is also possible to perform affected only IBD sharing analyses.

The package is enhanced by the following packages: loki-doc
Please cite: Simon C. Heath: Markov chain Monte Carlo segregation and linkage analysis for oligogenic models. (PubMed,eprint) American Journal of Human Genetics 61(3):748-60 (1997)
Ltrsift
postprocessing and classification of LTR retrotransposons
Versions of package ltrsift
ReleaseVersionArchitectures
jessie1.0.2-1amd64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.0.2-7amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid1.0.2-7amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package ltrsift:
uitoolkitgtk
Popcon: 5 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

LTRsift is a graphical desktop tool for semi-automatic postprocessing of de novo predicted LTR retrotransposon annotations, such as the ones generated by LTRharvest and LTRdigest. Its user-friendly interface displays LTR retrotransposon candidates, their putative families and their internal structure in a hierarchical fashion, allowing the user to "sift" through the sometimes large results of de novo prediction software. It also offers customizable filtering and classification functionality.

Screenshots of package ltrsift
Macs
Model-based Analysis of ChIP-Seq on short reads sequencers
Versions of package macs
ReleaseVersionArchitectures
jessie2.0.9.1-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.1.1.20160309-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.1.1.20160309-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, is publicly available open source, and can be used for ChIP-Seq with or without control samples.

Please cite: Yong Zhang, Tao Liu, Clifford A Meyer, J{\'{e}}r{\^{o}}me Eeckhoute, David S Johnson, Bradley E Bernstein, Chad Nussbaum, Richard M Myers, Myles Brown, Wei Li and X Shirley Liu: Model-based Analysis of {ChIP}-Seq ({MACS}). (PubMed) Genome Biol 9(9):R137 (2008)
Macsyfinder
detection of macromolecular systems in protein datasets
Versions of package macsyfinder
ReleaseVersionArchitectures
stretch1.0.2-2all
sid1.0.2-2all
Popcon: 11 users (50 upd.)*
Versions and Archs
License: DFSG free
Svn

MacSyFinder is a program to model and detect macromolecular systems, genetic pathways... in protein datasets. In prokaryotes, these systems have often evolutionarily conserved properties: they are made of conserved components, and are encoded in compact loci (conserved genetic architecture). The user models these systems with MacSyFinder to reflect these conserved features, and to allow their efficient detection

This package presents the Open Source Java API to biological databases and a series of mostly sequence-based algorithms.

Please cite: Sophie S. Abby, Bertrand Néron, Hervé Ménager, Marie Touchon and Eduardo P. C. Rocha: MacSyFinder: A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas System. (PubMed,eprint) PLOS ONE 9(10):e110726 (2014)
Maffilter
process genome alignment in the Multiple Alignment Format
Versions of package maffilter
ReleaseVersionArchitectures
stretch1.1.0-1+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.1.0-1+dfsg-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

MafFilter applies a series of "filters" to a MAF file, in order to clean it, extract data and computer statistics while keeping track of the associated meta-data such as genome coordinates and quality scores.

  • It can process the alignment to remove low-quality / ambiguous / masked regions.
  • It can export data into a single or multiple alignment file in format such as Fasta or Clustal.
  • It can read annotation data in GFF or GTF format, and extract the corresponding alignment.
  • It can perform sliding windows calculations.
  • It can reconstruct phylogeny/genealogy along the genome alignment.
  • It can compute population genetics statistics, such as site frequency spectrum, number of fixed/polymorphic sites, etc.
The package is enhanced by the following packages: maffilter-examples
Please cite: Julien Y Dutheil, Sylvain Gaillard and Eva H Stukenbrock: MafFilter: a highly flexible and extensible multiple genome alignment files processor. (PubMed,eprint) BMC Genomics 15:53 (2014)
Mafft
Multiple alignment program for amino acid or nucleotide sequences
Versions of package mafft
ReleaseVersionArchitectures
squeeze6.815-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy6.864-1amd64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie7.205-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch7.294-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid7.294-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream7.299
Debtags of package mafft:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 26 users (67 upd.)*
Newer upstream!
License: DFSG free
Svn

MAFFT is a multiple sequence alignment program which offers three accuracy-oriented methods:

  • L-INS-i (probably most accurate; recommended for <200 sequences; iterative refinement method incorporating local pairwise alignment information),
  • G-INS-i (suitable for sequences of similar lengths; recommended for <200 sequences; iterative refinement method incorporating global pairwise alignment information),
  • E-INS-i (suitable for sequences containing large unalignable regions; recommended for <200 sequences), and five speed-oriented methods:

  • FFT-NS-i (iterative refinement method; two cycles only),

  • FFT-NS-i (iterative refinement method; max. 1000 iterations),
  • FFT-NS-2 (fast; progressive method),
  • FFT-NS-1 (very fast; recommended for >2000 sequences; progressive method with a rough guide tree),
  • NW-NS-PartTree-1 (recommended for ∼50,000 sequences; progressive method with the PartTree algorithm).
Please cite: Kazutaka Katoh and Hiroyuki Toh: Recent developments in the MAFFT multiple sequence alignment program. (PubMed) Brief Bioinform 9(4):286-298 (2008)
Mapsembler2
bioinformatics targeted assembly software
Versions of package mapsembler2
ReleaseVersionArchitectures
jessie2.1.6+dfsg-1amd64,armel,armhf,i386,mips,mipsel,powerpc,s390x
sid2.2.3+dfsg-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,ppc64el,s390x
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Svn

Mapsembler2 is a targeted assembly software. It takes as input a set of NGS raw reads (fasta or fastq, gzipped or not) and a set of input sequences (starters).

It first determines if each starter is read-coherent, e.g. whether reads confirm the presence of each starter in the original sequence. Then for each read-coherent starter, Mapsembler2 outputs its sequence neighborhood as a linear sequence or as a graph, depending on the user choice.

Mapsembler2 may be used for (not limited to):

  • Validate an assembled sequence (input as starter), e.g. from a de Bruijn graph assembly where read-coherence was not enforced.
  • Checks if a gene (input as starter) has an homolog in a set of reads
  • Checks if a known enzyme is present in a metagenomic NGS read set.
  • Enrich unmappable reads by extending them, possibly making them mappable
  • Checks what happens at the extremities of a contig
  • Remove contaminants or symbiont reads from a read set
Maq
maps short fixed-length polymorphic DNA sequence reads to reference sequences
Versions of package maq
ReleaseVersionArchitectures
squeeze0.7.1-3amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.7.1-5amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.7.1-5amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.7.1-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.7.1-6amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package maq:
biologynuceleic-acids
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing, searching
works-with-formatplaintext
Popcon: 15 users (52 upd.)*
Versions and Archs
License: DFSG free
Svn

Maq (short for Mapping and Assembly with Quality) builds mapping assemblies from short reads generated by the next-generation sequencing machines. It was particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has a preliminary functionality to handle ABI SOLiD data. Maq is previously known as mapass2.

Developmemt of Maq stopped in 2008. Its successors are BWA and SAMtools.

Please cite: Heng Li, Jue Ruan and Richard Durbin: Mapping short DNA sequencing reads and calling variants using mapping quality scores. (PubMed,eprint) Genome Research 18(11):1851-1858 (2008)
Maqview
graphical read alignment viewer for short gene sequences
Versions of package maqview
ReleaseVersionArchitectures
wheezy0.2.5-4amd64,armel,armhf,i386,ia64,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.2.5-6amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.2.5-7amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.2.5-7amd64,arm64,armel,armhf,hurd-i386,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

Maqview is graphical read alignment viewer. It is specifically designed for the Maq alignment file and allows you to see the mismatches, base qualities and mapping qualities. Maqview is nothing fancy as Consed or GAP, but just a simple viewer for you to see what happens in a particular region.

In comparison to tgap-maq, the text-based read alignment viewer writen by James Bonfield, Maqview is faster and takes up much less memory and disk space in indexing. This is possibly because tgap aims to be a general-purpose viewer but Maqview fully makes use of the fact that a Maq alignment file has already been sorted. Maqview is also efficient in viewing and provides a command-line tool to quickly retrieve any region in an Maq alignment file.

Please cite: Heng Li, Jue Ruan and Richard Durbin: Mapping short DNA sequencing reads and calling variants using mapping quality scores. (PubMed,eprint) Genome Research 18(11):1851-1858 (2008)
Mash
fast genome and metagenome distance estimation using MinHash
Versions of package mash
ReleaseVersionArchitectures
stretch1.1-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.1-3amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Mash uses MinHash locality-sensitive hashing to reduce large biosequences to a representative sketch and rapidly estimate pairwise distances between genomes or metagenomes. Mash sketch databases effectively delineate known species boundaries, allow construction of approximate phylogenies, and can be searched in seconds using assembled genomes or raw sequencing runs from Illumina, Pacific Biosciences, and Oxford Nanopore. For metagenomics, Mash scales to thousands of samples and can replicate Human Microbiome Project and Global Ocean Survey results in a fraction of the time.

Please cite: Brian D. Ondovi, Todd J. Treangen, Páll Melsted, Adam B. Mallonee, Nicholas H. Bergman, Sergey Koren and Adam M. Phillippy: Mash: fast genome and metagenome distance estimation using MinHash. (PubMed,eprint) Genome Biology 17:132 (2016)
Massxpert
linear polymer mass spectrometry software
Versions of package massxpert
ReleaseVersionArchitectures
squeeze2.3.6-1squeeze1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy3.2.3-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.4.1-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.6.1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3.6.1-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package massxpert:
biologynuceleic-acids, peptidic
fieldbiology, chemistry
interfacex11
roleprogram
uitoolkitqt
useanalysing, simulating
works-withbiological-sequence
works-with-formatxml
x11application
Popcon: 18 users (59 upd.)*
Versions and Archs
License: DFSG free
Git

massXpert is a program to simulate and analyse mass spectrometric data obtained on linear (bio-)polymers. It is the successor of GNU polyxmass.

Four modules allow:

  • making brand new polymer chemistry definitions;
  • using the definitions to perform easy calculations in a desktop calculator-like manner;
  • performing sophisticated polymer sequence editing and simulations;
  • perform m/z list comparisons;

Chemical simulations encompass cleavage (either chemical or enzymatic), gas-phase fragmentations, chemical modification of any monomer in the polymer sequence, cross-linking of monomers in the sequence, arbitrary mass searches, calculation of the isotopic pattern...

Other screenshots of package massxpert
VersionURL
2.0.3https://screenshots.debian.net/screenshots/000/002/268/large.png
Screenshots of package massxpert
Mauve-aligner
multiple genome alignment
Versions of package mauve-aligner
ReleaseVersionArchitectures
stretch2.4.0+4734-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid2.4.0+4734-3amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 2 users (45 upd.)*
Versions and Archs
License: DFSG free
Git

Mauve is a system for efficiently constructing multiple genome alignments in the presence of large-scale evolutionary events such as rearrangement and inversion. Multiple genome alignment provides a basis for research into comparative genomics and the study of evolutionary dynamics. Aligning whole genomes is a fundamentally different problem than aligning short sequences.

Mauve has been developed with the idea that a multiple genome aligner should require only modest computational resources. It employs algorithmic techniques that scale well in the amount of sequence being aligned. For example, a pair of Y. pestis genomes can be aligned in under a minute, while a group of 9 divergent Enterobacterial genomes can be aligned in a few hours.

Mauve computes and interactively visualizes genome sequence comparisons. Using FastA or GenBank sequence data, Mauve constructs multiple genome alignments that identify large-scale rearrangement, gene gain, gene loss, indels, and nucleotide substutition.

Mauve is developed at the University of Wisconsin.

The package is enhanced by the following packages: progressivemauve
Please cite: Mauve: multiple alignment of conserved genomic sequence with rearrangements: Aaron C. E. Darling and Bob Mau and Frederick R. Blattner and Nicole T. Perna. (PubMed,eprint) Genome research 14(7):1394-1403 (2004)
Melting
compute the melting temperature of nucleic acid duplex
Versions of package melting
ReleaseVersionArchitectures
squeeze4.3c-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy4.3c-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.3.1+dfsg-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.3.1+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid4.3.1+dfsg-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package melting:
biologynuceleic-acids
fieldbiology, biology:molecular
interfacecommandline
roleprogram
scopeutility
suitegnu
useanalysing
works-with-formatplaintext
Popcon: 6 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

This program computes, for a nucleic acid duplex, the enthalpy, the entropy and the melting temperature of the helix-coil transitions. Three types of hybridisation are possible: DNA/DNA, DNA/RNA, and RNA/RNA. The program first computes the hybridisation enthalpy and entropy from the elementary parameters of each Crick's pair by the nearest-neighbor method. Then the melting temperature is computed. The set of thermodynamic parameters can be easily changed, for instance following an experimental breakthrough.

Please cite: Le Novère, Nicolas: MELTING, computing the melting temperature of nucleic acid duplex. (PubMed,eprint) Bioinformatics 17(12):1226-1227 (2001)
Screenshots of package melting
Meryl
in- and out-of-core kmer counting and utilities
Versions of package meryl
ReleaseVersionArchitectures
stretch0~20150903+r2013-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0~20150903+r2013-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (46 upd.)*
Versions and Archs
License: DFSG free
Git

meryl computes the kmer content of genomic sequences. Kmer content is represented as a list of kmers and the number of times each occurs in the input sequences. The kmer can be restricted to only the forward kmer, only the reverse kmer, or the canonical kmer (lexicographically smaller of the forward and reverse kmer at each location). Meryl can report the histogram of counts, the list of kmers and their counts, or can perform mathematical and set operations on the processed data files.

This package is part of the Kmer suite.

Metastudent
predictor of Gene Ontology terms from protein sequence
Versions of package metastudent
ReleaseVersionArchitectures
jessie1.0.11-2all
stretch2.0.1-3all
sid2.0.1-3all
Popcon: 15 users (50 upd.)*
Versions and Archs
License: DFSG free
Svn

Often, only the sequence of a protein is known, but not its functions. Metastudent will try to predict missing functional annotations through homology searches (BLAST).

All predicted functions correspond to Gene Ontology (GO) terms from the Molecular Function (MFO), the Biological Process (BPO) and the Cellular Component Ontology (CCO) and are associated with a reliability score.

Please cite: Tobias Hamp, Rebecca Kassner, Stefan Seemayer, Esmeralda Vicedo, Christian Schaefer, Dominik Achten, Florian Auer, Ariane Boehm, Tatjana Braun, Maximilian Hecht, Mark Heron, Peter Hönigschmid, Thomas A. Hopf, Stefanie Kaufmann, Michael Kiening, Denis Krompass, Cedric Landerer, Yannick Mahlich, Manfred Roos and Burkhard Rost: Homology-based inference sets the bar high for protein function prediction.. (PubMed) BMC Bioinformatics 14(Suppl 3):S7 (2013)
Mhap
locality-sensitive hashing to detect long-read overlaps
Versions of package mhap
ReleaseVersionArchitectures
stretch1.6+dfsg-1all
sid2.1+dfsg-1all
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The MinHash Alignment Process (MHAP) MHAP (pronounced MAP) is a reference implementation of a probabilistic sequence overlapping algorithm. Designed to efficiently detect all overlaps between noisy long-read sequence data. It efficiently estimates Jaccard similarity by compressing sequences to their representative fingerprints composed on min-mers (minimum k-mer).

Please cite: Konstantin Berlin, Sergey Koren, Chen-Shan Chin, James P Drake, Jane M Landolin and Adam M Phillippy: Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. (PubMed) Nature Biotechnology 33(6):623–630 (2015)
Microbegps
explorative taxonomic profiling tool for metagenomic data
Versions of package microbegps
ReleaseVersionArchitectures
stretch1.0.0-2all
sid1.0.0-2all
Popcon: 1 users (46 upd.)*
Versions and Archs
License: DFSG free
Git

MicrobeGPS is a bioinformatics tool for the analysis of metagenomic sequencing data. The goal is to profile the composition of metagenomic communities as accurately as possible and present the results to the user in a convenient manner. One main focus is reliability: the tool calculates quality metrics for the estimated candidates and allows the user to identify false candidates easily.

Please cite: Martin S. Lindner and Bernhard Y. Renard: Metagenomic Profiling of Known and Unknown Microbes with MicrobeGPS. (PubMed,eprint) PLoS One 10(2):e0117711 (2015)
Microbiomeutil
Microbiome Analysis Utilities
Versions of package microbiomeutil
ReleaseVersionArchitectures
jessie20101212+dfsg-1all
stretch20101212+dfsg1-1all
sid20101212+dfsg1-1all
Popcon: 4 users (46 upd.)*
Versions and Archs
License: DFSG free
Git

The microbiomeutil package comes with the following utilities:

  • ChimeraSlayer: ChimeraSlayer for chimera detection.
  • NAST-iEr: NAST-based alignment tool.
  • WigeoN: A reimplementation of the Pintail 16S anomaly detection utility
  • RESOURCES: Reference 16S sequences and NAST-alignments that the tools above leverage.
Please cite: Brian J. Haas, Dirk Gevers, Ashlee M. Earl, Mike Feldgarden, Doyle V. Ward, Georgia Giannoukos, Dawn Ciulla, Diana Tabbaa, Sarah K. Highlander, Erica Sodergren, Barbara Methé, Todd Z. DeSantis, The Human Microbiome Consortium, Joseph F. Petrosino, Rob Knight and Bruce W. Birren: Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. (PubMed,eprint) Genome Research 21(3):494-504 (2011)
Minia
short-read biological sequence assembler
Versions of package minia
ReleaseVersionArchitectures
jessie1.6088-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.6906-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.6906-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

Short-read DNA sequence assembler based on a de Bruijn graph, capable of assembling a human genome on a desktop computer in a day.

The output of Minia is a set of contigs. Minia produces results of similar contiguity and accuracy to other de Bruijn assemblers (e.g. Velvet).

Please cite: Rayan Chikhi and Guillaume Rizk: Space-Efficient and Exact de Bruijn Graph Representation Based on a Bloom Filter.. (PubMed,eprint) Algorithms for Molecular Biology 8(1):22 (2013)
Miniasm
ultrafast de novo assembler for long noisy DNA sequencing reads
Versions of package miniasm
ReleaseVersionArchitectures
stretch0.2+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.2+dfsg-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Miniasm is an experimental very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final unitig sequences. Thus the per-base error rate is similar to the raw input reads.

Minimap
tool for approximate mapping of long biosequences such as DNA reads
Versions of package minimap
ReleaseVersionArchitectures
stretch0.2-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.2-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Minimap is an experimental tool to efficiently find multiple approximate mapping positions between two sets of long biological sequences, such as between DNA reads and reference genomes, between genomes and between long noisy reads. Minimap does not generate alignments as of now and because of this, it is usually tens of times faster than mainstream aligners. It does not replace mainstream aligners, but it can be useful when you want to quickly identify long approximate matches at moderate divergence among a huge collection of sequences. For this task, it is much faster than most existing tools.

Mipe
Tools to store PCR-derived data
Versions of package mipe
ReleaseVersionArchitectures
squeeze1.1-3all
wheezy1.1-4all
jessie1.1-4all
stretch1.1-5all
sid1.1-5all
Debtags of package mipe:
fieldbiology, biology:bioinformatics, biology:molecular
interfacecommandline
roledocumentation, program
scopeutility
useorganizing
works-with-formatxml
Popcon: 7 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

MIPE provides a standard format to exchange and/or storage of all information associated with PCR experiments using a flat text file. This will:

  • allow for exchange of PCR data between researchers/laboratories
  • enable traceability of the data
  • prevent problems when submitting data to dbSTS or dbSNP
  • enable the writing of standard scripts to extract data (e.g. a list of PCR primers, SNP positions or haplotypes for different animals)

Although this tool can be used for data storage, it's primary focus should be data exchange. For larger repositories, relational databases are more appropriate for storage of these data. The MIPE format could then be used as a standard format to import into and/or export from these databases.

Please cite: Jan Aerts and T. Veenendaal: MIPE - a XML-format to facilitate the storage and exchange of PCR-related data. Online Journal of Bioinformatics 6(2):114-120 (2005)
Mira-assembler
Whole Genome Shotgun and EST Sequence Assembler
Versions of package mira-assembler
ReleaseVersionArchitectures
wheezy3.4.0.1-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.0.2-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch4.9.6-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid4.9.6-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package mira-assembler:
roleprogram
Popcon: 8 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

The mira genome fragment assembler is a specialised assembler for sequencing projects classified as 'hard' due to high number of similar repeats. For expressed sequence tags (ESTs) transcripts, miraEST is specialised on reconstructing pristine mRNA transcripts while detecting and classifying single nucleotide polymorphisms (SNP) occurring in different variations thereof.

The assembler is routinely used for such various tasks as mutation detection in different cell types, similarity analysis of transcripts between organisms, and pristine assembly of sequences from various sources for oligo design in clinical microarray experiments.

The package provides the following executables: Binaries provided:

  • mira: for assembly of genome sequences
  • miramem: estimating memory needed to assemble projects.
  • mirabait: a "grep" like tool to select reads with kmers up to 256 bases.
  • miraconvert: is a tool to convert, extract and sometimes recalculate all kinds of data related to sequence assembly files.
Please cite: Bastien Chevreux, Thomas Pfisterer, Bernd Drescher, Albert J. Driesel, Werner E. G. Müller, Thomas Wetter and Sándor Suhai: Using the miraEST Assembler for Reliable and Automated mRNA Transcript Assembly and SNP Detection in Sequenced ESTs. (PubMed,eprint) Genome Research 14(6):1147-1159 (2004)
Mlv-smile
Find statistically significant patterns in sequences
Versions of package mlv-smile
ReleaseVersionArchitectures
wheezy1.47-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.47-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.47-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.47-4amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

Smile determines sequence motifs on the basis of a set of DNA, RNA or protein sequences.

  • No hard limit on the number of combinations of motifs to describe subsets of sequences.
  • The sequence alphabet may be specified.
  • The use of wildcards is supported.
  • Better determination of significance of motifs by simulation.
  • Introduction of a set of sequences with negative controls that should not match automatically determined motifs.
Mothur
sequence analysis suite for research on microbiota
Versions of package mothur
ReleaseVersionArchitectures
wheezy1.24.1-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.33.3+dfsg-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el
stretch1.37.6-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.37.6-1armel,armhf,kfreebsd-i386,mips,mips64el,mipsel,powerpc,s390x
sid1.38.0-1amd64,arm64,i386,kfreebsd-amd64,ppc64el
upstream1.38.1
Debtags of package mothur:
roleprogram
Popcon: 7 users (55 upd.)*
Newer upstream!
License: DFSG free
Git

Mothur seeks to develop a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community. It has incorporated the functionality of dotur, sons, treeclimber, s-libshuff, unifrac, and much more. In addition to improving the flexibility of these algorithms, a number of other features including calculators and visualization tools were added.

Please cite: Patrick D Schloss, Sarah L Westcott, Thomas Ryabin, Justine R Hall, Martin Hartmann, Emily B Hollister, Ryan A Lesniewski, Brian B Oakley, Donovan H Parks, Courtney J Robinson, Jason W Sahl, Blaz Stres, Gerhard G Thallinger, David J Van Horn and Carolyn F Weber: Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. (PubMed) Appl Environ Microbiol 75(23):7537-7541 (2009)
Mrbayes
Bayesian Inference of Phylogeny
Versions of package mrbayes
ReleaseVersionArchitectures
wheezy3.2.1+dfsg-1amd64,armel,armhf,i386,ia64,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.2.3+dfsg-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.2.6+dfsg-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid3.2.6+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 9 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

Bayesian inference of phylogeny is based upon a quantity called the posterior probability distribution of trees, which is the probability of a tree conditioned on the observations. The conditioning is accomplished using Bayes's theorem. The posterior probability distribution of trees is impossible to calculate analytically; instead, MrBayes uses a simulation technique called Markov chain Monte Carlo (or MCMC) to approximate the posterior probabilities of trees.

The package is enhanced by the following packages: mrbayes-doc
Please cite: Fredrik Ronquist, Maxim Teslenko, Paul van der Mark, Daniel L. Ayres, Aaron Darling, Sebastian Höhna, Bret Larget, Liang Liu, Marc A. Suchard and John P. Huelsenbeck: MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice across a Large Model Space. (PubMed,eprint) Systematic Biology (2012)
Screenshots of package mrbayes
Mummer
Efficient sequence alignment of full genomes
Versions of package mummer
ReleaseVersionArchitectures
squeeze3.22~dfsg-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy3.23~dfsg-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.23~dfsg-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.23~dfsg-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3.23+dfsg-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package mummer:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 18 users (75 upd.)*
Versions and Archs
License: DFSG free
Git

MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form. For example, MUMmer 3.0 can find all 20-basepair or longer exact matches between a pair of 5-megabase genomes in 13.7 seconds, using 78 MB of memory, on a 2.4 GHz Linux desktop computer. MUMmer can also align incomplete genomes; it handles the 100s or 1000s of contigs from a shotgun sequencing project with ease, and will align them to another set of contigs or a genome using the NUCmer program included with the system. If the species are too divergent for DNA sequence alignment to detect similarity, then the PROmer program can generate alignments based upon the six-frame translations of both input sequences.

Please cite: Stefan Kurtz, Adam Phillippy, Arthur L. Delcher, Michael Smoot, Martin Shumway, Corina Antonescu and Steven L. Salzberg: Versatile and open software for comparing large genomes. (PubMed) Genome Biology 5(2):R12 (2004)
Screenshots of package mummer
Murasaki
homology detection tool across multiple large genomes
Versions of package murasaki
ReleaseVersionArchitectures
stretch1.68.6-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.68.6-6amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 2 users (56 upd.)*
Versions and Archs
License: DFSG free
Git

Murasaki is a scalable and fast, language theory-based homology detection tool across multiple large genomes. It enable whole-genome scale multiple genome global alignments. Supports unlimited length gapped-seed patterns and unique TF-IDF based filtering.

Murasaki is an anchor alignment software, which is

  • exteremely fast (17 CPU hours for whole Human x Mouse genome (with 40 nodes: 52 wall minutes))
  • scalable (Arbitrarily parallelizable across multiple nodes using MPI. Even a single node with 16GB of ram can handle over 1Gbp of sequence.)
  • unlimited pattern length
  • repeat tolerant
  • intelligent noise reduction
Please cite: Kris Popendorf, Hachiya Tsuyoshi, Yasunori Osana and Yasubumi Sakakibara: Murasaki: A Fast, Parallelizable Algorithm to Find Anchors from Multiple Genomes. (PubMed,eprint) PLOS one 5(9):e12651 (2010)
Muscle
Multiple alignment program of protein sequences
Versions of package muscle
ReleaseVersionArchitectures
wheezy3.8.31-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.8.31-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.8.31+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3.8.31+dfsg-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
squeeze3.70+fix1-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
Debtags of package muscle:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 27 users (124 upd.)*
Versions and Archs
License: DFSG free
Git

MUSCLE is a multiple alignment program for protein sequences. MUSCLE stands for multiple sequence comparison by log-expectation. In the authors tests, MUSCLE achieved the highest scores of all tested programs on several alignment accuracy benchmarks, and is also one of the fastest programs out there.

Please cite: Robert C. Edgar: MUSCLE: multiple sequence alignment with high accuracy and high throughput. (PubMed,eprint) Nucleic Acids Research 32(5):1792-1797 (2004)
Screenshots of package muscle
Mustang
multiple structural alignment of proteins
Versions of package mustang
ReleaseVersionArchitectures
squeeze3.2.1-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy3.2.1-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.2.2-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.2.2-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3.2.2-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package mustang:
biologypeptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
useanalysing, comparing
works-with-formatplaintext
Popcon: 15 users (54 upd.)*
Versions and Archs
License: DFSG free
Svn

Mustang is an algorithm to align multiple protein structures. Given a set of PDB files, the program uses the spatial information in the Calpha atoms of the set to produce a sequence alignment. Based on a progressive pairwise heuristic the algorithm then proceeds through a number of refinement passes. Mustang reports the multiple sequence alignment and the corresponding superposition of structures.

The package is enhanced by the following packages: mustang-testdata
Please cite: Arun S. Konagurthu, James C. Whisstock, Peter J. Stuckey and Arthur M. Lesk: MUSTANG: A multiple structural alignment algorithm. (PubMed) Proteins: Structure, Function, and Bioinformatics 64(3):559-574 (2006)
Screenshots of package mustang
Nanopolish
consensus caller for nanopore sequencing data
Versions of package nanopolish
ReleaseVersionArchitectures
stretch0.4.0-1amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
sid0.4.0-1amd64,arm64,armel,i386,kfreebsd-amd64,kfreebsd-i386,mips64el,mipsel,ppc64el
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Nanopolish uses a signal-level hidden Markov model for consensus calling of nanopore genome sequencing data.

Nast-ier
NAST-based DNA alignment tool
Versions of package nast-ier
ReleaseVersionArchitectures
jessie20101212+dfsg-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch20101212+dfsg1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid20101212+dfsg1-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (48 upd.)*
Versions and Archs
License: DFSG free
Git

The NAST-iEr alignment utility aligns a single raw nucleotide sequence against one or more NAST formatted sequences.

The alignment algorithm involves global dynamic programming profile alignment to fixed (NAST-formatted) multiply aligned template sequences without any end-gap penalty.

NAST-iEr is part of the microbiomeutil suite.

The package is enhanced by the following packages: microbiomeutil-data
Please cite: Brian J. Haas, Dirk Gevers, Ashlee M. Earl, Mike Feldgarden, Doyle V. Ward, Georgia Giannoukos, Dawn Ciulla, Diana Tabbaa, Sarah K. Highlander, Erica Sodergren, Barbara Methé, Todd Z. DeSantis, The Human Microbiome Consortium, Joseph F. Petrosino, Rob Knight and Bruce W. Birren: Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. (PubMed,eprint) Genome Research 21(3):494-504 (2011)
Ncbi-blast+
next generation suite of BLAST sequence search tools
Versions of package ncbi-blast+
ReleaseVersionArchitectures
wheezy2.2.26-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.2.29-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid2.2.29-3hurd-i386
stretch2.3.0-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.3.0-1amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream2.4.0
Debtags of package ncbi-blast+:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
useanalysing
works-withbiological-sequence
Popcon: 55 users (71 upd.)*
Newer upstream!
License: DFSG free
Git

The Basic Local Alignment Search Tool (BLAST) is the most widely used sequence similarity tool. There are versions of BLAST that compare protein queries to protein databases, nucleotide queries to nucleotide databases, as well as versions that translate nucleotide queries or databases in all six frames and compare to protein databases or queries. PSI-BLAST produces a position-specific-scoring-matrix (PSSM) starting with a protein query, and then uses that PSSM to perform further searches. It is also possible to compare a protein or nucleotide query to a database of PSSM’s. The NCBI supports a BLAST web page at blast.ncbi.nlm.nih.gov as well as a network service.

Ncbi-epcr
Tool to test a DNA sequence for the presence of sequence tagged sites
Versions of package ncbi-epcr
ReleaseVersionArchitectures
squeeze2.3.12-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.3.12-1-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.3.12-1-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.3.12-1-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.3.12-1-3amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package ncbi-epcr:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usechecking, searching
works-with-formatplaintext
Popcon: 10 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

Electronic PCR (e-PCR) is computational procedure that is used to identify sequence tagged sites(STSs), within DNA sequences. e-PCR looks for potential STSs in DNA sequences by searching for subsequences that closely match the PCR primers and have the correct order, orientation, and spacing that could represent the PCR primers used to generate known STSs.

The new version of e-PCR implements a fuzzy matching strategy. To reduce likelihood that a true STS will be missed due to mismatches, multiple discontigous words may be used instead of a single exact word. Each of this word has groups of significant positions separated by 'wildcard' positions that are not required to match. In addition, it is also possible to allow gaps in the primer alignments.

The main motivation for implementing reverse searching (called Reverse e-PCR) was to make it feasible to search the human genome sequence and other large genomes. The new version of e-PCR provides a search mode using a query sequence against a sequence database.

Please cite: Schuler, Gregory D.: Sequence Mapping by Electronic PCR. (PubMed,eprint) Genome Research 7(5):541-550 (1997)
Ncbi-seg
tool to mask segments of low compositional complexity in amino acid sequences
Versions of package ncbi-seg
ReleaseVersionArchitectures
jessie0.0.20000620-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.0.20000620-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.0.20000620-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 8 users (46 upd.)*
Versions and Archs
License: DFSG free
Svn

ncbi-seg (a.k.a. SEG) is a program for identifying and masking segments of low compositional complexity in amino acid sequences.

ncbi-seg divides sequences into contrasting segments of low-complexity and high-complexity. Low-complexity segments defined by the algorithm represent "simple sequences" or "compositionally-biased regions".

This program is inappropriate for masking nucleotide sequences and, in fact, may strip some nucleotide ambiguity codes from nt. sequences as they are being read.

Please cite: John C. Wootton and Scott Federhen: Statistics of local complexity in amino acid sequences and sequence databases.. Computers & Chemistry 17:149-163 (1993)
Ncbi-tools-bin
NCBI libraries for biology applications (text-based utilities)
Versions of package ncbi-tools-bin
ReleaseVersionArchitectures
squeeze6.1.20090809-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy6.1.20120620-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie6.1.20120620-8amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch6.1.20120620-10amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid6.1.20120620-10amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package ncbi-tools-bin:
biologynuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
networkclient
roleprogram
sciencecalculation
scopeutility
useanalysing, calculating, converting, searching
works-withbiological-sequence
works-with-formatplaintext, xml
Popcon: 16 users (51 upd.)*
Versions and Archs
License: DFSG free
Git

This package includes various utilities distributed with the NCBI C SDK, including the development tools asntool and errhdr (formerly of libncbi6-dev). None of the programs in this package require X; you can find the X-based utilities in the ncbi-tools-x11 package. BLAST and related tools are in a separate package (blast2).

The package is enhanced by the following packages: mcl
Ncbi-tools-x11
NCBI libraries for biology applications (X-based utilities)
Versions of package ncbi-tools-x11
ReleaseVersionArchitectures
squeeze6.1.20090809-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy6.1.20120620-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie6.1.20120620-8amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch6.1.20120620-10amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid6.1.20120620-10amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package ncbi-tools-x11:
biologynuceleic-acids, peptidic
fieldbiology, biology:bioinformatics, biology:structural
interface3d, x11
networkclient
roleprogram
sciencevisualisation
scopeutility
uitoolkitmotif
useanalysing, calculating, editing, searching, viewing
x11application
Popcon: 68 users (31 upd.)*
Versions and Archs
License: DFSG free
Git

This package includes some X-based utilities distributed with the NCBI C SDK: Cn3D, Network Entrez, Sequin, ddv, and udv. These programs are not part of ncbi-tools-bin because they depend on several additional library packages.

Screenshots of package ncbi-tools-x11
Ncl-tools
tools to deal with NEXUS files
Versions of package ncl-tools
ReleaseVersionArchitectures
stretch2.1.18+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.1.18+dfsg-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The NEXUS Class Library is a C++ library for parsing NEXUS files.

The NEXUS file format is widely used in bioinformatics. Several popular phylogenetic programs such as Paup, MrBayes, Mesquite, and MacClade use this format.

Ncoils
coiled coil secondary structure prediction
Versions of package ncoils
ReleaseVersionArchitectures
squeeze2002-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2002-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2002-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2002-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2002-4amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 10 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

The program predicts the coiled coil secondary structure predictions from protein sequences. The algorithm was published in Lupas, van Dyke & Stock, Predicting coiled coils from protein sequences Science, 252, 1162-1164, 1991.

Please cite: Andrei Lupas, Marc Van Dyke and Jeff Stock: Predicting coiled coils from protein sequences. (PubMed) Science 252:1162-1164 (1991)
Neobio
computes alignments of amino acid and nucleotide sequences
Versions of package neobio
ReleaseVersionArchitectures
wheezy0.0.20030929-1all
wheezy-security0.0.20030929-1+deb7u2all
jessie0.0.20030929-1.1all
stretch0.0.20030929-2all
sid0.0.20030929-2all
Popcon: 5 users (46 upd.)*
Versions and Archs
License: DFSG free
Svn

Library and graphical user interface for pairwise sequence alignments. Implementation of the dynamic programming methods of Needleman & Wunsch (global alignment) and Smith & Waterman (local alignment).

Please cite: Maxime Crochemore, Gad M. Landau and Michal Ziv-Ukelson: A sub-quadratic sequence alignment algorithm for unrestricted cost matrices. :679-688 (2002)
Screenshots of package neobio
Njplot
phylogenetic tree drawing program
Versions of package njplot
ReleaseVersionArchitectures
squeeze2.3-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.4-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.4-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.4-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.4-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package njplot:
fieldbiology, biology:bioinformatics
interfacex11
roleprogram
scopeutility
uitoolkitmotif
useanalysing, editing, organizing, printing, viewing
works-withbiological-sequence
works-with-formatplaintext
x11application
Popcon: 60 users (31 upd.)*
Versions and Archs
License: DFSG free
Svn

NJplot is able to draw any dendrogram expressed in the Newick standard phylogenetic tree format (e.g., the format used by the Phylip package). NJplot is especially convenient for rooting the unrooted trees obtained from parsimony, distance or maximum likelihood tree-building methods.

Please cite: G. Perrière and M. Gouy: WWW-query: An on-line retrieval system for biological sequence banks. (PubMed) Biochimie 78(5):364–369 (1996)
Screenshots of package njplot
Norsnet
tool to identify unstructured loops in proteins
Versions of package norsnet
ReleaseVersionArchitectures
jessie1.0.17-1all
stretch1.0.17-2all
sid1.0.17-2all
Popcon: 8 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

NORSnet can distinguish between very long contiguous segments with non-regular secondary structure (NORS regions) and well-folded proteins.

NORSnet was trained on predicted information rather than on experimental data. This allows NORSnet to reach into regions in sequence space that are not covered by specialized disorder predictors. One disadvantage of this approach is that it is not optimal for the identification of the "average" disordered region.

NORSnet takes the following input, further described on norsnet(1):

  • a protein sequence in a FASTA file
  • secondary structure and solvent accessibility prediction by prof(1)
  • an HSSP file
  • flexible/rigid residues prediction by profbval(1)
Please cite: Avner Schlessinger, Jinfeng Liu and Burkhard Rost: Natively unstructured loops differ from other loops.. (PubMed,eprint) PLoS Comput Biol. 3:e140 (2007)
Norsp
predictor of non-regular secondary structure
Versions of package norsp
ReleaseVersionArchitectures
jessie1.0.6-1all
stretch1.0.6-2all
sid1.0.6-2all
Popcon: 9 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

NORSp is a publicly available predictor for disordered regions in proteins. Specifically, it predicts long regions with no regular secondary structure. Upon submission of a protein sequence, NORSp analyses the protein about its secondary structure, the presence of transmembrane helices and coiled-coils. It then returns the presence and position of disordered regions.

NORSp can be useful for biologists in several ways. For example, crystallographers can check whether their proteins contain NORS regions and make the decision about whether to proceed with the experiments since NORS proteins may be difficult to crystallise, as demonstrated by the their low occurrence in PDB. Biologists interested in protein structure-function relationship may also find it interesting to verify whether the protein-protein interaction sites coincide with NORS regions.

Please cite: Jinfeng Liu and Burkhard Rost: NORSp: Predictions of long regions without regular secondary structure.. (PubMed,eprint) Nucleic Acids Res 31(13):3833-3835 (2003)
Openms
package for LC/MS data management and analysis
Versions of package openms
ReleaseVersionArchitectures
jessie1.11.1-5all
sid2.0.0-4all
upstream2.0.1-sources
Popcon: 0 users (0 upd.)*
Newer upstream!
License: DFSG free
Vcs

OpenMS is a package for LC/MS data management and analysis. OpenMS offers an infrastructure for the development of mass spectrometry-related software and powerful 2D and 3D visualization solutions.

TOPP (the OpenMS proteomic pipeline) is a pipeline for the analysis of HPLC/MS data. It consists of a set of numerous small applications that can be chained together to create analysis pipelines tailored for a specific problem.

This package is a metapackage that depends on both the libopenms library package (libOpenMS and libOpenMS_GUI) and the OpenMS Proteomic Pipeline (topp) package.

Please cite: Marc Sturm, Andreas Bertsch, Clemens Gröpl, Andreas Hildebrandt, Rene Hussong, Eva Lange, Nico Pfeifer, Ole Schulz-Trieglaff, Alexandra Zerck, Knut Reinert and Oliver Kohlbacher: OpenMS – an Open-Source Software Framework for Mass Spectrometry. (PubMed,eprint) BMC Bioinformatics 9(163) (2008)
Screenshots of package openms
Paraclu
Parametric clustering of genomic and transcriptomic features
Versions of package paraclu
ReleaseVersionArchitectures
jessie9-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch9-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid9-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 6 users (47 upd.)*
Versions and Archs
License: DFSG free
Git

Paraclu finds clusters in data attached to sequences. It was first applied to transcription start counts in genome sequences, but it could be applied to other things too.

Paraclu is intended to explore the data, imposing minimal prior assumptions, and letting the data speak for itself.

One consequence of this is that paraclu can find clusters within clusters. Real data sometimes exhibits clustering at multiple scales: there may be large, rarefied clusters; and within each large cluster there may be several small, dense clusters.

Please cite: Martin C. Frith, Eivind Valen, Anders Krogh, Yoshihide Hayashizaki, Piero Carninci and Albin Sandelin: A code for transcription initiation in mammalian genomes. (eprint) Genome Research 18(1):1-12 (2008)
Parsinsert
Parsimonious Insertion of unclassified sequences into phylogenetic trees
Versions of package parsinsert
ReleaseVersionArchitectures
jessie1.04-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.04-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.04-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 6 users (48 upd.)*
Versions and Archs
License: DFSG free
Svn

ParsInsert efficiently produces both a phylogenetic tree and taxonomic classification for sequences for microbial community sequence analysis. This is a C++ implementation of the Parsimonious Insertion algorithm.

Pbalign
map Pacific Biosciences reads to reference DNA sequences
Versions of package pbalign
ReleaseVersionArchitectures
stretch0.2.0-1all
sid0.2.0-1all
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

pbalign aligns PacBio reads to reference sequences, filters aligned reads according to user-specific filtering criteria, and converts the output to either the SAM format or PacBio Compare HDF5 (e.g., .cmp.h5) format.

This package is part of the SMRTAnalysis suite.

Pbbarcode
annotate PacBio sequencing reads with barcode information
Versions of package pbbarcode
ReleaseVersionArchitectures
stretch0.8.0-2amd64,arm64,mips64el,ppc64el
sid0.8.0-2amd64,arm64,kfreebsd-amd64,mips64el,ppc64el
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The pbbarcode package provides tools for annotating PacBio sequencing reads with barcode information. Typically, pbbarcode is called in context of a SMRTPipe workflow as opposed to directly on the command line, however, users are encouraged to utilize the command-line utility directly, as more options are available.

This package is part of the SMRTAnalysis suite.

Pbdagcon
sequence consensus using directed acyclic graphs
Versions of package pbdagcon
ReleaseVersionArchitectures
stretch0~20151114+git1d12e13+ds-2amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
sid0~20151114+git1d12e13+ds-2amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

pbdagcon is a tool that implements DAGCon (Directed Acyclic Graph Consensus) which is a sequence consensus algorithm based on using directed acyclic graphs to encode multiple sequence alignment.

It uses the alignment information from blasr to align sequence reads to a "backbone" sequence. Based on the underlying alignment directed acyclic graph (DAG), it will be able to use the new information from the reads to find the discrepancies between the reads and the "backbone" sequences. A dynamic programming process is then applied to the DAG to find the optimum sequence of bases as the consensus. The new consensus can be used as a new backbone sequence to iteratively improve the consensus quality.

While the code is developed for processing PacBio(TM) raw sequence data, the algorithm can be used for general consensus purpose. Currently, it only takes FASTA input. For shorter read sequences, one might need to adjust the blasr alignment parameters to get the alignment string properly.

The code and the underlying graphical data structure have been used for some algorithm development prototyping including phasing reads and pre-assembly.

Pbgenomicconsensus
Pacific Biosciences variant and consensus caller
Versions of package pbgenomicconsensus
ReleaseVersionArchitectures
stretch2.0.0+20160420-2all
sid2.0.0+20160420-2all
Popcon: 0 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

The GenomicConsensus package provides Quiver, Pacific Biosciences' flagship consensus and variant caller. Quiver is an algorithm that finds the maximum likelihood template sequence given PacBio reads of the template. These reads are modeled using a conditional random field approach that prescribes a probability to a read given a template sequence. In addition to the base sequence of each read, Quiver uses several additional quality value covariates that the base caller provides.

This package is part of the SMRTAnalysis suite

Pbh5tools
tools for manipulating Pacific Biosciences HDF5 files
Versions of package pbh5tools
ReleaseVersionArchitectures
sid0.8.0+dfsg-4all
stretch0.8.0+dfsg-5all
sid0.8.0+dfsg-5all
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This package provides functionality for manipulating and extracting data from cmp.h5 and bas.h5 files produced by the Pacific Biosciences sequencers. cmp.h5 files contain alignment information while bas.h5 files contain base-call information.

This package is part of the SMRTAnalysis suite.

Pbhoney
genomic structural variation discovery
Versions of package pbhoney
ReleaseVersionArchitectures
stretch15.8.24+dfsg-1all
sid15.8.24+dfsg-1all
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

PBHoney is an implementation of two variant-identification approaches designed to exploit the high mappability of long reads (i.e., greater than 10,000 bp). PBHoney considers both intra-read discordance and soft-clipped tails of long reads to identify structural variants.

PBHoney is part of the PBSuite.

Pbjelly
genome assembly upgrading tool
Versions of package pbjelly
ReleaseVersionArchitectures
stretch15.8.24+dfsg-1all
sid15.8.24+dfsg-1all
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

PBJelly is a highly automated pipeline that aligns long sequencing reads (such as PacBio RS reads or long 454 reads in fasta format) to high-confidence draft assembles. PBJelly fills or reduces as many captured gaps as possible to produce upgraded draft genomes.

PBJelly is part of the PBSuite.

Pbsim
simulator for PacBio sequencing reads
Versions of package pbsim
ReleaseVersionArchitectures
stretch1.0.3-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.0.3-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

PacBio DNA sequencers produce two types of characteristic reads: CCS (short and low error rate) and CLR (long and high error rate), both of which could be useful for de novo assembly of genomes. PBSIM simulates those PacBio reads from a reference sequence by using either a model-based or sampling-based simulation. Simulated reads are useful, for example, when developing or evaluating sequence assemblers targeted at PacBio data.

Please cite: Yukiteru Ono, Kiyoshi Asai and Michiaki Hamada: PBSIM: PacBio reads simulator - toward accurate genome assembly. (PubMed,eprint) Bioinformatics 29(1):119-121 (2013)
Pbsuite
software for Pacific Biosciences sequencing data
Versions of package pbsuite
ReleaseVersionArchitectures
stretch15.8.24+dfsg-1all
sid15.8.24+dfsg-1all
Popcon: users ( upd.)*
Versions and Archs
License: DFSG free
Git

The PBSuite contains two projects created for analysis of Pacific Biosciences long-read sequencing data.

  • PBJelly - genome upgrading tool
  • PBHoney - structural variation discovery
Pdb2pqr
Preparation of protein structures for electrostatics calculations
Versions of package pdb2pqr
ReleaseVersionArchitectures
wheezy1.8-1amd64,armel,armhf,i386,ia64,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.9.0+dfsg-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.1.1+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.1.1+dfsg-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 16 users (80 upd.)*
Versions and Archs
License: DFSG free
Git

PDB2PQR is a Python software package that automates many of the common tasks of preparing structures for continuum electrostatics calculations. It thus provides a platform-independent utility for converting protein files in PDB format to PQR format. These tasks include:

  • Adding a limited number of missing heavy atoms to biomolecular structures
  • Determining side-chain pKas
  • Placing missing hydrogens
  • Optimizing the protein for favorable hydrogen bonding
  • Assigning charge and radius parameters from a variety of force fields

This package also includes PropKa, a tool to modify the protonation state of protein structures in the Protein Data Bank (PDB) format to match a given pKa value. It can also be used to refine NMR structures, which often yield inaccurate pKa values for some residues.

Please cite: Todd J Dolinsky, Paul Czodrowski, Hui Li, Jens E Nielsen, Jan H Jensen, Gerhard Klebe and Nathan A Baker: PDB2PQR: Expanding and upgrading automated preparation of biomolecular structures for molecular simulations. (PubMed,eprint) Nucleic Acids Research 35:W522-5 (2007)
Perlprimer
Graphical design of primers for PCR
Versions of package perlprimer
ReleaseVersionArchitectures
squeeze1.1.19-1all
wheezy1.1.21-1all
jessie1.1.21-2all
stretch1.1.21-2all
sid1.1.21-2all
Debtags of package perlprimer:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:molecular
interfacex11
networkclient
roleprogram
scopeutility
uitoolkittk
useanalysing
works-with-formatplaintext
x11application
Popcon: 65 users (32 upd.)*
Versions and Archs
License: DFSG free
Git

PerlPrimer is a free, open-source GUI application written in Perl that designs primers for standard Polymerase Chain Reaction (PCR), bisulphite PCR, real-time PCR (QPCR) and sequencing. It aims to automate and simplify the process of primer design.

If operated online, the tool nicely communicates with the Ensembl project for further insights into the gene structure, i.e., allowing for taking the location of exons and introns into account for the design of the primers. The sequences themselves can be retrieved, too.

Please cite: Marshall, Owen J.: PerlPrimer: cross-platform, graphical primer design for standard, bisulphite and real-time PCR. (PubMed,eprint) Bioinformatics 20(15):2471-2472 (2004)
Perm
efficient mapping of short reads with periodic spaced seeds
Versions of package perm
ReleaseVersionArchitectures
jessie0.4.0-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.4.0-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.4.0-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (46 upd.)*
Versions and Archs
License: DFSG free
Svn

PerM is a software package which was designed to perform highly efficient genome scale alignments for hundreds of millions of short reads produced by the ABI SOLiD and Illumina sequencing platforms. Today PerM is capable of providing full sensitivity for alignments within 4 mismatches for 50bp SOLID reads and 9 mismatches for 100bp Illumina reads.

Please cite: Yangho Chen, Tade Souaiaia and Ting Chen: PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds. (PubMed,eprint) Bioinformatics 25(19):2514-21 (2009)
Phipack
PHI test and other tests of recombination
Versions of package phipack
ReleaseVersionArchitectures
stretch0.0.20160614-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.0.20160614-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The PhiPack software package implements a few tests for recombination and can produce refined incompatibility matrices as well. Specifically, PHIPack implements the 'Pairwise Homoplasy Index', Maximum Chi2 and the 'Neighbour Similarity Score'. The program Phi can be run to produce a p-value of recombination within a data set and the program profile can be run to determine regions exhibiting strongest evidence mosaicism.

Please cite: Trevor C. Bruen, Hervé Philippe and David Bryant: A Simple and Robust Statistical Test for Detecting the Presence of Recombination. (PubMed,eprint) Genetics 172(4):2665-2681 (2006)
Phybin
binning/clustering newick trees by topology
Versions of package phybin
ReleaseVersionArchitectures
stretch0.3-1amd64,arm64,armel,armhf,i386,powerpc,ppc64el,s390x
sid0.3-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,powerpc,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

PhyBin is a simple command line tool that classifies a set of Newick tree files by their topology. The purpose of it is to take a large set of tree files and browse through the most common tree topologies.

It can do simple binning of identical trees or more complex clustering based on an all-to-all Robinson-Foulds distance matrix.

phybin produces output files that characterize the size and contents of each bin or cluster (including generating GraphViz-based visual representations of the tree topologies).

Please cite: Ryan R. Newton and Irene L.G. Newton: PhyBin: binning trees by topology. (PubMed,eprint) PeerJ 1:e187 (2013)
Phylip
package of programs for inferring phylogenies
Versions of package phylip
ReleaseVersionArchitectures
squeeze3.69-1 (non-free)amd64,armel,i386,ia64,mips,mipsel,powerpc,s390,sparc
wheezy3.69-1 (non-free)amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.696+dfsg-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.696+dfsg-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3.696+dfsg-3amd64,arm64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package phylip:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing
works-with-formatplaintext
Popcon: 17 users (65 upd.)*
Versions and Archs
License: DFSG free
Git

The PHYLogeny Inference Package is a package of programs for inferring phylogenies (evolutionary trees) from sequences. Methods that are available in the package include parsimony, distance matrix, and likelihood methods, including bootstrapping and consensus trees. Data types that can be handled include molecular sequences, gene frequencies, restriction sites, distance matrices, and 0/1 discrete characters.

Screenshots of package phylip
Phyml
Phylogenetic estimation using Maximum Likelihood
Versions of package phyml
ReleaseVersionArchitectures
stretch3.2.0+dfsg-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid3.2.0+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
squeeze20100123-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy20110919-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie20120412-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid20120412-2hurd-i386,kfreebsd-amd64,kfreebsd-i386
upstream3.2.20160530
Debtags of package phyml:
biologypeptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
useanalysing, comparing
works-withbiological-sequence
Popcon: 17 users (58 upd.)*
Newer upstream!
License: DFSG free
Git

PhyML is a software that estimates maximum likelihood phylogenies from alignments of nucleotide or amino acid sequences. It provides a wide range of options that were designed to facilitate standard phylogenetic analyses. The main strengths of PhyML lies in the large number of substitution models coupled to various options to search the space of phylogenetic tree topologies, going from very fast and efficient methods to slower but generally more accurate approaches. It also implements two methods to evaluate branch supports in a sound statistical framework (the non-parametric bootstrap and the approximate likelihood ratio test).

PhyML was designed to process moderate to large data sets. In theory, alignments with up to 4,000 sequences 2,000,000 character-long can be analyzed. In practice however, the amount of memory required to process a data set is proportional of the product of the number of sequences by their length. Hence, a large number of sequences can only be processed provided that they are short. Also, PhyML can handle long sequences provided that they are not numerous. With most standard personal computers, the “comfort zone” for PhyML generally lies around 3 to 500 sequences less than 2,000 character long.

This package also includes PhyTime.

Please cite: Stéphane Guindon: Bayesian estimation of divergence times from large sequence alignments. (PubMed,eprint) Molecular Biology and Evolution 27(8):1768-81 (2010)
Screenshots of package phyml
Phyutility
simple analyses or modifications on both phylogenetic trees and data matrices
Versions of package phyutility
ReleaseVersionArchitectures
jessie2.7.3-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.7.3-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.7.3-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (48 upd.)*
Versions and Archs
License: DFSG free
Git

Phyutility (fyoo-til-i-te) is a command line program that performs simple analyses or modifications on both trees and data matrices.

Currently it performs the following functions (to suggest another feature, submit an Issue and use the label Type-Enhancement) :

Trees

  • rerooting
  • pruning
  • type conversion
  • consensus
  • leaf stability
  • lineage movement
  • tree support

Data Matrices

  • concatenate alignments
  • genbank parsing
  • trimming alignments
  • search NCBI
  • fetch NCBI
Please cite: Stephen A. Smith and Casey W. Dunn: Phyutility: a phyloinformatics utility for trees, alignments, and molecular data. (PubMed,eprint) Bioinformatics 24(5):715-716 (2008)
Picard-tools
Command line tools to manipulate SAM and BAM files
Versions of package picard-tools
ReleaseVersionArchitectures
squeeze1.27-1all
wheezy1.46-1all
jessie1.113-1all
stretch2.5.0-gradle+dfsg-1all
sid2.5.0-gradle+dfsg-1all
Popcon: 16 users (78 upd.)*
Versions and Archs
License: DFSG free
Git

SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments. Picard Tools includes these utilities to manipulate SAM and BAM files:

 AddCommentsToBam                  FifoBuffer
 AddOrReplaceReadGroups            FilterSamReads
 BaitDesigner                      FilterVcf
 BamIndexStats                     FixMateInformation
 BamToBfq                          GatherBamFiles
 BedToIntervalList                 GatherVcfs
 BuildBamIndex                     GenotypeConcordance
 CalculateHsMetrics                IlluminaBasecallsToFastq
 CalculateReadGroupChecksum        IlluminaBasecallsToSam
 CheckIlluminaDirectory            LiftOverIntervalList
 CheckTerminatorBlock              LiftoverVcf
 CleanSam                          MakeSitesOnlyVcf
 CollectAlignmentSummaryMetrics    MarkDuplicates
 CollectBaseDistributionByCycle    MarkDuplicatesWithMateCigar
 CollectGcBiasMetrics              MarkIlluminaAdapters
 CollectHiSeqXPfFailMetrics        MeanQualityByCycle
 CollectIlluminaBasecallingMetrics MergeBamAlignment
 CollectIlluminaLaneMetrics        MergeSamFiles
 CollectInsertSizeMetrics          MergeVcfs
 CollectJumpingLibraryMetrics      NormalizeFasta
 CollectMultipleMetrics            PositionBasedDownsampleSam
 CollectOxoGMetrics                QualityScoreDistribution
 CollectQualityYieldMetrics        RenameSampleInVcf
 CollectRawWgsMetrics              ReorderSam
 CollectRnaSeqMetrics              ReplaceSamHeader
 CollectRrbsMetrics                RevertOriginalBaseQualitiesAndAddMateCigar
 CollectSequencingArtifactMetrics  RevertSam
 CollectTargetedPcrMetrics         SamFormatConverter
 CollectVariantCallingMetrics      SamToFastq
 CollectWgsMetrics                 ScatterIntervalsByNs
 CompareMetrics                    SortSam
 CompareSAMs                       SortVcf
 ConvertSequencingArtifactToOxoG   SplitSamByLibrary
 CreateSequenceDictionary          SplitVcfs
 DownsampleSam                     UpdateVcfSequenceDictionary
 EstimateLibraryComplexity         ValidateSamFile
 ExtractIlluminaBarcodes           VcfFormatConverter
 ExtractSequences                  VcfToIntervalList
 FastqToSam                        ViewSam
Piler
genomic repeat analysis
Versions of package piler
ReleaseVersionArchitectures
stretch0~20140707-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0~20140707-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

PILER (Parsimonious Inference of a Library of Elementary Repeats) searches a genome sequence for repetitive elements. It implements search algorithms that identify characteristic patterns of local alignments induced by certain classes of repeats.

Please cite: Robert C. Edgar and Eugene W. Myers: PILER: identification and classification of genomic repeats. (PubMed,eprint) Bioinformatics 21(suppl 1):i152-i158 (2005)
Placnet
Plasmid Constellation Network project
Versions of package placnet
ReleaseVersionArchitectures
stretch1.03-2all
sid1.03-2all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Placnet is a new tool for plasmid analysis in NGS projects. Placnet is optimized to work with Illumina sequences but it also works with 454, Iontorrent or any of the actual sequence technologies.

The input of placnet is a set of contigs and one or more SAM files with the mapping of the reads against the contigs. Placnet obtains a set of files, easily opened on Cytoscape software or other network tools.

Please cite: Val F. Lanza, María de Toro, M. Pilar Garcillán-Barcia, Azucena Mora, Jorge Blanco, Teresa M. Coque and Fernando de la Cruz: Plasmid Flux in Escherichia coli ST131 Sublineages, Analyzed by Plasmid Constellation Network (PLACNET), a New Method for Plasmid Reconstruction from Whole Genome Sequences. (PubMed,eprint) PLOS 10(12):e1004766 (2014)
Plasmidomics
draw plasmids and vector maps with PostScript graphics export
Versions of package plasmidomics
ReleaseVersionArchitectures
squeeze0.2.0-2all
wheezy0.2.0-2all
jessie0.2.0-3all
stretch0.2.0-4all
sid0.2.0-4all
Debtags of package plasmidomics:
fieldbiology, biology:molecular
interfacex11
roleprogram
scopeutility
uitoolkittk
works-withimage:vector
works-with-formatpostscript
x11application
Popcon: 7 users (46 upd.)*
Versions and Archs
License: DFSG free
Svn

Plasmidomics is written for easy drawing of plasmids and vector maps to use them in theses, presentations or other forms of publications. It natively supports PostScript as output format.

Screenshots of package plasmidomics
Plast
Parallel Local Sequence Alignment Search Tool
Versions of package plast
ReleaseVersionArchitectures
stretch2.3.1+dfsg-4amd64
sid2.3.1+dfsg-4amd64,kfreebsd-amd64
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

PLAST is a fast, accurate and NGS scalable bank-to-bank sequence similarity search tool providing significant accelerations of seeds- based heuristic comparison methods, such as the Blast suite of algorithms.

Relying on unique software architecture, PLAST takes full advantage of recent multi-core personal computers without requiring any additional hardware devices.

Please cite: Van Hoa Nguyen and Dominique Lavenier: PLAST: parallel local alignment search tool for database comparison. (PubMed,eprint) BMC Bioinformatics 10:329 (2009)
Plink
whole-genome association analysis toolset
Versions of package plink
ReleaseVersionArchitectures
squeeze1.07-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mipsel,powerpc,s390,sparc
wheezy1.07-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.07-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.07-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.07-6amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package plink:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
Popcon: 47 users (33 upd.)*
Versions and Archs
License: DFSG free
Svn

plink expects as input the data from SNP (single nucleotide polymorphism) chips of many individuals and their phenotypical description of a disease. It finds associations of single or pairs of DNA variations with a phenotype and can retrieve SNP annotation from an online source.

SNPs can evaluated individually or as pairs for their association with the disease phenotypes. The joint investigation of copy number variations is supported. A variety of statistical tests have been implemented.

Please note: The executable was renamed to plink1 because of a name clash. Please read more about this in /usr/share/doc/README.Debian.

Please cite: Shaun Purcell, Benjamin Neale, Kathe Todd-Brown, Lori Thomas, Manuel A. R. Ferreira, David Bender, Julian Maller, Pamela Sklar, Paul I. W. de Bakker, Mark J. Daly and Pak C. Sham: PLINK: a toolset for whole-genome association and population-based linkage analysis. (PubMed) American Journal of Human Genetics 81(3):559-75 (2007)
Plink1.9
whole-genome association analysis toolset
Versions of package plink1.9
ReleaseVersionArchitectures
stretch1.90~b3.36-160416-1amd64,armel,armhf,i386,mipsel
sid1.90~b3.36-160416-1amd64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mipsel
upstream160705
Popcon: 1 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

plink expects as input the data from SNP (single nucleotide polymorphism) chips of many individuals and their phenotypical description of a disease. It finds associations of single or pairs of DNA variations with a phenotype and can retrieve SNP annotation from an online source.

SNPs can evaluated individually or as pairs for their association with the disease phenotypes. The joint investigation of copy number variations is supported. A variety of statistical tests have been implemented.

plink1.9 is a comprehensive update of plink with new algorithms and new methods, faster and less memory consumer than the first plink.

Please note: The executable was renamed to plink1.9 because of a name clash. Please read more about this in /usr/share/doc/README.Debian.

Please cite: Christopher C. Chang, Carson C. Chow, Laurent C.A.M. Tellier, Shashaank Vattikuti, Shaun M. Purcell and James J. Lee: Second-generation PLINK: rising to the challenge of larger and richer datasets. (eprint) GigaScience 4(1):7 (2015)
Plip
fully automated protein-ligand interaction profiler
Versions of package plip
ReleaseVersionArchitectures
stretch1.3.1+dfsg-1all
sid1.3.1+dfsg-1all
upstream1.3.1a
Popcon: 1 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

The Protein-Ligand Interaction Profiler (PLIP) is a tool to analyze and visualize protein-ligand interactions in PDB files.

Features include:

  • Detection of eight different types of noncovalent interactions
  • Automatic detection of relevant ligands in a PDB file
  • Direct download of PDB structures from wwPDB server if valid PDB ID is given
  • Processing of custom PDB files containing protein-ligand complexes (e.g. from docking)
  • No need for special preparation of a PDB file, works out of the box
  • Atom-level interaction reports in rST and XML formats for easy parsing
  • Generation of PyMOL session files (.pse) for each pairing, enabling easy preparation of images for publications and talks
  • Rendering of preview image for each ligand and its interactions with the protein
Please cite: Sebastian Salentin, Sven Schreiber, V. Joachim Haupt, Melissa F. Adasme and Michael Schroeder: PLIP: fully automated protein–ligand interaction profiler. (eprint) Nucleic Acids Research (W1) (2015)
Poa
Partial Order Alignment for multiple sequence alignment
Versions of package poa
ReleaseVersionArchitectures
squeeze2.0+20060928-2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.0+20060928-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.0+20060928-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.0+20060928-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.0+20060928-4amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package poa:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
works-with-formatplaintext
Popcon: 16 users (54 upd.)*
Versions and Archs
License: DFSG free
Svn

POA is Partial Order Alignment, a fast program for multiple sequence alignment (MSA) in bioinformatics. Its advantages are speed, scalability, sensitivity, and the superior ability to handle branching / indels in the alignment. Partial order alignment is an approach to MSA, which can be combined with existing methods such as progressive alignment. POA optimally aligns a pair of MSAs and which therefore can be applied directly to progressive alignment methods such as CLUSTAL. For large alignments, Progressive POA is 10-30 times faster than CLUSTALW.

Screenshots of package poa
Populations
population genetic software
Versions of package populations
ReleaseVersionArchitectures
wheezy1.2.33+svn0120106-2.1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.2.33+svn0120106-2.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.2.33+svn0120106-2.1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.2.33+svn0120106-2.1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package populations:
roleprogram
uitoolkitqt
Popcon: 5 users (47 upd.)*
Versions and Archs
License: DFSG free

Populations is a population genetic software. It computes genetic distances between populations or individuals. It builds phylogenetic trees (NJ or UPGMA) with bootstrap values.

Screenshots of package populations
Poretools
toolkit for nanopore nucleotide sequencing data
Versions of package poretools
ReleaseVersionArchitectures
stretch0.5.1-1all
sid0.5.1-1all
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

poretools is a flexible toolkit for exploring datasets generated by nanopore sequencing devices from MinION for the purposes of quality control and downstream analysis. Poretools operates directly on the native FAST5 (a variant of the HDF5 standard) file format produced by ONT and provides a wealth of format conversion utilities and data exploration and visualization tools.

Please cite: Nicholas Loman and Aaron Quinlan: Poretools: a toolkit for analyzing nanopore sequence data. (PubMed,eprint) Bioinformatics 30(23):3399-3401 (2014)
Prank
Probabilistic Alignment Kit for DNA, codon and amino-acid sequences
Versions of package prank
ReleaseVersionArchitectures
jessie0.0.140110-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.0.150803-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.0.150803-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 15 users (65 upd.)*
Versions and Archs
License: DFSG free
Svn

PRANK is a probabilistic multiple alignment program for DNA, codon and amino-acid sequences. It's based on a novel algorithm that treats insertions correctly and avoids over-estimation of the number of deletion events. In addition, PRANK borrows ideas from maximum likelihood methods used in phylogenetics and correctly takes into account the evolutionary distances between sequences. Lastly, PRANK allows for defining a potential structure for sequences to be aligned and then, simultaneously with the alignment, predicts the locations of structural units in the sequences.

PRANK is a command-line program for UNIX-style environments but the same sequence alignment engine is implemented in the graphical program PRANKSTER. In addition to providing a user-friendly interface to those not familiar with Unix systems, PRANKSTER is an alignment browser for alignments saved in the HSAML format. The novel format allows for storing all the information generated by the aligner and the alignment browser is a convenient way to analyse and manipulate the data.

PRANK aims at an evolutionarily correct sequence alignment and often the result looks different from ones generated with other alignment methods. There are, however, cases where the different look is caused by violations of the method's assumptions. To understand why things may go wrong and how to avoid that, read this explanation of differences between PRANK and traditional progressive alignment methods.

Screenshots of package prank
Predictnls
prediction and analysis of protein nuclear localization signals
Versions of package predictnls
ReleaseVersionArchitectures
jessie1.0.20-1all
stretch1.0.20-3all
sid1.0.20-3all
Popcon: 9 users (50 upd.)*
Versions and Archs
License: DFSG free
Svn

predictnls is a method for the prediction and analysis of protein nuclear localization signals (NLS). In addition to reporting the positions of NLSs found, predictnls also gives short statistics.

Please cite: Murat Cokol, Rajesh Nair and Burkhard Rost: Finding nuclear localization signals.. (PubMed,eprint) EMBO reports 1(5):411-415 (2000)
Screenshots of package predictnls
Predictprotein
suite of protein sequence analysis tools
Versions of package predictprotein
ReleaseVersionArchitectures
jessie1.1.06-1all
stretch1.1.07-1all
sid1.1.07-1all
Popcon: 7 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

PredictProtein is a sequence analysis suite providing prediction of protein structure and function.

PredictProtein takes a protein sequence as input and provides the following per-residue, or whole protein annotations:

  • secondary structure
  • solvent accessibility
  • multiple sequence alignments
  • PROSITE sequence motifs
  • low-complexity regions
  • nuclear localisation signals
  • regions lacking regular structure (NORS)
  • unstructured loops
  • transmembrane helices
  • transmembrane beta barrels
  • coiled-coil regions
  • disulfide-bonds
  • disordered regions
  • B-value flexibility
  • protein-protein interaction sites
  • Gene Ontology terms
Please cite: Burkhrd Rost, Guy Yachdav and Jinfeng Liu: The PredictProtein server. (PubMed,eprint) Nucleic Acids Research 32(2):W321-W326 (2004)
Prime-phylo
bayesian estimation of gene trees taking the species tree into account
Versions of package prime-phylo
ReleaseVersionArchitectures
jessie1.0.11-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el
sid1.0.11-3amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 6 users (24 upd.)*
Versions and Archs
License: DFSG free
Git

PrIME (Probabilistic Integrated Models of Evolution) is a package supporting inference of evolutionary parameters in a Bayesian framework using Markov chain Monte Carlo simulation. A distinguishing feature of PrIME is that the species tree is taken into account when analyzing gene trees.

The input data to PrIME is a multiple sequence alignment in FASTA format and the output data contains trees in Newick format.

Please cite: Ö. Åkerborg, B. Sennblad, L. Arvestad and J. Lagergren: Simultaneous Bayesian gene tree reconstruction and reconciliation analysis. (PubMed,eprint) Proceedings of the National Academy of Sciences 106(14):5714-5719 (2009)
Primer3
tool to design flanking oligo nucleotides for DNA amplification
Versions of package primer3
ReleaseVersionArchitectures
squeeze1.1.4-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy2.2.3-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.3.6-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch2.3.7-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid2.3.7-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package primer3:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
works-with-formatplaintext
Popcon: 25 users (111 upd.)*
Versions and Archs
License: DFSG free
Git

Primer3 picks primers for Polymerase Chain Reactions (PCRs), considering as criteria oligonucleotide melting temperature, size, GC content and primer-dimer possibilities, PCR product size, positional constraints within the source sequence, and miscellaneous other constraints. All of these criteria are user-specifiable as constraints, and some are specifiable as terms in an objective function that characterizes an optimal primer pair.

Please cite: Steve Rozen and Helen J. Skaletsky: Primer3 on the WWW for general users and for biologist programmers. (PubMed,eprint) Methods Mol Biol. 132(3):365-86 (2000)
Screenshots of package primer3
Proalign
Probabilistic multiple alignment program
Versions of package proalign
ReleaseVersionArchitectures
jessie0.603-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.603-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.603-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 7 users (47 upd.)*
Versions and Archs
License: DFSG free
Svn

ProAlign performs probabilistic sequence alignments using hidden Markov models (HMM). It includes a graphical interface (GUI) allowing to (i) perform alignments of nucleotide or amino-acid sequences, (ii) view the quality of solutions, (iii) filter the unreliable alignment regions and (iv) export alignments to other software.

ProAlign uses a progressive method, such that multiple alignment is created stepwise by performing pairwise alignments in the nodes of a guide tree. Sequences are described with vectors of character probabilities, and each pairwise alignment reconstructs the ancestral (parent) sequence by computing the probabilities of different characters according to an evolutionary model.

Please cite: Ari Löytynoja and Michel C Milinkovitch: A hidden Markov model for progressive multiple alignment. (PubMed,