Debian Med Project
Help us to see Debian used by medical practitioners and biomedical researchers! Join us on the Salsa page.
Summary
Biology
Debian Med bioinformatics packages

This metapackage will install Debian packages for use in molecular biology, structural biology and other biological sciences.

Description

For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:

If you discover a project which looks like a good candidate for Debian Med to you, or if you have prepared an unofficial Debian package, please do not hesitate to send a description of that project to the Debian Med mailing list

Links to other tasks

Debian Med Biology packages

Official Debian packages with high relevance

abacas
close gaps in genomic alignments from short reads
Versions of package abacas
ReleaseVersionArchitectures
bullseye1.3.1-9all
jessie1.3.1-2all
stretch1.3.1-3all
sid1.3.1-9all
buster1.3.1-5all
bookworm1.3.1-9all
trixie1.3.1-9all
Debtags of package abacas:
roleprogram
Popcon: 3 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

ABACAS (Algorithm Based Automatic Contiguation of Assembled Sequences) intends to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence.

ABACAS uses MUMmer to find alignment positions and identify syntenies of assembled contigs against the reference. The output is then processed to generate a pseudomolecule taking overlapping contigs and gaps in to account. ABACAS generates a comparison file that can be used to visualize ordered and oriented contigs in ACT. Synteny is represented by red bars where colour intensity decreases with lower values of percent identity between comparable blocks. Information on contigs such as the orientation, percent identity, coverage and overlap with other contigs can also be visualized by loading the outputted feature file on ACT.

The package is enhanced by the following packages: abacas-examples
Please cite: Samuel Assefa, Thomas M. Keane, Thomas D. Otto, Chris Newbold and Matthew Berriman: ABACAS: algorithm-based automatic contiguation of assembled sequences. (PubMed,eprint) Bioinformatics 25(15):1968-1969 (2009)
Topics: Probes and primers
abpoa
adaptive banded Partial Order Alignment
Versions of package abpoa
ReleaseVersionArchitectures
sid1.5.2-2amd64,arm64,ppc64el
trixie1.5.2-2amd64,arm64,ppc64el
bookworm1.4.1-3amd64,arm64,ppc64el
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

abPOA is an extended version of Partial Order Alignment (POA) that performs adaptive banded dynamic programming (DP) with an SIMD implementation. abPOA can perform multiple sequence alignment (MSA) on a set of input sequences and generate a consensus sequence by applying the heaviest bundling algorithm to the final alignment graph.

abPOA can generate high-quality consensus sequences from error-prone long reads and offer significant speed improvement over existing tools.

abPOA supports three alignment modes (global, local, extension) and flexible scoring schemes that allow linear, affine and convex gap penalties. It right now supports SSE2/SSE4.1/AVX2 vectorization.

For more information please refer to the paper1 published in Bioinformatics.

Please cite: Yan Gao, Yongzhuang Liu, Yanmei Ma, Bo Liu, Yadong Wang and Yi Xing: abPOA: an SIMD-based C library for fast partial order alignment using adaptive band. Bioinformatics 37(15):2209–2211 (2021)
abyss
de novo, parallel, sequence assembler for short reads
Versions of package abyss
ReleaseVersionArchitectures
bullseye2.2.5+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2.3.5+dfsg-2amd64,arm64,mips64el,ppc64el,s390x
buster2.1.5-7amd64,arm64,armhf,i386
stretch-backports2.1.5-7~bpo9+1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.3.9-1amd64,arm64,mips64el,ppc64el,riscv64
stretch2.0.2-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie1.5.2-1 (non-free)amd64
trixie2.3.9-1amd64,arm64,mips64el,ppc64el,riscv64
Debtags of package abyss:
roleprogram
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ABySS is a de novo, parallel, sequence assembler that is designed for short reads. It may be used to assemble genome or transcriptome sequence data. Parallelization is achieved using MPI, OpenMP and pthread.

Please cite: Shaun D. Jackman, Benjamin P. Vandervalk, Hamid Mohamadi, Justin Chu, Sarah Yeo, S. Austin Hammond, Golnaz Jahesh, Hamza Khan, Lauren Coombe, Rene L. Warren and İnanç Birol: "ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter". (PubMed,eprint) Genome Research 27(5):768-777 (2017)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequence assembly
acedb-other
retrieval of DNA or protein sequences
Versions of package acedb-other
ReleaseVersionArchitectures
sid4.9.39+dfsg.02-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie4.9.39+dfsg.01-5amd64,armel,armhf,i386
trixie4.9.39+dfsg.02-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm4.9.39+dfsg.02-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye4.9.39+dfsg.02-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster4.9.39+dfsg.02-4amd64,arm64,armhf,i386
stretch4.9.39+dfsg.02-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Debtags of package acedb-other:
biologynuceleic-acids
fieldbiology, biology:bioinformatics
roleprogram
scopeutility
Popcon: 3 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This package collects all those smallish applications that acedb collects under its 'other' target of its Makefile.

efetch: presumably short for 'entry fetch' collects sequence information from common DNA and protein databases.

Please cite: L. D. Stein and J. Thierry-Mieg: AceDB: a genome database management system. Computing in Science and Engineering 1(3):44-52 (1999)
Registry entries: Bio.tools 
adapterremoval
rapid adapter trimming, identification, and read merging of gene sequences
Versions of package adapterremoval
ReleaseVersionArchitectures
bookworm2.3.3-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.3.3-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye2.3.1-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.3.3-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster2.2.3-1amd64,arm64,armhf,i386
stretch2.2.0-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
upstream2.3.4
Popcon: 2 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

This program searches for and removes remnant adapter sequences from High- Throughput Sequencing (HTS) data and (optionally) trims low quality bases from the 3' end of reads following adapter removal. AdapterRemoval can analyze both single end and paired end data, and can be used to merge overlapping paired-ended reads into (longer) consensus sequences. Additionally, the AdapterRemoval may be used to recover a consensus adapter sequence for paired-ended data, for which this information is not available.

The package is enhanced by the following packages: multiqc
Please cite: Mikkel Schubert, Stinus Lindgreen and Ludovic Orlando: AdapterRemoval v2: rapid adapter trimming, identification, and read merging. (PubMed,eprint) BMC Research Notes 9:88 (2016)
Registry entries: SciCrunch 
adun-core
Molecular Simulator
Versions of package adun-core
ReleaseVersionArchitectures
trixie0.81-14amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
sid0.81-14amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.81-14amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.81-13amd64,arm64,armhf,i386
stretch0.81-9amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye0.81-14amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 9 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

Adun is a biomolecular simulator that also includes data management and analysis capabilities. It was developed at the Computational Biophysics and Biochemistry Laboratory, a part of the Research Unit on Biomedical Informatics of the UPF.

This package contains the AdunCore program and the Adun server. If you want the graphical UI frontend, install the adun.app package.

Please cite: Michael A. Johnston, Ignacio Fdez. Galván and Jordi Villà-Freixa: Framework-based design of a new all-purpose molecular simulation application: The Adun simulator. (PubMed) J. Comp. Chem. 26(15):1647-1659 (2005)
aegean
integrated genome analysis toolkit
Versions of package aegean
ReleaseVersionArchitectures
stretch0.15.2+dfsg-1amd64,arm64,armel,armhf,i386,mipsel,ppc64el,s390x
bullseye0.16.0+dfsg-2amd64,arm64,armel,armhf,i386,mipsel,ppc64el,s390x
bookworm0.16.0+dfsg-2amd64,arm64,armel,armhf,i386,mipsel,ppc64el,s390x
sid0.16.0+dfsg-4amd64,arm64,armel,armhf,i386,ppc64el,s390x
trixie0.16.0+dfsg-4amd64,arm64,armel,armhf,i386,ppc64el,s390x
buster0.16.0+dfsg-1amd64,arm64,armhf,i386
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The AEGeAn Toolkit is designed for the Analysis and Evaluation of Genome Annotations. The toolkit includes a variety of analysis programs, e.g. for comparing distinct sets of gene structure annotations (ParsEval), computation of gene loci (LocusPocus) and more.

Please cite: Daniel S Standage and Volker P Brendel: ParsEval: parallel comparison and analysis of gene structure annotations.. (PubMed,eprint) BMC Bioinformatics 13(1):187 (2012)
Topics: Sequencing
aevol
digital genetics model to run Evolution Experiments in silico
Versions of package aevol
ReleaseVersionArchitectures
buster5.0-2amd64,arm64,armhf,i386
bullseye5.0+ds-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm5.0+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie5.0+ds-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid5.0+ds-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie4.4-1amd64,armel,armhf,i386
stretch4.4-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Aevol is a digital genetics model: populations of digital organisms are subjected to a process of selection and variation, which creates a Darwinian dynamics.

By modifying the characteristics of selection (e.g. population size, type of environment, environmental variations) or variation (e.g. mutation rates, chromosomal rearrangement rates, types of rearrangements, horizontal transfer), one can study experimentally the impact of these parameters on the structure of the evolved organisms. In particular, since Aevol integrates a precise and realistic model of the genome, it allows for the study of structural variations of the genome (e.g. number of genes, synteny, proportion of coding sequences).

The simulation platform comes along with a set of tools for analysing phylogenies and measuring many characteristics of the organisms and populations along evolution.

Please cite: Dusan Misevic, Antoine Frenoy, David P. Parsons and Francois Taddei: Effects of public good properties on the evolution of cooperation. (eprint) :218-225 (2012)
alien-hunter
Interpolated Variable Order Motifs to identify horizontally acquired DNA
Versions of package alien-hunter
ReleaseVersionArchitectures
sid1.7-10all
stretch1.7-5all
bullseye1.7-8all
buster1.7-7all
trixie1.7-10all
bookworm1.7-10all
jessie1.7-3all
Debtags of package alien-hunter:
fieldbiology, biology:structural
roleprogram
scopeutility
useanalysing
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Alien_hunter is an application for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs). An IVOM approach exploits compositional biases using variable order motif distributions and captures more reliably the local composition of a sequence compared to fixed-order methods. Optionally the predictions can be parsed into a 2-state 2nd order Hidden Markov Model (HMM), in a change-point detection framework, to optimize the localization of the boundaries of the predicted regions. The predictions (embl format) can be automatically loaded into Artemis genome viewer freely available at: http://www.sanger.ac.uk/Software/Artemis/.

Please cite: Georgios S. Vernikos and Julian Parkhill: Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands. (PubMed,eprint) Bioinformatics 22(18):2196-2203 (2006)
Registry entries: SciCrunch 
alter-sequence-alignment
genomic sequences ALignment Transformation EnviRonment
Versions of package alter-sequence-alignment
ReleaseVersionArchitectures
bookworm1.3.4-6all
stretch1.3.3+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie1.3.4-8all
sid1.3.4-8all
buster1.3.4-2all
bullseye1.3.4-4all
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

ALTER (ALignment Transformation EnviRonment) is a tool to transform between multiple sequence alignment formats. ALTER focuses on the specifications of mainstream alignment and analysis programs rather than on the conversion among more or less specific formats.

Please cite: Daniel Glez-Peña, Daniel Gómez-Blanco, Miguel Reboiro-Jato, Florentino Fdez-Riverola and David Posada: ALTER: program-oriented conversion of DNA and protein alignments". (PubMed,eprint) Nucl. Acids Res. 38(suppl 2):W14-W18 (2010)
Registry entries: Bio.tools  SciCrunch 
altree
program to perform phylogeny-based association and localization analysis
Versions of package altree
ReleaseVersionArchitectures
jessie1.3.1-2amd64,armel,armhf,i386
bookworm1.3.2-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.3.1-10amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.3.2-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.3.1-7amd64,arm64,armhf,i386
stretch1.3.1-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid1.3.2-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Debtags of package altree:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram, shared-lib
scopeutility
useanalysing, comparing
works-with-formatplaintext
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ALTree was designed to perform association detection and localization of susceptibility sites using haplotype phylogenetic trees: first, it allows the detection of an association between a candidate gene and a disease, and second, it enables to make hypothesis about the susceptibility loci.

Please cite: Claire Bardel, Vincent Danjean and Emmanuelle Genin: ALTree: association detection and localization of susceptibility sites using haplotype phylogenetic trees. (PubMed,eprint) Bioinformatics 22(11):1402-1403 (2006)
Registry entries: SciCrunch 
amap-align
Protein multiple alignment by sequence annealing
Versions of package amap-align
ReleaseVersionArchitectures
bookworm2.2+git20080214.600fc29+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster2.2+git20080214.600fc29+dfsg-1amd64,arm64,armhf,i386
stretch2.2-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid2.2+git20080214.600fc29+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie2.2-4amd64,armel,armhf,i386
trixie2.2+git20080214.600fc29+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye2.2+git20080214.600fc29+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package amap-align:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing
works-with-formatplaintext
Popcon: 13 users (10 upd.)*
Versions and Archs
License: DFSG free
Git

AMAP is a command line tool to perform multiple alignment of peptidic sequences. It utilizes posterior decoding, and a sequence-annealing alignment, instead of the traditional progressive alignment method. It is the only alignment program that allows one to control the sensitivity / specificity tradeoff. It is based on the ProbCons source code, but uses alignment metric accuracy and eliminates the consistency transformation.

The Java visualisation tool of AMAP 2.2 is not yet packaged in Debian.

Please cite: Ariel S. Schwartz and Lior Pachter: Multiple alignment by sequence annealing. (eprint) Bioinformatics 23(2):e24-e29 (2007)
Registry entries: SciCrunch  Bioconda 
Remark of Debian Med team: Dead upstream

The homepage of this project vanished as well as the Download area. An old unmaintained version remained at code.google.com. Please drop the maintainer a note if you have any news of this project.

ampliconnoise
removal of noise from 454 sequenced PCR amplicons
Versions of package ampliconnoise
ReleaseVersionArchitectures
jessie1.29-2amd64,armel,armhf,i386
sid1.29-13amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.29-10amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.29-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.29-8amd64,arm64,armhf,i386
stretch1.29-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Debtags of package ampliconnoise:
roleprogram
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

AmpliconNoise is a package of applications to clean up high-throughput sequence data. It consists of three main parts:

Pyronoise - does flowgram-based clustering to spot misreads SeqNoise - removes PCR point mutations Perseus - removes PCR chimeras without the need for a set of reference sequences

Previously there was a standalone "Pyronoise" by the same authors and this package includes an updated version. There is also a "Denoiser" in Qiime which is related but distinct.

Please cite: Christopher Quince, Anders Lanzen, Russell J Davenport and Peter J Turnbaugh: Removing Noise From Pyrosequenced Amplicons. (PubMed,eprint) BMC Bioinformatics 12:38 (2011)
Registry entries: Bio.tools  SciCrunch 
Topics: Sequencing
andi
Efficient Estimation of Evolutionary Distances
Versions of package andi
ReleaseVersionArchitectures
bullseye0.13-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.14-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.12-4amd64,arm64,armhf,i386
stretch0.10-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid0.14-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.14-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

This is the andi program for estimating the evolutionary distance between closely related genomes. These distances can be used to rapidly infer phylogenies for big sets of genomes. Because andi does not compute full alignments, it is so efficient that it scales even up to thousands of bacterial genomes.

Please cite: Bernhard Haubold, Fabian Klötzl and Peter Pfaffelhuber: andi: Fast and accurate estimation of evolutionary distances between closely related genomes. (PubMed,eprint) Bioinformatics 31(8):1169-1175 (2015)
Registry entries: Bio.tools  Bioconda 
Topics: Phylogenetics
anfo
Short Read Aligner/Mapper from MPG
Versions of package anfo
ReleaseVersionArchitectures
stretch0.98-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid0.98-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
bullseye0.98-8amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie0.98-4amd64,armel,armhf,i386
buster0.98-7amd64,arm64,armhf,i386
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Anfo is a mapper in the spirit of Soap/Maq/Bowtie, but its implementation takes more after BLAST/BLAT. It's most useful for the alignment of sequencing reads where the DNA sequence is somehow modified (think ancient DNA or bisulphite treatment) and/or there is more divergence between sample and reference than what fast mappers will handle gracefully (say the reference genome is missing and a related species is used instead).

Registry entries: SciCrunch 
Topics: Sequencing
any2fasta
convert various sequence formats to FASTA
Versions of package any2fasta
ReleaseVersionArchitectures
bookworm0.4.2-2all
trixie0.4.2-2all
sid0.4.2-2all
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Established tools like readseq and seqret from EMBOSS, both create mangled IDs containing | or . characters, and there is no way to fix this behaviour. This resultes in inconsitences between .gbk and .fna versions of files in pipelines.

This script uses only core Perl modules, has no other dependencies like Bioperl or Biopython, and runs very quickly.

It supports the following input formats:

 1. Genbank flat file, typically .gb, .gbk, .gbff (starts with LOCUS)
 2. EMBL flat file, typically .embl, (starts with ID)
 3. GFF with sequence, typically .gff, .gff3 (starts with ##gff)
 4. FASTA DNA, typically .fasta, .fa, .fna, .ffn (starts with >)
 5. FASTQ DNA, typically .fastq, .fq (starts with @)
 6. CLUSTAL alignments, typically .clw, .clu (starts with CLUSTAL or MUSCLE)
 7. STOCKHOLM alignments, typically .sth (starts with # STOCKHOLM)
 8. GFA assembly graph, typically .gfa (starts with ^[A-Z]\t)

Files may be compressed with:

 1. gzip, typically .gz
 2. bzip2, typically .bz2
 3. zip, typically .zip
Registry entries: Bioconda 
aragorn
tRNA and tmRNA detection in nucleotide sequences
Versions of package aragorn
ReleaseVersionArchitectures
trixie1.2.41-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.2.41-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.2.38-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.2.38-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.2.38-2amd64,arm64,armhf,i386
stretch1.2.38-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie1.2.36-4amd64,armel,armhf,i386
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

The program employs heuristic algorithms to predict tRNA secondary structure, based on homology with recognized tRNA consensus sequences and ability to form a base-paired cloverleaf. tmRNA genes are identified using a modified version of the BRUCE program.

Please cite: Dean Laslett and Bjorn Canback: ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. (PubMed,eprint) Nucleic Acids Research 32(1):11-16 (2004)
Registry entries: SciCrunch  Bioconda 
Topics: Functional, regulatory and non-coding RNA
arden
specificity control for read alignments using an artificial reference
Versions of package arden
ReleaseVersionArchitectures
trixie1.0-5all
jessie1.0-1amd64,armel,armhf,i386
stretch1.0-3all
sid1.0-5all
bookworm1.0-5all
bullseye1.0-5all
buster1.0-4all
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

ARDEN (Artificial Reference Driven Estimation of false positives in NGS data) is a novel benchmark that estimates error rates based on real experimental reads and an additionally generated artificial reference genome. It allows the computation of error rates specifically for a dataset and the construction of a ROC-curve. Thereby, it can be used to optimize parameters for read mappers, to select read mappers for a specific problem or also to filter alignments based on quality estimation.

Please cite: Sven H. Giese, Franziska Zickmann and Bernhard Y. Renard: Specificity control for read alignments using an artificial reference genome-guided false discovery rate. (PubMed,eprint) Bioinformatics 30(1):9-16 (2013)
Registry entries: SciCrunch 
Topics: Sequencing
ariba
Antibiotic Resistance Identification By Assembly
Versions of package ariba
ReleaseVersionArchitectures
buster2.13.3+ds-1amd64
stretch-backports2.13.3+ds-1~bpo9+1amd64
stretch2.6.1+ds-1amd64
sid2.14.7+ds-2amd64,arm64,mips64el,ppc64el,riscv64
trixie2.14.7+ds-2amd64,arm64,mips64el,ppc64el
bookworm2.14.6+ds-5amd64,arm64,mips64el,ppc64el
bullseye2.14.6+ds-1amd64,arm64,mips64el,ppc64el
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

ARIBA is a tool that identifies antibiotic resistance genes by running local assemblies. The input is a FASTA file of reference genes and paired sequencing reads. ARIBA reports which of the reference genes were found, plus detailed information on the quality of the assemblies and any variants between the sequencing reads and the reference genes.

Please cite: Martin Hunt, Alison E. Mather, Leonor Sanchez-Buso, Andrew J. Page, Julian Parkhill, Jacqueline A. Keane and Simon R. Harris: ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads. (PubMed,eprint) Microbial Genomics 3 (2017)
Registry entries: SciCrunch  Bioconda 
art-nextgen-simulation-tools
simulation tools to generate synthetic next-generation sequencing reads
Versions of package art-nextgen-simulation-tools
ReleaseVersionArchitectures
sid20160605+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch20160605+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster20160605+dfsg-3amd64,arm64,armhf,i386
bullseye20160605+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm20160605+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie20160605+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ART is a set of simulation tools to generate synthetic next-generation sequencing reads. ART simulates sequencing reads by mimicking real sequencing process with empirical error models or quality profiles summarized from large recalibrated sequencing data. ART can also simulate reads using user own read error model or quality profiles. ART supports simulation of single-end, paired-end/mate-pair reads of three major commercial next-generation sequencing platforms: Illumina's Solexa, Roche's 454 and Applied Biosystems' SOLiD. ART can be used to test or benchmark a variety of method or tools for next-generation sequencing data analysis, including read alignment, de novo assembly, SNP and structure variation discovery. ART was used as a primary tool for the simulation study of the 1000 Genomes Project . ART is implemented in C++ with optimized algorithms and is highly efficient in read simulation. ART outputs reads in the FASTQ format, and alignments in the ALN format. ART can also generate alignments in the SAM alignment or UCSC BED file format. ART can be used together with genome variants simulators (e.g. VarSim) for evaluating variant calling tools or methods.

Please cite: Weichun Huang, Leping Li, Jason R. Myers and Gabor T. Marth: ART: a next-generation sequencing read simulator. (PubMed,eprint) Bioinformatics 28(4):593-594 (2012)
Registry entries: SciCrunch  Bioconda 
artemis
genome browser and annotation tool
Versions of package artemis
ReleaseVersionArchitectures
stretch16.0.17+dfsg-1all
sid18.2.0+dfsg-4all
trixie18.2.0+dfsg-4all
buster17.0.1+dfsg-2amd64
bullseye18.1.0+dfsg-3amd64
bookworm18.2.0+dfsg-3all
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Artemis is a genome browser and annotation tool that allows visualisation of sequence features, next generation data and the results of analyses within the context of the sequence, and also its six-frame translation.

This package includes the Artemis genome browser, the Artemis Comparison Tool (ACT), and the DNAplotter and BamView utilities.

Please cite: Tim Carver, Simon R. Harris, Matthew Berriman, Julian Parkhill and Jacqueline A. McQuillan: Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. (PubMed,eprint) Bioinformatics 28(4):464-469 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Genomics
Screenshots of package artemis
artfastqgenerator
outputs artificial FASTQ files derived from a reference genome
Versions of package artfastqgenerator
ReleaseVersionArchitectures
sid0.0.20150519-5all
stretch0.0.20150519-2all
buster0.0.20150519-3all
bullseye0.0.20150519-4all
bookworm0.0.20150519-4all
trixie0.0.20150519-5all
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

ArtificialFastqGenerator takes the reference genome (in FASTA format) as input and outputs artificial FASTQ files in the Sanger format. It can accept Phred base quality scores from existing FASTQ files, and use them to simulate sequencing errors. Since the artificial FASTQs are derived from the reference genome, the reference genome provides a gold-standard for calling variants (Single Nucleotide Polymorphisms (SNPs) and insertions and deletions (indels)). This enables evaluation of a Next Generation Sequencing (NGS) analysis pipeline which aligns reads to the reference genome and then calls the variants.

Please cite: Matthew Frampton and Richard Houlston: Generation of Artificial FASTQ Files to Evaluate the Performance of Next-Generation Sequencing Pipelines. (PubMed,eprint) PLOSone 7(11):e49110 (2012)
assembly-stats
get assembly statistics from FASTA and FASTQ files
Versions of package assembly-stats
ReleaseVersionArchitectures
bullseye1.0.1+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0.1+ds-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0.1+ds-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.0.1+ds-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Get statistics from a list of files.

Detection of FASTA or FASTQ format of each file is automatic from the file contents, so file names and extensions are irrelevant.

The default output format is human readable. You can change the output format and ignore sequences shorter than a given length.

Registry entries: Bioconda 
assemblytics
detect and analyze structural variants from a genome assembly
Versions of package assemblytics
ReleaseVersionArchitectures
bookworm1.2.1+dfsg-1all
trixie1.2.1+dfsg-2all
sid1.2.1+dfsg-2all
bullseye1.0+ds-2all
buster1.0+ds-1all
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Assemblytics incorporates a unique anchor filtering approach to increase robustness to repetitive elements, and identifies six classes of variants based on their distinct alignment signatures. Assemblytics can be applied both to comparing aberrant genomes, such as human cancers, to a reference, or to identify differences between related species.

Please cite: Maria Nattestad and Michael C. Schatz: Assemblytics: a web analytics tool for the detection of variants from an assembly. (PubMed) Bioinformatics 32(19):3021-3023 (2016)
atac
genome assembly-to-assembly comparison
Versions of package atac
ReleaseVersionArchitectures
trixie0~20150903+r2013-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0~20150903+r2013-8amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch0~20150903+r2013-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid0~20150903+r2013-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster0~20150903+r2013-6amd64,arm64,armhf,i386
bookworm0~20150903+r2013-8amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 4 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

atac computes a one-to-one pairwise alignment of large DNA sequences. It first finds the unique k-mers in each sequence, chains them to larger blocks, and fills in spaces between blocks. It was written primarily to transfer annotations between different assemblies of the human genome.

The output is a set of ungapped 'matches', and a set of gapped 'runs' formed from the matches. Each match or run associates one sequence with the other sequence. The association is 'unique', in that there is no other (sizeable) associations for either sequence. Thus, large repeats and duplications are not present in the output - they appear as unmapped regions.

Though the output is always pairwise, atac can cache intermediate results to speed a comparisons of multiple sequences.

This package is part of the Kmer suite.

The package is enhanced by the following packages: kmer-examples
Please cite: B. Walenz and L. Florea: Sim4db and leaff: Utilities for fast batched spliced alignment and sequence indexing. (PubMed) Bioinformatics 27(13):1869-1870 (2011)
Registry entries: Bio.tools  SciCrunch  Bioconda 
ataqv
ATAC-seq QC and visualization
Versions of package ataqv
ReleaseVersionArchitectures
sid1.3.1+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bullseye1.2.1+ds-1amd64,arm64,armhf,i386,mips64el,mipsel,ppc64el
trixie1.3.1+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bookworm1.3.0+ds-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

A toolkit for measuring and comparing ATAC-seq results, made in the Parker lab at the University of Michigan. They wrote it to help understand how well their ATAC-seq assays had worked, and to make it easier to spot differences that might be caused by library prep or sequencing.

Please cite: Peter Orchard, Yasuhiro Kyono, John Hensley, Jacob O. Kitzman and Stephen C.J. Parker: Quantification, Dynamic Visualization, and Validation of Bias in ATAC-Seq Data with ataqv. (eprint) Cell Systems 10(3):2405-4712 (2020)
atropos
NGS read trimming tool that is specific, sensitive, and speedy
Versions of package atropos
ReleaseVersionArchitectures
trixie1.1.32+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bookworm1.1.31+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
bullseye1.1.29+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
sid1.1.32+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Atropos is tool for specific, sensitive, and speedy trimming of NGS reads. It is a fork of the venerable Cutadapt read trimmer, with the primary improvements being:

  1. Multi-threading support, including an extremely fast "parallel
     write" mode.
  2. Implementation of a new insert alignment-based trimming algorithm
     for paired-end reads that is substantially more sensitive and
     specific than the original Cutadapt adapter alignment-based
     algorithm. This algorithm can also correct mismatches between the
     overlapping portions of the reads.
  3. Options for trimming specific types of data (miRNA, bisulfite-seq).
  4. A new command ('detect') that will detect adapter sequences and
     other potential contaminants.
  5. A new command ('error') that will estimate the sequencing error
     rate, which helps to select the appropriate adapter- and quality-
     trimming parameter values.
  6. A new command ('qc') that generates read statistics similar to
     FastQC. The trim command can also compute read statistics both
     before and after trimming (using the '--stats' option).
  7. Improved summary reports, including support for serialization
     formats (JSON, YAML, pickle), support for user-defined templates
     (via the optional Jinja2 dependency), and integration with MultiQC.
  8. The ability to merge overlapping reads (this is experimental and
     the functionality is limited).
  9. The ability to write the summary report and log messages to
     separate files.
 10. The ability to read SAM/BAM files and read/write interleaved
     FASTQ files.
 11. Direct trimming of reads from an SRA accession.
 12. A progress bar, and other minor usability enhancements.
Please cite: John P. Didion, Marcel Martin and Francis S. Collins: Atropos: specific, sensitive, and speedy trimming of sequencing reads. (PubMed,eprint) PeerJ 5:e3720 (2017)
Registry entries: Bio.tools  Bioconda 
augur
pipeline components for real-time virus analysis
Versions of package augur
ReleaseVersionArchitectures
trixie24.4.0-1all
sid24.4.0-1all
bookworm20.0.0-1all
buster-backports6.4.2-2~bpo10+1all
bullseye11.0.0-1all
upstream25.4.0
Popcon: 1 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

The nextstrain project is an attempt to make flexible informatic pipelines and visualization tools to track ongoing pathogen evolution as sequence data emerges. The nextstrain project derives from nextflu, which was specific to influenza evolution.

nextstrain is comprised of three components:

  • fauna: database and IO scripts for sequence and serological data
  • augur: informatic pipelines to conduct inferences from raw data
  • auspice: web app to visualize resulting inferences
augustus
gene prediction in eukaryotic genomes
Versions of package augustus
ReleaseVersionArchitectures
bookworm3.5.0+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster3.3.2+dfsg-2amd64,arm64,armhf
stretch3.2.3+dfsg-1amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
sid3.5.0+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie3.5.0+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye3.4.0+dfsg2-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

AUGUSTUS is a software for gene prediction in eukaryotic genomic sequences that is based on a generalized hidden Markov model (HMM), a probabilistic model of a sequence and its gene structure. After learning gene structures from a reference annotation, AUGUSTUS uses the HMM to recognize genes in a new sequence and annotates it with the regions of identified genes. External hints, e.g. from RNA sequencing, EST or protein alignments etc. can be used to guide and improve the gene finding process. The result is the set of most likely gene structures that comply with all given user constraints, if such gene structures exist. AUGUSTUS already includes prebuilt HMMs for many species, as well as scripts to train custom models using annotated genomes.

Please cite: Stefanie König, Lars Romoth, Lizzy Gerischer and Mario Stanke: Simultaneous gene finding in multiple genomes. (PubMed,eprint) Bioinformatics 32(22):3388-3395 (2016)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Gene transcripts; Gene and protein families
autodock
analysis of ligand binding to protein structure
Versions of package autodock
ReleaseVersionArchitectures
stretch4.2.6-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid4.2.6-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie4.2.6-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm4.2.6-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye4.2.6-8amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster4.2.6-6amd64,arm64,armhf,i386
jessie4.2.6-2amd64,armel,armhf,i386
Debtags of package autodock:
fieldbiology, biology:structural
interfacecommandline
roleprogram
scopeutility
useanalysing
works-with3dmodel
Popcon: 8 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

AutoDock is a prime representative of the programs addressing the simulation of the docking of fairly small chemical ligands to rather big protein receptors. Earlier versions had all flexibility in the ligands while the protein was kept rather ridgid. This latest version 4 also allows for a flexibility of selected sidechains of surface residues, i.e., takes the rotamers into account.

The AutoDock program performs the docking of the ligand to a set of grids describing the target protein. AutoGrid pre-calculates these grids.

The package is enhanced by the following packages: autogrid
Screenshots of package autodock
autodock-vina
docking of small molecules to proteins
Versions of package autodock-vina
ReleaseVersionArchitectures
buster1.1.2-5amd64,arm64,armhf,i386
jessie1.1.2-3amd64,armel,armhf,i386
sid1.2.5-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch1.1.2-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.2.3-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.2.5-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.1.2-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 17 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

AutoDock Vina is a program to support drug discovery, molecular docking and virtual screening of compound libraries. It offers multi-core capability, high performance and enhanced accuracy and ease of use.

The same institute also developed autodock, which is widely used.

O. Trott, A. J. Olson, AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461

Please cite: Oleg Trott and Arthur J. Olson: AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. (eprint) Journal of Computational Chemistry 31(2):455-461 (2010)
Registry entries: Bio.tools  SciCrunch  Bioconda 
autogrid
pre-calculate binding of ligands to their receptor
Versions of package autogrid
ReleaseVersionArchitectures
stretch4.2.6-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster4.2.6-6amd64,arm64,armhf,i386
bullseye4.2.6-8amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm4.2.6-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie4.2.6-2amd64,armel,armhf,i386
trixie4.2.6-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid4.2.6-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Debtags of package autogrid:
fieldbiology, biology:structural
interfacecommandline
roleprogram
scopeutility
useanalysing
works-with3dmodel
Popcon: 5 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The AutoDockSuite addresses the molecular analysis of the docking of a smaller chemical compounds to their receptors of known three-dimensional structure.

The AutoGrid program performs pre-calculations for the docking of a ligand to a set of grids that describe the effect that the protein has on point charges. The effect of these forces on the ligand is then analysed by the AutoDock program.

avogadro
Molecular Graphics and Modelling System
Versions of package avogadro
ReleaseVersionArchitectures
jessie1.0.3-10.1amd64,armel,armhf,i386
bullseye1.93.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.97.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.99.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
sid1.99.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.2.0-4amd64,arm64,armhf,i386
stretch1.2.0-1+deb9u1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Debtags of package avogadro:
fieldchemistry
roleprogram
uitoolkitqt
useviewing
Popcon: 54 users (33 upd.)*
Versions and Archs
License: DFSG free
Git

Avogadro is a molecular graphics and modelling system targeted at molecules and biomolecules. It can visualize properties like molecular orbitals or electrostatic potentials and features an intuitive molecular builder.

Features include:

  • Molecular modeller with automatic force-field based geometry optimization
  • Molecular Mechanics including constraints and conformer searches
  • Visualization of molecular orbitals and general isosurfaces
  • Visualization of vibrations and plotting of vibrational spectra
  • Support for crystallographic unit cells
  • Input generation for the Gaussian, GAMESS and MOLPRO quantum chemistry packages
  • Flexible plugin architecture and Python scripting

File formats Avogadro can read include PDB, XYZ, CML, CIF, Molden, as well as Gaussian, GAMESS and MOLPRO output.

Please cite: Marcus D Hanwell, Donald E Curtis, David C Lonie, Tim Vandermeersch, Eva Zurek and Geoffrey R Hutchison: Avogadro: An advanced semantic chemical editor, visualization, and analysis platform. (eprint) J. Cheminf. 4:17 (2012)
Registry entries: Bio.tools  SciCrunch 
axe-demultiplexer
Trie-based DNA sequencing read demultiplexer
Versions of package axe-demultiplexer
ReleaseVersionArchitectures
buster0.3.3+dfsg-1amd64,arm64,armhf,i386
bullseye0.3.3+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.3.3+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch0.3.2+dfsg1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm0.3.3+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.3.3+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Axe very rapidly selects the optimal barcode present in a sequence read, even in the presence of sequencing errors. The algorithm is able to handle combinatorial barcoding, barcodes of differing length, and several mismatches per barcode.

Registry entries: SciCrunch 
baitfisher
software package for designing hybrid enrichment probes
Versions of package baitfisher
ReleaseVersionArchitectures
bullseye1.2.7+git20190123.241d060+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.2.7+git20211020.de26d5c+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.2.7+git20180107.e92dbf2+dfsg-1amd64,arm64,armhf,i386
bookworm1.2.7+git20211020.de26d5c+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.2.7+git20211020.de26d5c+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

The BaitFisher package consists of two programs: BaitFisher and BaitFilter.

BaitFisher was been designed to construct hybrid enrichment baits from multiple sequence alignments (MSAs) or annotated features in MSAs. The main goal of BaitFisher is to avoid redundancy in the construction of baits by designing fewer baits in conserved regions of the MSAs and designing more baits in variable regions. This makes use of the fact that hybrid enrichment baits can differ to some extends from the target region, which they should capture in the enrichment procedure. By specifying the allowed distance between baits and the sequences in the MSAs the user can control the allowed bait-to-target distance and the degree of reduction in the number of baits that are designed. See the BaitFisher paper for details.

BaitFilter was designed (i) to determine whether baits bind unspecifically to a reference genome, (ii) to filter baits that only have partial length matches to a reference genome, (iii) to determine the optimal bait region in a MSA and to convert baits to a format that can be uploaded at a bait constructing company. The optimal bait region can be the most conserved region in the MSA or the region with the highest number of sequences without gaps or ambiguous nucleotides.

Please cite: Christoph Mayer, Manuela Sann, Alexander Donath, Martin Meixner, Lars Podsiadlowski, Ralph S. Peters, Malte Petersen, Karen Meusemann, Karsten Liere, Johann-Wolfgang Wägele, Bernhard Misof, Christoph Bleidorn, Michael Ohl and Oliver Niehuis: BaitFisher: A Software Package for Multispecies Target DNA Enrichment Probe Design. (PubMed,eprint) Mol. Biol. Evol. 33(7):1875-1886 (2016)
Registry entries: SciCrunch  Bioconda 
bali-phy
Bayesian Inference of Alignment and Phylogeny
Versions of package bali-phy
ReleaseVersionArchitectures
bookworm3.6.1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye3.6.0+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster3.4+dfsg-1amd64,arm64,armhf,i386
experimental4.0~beta13+dfsg-1amd64,arm64,i386,ppc64el,s390x
sid3.6.1+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie3.6.1+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
experimental4.0~beta2+dfsg-1armhf,mips64el
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

BAli-Phy estimates multiple sequence alignments and evolutionary trees from unaligned DNA, amino acid, or codon sequences. BAli-Phy uses MCMC to estimate evolutionary trees, positive selection, and branch lengths while averaging over alternative alignments. BAli-Phy can display alignment ambiguity graphically in an alignment uncertainty (AU) plot.

BAli-Phy can also estimate phylogenies from a fixed alignment (like MrBayes and BEAST) using substitution models like GTR+gamma. BAli-Phy automatically estimates relative rates for each gene.

Please cite: Benjamin D. Redelings and Marc A. Suchard: Joint Bayesian Estimation of Alignment and Phylogeny. (PubMed,eprint) Systematic Biology 54(3):401-418 (2005)
Registry entries: Bio.tools 
ballview
free molecular modeling and molecular graphics tool
Versions of package ballview
ReleaseVersionArchitectures
stretch1.4.3~beta1-3amd64,arm64,armel,armhf,i386,mips,ppc64el,s390x
bookworm1.5.0+git20180813.37fc53c-11amd64,arm64,i386,mips64el,mipsel,ppc64el,s390x
buster1.5.0+git20180813.37fc53c-3amd64,arm64,i386
sid1.5.0+git20180813.37fc53c-11amd64,arm64,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.5.0+git20180813.37fc53c-6amd64,arm64,i386,mips64el,mipsel,ppc64el,s390x
jessie1.4.2+20140406-1amd64,armel,armhf,i386
upstream1.5.0+git20220524.d85d2dd
Debtags of package ballview:
interfacex11
roleprogram
uitoolkitqt
x11application
Popcon: 8 users (15 upd.)*
Newer upstream!
License: DFSG free
Git

BALLView provides fast OpenGL-based visualization of molecular structures, molecular mechanics methods (minimization, MD simulation using the AMBER, CHARMM, and MMFF94 force fields), calculation and visualization of electrostatic properties (FDPB) and molecular editing features.

BALLView can be considered a graphical user interface on the basis of BALL (Biochemical Algorithms Library) with a focus on the most common demands of protein chemists and biophysicists in particular. It is developed in the groups of Hans-Peter Lenhof (Saarland University, Saarbruecken, Germany) and Oliver Kohlbacher (University of Tuebingen, Germany). BALL is an application framework in C++ that has been specifically designed for rapid software development in Molecular Modeling and Computational Molecular Biology. It provides an extensive set of data structures as well as classes for Molecular Mechanics, advanced solvation methods, comparison and analysis of protein structures, file import/export, and visualization.

Please cite: Andreas Moll, Andreas Hildebrandt, Hans-Peter Lenhof and Oliver Kohlbacher: BALLView: a tool for research and education in molecular modeling. (PubMed,eprint) Bioinformatics 22(3):365-366 (2006)
Registry entries: SciCrunch 
Screenshots of package ballview
bamclipper
Remove gene-specific primer sequences from SAM/BAM alignments
Versions of package bamclipper
ReleaseVersionArchitectures
bookworm1.0.0-3all
trixie1.0.0-3all
sid1.0.0-3all
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Remove gene-specific primer sequences from SAM/BAM alignments of PCR amplicons by soft-clipping.

bamclipper.sh soft-clips gene-specific primers from BAM alignment file based on genomic coordinates of primer pairs in BEDPE format.

Please cite: Chun Hang Au, Dona N Ho, Ava Kwong, Tsun Leung Chan and Edmond S K Ma: BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing. (PubMed,eprint) Scientific Reports 7(1):1567 (2017)
Registry entries: Bioconda 
bamkit
tools for common BAM file manipulations
Versions of package bamkit
ReleaseVersionArchitectures
trixie0.0.1+git20170413.ccd079d-3all
sid0.0.1+git20170413.ccd079d-3all
bookworm0.0.1+git20170413.ccd079d-3all
bullseye0.0.1+git20170413.ccd079d-2all
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This package provides some Python3 tools for common BAM file manipulations.

bamtools
toolkit for manipulating BAM (genome alignment) files
Versions of package bamtools
ReleaseVersionArchitectures
sid2.5.2+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie2.5.2+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2.5.2+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie2.3.0+dfsg-2amd64,armel,armhf,i386
bullseye2.5.1+dfsg-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster2.5.1+dfsg-3amd64,arm64,armhf,i386
stretch2.4.1+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Popcon: 6 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

BamTools facilitates research analysis and data management using BAM files. It copes with the enormous amount of data produced by current sequencing technologies that is typically stored in compressed, binary formats that are not easily handled by the text-based parsers commonly used in bioinformatics research.

BamTools provides both a C++ API for BAM file support as well as a command-line toolkit.

This is the bamtools command-line toolkit.

Available bamtools commands:

 convert  Converts between BAM and a number of other formats
 count    Prints number of alignments in BAM file(s)
 coverage Prints coverage statistics from the input BAM file
 filter   Filters BAM file(s) by user-specified criteria
 header   Prints BAM header information
 index    Generates index for BAM file
 merge    Merge multiple BAM files into single file
 random   Select random alignments from existing BAM file(s), intended more
          as a testing tool.
 resolve  Resolves paired-end reads (marking the IsProperPair flag as needed)
 revert   Removes duplicate marks and restores original base qualities
 sort     Sorts the BAM file according to some criteria
 split    Splits a BAM file on user-specified property, creating a new BAM
          output file for each value found
 stats    Prints some basic statistics from input BAM file(s)
The package is enhanced by the following packages: multiqc
Please cite: Derek W. Barnett, Erik K. Garrison, Aaron R. Quinlan, Michael P. Stromberg and Gabor T. Marth: BamTools: a C++ API and toolkit for analyzing and managing BAM files. (PubMed,eprint) Bioinformatics 27(12):1691-2 (2011)
Registry entries: Bio.tools  SciCrunch  Bioconda 
bandage
Bioinformatics Application for Navigating De novo Assembly Graphs Easily
Versions of package bandage
ReleaseVersionArchitectures
trixie0.9.0-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
bookworm0.9.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye0.8.1-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.8.1-1amd64,arm64,armhf,i386
sid0.9.0-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Bandage is a GUI program that allows users to interact with the assembly graphs made by de novo assemblers such as Velvet, SPAdes, MEGAHIT and others.

De novo assembly graphs contain not only assembled contigs but also the connections between those contigs, which were previously not easily accessible. Bandage visualises assembly graphs, with connections, using graph layout algorithms. Nodes in the drawn graph, which represent contigs, can be automatically labelled with their ID, length or depth. Users can interact with the graph by moving, labelling and colouring nodes. Sequence information can also be extracted directly from the graph viewer. By displaying connections between contigs, Bandage opens up new possibilities for analysing and improving de novo assemblies that are not possible by looking at contigs alone.

More information and download links are on the Bandage website: rrwick.github.io/Bandage

The package is relevant to the field of genome assembly.

The package is enhanced by the following packages: bandage-examples
Please cite: Ryan R. Wick, Mark B. Schultz, Justin Zobel and Kathryn E. Holt: Bandage: interactive visualisation of de novo genome assemblies. (PubMed,eprint) Bioinformatics 31(20):3350-3352 (2015)
Registry entries: Bio.tools  Bioconda 
barrnap
rapid ribosomal RNA prediction
Versions of package barrnap
ReleaseVersionArchitectures
trixie0.9+dfsg-4all
sid0.9+dfsg-4all
stretch0.7+dfsg-2all
buster0.9+dfsg-1all
bullseye0.9+dfsg-2all
bookworm0.9+dfsg-3all
Popcon: 3 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Barrnap (BAsic Rapid Ribosomal RNA Predictor) predicts the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S).

It takes FASTA DNA sequence as input, and writes GFF3 as output. It uses the NHMMER tool that comes with HMMER 3.1 for HMM searching in RNA:DNA style. Multithreading is supported and one can expect roughly linear speed-ups with more CPUs.

Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Functional, regulatory and non-coding RNA
bbmap
BBTools genomic aligner and other tools for short sequences
Versions of package bbmap
ReleaseVersionArchitectures
sid39.08+dfsg-1all
trixie39.08+dfsg-1all
bookworm39.01+dfsg-2all
bullseye38.90+dfsg-1all
buster-backports38.63+dfsg-1~bpo10+1all
upstream39.09
Popcon: 1 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

The BBTools are a collection of small programs to solve recurrent tasks for the creative handling of short biological RNA/DNA sequences. This suite may be best known for its mapper, which is also the name of the project on sourceforge, but several tools have been added over time. All tools are multi-threaded, implemented platform-independently in Java:

BBMap: Short read aligner for DNA and RNA-seq data. Capable of handling arbitrarily large genomes with millions of scaffolds. Handles Illumina, PacBio, 454, and other reads; very high sensitivity and tolerant of errors and numerous large indels.

BBNorm: Kmer-based error-correction and normalization tool.

Dedupe: Simplifies assemblies by removing duplicate or contained subsequences that share a target percent identity.

Reformat: Reformats reads between fasta/fastq/scarf/fasta+qual/sam, interleaved/paired, and ASCII-33/64, at over 500 MB/s.

BBDuk: Filters, trims, or masks reads with kmer matches to an artifact/contaminant file.

The package is enhanced by the following packages: multiqc
Please cite: Brian Bushnell, Jonathan Rood and Esther Singer: BBMerge – Accurate paired shotgun read merging via overlap. (PubMed,eprint) PLOS One (2017)
Registry entries: Bio.tools  SciCrunch  Bioconda 
bcalm
de Bruijn compaction in low memory
Versions of package bcalm
ReleaseVersionArchitectures
trixie2.2.3-5amd64,arm64,mips64el,ppc64el,riscv64
bookworm2.2.3-4amd64,arm64,mips64el,ppc64el
sid2.2.3-5amd64,arm64,mips64el,ppc64el,riscv64
bullseye2.2.3-1amd64,arm64,i386,mips64el,ppc64el,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

A bioinformatics tool for constructing the compacted de Bruijn graph from sequencing data.

This is the parallel version of the BCALM software using gatb-core library.

Please cite: Rayan Chikhi, Antoine Limasset and Paul Medvedev: Compacting de Bruijn graphs from sequencing data quickly and in low memory.. (eprint) Bioinformatics 32(12):208 (2016)
bcftools
genomic variant calling and manipulation of VCF/BCF files
Versions of package bcftools
ReleaseVersionArchitectures
trixie1.20-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch-backports1.8-1~bpo9+1amd64,arm64,armel,armhf,mips64el,mipsel,ppc64el
buster1.9-1amd64,arm64,armhf
bullseye1.11-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
stretch1.3.1-1amd64,arm64,armel,mips64el,mipsel,ppc64el
bookworm1.16-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
sid1.20-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream1.21
Popcon: 28 users (72 upd.)*
Newer upstream!
License: DFSG free
Git

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.

The package is enhanced by the following packages: multiqc
Please cite: Petr Danecek and Shane A. McCarthy: BCFtools/csq: Haplotype-aware variant consequences. (2016)
Registry entries: Bio.tools  SciCrunch  Bioconda 
beads
2-DE electrophoresis gel image spot detection
Versions of package beads
ReleaseVersionArchitectures
bullseye1.1.20-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.1.22-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.1.22-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
buster1.1.18+dfsg-3amd64,arm64,armhf,i386
bookworm1.1.22-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Beads is a program for spot detection on 2-D gel images. It is based on an analogy with beads flowing uphill on the surface of the gel image and on the analysis of their paths (Langella & Zivy, 2008).

Please cite: Olivier Langella and Michel Zivy: A method based on bead flows for spot detection on 2-D gel images. (PubMed) Proteomics 8(23-24):4914-8 (2008)
beagle
Genotype calling, genotype phasing and imputation of ungenotyped markers
Versions of package beagle
ReleaseVersionArchitectures
bookworm220722-1all
trixie220722-1all
buster5.0-180928+dfsg-1+deb10u1all
bullseye5.1-200518+dfsg-1all
sid220722-1all
stretch4.1~160727-86a+dfsg-1all
upstream240806
Popcon: 3 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

Beagle performs genotype calling, genotype phasing, imputation of ungenotyped markers, and identity-by-descent segment detection. Genotypic imputation works on phased haplotypes using a Li and Stephens haplotype frequency model. Beagle also implements the Refined IBD algorithm for detecting homozygosity-by-descent (HBD) and identity-by-descent (IBD) segments.

The package is enhanced by the following packages: beagle-doc
Please cite: Sharon R. Browning and Brian L. Browning: Rapid and Accurate Haplotype Phasing and Missing-Data Inference for Whole-Genome Association Studies By Use of Localized Haplotype Clustering. (eprint) The American Journal of Human Genetics 81(5):1084-1097 (2007)
Registry entries: Bio.tools  SciCrunch  Bioconda 
beast-mcmc
Bayesian MCMC phylogenetic inference
Versions of package beast-mcmc
ReleaseVersionArchitectures
buster1.10.4+dfsg-1all
bullseye1.10.4+dfsg-2all
jessie1.8.0-1 (contrib)all
stretch1.8.4+dfsg.1-1all
sid1.10.4+dfsg-5all
trixie1.10.4+dfsg-5all
bookworm1.10.4+dfsg-5all
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

BEAST is a cross-platform program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability. Included is a simple to use user-interface program for setting up standard analyses and a suit of programs for analysing the results.

The package is enhanced by the following packages: beast-mcmc-doc beast-mcmc-examples
Please cite: Alexei J Drummond and Andrew Rambaut: BEAST: Bayesian evolutionary analysis by sampling trees. (PubMed,eprint) BMC Evol Biol 8(7):214 (2007)
Registry entries: Bio.tools  SciCrunch  Bioconda 
beast2-mcmc
Bayesian MCMC phylogenetic inference
Versions of package beast2-mcmc
ReleaseVersionArchitectures
bullseye2.6.3+dfsg-2all
sid2.7.6+dfsg-1all
trixie2.7.6+dfsg-1all
bookworm2.7.3+dfsg-1all
stretch2.4.4+dfsg-1all
buster2.5.1+dfsg-2all
upstream2.7.7
Popcon: 2 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

BEAST is a cross-platform program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability. Included is a simple to use user-interface program for setting up standard analyses and a suit of programs for analysing the results.

This is no new upstream version of beast-mcmc (1.x) but rather a rewritten version.

The package is enhanced by the following packages: beast2-mcmc-doc beast2-mcmc-examples
Please cite: Remco Bouckaert, Joseph Heled, Denise Kühnert, Tim Vaughan, Chieh-Hsi Wu, Dong Xie, Marc A. Suchard, Andrew Rambaut and Alexei J. Drummond: BEAST 2: A Software Platform for Bayesian Evolutionary Analysis. (PubMed,eprint) PLoS Comput Biol 10(4):e1003537 (2014)
Registry entries: Bio.tools  SciCrunch  Bioconda 
bedops
high-performance genomic feature operations
Versions of package bedops
ReleaseVersionArchitectures
bullseye2.4.39+dfsg1-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2.4.41+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.4.41+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.4.41+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch-backports2.4.35+dfsg-1~bpo9+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2.4.35+dfsg-1amd64,arm64,armhf,i386
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

BEDOPS is a suite of tools to address common questions raised in genomic studies, mostly with regard to overlap and proximity relationships between data sets. It aims to be scalable and flexible, facilitating the efficient and accurate analysis and management of large-scale genomic data.

Please cite: Shane Neph, M. Scott Kuehn, Alex P. Reynolds, Eric Haugen, Robert E. Thurman, Audra K. Johnson, Eric Rynes, Matthew T. Maurano, Jeff Vierstra, Sean Thomas, Richard Sandstrom, Richard Humbert and John A. Stamatoyannopoulos: BEDOPS: high-performance genomic feature operations. (PubMed,eprint) 28(14):1919-1920 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
bedtools
suite of utilities for comparing genomic features
Versions of package bedtools
ReleaseVersionArchitectures
stretch2.26.0+dfsg-3amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
buster2.27.1+dfsg-4amd64,arm64,armhf
jessie2.21.0-1amd64,armhf,i386
sid2.31.1+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie2.31.1+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2.30.0+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye2.30.0+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package bedtools:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopesuite
useanalysing, comparing, converting, filtering
works-withbiological-sequence
Popcon: 34 users (6 upd.)*
Versions and Archs
License: DFSG free
Git

The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using BEDTools, one can develop sophisticated pipelines that answer complicated research questions by streaming several BEDTools together.

The groupBy utility is distributed in the filo package.

Please cite: Aaron R. Quinlan and Ira M. Hall: BEDTools: a flexible suite of utilities for comparing genomic features. (PubMed,eprint) Bioinformatics 26(6):841-842 (2010)
Registry entries: Bio.tools  SciCrunch  Bioconda 
belvu
multiple sequence alignment viewer and phylogenetic tool
Versions of package belvu
ReleaseVersionArchitectures
bullseye4.44.1+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster4.44.1+dfsg-3amd64,arm64,armhf,i386
bookworm4.44.1+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie4.44.1+dfsg-7.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
sid4.44.1+dfsg-7.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Belvu is a multiple sequence alignment viewer and phylogenetic tool with an extensive set of user-configurable modes to color residues.

  • View multiple sequence alignments.
  • Residues can be coloured by conservation, with user-configurable cutoffs and colours.
  • Residues can be coloured by residue type (user-configurable).
  • Colour schemes can be imported or exported.
  • Swissprot (or PIR) entries can be fetched by double clicking.
  • The position in the alignment can be easily tracked.
  • Manual deletion of rows and columns.
  • Automatic editing of rows and columns based on customisable criteria:
    • removal of all-gap columns;
    • removal of all gaps;
    • removal of redundant sequences;
    • removal of a column by a user-specified percentage of gaps;
    • filtering of sequences by percent identity;
    • removal of sequences by a user-specified percentage of gaps;
    • removal of partial sequences (those starting or ending with gaps); and
    • removal of columns by conservation (with user-specified upper/lower cutoffs).
  • The alignment can be saved in Stockholm, Selex, MSF or FASTA format.
  • Distance matrices between sequences can be generated using a variety of distance metrics.
  • Distance matrices can be imported or exported.
  • Phylogenetic trees can be constructed based on various distance-based tree reconstruction algorithms.
  • Trees can be saved in New Hampshire format.
  • Belvu can perform bootstrap phylogenetic reconstruction.
Please cite: Gemma Barson and Ed Griffiths: SeqTools: visual tools for manual analysis of sequence alignments. (PubMed,eprint) BMC Research Notes 9:39 (2016)
Registry entries: Bio.tools  SciCrunch 
berkeley-express
Streaming quantification for high-throughput sequencing
Versions of package berkeley-express
ReleaseVersionArchitectures
buster1.5.2+dfsg-1amd64,arm64,armhf,i386
stretch1.5.1-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.5.3+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.5.3+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.5.3+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.5.3+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

eXpress is a streaming tool for quantifying the abundances of a set of target sequences from sampled subsequences. Example applications include transcript-level RNA-Seq quantification, allele-specific/haplotype expression analysis (from RNA-Seq), transcription factor binding quantification in ChIP-Seq, and analysis of metagenomic data. It is based on an online-EM algorithm that results in space (memory) requirements proportional to the total size of the target sequences and time requirements that are proportional to the number of sampled fragments. Thus, in applications such as RNA-Seq, eXpress can accurately quantify much larger samples than other currently available tools greatly reducing computing infrastructure requirements. eXpress can be used to build lightweight high-throughput sequencing processing pipelines when coupled with a streaming aligner (such as Bowtie), as output can be piped directly into eXpress, effectively eliminating the need to store read alignments in memory or on disk.

In an analysis of the performance of eXpress for RNA-Seq data, it was observed that this efficiency does not come at a cost of accuracy. eXpress is more accurate than other available tools, even when limited to smaller datasets that do not require such efficiency. Moreover, like the Cufflinks program, eXpress can be used to estimate transcript abundances in multi-isoform genes. eXpress is also able to resolve multi-mappings of reads across gene families, and does not require a reference genome so that it can be used in conjunction with de novo assemblers such as Trinity, Oases, or Trans-ABySS. The underlying model is based on previously described probabilistic models developed for RNA-Seq but is applicable to other settings where target sequences are sampled, and includes parameters for fragment length distributions, errors in reads, and sequence-specific fragment bias.

eXpress can be used to resolve ambiguous mappings in other high-throughput sequencing based applications. The only required inputs to eXpress are a set of target sequences and a set of sequenced fragments multiply-aligned to them. While these target sequences will often be gene isoforms, they need not be. Haplotypes can be used as the reference for allele-specific expression analysis, binding regions for ChIP-Seq, or target genomes in metagenomics experiments. eXpress is useful in any analysis where reads multi-map to sequences that differ in abundance.

Please cite: Adam Roberts and Lior Pachter: Streaming fragment assignment for real-time analysis of sequencing experiments. (PubMed) Nature Methods 10(1):71–73 (2013)
Registry entries: SciCrunch  Bioconda 
bifrost
parallel construction, indexing and querying of de Bruijn graphs
Versions of package bifrost
ReleaseVersionArchitectures
trixie1.3.1-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
sid1.3.1-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
upstream1.3.5
Popcon: 1 users (4 upd.)*
Newer upstream!
License: DFSG free
Git

Bifrost is a command-line tool for sequencing that features a broad range of functions, such as indexing, editing, and querying the graph, and includes a graph coloring method that maps each k-mer of the graph to the genomes it occurs in.

  • Build, index, color and query the compacted de Bruijn graph
  • No need to build the uncompacted de Bruijn graph
  • Reads or assembled genomes as input
  • Output graph in GFA (can be visualized with Bandage), FASTA or binary
  • Graph cleaning: short tip clipping, etc.
  • Multi-threaded
  • No parameters to estimate with other tools
  • Exact or approximate k-mer search of queries
Please cite: Guillaume Holley and Páll Melsted: Bifrost – Highly parallel construction and indexing of colored and compacted de Bruijn graphs. (PubMed,eprint) bioRxiv 21(1):249 (2020)
Registry entries: Bio.tools  Bioconda 
bio-eagle
Haplotype phasing within a genotyped cohort or using a phased reference panel
Versions of package bio-eagle
ReleaseVersionArchitectures
buster2.4.1-1amd64
stretch2.3-3amd64,i386
trixie2.4.1-3amd64,i386
bookworm2.4.1-3amd64,i386
bullseye2.4.1-3amd64,i386
sid2.4.1-3amd64,i386
Popcon: 13 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Eagle estimates haplotype phase either within a genotyped cohort or using a phased reference panel. The basic idea of the Eagle1 algorithm is to harness identity-by-descent among distant relatives—which is pervasive at very large sample sizes but rare among smaller numbers of samples—to rapidly call phase using a fast scoring approach. In contrast, the Eagle2 algorithm analyzes a full probabilistic model similar to the diploid Li-Stephens model used by previous HMM-based methods.

Please note: The executable was renamed to bio-eagle because of a name clash. Please read more about this in /usr/share/doc/bio-eagle/README.Debian.

The package is enhanced by the following packages: bio-eagle-examples
Please cite: Po-Ru Loh, Pier Francesco Palamara and Alkes L Price: Fast and accurate long-range phasing in a UK Biobank cohort. Nature Genetics (2016)
bio-rainbow
clustering and assembling short reads for bioinformatics
Versions of package bio-rainbow
ReleaseVersionArchitectures
bookworm2.0.4+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.0.4+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.0.4+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye2.0.4+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch2.0.4-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2.0.4+dfsg-1amd64,arm64,armhf,i386
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Efficient tool for clustering and assembling short reads, especially for RAD.

Rainbow is developed to provide an ultra-fast and memory-efficient solution to clustering and assembling short reads produced by RAD-seq. First, Rainbow clusters reads using a spaced seed method. Then, Rainbow implements a heterozygote calling like strategy to divide potential groups into haplotypes in a top-down manner. long a guided tree, it iteratively merges sibling leaves in a bottom-up manner if they are similar enough. Here, the similarity is defined by comparing the 2nd reads of a RAD segment. This approach tries to collapse heterozygote while discriminate repetitive sequences. At last, Rainbow uses a greedy algorithm to locally assemble merged reads into contigs. Rainbow not only outputs the optimal but also suboptimal assembly results. Based on simulation and a real guppy RAD-seq data, it is shown that Rainbow is more competent than the other tools in dealing with RAD-seq data.

Please cite: Zechen Chong, Jue Ruan and Chung-I. Wu: Rainbow: an integrated tool for efficient clustering and assembling RAD-seq reads.. (PubMed) Bioinformatics 28(21):2732-2737 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
bio-tradis
analyse the output from TraDIS analyses of genomic sequences
Versions of package bio-tradis
ReleaseVersionArchitectures
sid1.4.5+dfsg2-2all
buster1.4.1+dfsg-1all
stretch-backports1.3.3+dfsg-3~bpo9+1all
bullseye1.4.5+dfsg2-1all
bookworm1.4.5+dfsg2-1all
trixie1.4.5+dfsg2-2all
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Bio-Tradis contains a set of tools to analyse the output from TraDIS analyses.

The Bio-Tradis analysis pipeline is implemented as an extensible Perl library which can either be used as is, or as a basis for the development of more advanced analysis tools.

Please note: You need to manually install BioConductor Edger which can not be distributed by Debian in recent version since it is using non-distributable code locfit.

Please cite: Lars Barquist, Matthew Mayho, Carla Cummins, Amy K. Cain, Christine J. Boinett, Andrew J. Page, Gemma C. Langridge, Michael A. Quail, Jacqueline A. Keane and Julian Parkhill: The TraDIS toolkit: sequencing and analysis for dense transposon mutant libraries. (PubMed,eprint) Bioinformatics 32(7):1109-1111 (2016)
Registry entries: Bio.tools  SciCrunch  Bioconda 
bio-vcf
domain specific language (DSL) for processing the VCF format
Versions of package bio-vcf
ReleaseVersionArchitectures
sid0.9.5-3all
bookworm0.9.5-3all
trixie0.9.5-3all
bullseye0.9.5-2all
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Bio-vcf provides a domain specific language (DSL) for processing the VCF format. Record named fields can be queried with regular expressions, e.g.

 sample.dp>20 and rec.filter !~ /LowQD/ and rec.tumor.bcount[rec.alt]>4

Bio-vcf is a new generation VCF parser, filter and converter. Bio-vcf is not only very fast for genome-wide (WGS) data, it also comes with a really nice filtering, evaluation and rewrite language and it can output any type of textual data, including VCF header and contents in RDF and JSON.

bioawk
extension of awk for biological sequence analysis
Versions of package bioawk
ReleaseVersionArchitectures
bookworm1.0-4+deb12u1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.0-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.0-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 3 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Bioawk is an extension to Brian Kernighan's awk, adding the support of several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names. It also adds a few built-in functions and an command line option to use TAB as the input/output delimiter. When the new functionality is not used, bioawk is intended to behave exactly the same as the original BWK awk.

Registry entries: Bioconda 
biobambam2
tools for early stage alignment file processing
Versions of package biobambam2
ReleaseVersionArchitectures
trixie2.0.185+ds-2amd64,i386,mips64el,ppc64el,riscv64
sid2.0.185+ds-2amd64,i386,mips64el,ppc64el,riscv64
bookworm2.0.185+ds-1amd64,arm64,i386,ppc64el
bullseye2.0.179+ds-1amd64,arm64,i386,ppc64el
upstream2.0.185-release-20221211202123
Popcon: 1 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

This package contains some tools for processing BAM files, including

  bamsormadup:  parallel sorting and duplicate marking
  bamcollate2:  reads BAM and writes BAM reordered such that alignment
                or collated by query name
  bammarkduplicates: reads BAM and writes BAM with duplicate alignments
                marked using the BAM flags field
  bammaskflags: reads BAM and writes BAM while masking (removing) bits
                from the flags column
  bamrecompress: reads BAM and writes BAM with a defined compression
                 setting. This tool is capable of multi-threading.
  bamsort:       reads BAM and writes BAM resorted by coordinates or
                 query name
  bamtofastq:    reads BAM and writes FastQ; output can be collated
                 or uncollated by query name
The package is enhanced by the following packages: multiqc
Please cite: German Tischler and Steven Leonard: biobambam: tools for read pair collation based algorithms on BAM files. (eprint) Source Code Biol Med. 9:13 (2014)
Registry entries: Bio.tools  SciCrunch  Bioconda 
biosyntax
Syntax Highlighting for Computational Biology (metapackage)
Versions of package biosyntax
ReleaseVersionArchitectures
buster1.0.0b-1all
bullseye1.0.0b-2all
bookworm1.0.0b-4all
trixie1.0.0b-6all
sid1.0.0b-6all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Syntax highlighting for computational biology to bring you intuitively close to your data. BioSyntax supports .sam, .flagstat, .vcf, .fasta, .fastq, .faidx , .clustal, .pdb, .gtf, .bed files & more.

This is a metapackage depending on all bioSyntax plugins.

Please cite: Artem Babaian, Anicet Ebou, Alyssa Fegen, Ho Yin Jeffrey Kam, German E Novakovsky, Jasper Wong, Dylan Aïssi and Li Yao: bioSyntax: syntax highlighting for computational biology. BMC Bioinformatics 19(303) (2018)
Registry entries: Bio.tools  SciCrunch 
bitseq
Bayesian Inference of Transcripts from Sequencing Data
Versions of package bitseq
ReleaseVersionArchitectures
bullseye0.7.5+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.7.5+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.7.5+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster0.7.5+dfsg-4amd64,arm64,armhf,i386
trixie0.7.5+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

BitSeq is an application for inferring expression levels of individual transcripts from sequencing (RNA-Seq) data and estimating differential expression (DE) between conditions. An advantage of this approach is the ability to account for both technical uncertainty and intrinsic biological variance in order to avoid false DE calls. The technical contribution to the uncertainty comes both from finite read-depth and the possibly ambiguous mapping of reads to multiple transcripts.

Please cite: James Hensman, Panagiotis Papastamoulis, Peter Glaus, Antti Honkela and Magnus Rattray: Fast and accurate approximate inference of transcript expression from RNA-seq data. (PubMed,eprint) Bioinformatics 31(24):3881-9 (2015)
Registry entries: Bio.tools  SciCrunch 
blasr
mapping single-molecule sequencing reads
Versions of package blasr
ReleaseVersionArchitectures
sid5.3.5+dfsg-6amd64,arm64,mips64el,ppc64el,riscv64
stretch5.3+0-1amd64,arm64,mips64el,ppc64el
bullseye5.3.3+dfsg-5amd64,arm64,mips64el,ppc64el
bookworm5.3.5+dfsg-6amd64,arm64,mips64el,ppc64el
buster5.3.2+dfsg-1.1amd64,arm64
Popcon: 0 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Basic local alignment with successive refinement (BLASR) is a method for mapping single-molecule sequencing reads against a reference genome. Such reads are thousands of bases long, with divergence between them and the genome being dominated by insertion and deletion error.

Registry entries: Bio.tools  SciCrunch  Bioconda 
blixem
interactive browser of sequence alignments
Versions of package blixem
ReleaseVersionArchitectures
trixie4.44.1+dfsg-7.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
bookworm4.44.1+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid4.44.1+dfsg-7.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster4.44.1+dfsg-3amd64,arm64,armhf,i386
bullseye4.44.1+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Blixem is an interactive browser of sequence alignments that have been stacked up in a "master-slave" multiple alignment; it is not a 'true' multiple alignment but a 'one-to-many' alignment.

  • Overview section showing the positions of genes and alignments around the alignment window
  • Detail section showing the actual alignment of protein or nucleotide sequences to the genomic DNA sequence.
  • View alignments against both strands of the reference sequence.
  • View sequences in nucleotide or protein mode; in protein mode, Blixem will display the three-frame translation of the reference sequence.
  • Residues are highlighted in different colours depending on whether they are an exact match, conserved substitution or mismatch.
  • Gapped alignments are supported, with insertions and deletions being highlighted in the match sequence.
  • Matches can be sorted and filtered.
  • SNPs and other variations can be highlighted in the reference sequence.
  • Poly(A) tails can be displayed and poly(A) signals highlighted in the reference sequence.
Please cite: Gemma Barson and Ed Griffiths: SeqTools: visual tools for manual analysis of sequence alignments. (PubMed,eprint) BMC Research Notes 9:39 (2016)
Registry entries: Bio.tools  SciCrunch 
bolt-lmm
Efficient large cohorts genome-wide Bayesian mixed-model association testing
Versions of package bolt-lmm
ReleaseVersionArchitectures
trixie2.4.1+dfsg-2amd64,i386,ppc64el
bookworm2.4.0+dfsg-1amd64,i386,ppc64el
bullseye2.3.4+dfsg-3amd64,i386,ppc64el
buster2.3.2+dfsg-3amd64
sid2.4.1+dfsg-2amd64,i386,ppc64el
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The BOLT-LMM software package currently consists of two main algorithms, the BOLT-LMM algorithm for mixed model association testing, and the BOLT-REML algorithm for variance components analysis (i.e., partitioning of SNP-heritability and estimation of genetic correlations).

The BOLT-LMM algorithm computes statistics for testing association between phenotype and genotypes using a linear mixed model. By default, BOLT-LMM assumes a Bayesian mixture-of-normals prior for the random effect attributed to SNPs other than the one being tested. This model generalizes the standard infinitesimal mixed model used by previous mixed model association methods, providing an opportunity for increased power to detect associations while controlling false positives. Additionally, BOLT-LMM applies algorithmic advances to compute mixed model association statistics much faster than eigendecomposition-based methods, both when using the Bayesian mixture model and when specialized to standard mixed model association.

The BOLT-REML algorithm estimates heritability explained by genotyped SNPs and genetic correlations among multiple traits measured on the same set of individuals. BOLT-REML applies variance components analysis to perform these tasks, supporting both multi-component modeling to partition SNP-heritability and multi-trait modeling to estimate correlations. BOLT-REML applies a Monte Carlo algorithm that is much faster than eigendecomposition-based methods for variance components analysis at large sample sizes.

The package is enhanced by the following packages: bolt-lmm-example
Please cite: Po-Ru Loh, George Tucker, Brendan K Bulik-Sullivan, Bjarni J Vilhjálmsson, Hilary K Finucane, Rany M Salem, Daniel I Chasman, Paul M Ridker, Benjamin M Neale, Bonnie Berger, Nick Patterson and Alkes L Price: Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nature Genetics (2015)
bowtie
Ultrafast memory-efficient short read aligner
Versions of package bowtie
ReleaseVersionArchitectures
trixie1.3.1-3amd64,arm64,mips64el,ppc64el,riscv64,s390x
stretch1.1.2-6amd64,arm64,mips64el,ppc64el,s390x
bullseye1.3.0+dfsg1-1amd64,arm64,mips64el,ppc64el,s390x
jessie1.1.1-2amd64
bookworm1.3.1-1amd64,arm64,mips64el,ppc64el,s390x
buster1.2.2+dfsg-4amd64,arm64
sid1.3.1-3amd64,arm64,mips64el,ppc64el,riscv64,s390x
Debtags of package bowtie:
biologynuceleic-acids
fieldbiology:bioinformatics
interfacecommandline
roleprogram
sciencecalculation
scopeutility
useanalysing, comparing
works-withbiological-sequence
Popcon: 19 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

This package addresses the problem to interpret the results from the latest (2010) DNA sequencing technologies. Those will yield fairly short stretches and those cannot be interpreted directly. It is the challenge for tools like Bowtie to give a chromosomal location to the short stretches of DNA sequenced per run.

Bowtie aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).

The package is enhanced by the following packages: bowtie-examples multiqc
Please cite: Ben Langmead, Cole Trapnell, Mihai Pop and Steven L Salzberg: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. (eprint) Genome Biology 10:R25 (2009)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Genomics
bowtie2
ultrafast memory-efficient short read aligner
Versions of package bowtie2
ReleaseVersionArchitectures
sid2.5.4-1amd64,arm64,mips64el,ppc64el,riscv64
bookworm2.5.0-3amd64,arm64,mips64el,ppc64el
buster2.3.4.3-1amd64
stretch2.3.0-2amd64
jessie2.2.4-1amd64
bullseye2.4.2-2amd64,arm64,mips64el,ppc64el
trixie2.5.4-1amd64,arm64,mips64el,ppc64el,riscv64
Popcon: 27 users (73 upd.)*
Versions and Archs
License: DFSG free
Git

is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes.

Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes

The package is enhanced by the following packages: bowtie2-examples multiqc
Please cite: Ben Langmead and Steven L Salzberg: Fast gapped-read alignment with Bowtie 2. (PubMed) Nature Methods 9:357–359 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Genomics
boxshade
Pretty-printing of multiple sequence alignments
Versions of package boxshade
ReleaseVersionArchitectures
stretch3.3.1-10amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie3.3.1-8amd64,armel,armhf,i386
sid3.3.1-14amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie3.3.1-14amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm3.3.1-14amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye3.3.1-14amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster3.3.1-12amd64,arm64,armhf,i386
Debtags of package boxshade:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usetypesetting
works-with-formathtml, plaintext, postscript, tex
Popcon: 5 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Boxshade is a program for creating good looking printouts from multiple-aligned protein or DNA sequences. The program does not perform the alignment by itself and requires as input a file that was created by a multiple alignment program or manually edited with respective tools.

Boxshade reads multiple-aligned sequences from either PILEUP-MSF, CLUSTAL-ALN, MALIGNED-data and ESEE-save files (limited to a maximum of 150 sequences with up to 10000 elements each). Various kinds of shading can be applied to identical/similar residues. Output is written to screen or to a file in the following formats: ANSI/VT100, PS/EPS, RTF, HPGL, ReGIS, LJ250-printer, ASCII, xFIG, PICT, HTML

Registry entries: Bio.tools  SciCrunch 
bppphyview
Bio++ Phylogenetic Viewer
Versions of package bppphyview
ReleaseVersionArchitectures
sid0.6.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.6.1-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.6.1-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.6.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
stretch0.3.0-1.1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie0.3.0-1amd64,armel,armhf,i386
buster0.6.1-1amd64,arm64,armhf,i386
Debtags of package bppphyview:
roleprogram
uitoolkitqt
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

A phylogenetic tree editor developed using Bio++ and Qt. Phyview allows one to visualize, edit, print and output phylogenetic trees and associated data.

bppsuite
Bio++ program suite
Versions of package bppsuite
ReleaseVersionArchitectures
trixie2.4.1-7amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster2.4.1-1amd64,arm64,armhf,i386
stretch2.2.0-0.1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm2.4.1-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.4.1-7amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye2.4.1-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package bppsuite:
roleprogram
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The Bio++ Program Suite is a package of programs using the Bio++ libraries and dedicated to Phylogenetics and Molecular Evolution. All programs are independent, but can be combined to perform rather complex analyses. These programs use the interface helper tools of the libraries, and hence share the same syntax. They also have several options in common, which may also be shared by third-party software.

The following programs are included:

  • BppML for maximum likelihood analysis,
  • BppSeqGen for sequences simulation,
  • BppAncestor for ancestral states reconstruction,
  • BppDist for distance methods,
  • BppPars for parsimony analysis,
  • BppSeqMan for file conversion and sequence manipulation,
  • BppConsense for building consensus tree and computing bootstrap values,
  • BppReRoot for tree rerooting.
  • BppTreeDraw for tree drawing.
  • BppAlnScore for comparing alignments and computing alignment scores.
  • BppMixedLikelihoods for computing site per site likelihoods of components of mixture models.
  • BppPopGen for population genetics analyses.
The package is enhanced by the following packages: bppsuite-examples
brig
BLAST Ring Image Generator
Versions of package brig
ReleaseVersionArchitectures
bookworm0.95+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.95+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.95+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
bullseye0.95+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.95+dfsg-2amd64,arm64,armhf,i386
stretch0.95+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

BRIG can display circular comparisons between a large number of genomes, with a focus on handling genome assembly data.

  • Images show similarity between a central reference sequence and other sequences as concentric rings.
  • BRIG will perform all BLAST comparisons and file parsing automatically via a simple GUI.
  • Contig boundaries and read coverage can be displayed for draft genomes; customized graphs and annotations can be displayed.
  • Using a user-defined set of genes as input, BRIG can display gene presence, absence, truncation or sequence variation in a set of complete genomes, draft genomes or even raw, unassembled sequence data.
  • BRIG also accepts SAM-formatted read-mapping files enabling genomic regions present in unassembled sequence data from multiple samples to be compared simultaneously
Please cite: Nabil-Fareed Alikhan, Nicola K Petty, Nouri L Ben Zakour and Scott A Beatson: BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons. (PubMed,eprint) BMC Genomics 12:402 (2011)
Registry entries: Bio.tools  SciCrunch 
btllib-tools
Bioinformatics Technology Lab common code library tools
Versions of package btllib-tools
ReleaseVersionArchitectures
sid1.4.10+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64
trixie1.4.10+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64
bookworm1.4.10+dfsg-1amd64,arm64,mips64el,ppc64el
upstream1.7.3
Popcon: 2 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

Bioinformatics Technology Lab common code library in C++ with Python wrappers.

This package contains the tool indexlr.

Registry entries: Bioconda 
busco
benchmarking sets of universal single-copy orthologs
Versions of package busco
ReleaseVersionArchitectures
bookworm5.4.4-1amd64,i386
sid5.5.0-2amd64,arm64,i386
bullseye5.0.0-1all
trixie5.5.0-2amd64,arm64,i386
Popcon: 6 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs (BUSCO).

  • Automated selection of lineages issued from https://www.orthodb.org/
  • Automated download of all necessary files and datasets to conduct a run
  • Use prodigal for non-eukaryotic genomes
The package is enhanced by the following packages: multiqc
Please cite: Mathieu Seppey, Mosè Manni and Evgeny M. Zdobnov: BUSCO: Assessing Genome Assembly and Annotation Completeness. (PubMed) Methods Mol Biol. 1962:227-245 (2019)
Registry entries: Bio.tools  Bioconda 
bustools
program for manipulating BUS files for single cell RNA-Seq datasets
Versions of package bustools
ReleaseVersionArchitectures
sid0.43.2+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
trixie0.43.2+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm0.42.0+dfsg-1amd64,arm64,mips64el,ppc64el,s390x
bullseye0.40.0-4amd64,arm64,mips64el,ppc64el,s390x
upstream0.44.0
Popcon: 1 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

This package contains BUStools program, it can be used to error correct barcodes, collapse UMIs, produce gene count or transcript compatibility count matrices

Please cite: Páll Melsted, A. Sina Booeshaghi, Fan Gao, Eduardo Beltrame, Lambda Lu, Kristján Eldjárn Hjorleifsson, Jase Gehring and Lior Pachter: Modular and efficient pre-processing of single-cell RNA-seq.. BioRxiv :673285 (2019)
Registry entries: Bio.tools  Bioconda 
bwa
Burrows-Wheeler Aligner
Versions of package bwa
ReleaseVersionArchitectures
buster0.7.17-3amd64
jessie0.7.10-1amd64
stretch0.7.15-2+deb9u1amd64
stretch-backports0.7.17-1~bpo9+1amd64
sid0.7.18-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.7.18-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.7.17-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye0.7.17-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package bwa:
biologynuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline, text-mode
roleprogram
useanalysing, comparing
Popcon: 17 users (22 upd.)*
Versions and Archs
License: DFSG free
Git

BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.

Please cite: Heng Li and Richard Durbin: Fast and accurate short read alignment with Burrows-Wheeler transform. (PubMed,eprint) Bioinformatics 25(14):1754-1760 (2009)
Registry entries: Bio.tools  SciCrunch  Bioconda 
canu
single molecule sequence assembler for genomes
Versions of package canu
ReleaseVersionArchitectures
bookworm2.0+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch-backports1.7.1+dfsg-1~bpo9+1amd64
sid2.2+dfsg-5amd64,arm64,mips64el,ppc64el,riscv64,s390x
buster1.8+dfsg-2amd64
bullseye2.0+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Canu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing (such as the PacBio RS II or Oxford Nanopore MinION).

Canu is a hierarchical assembly pipeline which runs in four steps:

  • Detect overlaps in high-noise sequences using MHAP
  • Generate corrected sequence consensus
  • Trim corrected sequences
  • Assemble trimmed corrected sequences
Please cite: Sergey Koren, Brian P. Walenz, Konstantin Berlin, Jason R. Miller and Adam M. Phillippy: Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.. Genome Res. (2017)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Remark of Debian Med team: Genome assembly and large-scale genome alignment (http://www.cbcb.umd.edu/software/)
cassiopee
index and search tool in genomic sequences
Versions of package cassiopee
ReleaseVersionArchitectures
jessie1.0.1+dfsg-3amd64,armel,armhf,i386
bullseye1.0.9-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch1.0.5-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.0.9-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0.9-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0.9-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.0.9-2amd64,arm64,armhf,i386
Popcon: 3 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Cassiopee index and search library C implementation. It is a complete rewrite of the ruby Cassiopee gem. It scans an input genomic sequence (dna/rna/protein) and search for a subsequence with exact match or allowing substitutions (Hamming distance) and/or insertion/deletions.

This package contains the cassiopee and cassiopeeknife tools.

Registry entries: SciCrunch  Bioconda 
cat-bat
taxonomic classification of contigs and metagenome-assembled genomes (MAGs)
Versions of package cat-bat
ReleaseVersionArchitectures
trixie5.3-2amd64,arm64,ppc64el,riscv64,s390x
sid5.3-2amd64,arm64,ppc64el,riscv64,s390x
bullseye5.2.2-1amd64,arm64,ppc64el,s390x
bookworm5.2.3-2amd64,arm64,ppc64el,s390x
upstream6.0.1
Popcon: 0 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

Contig Annotation Tool (CAT) and Bin Annotation Tool (BAT) are pipelines for the taxonomic classification of long DNA sequences and metagenome assembled genomes (MAGs/bins) of both known and (highly) unknown microorganisms, as generated by contemporary metagenomics studies. The core algorithm of both programs involves gene calling, mapping of predicted ORFs against the nr protein database, and voting-based classification of the entire contig / MAG based on classification of the individual ORFs. CAT and BAT can be run from intermediate steps if files are formatted appropriately.

Please cite: F. A. Bastiaan von Meijenfeldt, Ksenia Arkhipova, Diego D. Cambuy, Felipe H. Coutinho and Bas E. Dutilh: Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT. (PubMed,eprint) Genome Biology 20(1):217 (2019)
Registry entries: Bioconda 
cct
visually comparing bacterial, plasmid, chloroplast, or mitochondrial sequences
Versions of package cct
ReleaseVersionArchitectures
sid1.0.3-1all
trixie1.0.3-1all
stretch-backports20170919+dfsg-1~bpo9+1all
buster20170919+dfsg-1all
bullseye1.0.0-1all
bookworm1.0.3-1all
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The CGView Comparison Tool (CCT) is a package for visually comparing bacterial, plasmid, chloroplast, or mitochondrial sequences of interest to existing genomes or sequence collections. The comparisons are conducted using BLAST, and the BLAST results are presented in the form of graphical maps that can also show sequence features, gene and protein names, COG category assignments, and sequence composition characteristics. CCT can generate maps in a variety of sizes, including 400 Megapixel maps suitable for posters. Comparisons can be conducted within a particular species or genus, or all available genomes can be used. The entire map creation process, from downloading sequences to redrawing zoomed maps, can be completed easily using scripts included with the CCT. User-defined features or analysis results can be included on maps, and maps can be extensively customized. To simplify program setup, a CCT virtual machine that includes all dependencies preinstalled is available. Detailed tutorials illustrating the use of CCT are included with the CCT documentation.

Please cite: Jason R Grant, Adriano S Arantes and Paul Stothard: Comparing thousands of circular genomes using the CGView Comparison Tool. (PubMed,eprint) BMC Genomics 13:202 (2012)
cd-hit
suite of programs designed to quickly group sequences
Versions of package cd-hit
ReleaseVersionArchitectures
bullseye4.8.1-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid4.8.1-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie4.8.1-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm4.8.1-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie4.6.1-2012-08-27-2amd64,armel,armhf,i386
stretch4.6.6-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster4.6.8-2amd64,arm64,armhf,i386
Popcon: 7 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

cd-hit contains a number of programs designed to quickly group sequences. cd-hit groups proteins into clusters that meet a user-defined similarity threshold. cd-hit-est is similar to cd-hit, but designed to group nucleotide sequences (without introns). cd-hit-est-2d is similar to cd-hit-2d but designed to compare two nucleotide datasets. A number of other related programs are also in this package. Please see the cd-hit user manual, also part of this package, for further information.

Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Genomics
cdbfasta
Constant DataBase indexing and retrieval tools for multi-FASTA files
Versions of package cdbfasta
ReleaseVersionArchitectures
stretch0.99-20100722-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie1.00+git20230710.da8f5ba+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.00+git20230710.da8f5ba+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster0.99-20100722-5amd64,arm64,armhf,i386
bookworm1.00+git20181005.014498c+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.00+git20181005.014498c+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie0.99-20100722-1amd64,armel,armhf,i386
Popcon: 3 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

CDB (Constant DataBase) can be used for creating indices for quick retrieval of any particular sequences from large multi-FASTA files. It has the option to compress data records in order to save space.

Registry entries: SciCrunch 
centrifuge
rapid and memory-efficient system for classification of DNA sequences
Versions of package centrifuge
ReleaseVersionArchitectures
bullseye1.0.3-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el
sid1.0.4.1-1amd64,arm64,mips64el,ppc64el,riscv64
buster1.0.3-2amd64
trixie1.0.4.1-1amd64,arm64,mips64el,ppc64el,riscv64
bookworm1.0.3-11amd64,arm64,armel,armhf,i386,mips64el,ppc64el
Popcon: 0 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

Centrifuge is a very rapid and memory-efficient system for the classification of DNA sequences from microbial samples, with better sensitivity than and comparable accuracy to other leading systems. The system uses a novel indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (e.g., 4.3 GB for ~4,100 bacterial genomes) yet provides very fast classification speed, allowing it to process a typical DNA sequencing run within an hour. Together these advances enable timely and accurate analysis of large metagenomics data sets on conventional desktop computers.

Please cite: Daehwan Kim, Li Song, Florian P. Breitwieser and Steven L. Salzberg: Centrifuge: rapid and sensitive classification of metagenomic sequences. (PubMed,eprint) Genome Research 26(12):1721-1729 (2016)
Registry entries: Bio.tools  Bioconda 
cgview
Circular Genome Viewer
Versions of package cgview
ReleaseVersionArchitectures
bookworm0.0.20100111-7all
bullseye0.0.20100111-7all
stretch0.0.20100111-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster0.0.20100111-4amd64,arm64,armhf,i386
sid0.0.20100111-7all
trixie0.0.20100111-7all
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

CGView is a Java package for generating high quality, zoomable maps of circular genomes. Its primary purpose is to serve as a component of sequence annotation pipelines, as a means of generating visual output suitable for the web. Feature information and rendering options are supplied to the program using an XML file, a tab delimited file, or an NCBI ptt file. CGView converts the input into a graphical map (PNG, JPG, or Scalable Vector Graphics format), complete with labels, a title, legends, and footnotes. In addition to the default full view map, the program can generate a series of hyperlinked maps showing expanded views. The linked maps can be explored using any web browser, allowing rapid genome browsing, and facilitating data sharing. The feature labels in maps can be hyperlinked to external resources, allowing CGView maps to be integrated with existing web site content or databases.

In addition to the CGView application, an API is available for generating maps from within other Java applications, using the cgview package.

CGView can be used for any of the following:

  • Bacterial genome visualization and browsing - CGView can be incorporated into bacterial genome annotation pipelines, as a means of generating web content for data visualization and navigation. The PNG and image map content does not require Java applets or special browser plugins.
  • Genome poster generation - CGView can generate poster-sized images of circular genomes in rasterized image formats or in Scalable Vector Graphics format.
  • Sequence analysis visualization - CGView can be used to display the output of sequence analysis programs in a circular context.

CGView features:

  • Images can be generated in PNG, JPG, or SVG format. See the CGView gallery.
  • Static or interactive maps can be generated. The interactive maps make use of standard PNG images and HTML image maps. Scalable Vector Graphics output is included in the interactive maps (see example).
  • The XML input allows complete control over the appearance of the map.
  • Tab delimited input files and NCBI ptt files can be used as an alternative to the XML format.
  • The CGView API can be used to incorporate CGView into Java applications.
  • The CGView applet can be used to incorporate zoomable maps into web pages (see example).
  • The CGView Server can be used to generate maps online.
Please cite: Paul Stothard and David S. Wishart: Circular genome visualization and exploration using CGView. (PubMed,eprint) Bioinformatics 21(4):537-539 (2004)
Registry entries: Bio.tools  SciCrunch  Bioconda 
changeo
Repertoire clonal assignment toolkit (Python 3)
Versions of package changeo
ReleaseVersionArchitectures
trixie1.3.0-1all
bullseye1.0.2-1all
buster0.4.5-1all
sid1.3.0-1all
bookworm1.3.0-1all
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Change-O is a collection of tools for processing the output of V(D)J alignment tools, assigning clonal clusters to immunoglobulin (Ig) sequences, and reconstructing germline sequences.

Dramatic improvements in high-throughput sequencing technologies now enable large-scale characterization of Ig repertoires, defined as the collection of trans-membrane antigen-receptor proteins located on the surface of B cells and T cells. Change-O is a suite of utilities to facilitate advanced analysis of Ig and TCR sequences following germline segment assignment. Change-O handles output from IMGT/HighV-QUEST and IgBLAST, and provides a wide variety of clustering methods for assigning clonal groups to Ig sequences. Record sorting, grouping, and various database manipulation operations are also included.

This package installs the library for Python 3.

Please cite: Namita T. Gupta, Jason A. Vander Heiden, Mohamed Uduman, Daniel Gadala-Maria, Gur Yaari and Steven H. Kleinstein: Link to publication (PubMed,eprint) Bioinformatics 31(20):3356-3358 (2015)
Registry entries: Bioconda 
chimeraslayer
detects likely chimeras in PCR amplified DNA
Versions of package chimeraslayer
ReleaseVersionArchitectures
bookworm20101212+dfsg1-5all
bullseye20101212+dfsg1-4all
buster20101212+dfsg1-2all
sid20101212+dfsg1-6all
stretch20101212+dfsg1-1all
trixie20101212+dfsg1-6all
jessie20101212+dfsg-1all
Debtags of package chimeraslayer:
biologyformat:aln, nuceleic-acids
fieldbiology, biology:molecular
roleprogram
scopeutility
Popcon: 3 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

ChimeraSlayer is a chimeric sequence detection utility, compatible with near-full length Sanger sequences and shorter 454-FLX sequences (~500bp).

Chimera Slayer involves the following series of steps that operate to flag chimeric 16S rRNA sequences:

 1. the ends of a query sequence are searched against an included
    database of reference chimera-free 16S sequences to identify potential
    parents of a chimera
 2. candidate parents of a chimera are selected as those that form a
    branched best scoring alignment to the NAST-formatted query sequence
 3. the NAST alignment of the query sequence is improved in a
    ‘chimera-aware’ profile-based NAST realignment to the selected
    reference parent sequences
 4. an evolutionary framework is used to flag query sequences found to
    exhibit greater sequence homology to an in silico chimera formed
    between any two of the selected reference parent sequences.

To run Chimera Slayer, you need NAST-formatted sequences generated by the nast-ier utility.

ChimeraSlayer is part of the microbiomeutil suite.

The package is enhanced by the following packages: microbiomeutil-data
Please cite: Brian J. Haas, Dirk Gevers, Ashlee M. Earl, Mike Feldgarden, Doyle V. Ward, Georgia Giannoukos, Dawn Ciulla, Diana Tabbaa, Sarah K. Highlander, Erica Sodergren, Barbara Methé, Todd Z. DeSantis, The Human Microbiome Consortium, Joseph F. Petrosino, Rob Knight and Bruce W. Birren: Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. (PubMed,eprint) Genome Research 21(3):494-504 (2011)
Registry entries: SciCrunch 
chromhmm
Chromatin state discovery and characterization
Versions of package chromhmm
ReleaseVersionArchitectures
bookworm1.24+dfsg-1all
trixie1.25+dfsg-1all
bullseye1.21+dfsg-1all
buster1.18+dfsg-1all
sid1.25+dfsg-1all
Popcon: 4 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

ChromHMM is software for learning and characterizing chromatin states. ChromHMM can integrate multiple chromatin datasets such as ChIP-seq data of various histone modifications to discover de novo the major re-occuring combinatorial and spatial patterns of marks. ChromHMM is based on a multivariate Hidden Markov Model that explicitly models the presence or absence of each chromatin mark. The resulting model can then be used to systematically annotate a genome in one or more cell types. By automatically computing state enrichments for large-scale functional and annotation datasets ChromHMM facilitates the biological characterization of each state. ChromHMM also produces files with genome-wide maps of chromatin state annotations that can be directly visualized in a genome browser.

The package is enhanced by the following packages: chromhmm-example
Please cite: Jason Ernst and Manolis Kellis: ChromHMM: automating chromatin-state discovery and characterization. (eprint) Nature Methods 9(3):215-216 (2012)
Registry entries: Bio.tools  Bioconda 
chromimpute
Large-scale systematic epigenome imputation
Versions of package chromimpute
ReleaseVersionArchitectures
buster1.0.3+dfsg-1all
sid1.0.3+dfsg-5all
bullseye1.0.3+dfsg-2all
trixie1.0.3+dfsg-5all
bookworm1.0.3+dfsg-4all
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ChromImpute takes an existing compendium of epigenomic data and uses it to predict signal tracks for mark-sample combinations not experimentally mapped or to generate a potentially more robust version of data sets that have been mapped experimentally. ChromImpute bases its predictions on features from signal tracks of other marks that have been mapped in the target sample and the target mark in other samples with these features combined using an ensemble of regression trees.

Please cite: Jason Ernst and Manolis Kellis: Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. (eprint) Nature Biotechnology 33(4):364-376 (2015)
cif-tools
Suite of tools to manipulate, validate and query mmCIF files
Versions of package cif-tools
ReleaseVersionArchitectures
trixie1.0.7-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0.7-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.0.7-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.0.0-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream1.0.12
Popcon: 3 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

This package contains a suite of tools for the manipulation of mmCIF files.

The structure of macro molecules is nowadays recorded in mmCIF files. Until recently however the ancient PDB file format was used by many programs but that format has since long been deprecated.

This package provides two tools, pdb2cif and cif2pdb, that can convert files from one format into the other, provided that data fits of course.

Other tools are cif-validate, cif-grep, cif-diff, cif-merge and mmCQL. The latter can be used to manipulate an mmCIF file as if it were a SQL like database using SELECT, UPDATE, INSERT and DELETE commands.

This package depends on libcifpp.

circlator
circularize genome assemblies
Versions of package circlator
ReleaseVersionArchitectures
sid1.5.6-11all
buster1.5.5-3amd64
stretch1.4.1-1all
bullseye1.5.6-5amd64
bookworm1.5.6-7amd64
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Circlator is a tool to automate assembly circularization for bacterial and small eukaryotic genomes and produce accurate linear representations of circular sequences.

Please cite: Martin Hunt, Nishadi De Silva, Thomas D. Otto, Julian Parkhill, Jacqueline A. Keane and Simon R. Harris: Circlator: automated circularization of genome assemblies using long sequencing reads. (PubMed) Genome Biology 29(16):294 (2015)
Registry entries: SciCrunch  Bioconda 
circos
plotter for visualizing data
Versions of package circos
ReleaseVersionArchitectures
buster0.69.6+dfsg-2all
stretch0.69.4+dfsg-1all
jessie0.66-1all
bullseye0.69.9+dfsg-2all
bookworm0.69.9+dfsg-2all
sid0.69.9+dfsg-2all
trixie0.69.9+dfsg-2all
Debtags of package circos:
fieldbiology:bioinformatics
roleprogram
useviewing
Popcon: 5 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Circos visualizes data in a circular layout — ideal for exploring relationships between objects or positions, and creating highly informative publication-quality graphics.

This package provides the Circos plotting engine, which is command-line driven (like gnuplot) and fully scriptable.

Please cite: Martin I Krzywinski, Jacqueline E Schein, Inanc Birol, Joseph Connors, Randy Gascoyne, Doug Horsman, Steven J Jones and Marco A Marra: Circos: An information aesthetic for comparative genomics. (PubMed,eprint) Genome Research 19(9):1639-45 (2009)
Registry entries: Bio.tools  SciCrunch  Bioconda 
clearcut
extremely efficient phylogenetic tree reconstruction
Versions of package clearcut
ReleaseVersionArchitectures
trixie1.0.9+git20211013.b799afe-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.0.9-3amd64,arm64,armhf,i386
sid1.0.9+git20211013.b799afe-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.0.9+git20211013.b799afe-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie1.0.9-1amd64,armel,armhf,i386
stretch1.0.9-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye1.0.9-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 4 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Clearcut is the reference implementation for the Relaxed Neighbor Joining (RNJ) algorithm by J. Evans, L. Sheneman, and J. Foster from the Initiative for Bioinformatics and Evolutionary Studies (IBEST) at the University of Idaho.

Please cite: Jason Evans, Luke Sheneman and James A. Foster: Relaxed Neighbor-Joining: A Fast Distance-Based Phylogenetic Tree Construction Method. (PubMed) J. Mol. Evol. 62(6):785-792 (2006)
Registry entries: SciCrunch  Bioconda 
clonalframe
inference of bacterial microevolution using multilocus sequence data
Versions of package clonalframe
ReleaseVersionArchitectures
bullseye1.2-10amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie1.2-3amd64,armel,armhf,i386
trixie1.2-11amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.2-11amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.2-11amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch1.2-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.2-9amd64,arm64,armhf,i386
Debtags of package clonalframe:
roleprogram
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ClonalFrame identifies the clonal relationships between the members of a sample, while also estimating the chromosomal position of homologous recombination events that have disrupted the clonal inheritance.

ClonalFrame can be applied to any kind of sequence data, from a single fragment of DNA to whole genomes. It is well suited for the analysis of MLST data, where 7 gene fragments have been sequenced, but becomes progressively more powerful as the sequenced regions increase in length and number up to whole genomes. However, it requires the sequences to be aligned. If you have genomic data that is not aligned, it is recommend to use Mauve which produces alignment of whole bacterial genomes in exactly the format required for analysis with ClonalFrame.

Please cite: Xavier Didelot and Daniel Falush: Inference of Bacterial Microevolution Using Multilocus Sequence Data. (PubMed,eprint) Genetics Advance 175:1251-1266 (2006)
Registry entries: SciCrunch 
clonalframeml
Efficient Inference of Recombination in Whole Bacterial Genomes
Versions of package clonalframeml
ReleaseVersionArchitectures
bookworm1.12-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.12-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.11-3amd64,arm64,armhf,i386
sid1.13-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.13-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

ClonalFrameML is a software package that performs efficient inference of recombination in bacterial genomes. ClonalFrameML was created by Xavier Didelot and Daniel Wilson. ClonalFrameML can be applied to any type of aligned sequence data, but is especially aimed at analysis of whole genome sequences. It is able to compare hundreds of whole genomes in a matter of hours on a standard Desktop computer. There are three main outputs from a run of ClonalFrameML: a phylogeny with branch lengths corrected to account for recombination, an estimation of the key parameters of the recombination process, and a genomic map of where recombination took place for each branch of the phylogeny.

ClonalFrameML is a maximum likelihood implementation of the Bayesian software ClonalFrame which was previously described by Didelot and Falush (2007). The recombination model underpinning ClonalFrameML is exactly the same as for ClonalFrame, but this new implementation is a lot faster, is able to deal with much larger genomic dataset, and does not suffer from MCMC convergence issues

Please cite: Xavier Didelot and Daniel J. Wilson: ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes. (PubMed,eprint) PLoS Comput Biology 11(2):e1004041 (2015)
Registry entries: Bioconda 
clonalorigin
inference of homologous recombination in bacteria using whole genome sequences
Versions of package clonalorigin
ReleaseVersionArchitectures
sid1.0-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.0-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.0-3amd64,arm64,armhf,i386
bookworm1.0-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Bacteria, unlike us, can reproduce on their own. They do however have mechanisms that transfer DNA between organisms, a process more formally known as recombination. The mechanisms by which recombination takes place have been studied extensively in the laboratory but much remains to be understood concerning how, when and where recombination takes place within natural populations of bacteria and how it helps them to adapt to new environments. ClonalOrigin performs a comparative analysis of the sequences of a sample of bacterial genomes in order to reconstruct the recombination events that have taken place in their ancestry.

Please cite: Xavier Didelot, Daniel Lawson, Aaron Darling and Daniel Falush: Inference of Homologous Recombination in Bacteria Using Whole-Genome Sequences. (PubMed,eprint) Genetics 186(4):1435-1449 (2010)
Registry entries: Bio.tools  SciCrunch 
clustalo
General-purpose multiple sequence alignment program for proteins
Versions of package clustalo
ReleaseVersionArchitectures
trixie1.2.4-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie1.2.1-1amd64,armel,armhf,i386
stretch1.2.4-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.2.4-2amd64,arm64,armhf,i386
bullseye1.2.4-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.2.4-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.2.4-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 22 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

Clustal Omega is a general-purpose multiple sequence alignment (MSA) program, primarily for amino-acid sequences. It produces high quality MSAs and is capable of handling data sets of hundreds of thousands of sequences in reasonable time, using multiple processors where available.

Please cite: Fabian Sievers, Andreas Wilm, David Dineen, Toby J Gibson, Kevin Karplus, Weizhong Li, Rodrigo Lopez, Hamish McWilliam, Michael Remmert, Johannes Söding, Julie D Thompson and Desmond G Higgins: Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. (PubMed,eprint) Molecular Systems Biology 7:539 (2011)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequence analysis
clustalw
global multiple nucleotide or peptide sequence alignment
Versions of package clustalw
ReleaseVersionArchitectures
sid2.1+lgpl-7amd64,arm64,mips64el,ppc64el,riscv64,s390x
jessie2.1+lgpl-4amd64,armel,armhf,i386
buster2.1+lgpl-6amd64,arm64,armhf,i386
bullseye2.1+lgpl-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2.1+lgpl-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.1+lgpl-7amd64,arm64,mips64el,ppc64el,riscv64,s390x
stretch2.1+lgpl-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Debtags of package clustalw:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline, text-mode
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 20 users (10 upd.)*
Versions and Archs
License: DFSG free
Git

This program performs an alignment of multiple nucleotide or amino acid sequences. It recognizes the format of input sequences and whether the sequences are nucleic acid (DNA/RNA) or amino acid (proteins). The output format may be selected from in various formats for multiple alignments such as Phylip or FASTA. Clustal W is very well accepted.

The output of Clustal W can be edited manually but preferably with an alignment editor like SeaView or within its companion Clustal X. When building a model from your alignment, this can be applied for improved database searches. The Debian package hmmer creates such in form of an HMM.

The package is enhanced by the following packages: clustalw-mpi
Please cite: M. A. Larkin, G. Blackshields, N. P. Brown, R. Chenna, P. A. McGettigan, H. McWilliam, F. Valentin, I.M. Wallace, A. Wilm, R. Lopez, J. D. Thompson, T. J. Gibson and D. G. Higgins: Clustal W and Clustal X version 2.0. (PubMed,eprint) Bioinformatics 23(21):2947-2948 (2007)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequence analysis
clustalx
Multiple alignment of nucleic acid and protein sequences (graphical interface)
Versions of package clustalx
ReleaseVersionArchitectures
bookworm2.1+lgpl-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie2.1+lgpl-3amd64,armel,armhf,i386
sid2.1+lgpl-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie2.1+lgpl-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
bullseye2.1+lgpl-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster2.1+lgpl-8amd64,arm64,armhf,i386
stretch2.1+lgpl-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Debtags of package clustalx:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacex11
roleprogram
scopeutility
uitoolkitmotif
useanalysing, comparing, viewing
works-with-formatplaintext
x11application
Popcon: 8 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

This package offers a GUI interface for the Clustal multiple sequence alignment program. It provides an integrated environment for performing multiple sequence- and profile-alignments to analyse the results. The sequence alignment is displayed in a window on the screen. A versatile coloring scheme has been incorporated to highlight conserved features in the alignment. For professional presentations, one should use the texshade LaTeX package or boxshade.

The pull-down menus at the top of the window allow you to select all the options required for traditional multiple sequence and profile alignment. You can cut-and-paste sequences to change the order of the alignment; you can select a subset of sequences to be aligned; you can select a sub-range of the alignment to be realigned and inserted back into the original alignment.

An alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted.

Please cite: M.A. Larkin, G. Blackshields, N.P. Brown, R. Chenna, P.A. McGettigan, H. McWilliam, F. Valentin, I.M. Wallace, A. Wilm, R. Lopez, J.D. Thompson, T.J. Gibson and D.G. Higgins: Clustal W and Clustal X version 2.0. (PubMed,eprint) Bioinformatics 23(21):2947-2948 (2007)
Registry entries: Bio.tools  SciCrunch 
Topics: Sequence analysis
cnvkit
Copy number variant detection from targeted DNA sequencing
Versions of package cnvkit
ReleaseVersionArchitectures
buster0.9.5-3amd64
trixie0.9.10-2all
bullseye0.9.8-1amd64,arm64,ppc64el
bookworm0.9.9-2amd64,arm64,ppc64el
sid0.9.10-2all
experimental0.9.10-3~0exp0all
upstream0.9.11
Popcon: 3 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

A command-line toolkit and Python library for detecting copy number variants and alterations genome-wide from targeted DNA sequencing. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Please cite: Eric Talevich, A. Hunter Shain, Thomas Botton and Boris C. Bastian: CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing. (PubMed,eprint) PLOS 12(4):e1004873 (2016)
Registry entries: Bio.tools  Bioconda 
codonw
Correspondence Analysis of Codon Usage
Versions of package codonw
ReleaseVersionArchitectures
bullseye1.4.4-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.4.4-4amd64,arm64,armhf,i386
stretch1.4.4-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie1.4.4-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.4.4-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.4.4-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

CodonW is a package for codon usage analysis. It was designed to simplify Multivariate Analysis (MVA) of codon usage. The MVA method employed in CodonW is correspondence analysis (COA) (the most popular MVA method for codon usage analysis). CodonW can generate a COA for codon usage, relative synonymous codon usage or amino acid usage. Additional analyses of codon usage include investigation of optimal codons, codon and dinucleotide bias, and/or base composition. CodonW analyses sequences encoded by genetic codes other than the universal code.

Please cite: Paul M. Sharp, Elizabeth Bailes, Russell J. Grocock, John F. Peden and R. Elizabeth Sockett: Variation in the strength of selected codon usage bias among bacteria.. (PubMed,eprint) Nucleic Acids Research 33(4):1141-1153 (2005)
Registry entries: Bioconda 
Topics: Sequence composition, complexity and repeats
comet-ms
Tandem mass spectrometry (MS/MS) search engine
Versions of package comet-ms
ReleaseVersionArchitectures
trixie2019015+cleaned1-4.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2019015+cleaned1-4.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch2014022-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2018012-1amd64,arm64,armhf,i386
bullseye2019015+cleaned1-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2019015+cleaned1-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream2021010
Popcon: 2 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

Comet is an open source tandem mass spectrometry (MS/MS) sequence database search engine. It identifies peptides by searching MS/MS spectra against sequences present in protein sequence databases.

This package ships a binary that does MS/MS database searches. Supported input formats are mzXML, mzML, and ms2 files. Supported output formats are .out, SQT, and pepXML.

Please cite: Jimmy K. Eng, Tahmina A. Jahan and Michael R. Hoopmann: Comet: an open source tandem mass spectrometry sequence database search tool. (PubMed) Proteomics 13(1) (2012)
concavity
predictor of protein ligand binding sites from structure and conservation
Versions of package concavity
ReleaseVersionArchitectures
stretch0.1+dfsg.1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie0.1-2amd64,armel,armhf,i386
sid0.1+dfsg.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.1+dfsg.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.1+dfsg.1-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye0.1+dfsg.1-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.1+dfsg.1-4amd64,arm64,armhf,i386
Popcon: 5 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

ConCavity predicts protein ligand binding sites by combining evolutionary sequence conservation and 3D structure.

ConCavity takes as input a PDB format protein structure and optionally files that characterize the evolutionary sequence conservation of the chains in the structure file.

The following result files are produced by default:

  • Residue ligand binding predictions for each chain (*.scores).
  • Residue ligand binding predictions in a PDB format file (residue scores placed in the temp. factor field, *_residue.pdb).
  • Pocket prediction locations in a DX format file (*.dx).
  • PyMOL script to visualize the predictions (*.pml).
The package is enhanced by the following packages: conservation-code
Please cite: John A. Capra, Roman A. Laskowski, Janet M. Thornton, Mona Singh and Thomas A. Funkhouser: Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure. (PubMed) PLoS Computational Biology 5(12):e1000585 (2009)
Registry entries: SciCrunch 
conservation-code
protein sequence conservation scoring tool
Versions of package conservation-code
ReleaseVersionArchitectures
trixie20110309.0-8all
stretch20110309.0-5all
buster20110309.0-7all
bullseye20110309.0-8all
bookworm20110309.0-8all
sid20110309.0-8all
jessie20110309.0-3all
Popcon: 6 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

This package provides score_conservation(1), a tool to score protein sequence conservation.

The following conservation scoring methods are implemented:

  • sum of pairs
  • weighted sum of pairs
  • Shannon entropy
  • Shannon entropy with property groupings (Mirny and Shakhnovich 1995, Valdar and Thornton 2001)
  • relative entropy with property groupings (Williamson 1995)
  • von Neumann entropy (Caffrey et al 2004)
  • relative entropy (Samudrala and Wang 2006)
  • Jensen-Shannon divergence (Capra and Singh 2007)

A window-based extension that incorporates the estimated conservation of sequentially adjacent residues into the score for each column is also given. This window approach can be applied to any of the conservation scoring methods.

The program accepts alignments in the CLUSTAL and FASTA formats.

The sequence-specific output can be used as the conservation input for concavity.

Conservation is highly predictive in identifying catalytic sites and residues near bound ligands.

Please cite: John A. Capra and Mona Singh: Predicting functionally important residues from sequence conservation. (PubMed) Bioinformatics 23(15):1875-82 (2007)
coot
model building program for macromolecular crystallography
Versions of package coot
ReleaseVersionArchitectures
sid1.1.09+dfsg-2amd64,arm64,armhf,ppc64el
upstream1.1.10
Popcon: 3 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

This is a program for constructing atomic models of macromolecules from x-ray diffraction data. Coot displays electron density maps and molecular models and allows model manipulations such as idealization, refinement, manual rotation/translation, rigid-body fitting, ligand search, solvation, mutations, rotamers. Validation tools such as Ramachandran and geometry plots are available to the user. This package provides a Coot build with embedded Python support.

Please cite: P. Emsley, B. Lohkamp, W. G. Scott and K. Cowtan: Features and development of Coot. (eprint) Acta Crystallographica Section D 66(4):486-501 (2010)
covtobed
convert the coverage track from a BAM file into a BED file
Versions of package covtobed
ReleaseVersionArchitectures
bullseye1.2.0+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.3.5+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.3.5+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.3.5+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Reads one (or more) alignment files (sorted BAM) and prints a BED with the coverage. It will join consecutive bases with the same coverage, and can be used to only print a BED file with the regions having a specific coverage range.

Please cite: Giovanni Birolo and Andrea Telatin: covtobed: a simple and fast tool to extract coverage tracks from BAM files. Journal of Open Source Software 5(47):2119 (2020)
Registry entries: Bioconda 
crac
integrated RNA-Seq read analysis
Versions of package crac
ReleaseVersionArchitectures
trixie2.5.2+dfsg-6amd64,arm64,mips64el,ppc64el,riscv64
bookworm2.5.2+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el
stretch2.5.0+dfsg-1amd64
sid2.5.2+dfsg-6amd64,arm64,mips64el,ppc64el,riscv64
buster2.5.0+dfsg-3amd64,arm64
bullseye2.5.2+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

CRAC is a tool to analyze High Throughput Sequencing (HTS) data in comparison to a reference genome. It is intended for transcriptomic and genomic sequencing reads. More precisely, with transcriptomic reads as input, it predicts point mutations, indels, splice junction, and chimeric RNAs (ie, non colinear splice junctions). CRAC can also output positions and nature of sequence error that it detects in the reads. CRAC uses a genome index. This index must be computed before running the read analysis. For this sake, use the command "crac-index" on your genome files. You can then process the reads using the command crac. See the man page of CRAC (help file) by typing "man crac". CRAC requires large amount of main memory on your computer. For processing against the Human genome, say 50 million reads of 100 nucleotide each, CRAC requires about 40 gigabytes of main memory. Check whether the system of your computing server is equipped with sufficient amount of memory before launching an analysis.

Please cite: Eliseos J. Mucaki, Natasha G. Caminsky, Ami M. Perri, Ruipeng Lu, Alain Laederach, Matthew Halvorsen, Joan H. M. Knoll and Peter K. Rogan: A unified analytic framework for prioritization of non-coding variants of uncertain significance in heritable breast and ovarian cancer. (PubMed) BMS Medical Genomics 9:19 (2016)
Registry entries: Bio.tools  SciCrunch 
csb
Computational Structural Biology Toolbox (CSB)
Versions of package csb
ReleaseVersionArchitectures
bullseye1.2.5+dfsg-5all
bookworm1.2.5+dfsg-8all
trixie1.2.5+dfsg-10all
sid1.2.5+dfsg-10all
buster1.2.5+dfsg-3all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Computational Structural Biology Toolbox (CSB) is a Python class library for reading, storing and analyzing biomolecular structures in a variety of formats with rich support for statistical analyses.

CSB is designed for reusability and extensibility and comes with a clean, well-documented API following good object-oriented engineering practice.

This package contains some user executable tools.

Registry entries: SciCrunch  Bioconda 
ctffind
fast and accurate defocus estimation from electron micrographs
Versions of package ctffind
ReleaseVersionArchitectures
sid4.1.14-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie4.1.14-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream5.0.2
Popcon: 1 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

This is a widely-used program for the estimation of objective lens defocus parameters from transmission electron micrographs. Defocus parameters are estimated by fitting a model of the microscope's contrast transfer function (CTF) to an image's amplitude spectrum.

Please cite: Alexis Rohou and Nikolaus Grigorieff: CTFFIND4: Fast and accurate defocus estimation from electron micrographs. (PubMed) Journal of Structural Biology 192(2):216-221 (2015)
Registry entries: SciCrunch 
cutadapt
Clean biological sequences from high-throughput sequencing reads
Versions of package cutadapt
ReleaseVersionArchitectures
buster1.18-1all
trixie4.7-2all
stretch1.12-2all
sid4.7-2all
bullseye3.2-2all
bookworm4.2-1all
upstream4.9
Popcon: 9 users (71 upd.)*
Newer upstream!
License: DFSG free
Git

Cutadapt helps with biological sequence clean tasks by finding the adapter or primer sequences in an error-tolerant way. It can also modify and filter reads in various ways. Adapter sequences can contain IUPAC wildcard characters. Also, paired-end reads and even colorspace data is supported. If you want, you can also just demultiplex your input data, without removing adapter sequences at all.

This package contains the user interface.

The package is enhanced by the following packages: multiqc
Please cite: Marcel Martin: Cutadapt removes adapter sequences from high-throughput sequencing reads. (eprint) EMBnet.journal 17(1):10-12 (2015)
Registry entries: Bio.tools  SciCrunch  Bioconda 
cutesv
comprehensive discovery of structural variations of genomic sequences
Versions of package cutesv
ReleaseVersionArchitectures
bookworm2.0.2-1all
trixie2.1.0-2all
sid2.1.1-1all
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Long-read sequencing enables the comprehensive discovery of structural variations (SVs). However, it is still non-trivial to achieve high sensitivity and performance simultaneously due to the complex SV characteristics implied by noisy long reads.

cuteSV is a sensitive, fast and scalable long-read-based SV detection approach. cuteSV uses tailored methods to collect the signatures of various types of SVs and employs a clustering-and-refinement method to analyze the signatures to implement sensitive SV detection. Benchmarks on real Pacific Biosciences (PacBio) and Oxford Nanopore Technology (ONT) datasets demonstrate that cuteSV has better yields and scalability than state-of-the-art tools.

Please cite: Tao Jiang, Yongzhuang Liu, Yue Jiang, Junyi Li, Yan Gao, Zhe Cui, Yadong Liu, Bo Liu and Yadong Wang: Long-read-based human genomic structural variation detection with cuteSV. (PubMed,eprint) Genome Biology 21(1):189 (2020)
Registry entries: Bio.tools  Bioconda 
daligner
local alignment discovery between long nucleotide sequencing reads
Versions of package daligner
ReleaseVersionArchitectures
buster1.0+git20180524.fd21879-1amd64,arm64,armhf,i386
stretch1.0+20161119-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye1.0+git20200727.ed40ce5-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.0+git20221215.bd26967-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0+git20240119.335105d-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0+git20240119.335105d-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

These tools permit one to find all significant local alignments between reads encoded in a Dazzler database. The assumption is that the reads are from a Pacific Biosciences RS II long read sequencer. That is, the reads are long and noisy, up to 15% on average.

Please cite: Gene Myers: Efficient Local Alignment Discovery amongst Noisy Long Reads. 8701:52-67 (2014)
Registry entries: SciCrunch  Bioconda 
damapper
long read to reference genome mapping tool
Versions of package damapper
ReleaseVersionArchitectures
trixie0.0+git20240314.b025cf9-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.0+git20200322.b2c9d7f-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.0+git20240314.b025cf9-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.0+git20210330.ab45103-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Recognised as the Damapper Library, this is a long read to reference genome mapping command line tool.

For a given reference database 'X' and read block 'Y', damapper produces the single file 'Y.X.las'. Each output file is sorted in order of the A-reads, and if a match is a chain of local alignments, then the LA's in the chain occur in increasing order of A-coordinates.

HPC.damapper writes a UNIX shell script to the standard output that maps every read in blocks to of database to a reference sequence . If is missing then only the single block is mapped, and if is also missing then all blocks of the database are mapped.

This package contains the damapper and HPC.damapper binaries.

datamash
statistics tool for command-line interface
Versions of package datamash
ReleaseVersionArchitectures
sid1.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.7-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch1.0.7-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.4-1amd64,arm64,armhf,i386
bullseye1.7-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie1.0.6-2amd64,armel,armhf,i386
Popcon: 44 users (11 upd.)*
Versions and Archs
License: DFSG free
Git

GNU Datamash is a command-line program which performs basic numeric, textual and statistical operations on input textual data files. It is designed to be portable and reliable, and aid researchers to easily automate analysis pipelines, without writing code or even short scripts.

Registry entries: SciCrunch 
dawg
simulate the evolution of recombinant DNA sequences
Versions of package dawg
ReleaseVersionArchitectures
buster1.2-2amd64,arm64,armhf,i386
stretch1.2-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.2-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.2-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.2-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.2-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

DNA Assembly with Gaps (Dawg) is an application designed to simulate the evolution of recombinant DNA sequences in continuous time based on the robust general time reversible model with gamma and invariant rate heterogeneity and a novel length-dependent model of gap formation. The application accepts phylogenies in Newick format and can return the sequence of any node, allowing for the exact evolutionary history to be recorded at the discretion of users. Dawg records the gap history of every lineage to produce the true alignment in the output. Many options are available to allow users to customize their simulations and results.

Please cite: Reed A. Cartwright: DNA assembly with gaps (Dawg): simulating sequence evolution. (PubMed,eprint) Bioinformatics 21(Suppl 3):iii31-iii38 (2005)
Registry entries: Bioconda 
dazzdb
manage nucleotide sequencing read data
Versions of package dazzdb
ReleaseVersionArchitectures
sid1.0+git20240115.be65e59-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.0+git20240115.be65e59-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.0+git20221215.aad3a46-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.0+git20201103.8d98c37-1+deb11u1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.0+git20180908.0bd5e07-1amd64,arm64,armhf,i386
stretch1.0+20161112-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

To facilitate the multiple phases of the dazzler assembler, all the read data is organized into what is effectively a database of the reads and their meta-information. The design goals for this data base are as follows:

  • The database stores the source Pacbio read information in such a way that it can re-create the original input data, thus permitting a user to remove the (effectively redundant) source files. This avoids duplicating the same data, once in the source file and once in the database.
  • The data base can be built up incrementally, that is new sequence data can be added to the data base over time.
  • The data base flexibly allows one to store any meta-data desired for reads. This is accomplished with the concept of tracks that implementors can add as they need them.
  • The data is held in a compressed form equivalent to the .dexta and .dexqv files of the data extraction module. Both the .fasta and .quiva information for each read is held in the data base and can be recreated from it. The .quiva information can be added separately and later on if desired.
  • To facilitate job parallel, cluster operation of the phases of the assembler, the database has a concept of a current partitioning in which all the reads that are over a given length and optionally unique to a well, are divided up into blocks containing roughly a given number of bases, except possibly the last block which may have a short count. Often programs can be run on blocks or pairs of blocks and each such job is reasonably well balanced as the blocks are all the same size. One must be careful about changing the partition during an assembly as doing so can void the structural validity of any interim block-based results.
Registry entries: Bioconda 
deblur
deconvolution for Illumina amplicon sequencing
Versions of package deblur
ReleaseVersionArchitectures
sid1.1.1-2all
Popcon: users ( upd.)*
Versions and Archs
License: DFSG free
Git

Deblur is a greedy deconvolution algorithm for amplicon sequencing based on Illumina Miseq/Hiseq error profiles. The authors recommend using Deblur via the QIIME2 plugin q2-deblur. Examples of its use can be found within the plugin itself. However, Deblur itself does not depend on QIIME2.

The input to Deblur workflow is a directory of FASTA or FASTQ files (1 per sample) or a single demultiplexed FASTA or FASTQ file. These files can be gzip'd. The output directory will contain three BIOM tables in which the observation IDs are the Deblurred sequences. The outputs are contingent on the reference databases used and a more focused discussion on them is in the subsequent README section titled "Positive and Negative Filtering." The output files are as follows:

  • reference-hit.biom : contains only Deblurred reads matching the positive filtering database. By default, a reference composed of 16S sequences is used, and this resulting table will contain only those reads which recruit at a coarse level to it will be retained. Reads are also filtered against the negative reference, which by default will remove any read which appears to be PhiX or adapter.

  • reference-hit.seqs.fa : a fasta file containing all the sequences in reference-hit.biom

  • reference-non-hit.biom : contains only Deblurred reads that did not align to the positive filtering database. Negative filtering is also appied to this table, so by default, PhiX and adapter are removed.

  • reference-non-hit.seqs.fa : a fasta file containing all the sequences in reference-non-hit.biom

  • all.biom : contains all Deblurred reads. This file represents the union of the "reference-hit.biom" and "reference-non-hit.biom" tables.

  • all.seqs.fa : a fasta file containing all the sequences in all.biom

Deblur uses two types of filtering on the sequences:

  • Negative mode - removes known artifact sequences (i.e. sequences aligning to PhiX or Adapter with >=95% identity and coverage).

  • Positive mode - keeps only sequences similar to a reference database (by default known 16S sequences). SortMeRNA is used, and any sequence with an e-value <= 10 is retained. Deblur also outputs a BIOM table without this positive filtering step (named all.biom).

The FASTA files for both of these filtering steps can be supplied via the --neg-ref-fp and --pos-ref-fp options. By default, the negative database is composed of PhiX and adapter sequence and the positive database of known 16S sequences.

Deblur uses negative mode filtering to remove known artifact (i.e. PhiX and Adapter sequences) prior to denoising. The output of Deblur contains three files: all.biom, which includes all sOTUs, reference-hit.biom, which contains the output of positive filtering of the sOTUs (default only sOTUs similar to 16S sequences), and reference-non-hit.biom, which contains only sOTUs failing the positive filtering (default only non-16S sOTUs).

deepnano
alternative basecaller for MinION reads of genomic sequences
Versions of package deepnano
ReleaseVersionArchitectures
bullseye0.0+git20170813.e8a621e-3.1amd64,arm64,armhf,i386,ppc64el,s390x
buster0.0+git20170813.e8a621e-3amd64,arm64,i386
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

DeepNano is alternative basecaller for Oxford Nanopore MinION reads based on deep recurrent neural networks.

Currently it works with SQK-MAP-006 and SQK-MAP-005 chemistry and as a postprocessor for Metrichor.

Please cite: Vladimír Boža, Broňa Brejová and Tomáš Vinař: DeepNano: Deep recurrent neural networks for base calling in MinION nanopore reads. PLOS one (2017)
Remark of Debian Med team: There is no intend to keep continue the existing packaging since

the program nanocall seems to serve the intended purpose better

delly
Structural variant discovery by read analysis
Versions of package delly
ReleaseVersionArchitectures
bullseye0.8.7-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.1.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster0.8.1-2amd64,arm64,armhf
bookworm1.1.6-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.1.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream1.2.6
Popcon: 2 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

Delly performs Structural variant discovery by integrated paired-end and split-read analysis. It discovers, genotypes and visualizes deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends, split-reads and read-depth to sensitively and accurately delineate genomic rearrangements throughout the genome.

Please cite: Tobias Rausch, Thomas Zichner, Andreas Schlattl, Adrian M. Stuetz, Vladimir Benes and Jan O. Korbel: DELLY: structural variant discovery by integrated paired-end and split-read analysis.. Bioinformatics 28:i333-i339 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
density-fitness
Calculates per-residue electron density scores
Versions of package density-fitness
ReleaseVersionArchitectures
bullseye1.0.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.0.8-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.0.8-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream1.0.11
Popcon: 2 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

The program density-fitness calculates electron density metrics, for main- (includes Cβ atom) and side-chain atoms of individual residues.

For this calculation, the program uses the structure model in either PDB or mmCIF format and the electron density from the 2mFo-DFc and mFo-DFc maps. If these maps are not readily available, the MTZ file and model can be used to calculate maps clipper. Density-fitness support both X-ray and electron diffraction data.

This program is essentially a reimplementation of edstats, a program available from the CCP4 suite. However, the output now contains only the RSR, SRSR and RSCC fields as in edstats with the addition of EDIAm and OPIA and no longer requires pre-calculated map coefficients.

Please cite: I. J. Tickle: Statistical quality indicators for electron-density maps. Acta Cryst. (D68):454-467 (2012)
dextractor
(d)extractor and compression command library
Versions of package dextractor
ReleaseVersionArchitectures
bullseye1.0-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.0-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Dextractor commands allow one to pull exactly and only the information needed for assembly and reconstruction from the source HDF5 files produced by the PacBio RS II sequencer, or from the source BAM files produced by the PacBio Sequel sequencer.

For each of the three extracted file types -- fasta, quiva, and arrow -- the library contains commands to compress the given file type, and to decompress it, which is a reversible process delivering the original uncompressed file. The compressed .fasta files, with the extension .dexta, consume 1/4 byte per base. The compressed .quiva files, with the extension .dexqv, consume 1.5 bytes per base on average, and the compressed .arrow files, with the extension .dexar, consume 1/4 byte per base

For more information, please view the available documentation at https://github.com/thegenemyers/DEXTRACTOR

Registry entries: Bioconda 
dialign
Segment-based multiple sequence alignment
Versions of package dialign
ReleaseVersionArchitectures
jessie2.2.1-7amd64,armel,armhf,i386
stretch2.2.1-8amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2.2.1-10amd64,arm64,armhf,i386
bullseye2.2.1-11amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2.2.1-11amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.2.1-13amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.2.1-13amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Debtags of package dialign:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing
works-with-formatplaintext
Popcon: 13 users (14 upd.)*
Versions and Archs
License: DFSG free
Git

DIALIGN2 is a command line tool to perform multiple alignment of protein or DNA sequences. It constructs alignments from gapfree pairs of similar segments of the sequences. This scoring scheme for alignments is the basic difference between DIALIGN and other global or local alignment methods. Note that DIALIGN does not employ any kind of gap penalty.

Please cite: Burkhard Morgenstern: DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. (PubMed,eprint) Bioinformatics 15(3):211-218 (1999)
Registry entries: Bio.tools  SciCrunch  Bioconda 
dialign-tx
Segment-based multiple sequence alignment
Versions of package dialign-tx
ReleaseVersionArchitectures
buster1.0.2-12amd64,arm64,armhf,i386
stretch1.0.2-9amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.0.2-14amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0.2-15amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie1.0.2-7amd64,armel,armhf,i386
sid1.0.2-15amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.0.2-13amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package dialign-tx:
fieldbiology, biology:bioinformatics
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 11 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

DIALIGN-TX is a command line tool to perform multiple alignment of protein or DNA sequences. It is a complete reimplementation of the segment-base approach including several new improvements and heuristics that significantly enhance the quality of the output alignments compared to DIALIGN 2.2 and DIALIGN-T. For pairwise alignment, DIALIGN-TX uses a fragment-chaining algorithm that favours chains of low-scoring local alignments over isolated high-scoring fragments. For multiple alignment, DIALIGN-TX uses an improved greedy procedure that is less sensitive to spurious local sequence similarities.

The package is enhanced by the following packages: dialign-tx-data
Please cite: Amarendran R. Subramanian, Michael Kaufmann and Burkhard Morgenstern: DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. (PubMed) Algorithms for Molecular Biology 3(1):6 (2008)
Registry entries: Bio.tools  SciCrunch  Bioconda 
diamond-aligner
accelerated BLAST compatible local sequence aligner
Versions of package diamond-aligner
ReleaseVersionArchitectures
bookworm2.1.3-1amd64,arm64,ppc64el,s390x
bullseye2.0.7-1amd64,arm64,ppc64el,s390x
sid2.1.9-1amd64,arm64,ppc64el,riscv64,s390x
trixie2.1.9-1amd64,arm64,ppc64el,riscv64,s390x
buster0.9.24+dfsg-1amd64
stretch-backports0.9.22+dfsg-2~bpo9+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

DIAMOND is a sequence aligner for protein and translated DNA searches and functions as a drop-in replacement for the NCBI BLAST software tools. It is suitable for protein-protein search as well as DNA-protein search on short reads and longer sequences including contigs and assemblies, providing a speedup of BLAST ranging up to x20,000.

Please cite: Benjamin Buchfink, Chao Xie and Daniel H Huson: Fast and sensitive protein alignment using DIAMOND. (PubMed) Nature methods 12(1):59-60 (2015)
Registry entries: Bio.tools  SciCrunch  Bioconda 
discosnp
discovering Single Nucleotide Polymorphism from raw set(s) of reads
Versions of package discosnp
ReleaseVersionArchitectures
buster2.3.0-2amd64,arm64,i386
sid2.6.2-3amd64,arm64,mips64el,ppc64el,riscv64
trixie2.6.2-3amd64,arm64,mips64el,ppc64el,riscv64
bookworm2.6.2-2amd64,arm64,mips64el,ppc64el
jessie1.2.5-1amd64,armel,armhf,i386
stretch1.2.6-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye4.4.4-1amd64,arm64,i386,mips64el,ppc64el,s390x
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Software discoSnp is designed for discovering Single Nucleotide Polymorphism (SNP) from raw set(s) of reads obtained with Next Generation Sequencers (NGS).

Note that number of input read sets is not constrained, it can be one, two, or more. Note also that no other data as reference genome or annotations are needed.

The software is composed by two modules. First module, kissnp2, detects SNPs from read sets. A second module, kissreads, enhance the kissnp2 results by computing per read set and for each found SNP:

 1) its mean read coverage
 2) the (phred) quality of reads generating the polymorphism.

This program is superseded by DiscoSnp++.

Registry entries: Bio.tools  SciCrunch  Bioconda 
disulfinder
cysteines disulfide bonding state and connectivity predictor
Versions of package disulfinder
ReleaseVersionArchitectures
stretch1.2.11-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.2.11-8amd64,arm64,armhf,i386
bullseye1.2.11-10amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.2.11-12amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.2.11-12amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.2.11-12amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie1.2.11-4amd64,armel,armhf,i386
Debtags of package disulfinder:
roleprogram
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

'disulfinder' is for predicting the disulfide bonding state of cysteines and their disulfide connectivity starting from sequence alone. Disulfide bridges play a major role in the stabilization of the folding process for several proteins. Prediction of disulfide bridges from sequence alone is therefore useful for the study of structural and functional properties of specific proteins. In addition, knowledge about the disulfide bonding state of cysteines may help the experimental structure determination process and may be useful in other genomic annotation tasks.

'disulfinder' predicts disulfide patterns in two computational stages: (1) the disulfide bonding state of each cysteine is predicted by a BRNN-SVM binary classifier; (2) cysteines that are known to participate in the formation of bridges are paired by a Recursive Neural Network to obtain a connectivity pattern.

Please cite: Alessio Ceroni, Andrea Passerini, Alessandro Vullo and Paolo Frasconi: DISULFIND: a disulfide bonding state and cysteine connectivity prediction server. (PubMed) Nucleic Acids Res 34(Web Server issue):W177-181 (2006)
Registry entries: Bio.tools  SciCrunch 
dnaclust
tool for clustering millions of short DNA sequences
Versions of package dnaclust
ReleaseVersionArchitectures
bookworm3-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid3-7amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye3-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster3-6amd64,arm64,armhf,i386
stretch3-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie3-2amd64,armel,armhf,i386
trixie3-7amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

dnaclust is a tool for clustering large number of short DNA sequences. The clusters are created in such a way that the "radius" of each clusters is no more than the specified threshold.

The input sequences to be clustered should be in Fasta format. The id of each sequence is based on the first word of the seqeunce in the Fasta format. The first word is the prefix of the header up to the first occurrence of white space characters in the header.

Please cite: Mohammadreza Ghodsi, Bo Liu and Mihai Pop: DNACLUST: accurate and efficient clustering of phylogenetic marker genes. (PubMed,eprint) BMC Bioinformatics 12:271 (2011)
Registry entries: Bio.tools  SciCrunch 
dnarrange
Method to find rearrangements in long DNA reads relative to a genome seq
Versions of package dnarrange
ReleaseVersionArchitectures
trixie1.5.3-1all
sid1.5.3-1all
bookworm1.5.3-1all
upstream1.6.2
Popcon: 2 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

This package provides utilities to align the reads to the genome, find rearrangements and draw pictures of rearranged groups

dotter
detailed comparison of two genomic sequences
Versions of package dotter
ReleaseVersionArchitectures
trixie4.44.1+dfsg-7.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
buster4.44.1+dfsg-3amd64,arm64,armhf,i386
bullseye4.44.1+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm4.44.1+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid4.44.1+dfsg-7.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Dotter is a graphical dot-matrix program for detailed comparison of two sequences.

  • Every residue in one sequence is compared to every residue in the other, and a matrix of scores is calculated.
  • One sequence is plotted on the x-axis and the other on the y-axis.
  • Noise is filtered out so that alignments appear as diagonal lines.
  • Pairwise scores are averaged over a sliding window to make the score matrix more intelligible.
  • The averaged score matrix forms a three-dimensional landscape, with the two sequences in two dimensions and the height of the peaks in the third. This landscape is projected onto two dimensions using a grey-scale image - the darker grey of a peak, the higher the score is.
  • The contrast and threshold of the grey-scale image can be adjusted interactively, without having to recalculate the score matrix.
  • An Alignment Tool is provided to examine the sequence alignment that the grey-scale image represents.
  • Known high-scoring pairs can be loaded from a GFF file and overlaid onto the plot.
  • Gene models can be loaded from GFF and displayed alongside the relevant axis.
  • Compare a sequence against itself to find internal repeats.
  • Find overlaps between multiple sequences by making a dot-plot of all sequences versus themselves.
  • Run Dotter in batch mode to create large, time-consuming dot-plots as a background process.
Please cite: Gemma Barson and Ed Griffiths: SeqTools: visual tools for manual analysis of sequence alignments. (PubMed,eprint) BMC Research Notes 9:39 (2016)
Registry entries: Bio.tools  SciCrunch 
drop-seq-tools
analyzing Drop-seq data
Versions of package drop-seq-tools
ReleaseVersionArchitectures
bookworm2.5.2+dfsg-1all
sid3.0.0+dfsg-1all
bullseye2.4.0+dfsg-6all
trixie3.0.0+dfsg-1all
upstream3.0.2
Popcon: 1 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

This software provide for core computational analysis of Drop-seq data, which shows you how to transform raw sequence data into an expression measurement for each gene in each individual cell.

Registry entries: Bioconda 
dssp
protein secondary structure assignment based on 3D structure
Versions of package dssp
ReleaseVersionArchitectures
buster3.0.0-3amd64,arm64,armhf,i386
bullseye4.0.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm4.2.2-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie2.2.1-2amd64,armel,armhf,i386
stretch2.2.1-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie4.2.2-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid4.2.2-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream4.4.7
Popcon: 8 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

DSSP is an application you use to assign the secondary structure of a protein based on its solved three dimensional (3D) structure.

This version (4.2) of DSSP is a rewrite that writes annotated mmCIF files by default but can still produce the older dssp format. New is also the support of PP helices.

Registry entries: Bio.tools  SciCrunch 
dwgsim
short sequencing read simulator
Versions of package dwgsim
ReleaseVersionArchitectures
stretch0.1.11-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid0.1.14-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.1.12-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.1.12-2amd64,arm64,armhf
bookworm0.1.14-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.1.14-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

DWGSIM simulates short sequencing reads from modern sequencing platforms. DWGSIM generates base error rates using a parametric model, allowing a more realisic error profile. It was originally developed for use in evaluating short read aligners.

Registry entries: SciCrunch  Bioconda 
e-mem
Efficient computation of Maximal Exact Matches for very large genomes
Versions of package e-mem
ReleaseVersionArchitectures
buster1.0.1-2amd64,arm64,armhf,i386
bullseye1.0.1-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.0.1-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

E-MEM enables efficient computation of Maximal Exact Matches (MEMs) that does not use full text indexes. The algorithm uses much less space and is highly amenable to parallelization. It can compute all MEMs of minimum length 100 between the whole human and mouse genomes on a 12 core machine in 10 min and 2 GB of memory; the required memory can be as low as 600 MB. It can run efficiently genomes of any size. Extensive testing and comparison with currently best algorithms is provided.

Mummer has many different scripts where one of the key program is MEM computation. In all the scripts, the MEM computation program can be replaced with e-mem with ease for better performance.

Please cite: Nilesh Khiste and Lucian Ilie: E-MEM: efficient computation of maximal exact matches for very large genomes. (PubMed,eprint) Bioinformatics 31(4):509-514 (2015)
ea-utils
command-line tools for processing biological sequencing data
Versions of package ea-utils
ReleaseVersionArchitectures
trixie1.1.2+dfsg-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
stretch1.1.2+dfsg-4amd64,arm64,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.1.2+dfsg-5amd64,arm64,armhf,i386
bullseye1.1.2+dfsg-6amd64,arm64,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.1.2+dfsg-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.1.2+dfsg-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 15 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Ea-utils provides a set of command-line tools for processing biological sequencing data, barcode demultiplexing, adapter trimming, etc.

Primarily written to support an Illumina based pipeline - but should work with any FASTQs.

Main Tools are:

  • fastq-mcf Scans a sequence file for adapters, and, based on a log-scaled threshold, determines a set of clipping parameters and performs clipping. Also does skewing detection and quality filtering.

  • fastq-multx Demultiplexes a fastq. Capable of auto-determining barcode id's based on a master set fields. Keeps multiple reads in-sync during demultiplexing. Can verify that the reads are in-sync as well, and fail if they're not.

  • fastq-join Similar to audy's stitch program, but in C, more efficient and supports some automatic benchmarking and tuning. It uses the same "squared distance for anchored alignment" as other tools.

  • varcall Takes a pileup and calculates variants in a more easily parameterized manner than some other tools.

Please cite: Erik Aronesty: Comparison of Sequencing Utility Programs. (eprint) The Open Bioinformatics Journal 7:1-8 (2013)
Registry entries: Bio.tools  SciCrunch 
ecopcr
estimate PCR barcode primers quality
Versions of package ecopcr
ReleaseVersionArchitectures
trixie1.0.1+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch0.5.0+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.0.1+dfsg-1amd64,arm64,armhf,i386
bullseye1.0.1+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.0.1+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.0.1+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

DNA barcoding is a tool for characterizing the species origin using a short sequence from a standard position and agreed upon position in the genome. To be used as a DNA barcode, a genome locus should vary among individuals of the same species only to a minor degree and it should vary among species very quickly. From a practical point of view, a barcode locus should be flanked by two conserved regions to design PCR primers. Several manually discovered barcode loci like COI, rbcL, 18S, 16S and 23S rDNA, or trnH-ps are routinely used today, but no objective function has been described to measure their quality in terms of universality (barcode coverage, Bc ) or in terms of taxonomical discrimination capacity (barcode specificity, Bs ).

ecoPCR is an electronic PCR software developed by LECA and Helix-Project. It helps to estimate Barcode primers quality. In conjunction with OBITools you can postprocess ecoPCR output to compute barcode coverage and barcode specificity. New barcode primers can be developed using the ecoPrimers software

Registry entries: Bioconda 
edtsurf
triangulated mesh surfaces for protein structures
Versions of package edtsurf
ReleaseVersionArchitectures
trixie0.2009-10amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid0.2009-10amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch0.2009-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster0.2009-6amd64,arm64,armhf,i386
jessie0.2009-3amd64,armel,armhf,i386
bullseye0.2009-10amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.2009-10amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

EDTSurf is a open source program to construct triangulated surfaces for macromolecules. It generates three major macromolecular surfaces: van der Waals surface, solvent-accessible surface and molecular surface (solvent-excluded surface). EDTsurf also identifies cavities which are inside of macromolecules.

Please cite: Dong Xu and Yang Zhang: Generating Triangulated Macromolecular Surfaces by Euclidean Distance Transform.. (PubMed,eprint) PLoS ONE 4(12):e8140 (2009)
eigensoft
reduction of population bias for genetic analyses
Versions of package eigensoft
ReleaseVersionArchitectures
trixie8.0.0+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm8.0.0+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid8.0.0+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster7.2.1+dfsg-1amd64,arm64,armhf,i386
bullseye7.2.1+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch6.1.4+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The EIGENSOFT package combines functionality from the group's population genetics methods (Patterson et al. 2006) and their EIGENSTRAT stratification method (Price et al. 2006). The EIGENSTRAT method uses principal components analysis to explicitly model ancestry differences between cases and controls along continuous axes of variation; the resulting correction is specific to a candidate marker's variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. The EIGENSOFT package has a built-in plotting script and supports multiple file formats and quantitative phenotypes.

Please cite: Alkes L. Price, Nick J. Patterson, Robert M. Plenge, Michael E. Weinblatt, Nancy A. Shadick and David Reich: Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics 38:904 - 909 (2006)
Registry entries: Bio.tools  SciCrunch  Bioconda 
elph
DNA/protein sequence motif finder
Versions of package elph
ReleaseVersionArchitectures
bullseye1.0.1-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.0.1-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.0.1-2amd64,arm64,armhf,i386
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

ELPH (Estimated Locations of Pattern Hits) is a general-purpose Gibbs sampler for finding motifs in a set of DNA or protein sequences. The program takes as input a set containing anywhere from a few dozen to thousands of sequences, and searches through them for the most common motif, assuming that each sequence contains one copy of the motif. ELPH was used to find patterns such as ribosome binding sites (RBSs) and exon splicing enhancers (ESEs).

embassy-domainatrix
Extra EMBOSS commands to handle domain classification file
Versions of package embassy-domainatrix
ReleaseVersionArchitectures
trixie0.1.660-5amd64,arm64,mips64el,ppc64el,riscv64
sid0.1.660-5amd64,arm64,mips64el,ppc64el,riscv64
jessie0.1.650-1amd64,armel,armhf,i386
stretch0.1.660-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye0.1.660-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.1.660-3amd64,arm64,armhf,i386
bookworm0.1.660-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package embassy-domainatrix:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, converting, editing, searching
works-with-formatplaintext
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The DOMAINATRIX programs were developed by Jon Ison and colleagues at MRC HGMP for their protein domain research. They are included as an EMBASSY package as a work in progress.

Applications in the current domainatrix release are cathparse (generates DCF file from raw CATH files), domainnr (removes redundant domains from a DCF file), domainreso (removes low resolution domains from a DCF file), domainseqs (adds sequence records to a DCF file), domainsse (adds secondary structure records to a DCF file), scopparse (generates DCF file from raw SCOP files) and ssematch (searches a DCF file for secondary structure matches).

embassy-domalign
Extra EMBOSS commands for protein domain alignment
Versions of package embassy-domalign
ReleaseVersionArchitectures
bookworm0.1.660-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.1.660-3amd64,arm64,armhf,i386
trixie0.1.660-5amd64,arm64,mips64el,ppc64el,riscv64
jessie0.1.650-1amd64,armel,armhf,i386
sid0.1.660-5amd64,arm64,mips64el,ppc64el,riscv64
bullseye0.1.660-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch0.1.660-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Debtags of package embassy-domalign:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing, editing
works-with-formatplaintext
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The DOMALIGN programs were developed by Jon Ison and colleagues at MRC HGMP for their protein domain research. They are included as an EMBASSY package as a work in progress.

Applications in the current domalign release are allversusall (sequence similarity data from all-versus-all comparison), domainalign (generates alignments (DAF file) for nodes in a DCF file), domainrep (reorders DCF file to identify representative structures) and seqalign (extend alignments (DAF file) with sequences (DHF file)).

embassy-domsearch
Extra EMBOSS commands to search for protein domains
Versions of package embassy-domsearch
ReleaseVersionArchitectures
buster0.1.660-3amd64,arm64,armhf,i386
stretch0.1.660-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie0.1.650-1amd64,armel,armhf,i386
trixie0.1.660-4amd64,arm64,mips64el,ppc64el,riscv64
sid0.1.660-4amd64,arm64,mips64el,ppc64el,riscv64
bookworm0.1.660-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye0.1.660-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package embassy-domsearch:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

The DOMSEARCH programs were developed by Jon Ison and colleagues at MRC HGMP for their protein domain research. They are included as an EMBASSY package as a work in progress.

Applications in this DOMSEARCH release are seqfraggle (removes fragment sequences from DHF files), seqnr (removes redundancy from DHF files), seqsearch (generates PSI-BLAST hits (DHF file) from a DAF file), seqsort (Remove ambiguous classified sequences from DHF files) and seqwords (Generates DHF files from keyword search of UniProt).

emboss
European molecular biology open software suite
Versions of package emboss
ReleaseVersionArchitectures
bullseye6.6.0+dfsg-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie6.6.0+dfsg-15amd64,arm64,mips64el,ppc64el,riscv64
buster6.6.0+dfsg-7amd64,arm64,armhf,i386
jessie6.6.0+dfsg-1amd64,armel,armhf,i386
stretch6.6.0+dfsg-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid6.6.0+dfsg-15amd64,arm64,mips64el,ppc64el,riscv64
bookworm6.6.0+dfsg-12amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package emboss:
fieldbiology, biology:bioinformatics, biology:molecular
interfacecommandline
roleprogram
scopesuite
useanalysing, comparing, converting, editing, organizing, searching, text-formatting, typesetting, viewing
works-withdb
works-with-formatplaintext
Popcon: 24 users (13 upd.)*
Versions and Archs
License: DFSG free
Git

EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole. EMBOSS breaks the historical trend towards commercial software packages.

The package is enhanced by the following packages: clustalw primer3
Please cite: Peter Rice, Ian Longden and Alan Bleasby: EMBOSS: The European Molecular Biology Open Software Suite. (PubMed) Trends in Genetics 16(6):276 - 277 (2000)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Screenshots of package emboss
emmax
genetic mapping considering population structure
Versions of package emmax
ReleaseVersionArchitectures
bookworm0~beta.20100307-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0~beta.20100307-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0~beta.20100307-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0~beta.20100307-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

EMMAX is a statistical test for large scale human or model organism association mapping accounting for the sample structure. In addition to the computational efficiency obtained by EMMA algorithm, EMMAX takes advantage of the fact that each locus explains only a small fraction of complex traits, which allows one to avoid repetitive variance component estimation procedure, resulting in a significant amount of increase in computational time of association mapping using mixed model.

Please cite: Hyun Min Kang, Jae Hoon Sul, Susan K Service, Noah A Zaitlen, Sit-yee Kong, Nelson B Freimer, Chiara Sabatti and Eleazar Eskin: Variance component model to account for sample structure in genome-wide association studies. (PubMed) Nature Genetics 42(4):348-54 (2010)
Registry entries: Bio.tools 
estscan
ORF-independent detector of coding DNA sequences
Versions of package estscan
ReleaseVersionArchitectures
sid3.0.3-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster3.0.3-3amd64,arm64,armhf,i386
trixie3.0.3-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm3.0.3-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye3.0.3-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

ESTScan is a program that can detect coding regions in DNA sequences, even if they are of low quality. ESTScan will also detect and correct sequencing errors that lead to frameshifts. ESTScan is not a gene prediction program , nor is it an open reading frame detector. In fact, its strength lies in the fact that it does not require an open reading frame to detect a coding region. As a result, the program may miss a few translated amino acids at either the N or the C terminus, but will detect coding regions with high selectivity and sensitivity.

ESTScan takes advantages of the bias in hexanucleotide usage found in coding regions relative to non-coding regions. This bias is formalized as an inhomogeneous 3-periodic fifth-order Hidden Markov Model (HMM). Additionally, the HMM of ESTScan has been extended to allows insertions and deletions when these improve the coding region statistics.

Please cite: C. Lottaz, C. Iseli, CV. Jongeneel and Philipp Bucher: Modeling sequencing errors by combining Hidden Markov models Bioinformatics 19:103-112 (2003)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Remark of Debian Med team: This package ships with BioLinux http://envgen.nox.ac.uk/biolinux.html
examl
Exascale Maximum Likelihood (ExaML) code for phylogenetic inference
Versions of package examl
ReleaseVersionArchitectures
bookworm3.0.22-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid3.0.22-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie3.0.22-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
bullseye3.0.22-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster3.0.21-2amd64,i386
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Exascale Maximum Likelihood (ExaML) is a code for phylogenetic inference using MPI. This code implements the popular RAxML search algorithm for maximum likelihood based inference of phylogenetic trees.

ExaML is a strapped-down light-weight version of RAxML for phylogenetic inference on huge datasets. It can only execute some very basic functions and is intended for computer-savvy users that can write little perl-scripts and have experience using queue submission scripts for clusters. ExaML only implements the CAT and GAMMA models of rate heterogeneity for binary, DNA, and protein data.

ExaML uses a radically new MPI parallelization approach that yields improved parallel efficiency, in particular on partitioned multi-gene or whole-genome datasets. It also implements a new load balancing algorithm that yields better parallel efficiency.

It is up to 4 times faster than its predecessor RAxML-Light and scales to a larger number of processors.

Please cite: Alexey M. Kozlov, Andre J. Aberer and Alexandros Stamatakis: ExaML version 3: a tool for phylogenomic analyses on supercomputers. (PubMed,eprint) Bioinformatics 31(15):2577-2579 (2015)
exonerate
generic tool for pairwise sequence comparison
Versions of package exonerate
ReleaseVersionArchitectures
bullseye2.4.0-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie2.2.0-6amd64,armel,armhf,i386
bookworm2.4.0-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.4.0-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.4.0-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch2.4.0-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2.4.0-4amd64,arm64,armhf,i386
Debtags of package exonerate:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usesearching
works-with-formatplaintext
Popcon: 61 users (10 upd.)*
Versions and Archs
License: DFSG free
Git

Exonerate allows you to align sequences using a many alignment models, using either exhaustive dynamic programming, or a variety of heuristics. Much of the functionality of the Wise dynamic programming suite was reimplemented in C for better efficiency. Exonerate is an intrinsic component of the building of the Ensembl genome databases, providing similarity scores between RNA and DNA sequences and thus determining splice variants and coding sequences in general.

An In-silico PCR Experiment Simulation System (see the ipcress man page) is packaged with exonerate.

This package also comes with a selection of utilities for performing simple manipulations quickly on fasta files beyond 2Gb

Please cite: Guy C. Slater and Ewan Birney: Automated generation of heuristics for biological sequence comparison. (PubMed,eprint) BMC Bioinformatics 6(1):31 (2005)
Registry entries: Bio.tools  SciCrunch  Bioconda 
fasta3
tools for searching collections of biological sequences
Versions of package fasta3
ReleaseVersionArchitectures
experimental36.3.8i.14-Nov-2020-2~0exp0simdeamd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid36.3.8i.14-Nov-2020-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster36.3.8g-1 (non-free)amd64
trixie36.3.8i.14-Nov-2020-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye36.3.8h.2020-02-11-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm36.3.8i.14-Nov-2020-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The FASTA programs find regions of local or global similarity between Protein or DNA sequences, either by searching Protein or DNA databases, or by identifying local duplications within a sequence. Other programs provide information on the statistical significance of an alignment. Like BLAST, FASTA can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

  • Protein
  • Protein-protein FASTA
  • Protein-protein Smith-Waterman (ssearch)
  • Global Protein-protein (Needleman-Wunsch) (ggsearch)
  • Global/Local protein-protein (glsearch)
  • Protein-protein with unordered peptides (fasts)
  • Protein-protein with mixed peptide sequences (fastf)

  • Nucleotide

  • Nucleotide-Nucleotide (DNA/RNA fasta)
  • Ordered Nucleotides vs Nucleotide (fastm)
  • Un-ordered Nucleotides vs Nucleotide (fasts)

  • Translated

  • Translated DNA (with frameshifts, e.g. ESTs) vs Proteins (fastx/fasty)
  • Protein vs Translated DNA (with frameshifts) (tfastx/tfasty)
  • Peptides vs Translated DNA (tfasts)

  • Statistical Significance

  • Protein vs Protein shuffle (prss)
  • DNA vs DNA shuffle (prss)
  • Translated DNA vs Protein shuffle (prfx)

  • Local Duplications

  • Local Protein alignments (lalign)
  • Plot Protein alignment "dot-plot" (plalign)
  • Local DNA alignments (lalign)
  • Plot DNA alignment "dot-plot" (plalign)

This software is often used via a web service at the EBI with readily indexed reference databases at http://www.ebi.ac.uk/Tools/fasta/.

Please cite: William R. Pearson and D. J. Lipman: Improved tools for biological sequence comparison. (PubMed,eprint) Proc Natl Acad Sci U S A 85(8):2444-8 (1988)
Registry entries: Bioconda 
fastahack
utility for indexing and sequence extraction from FASTA files
Versions of package fastahack
ReleaseVersionArchitectures
stretch0.0+20160702-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el
bullseye1.0.0+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0.0+dfsg-11amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0.0+dfsg-11amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.0.0+dfsg-10amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.0+git20160702.bbc645f+dfsg-6amd64,arm64,armhf,i386
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

fastahack is a small application for indexing and extracting sequences and subsequences from FASTA files. The included Fasta.cpp library provides a FASTA reader and indexer that can be embedded into applications which would benefit from directly reading subsequences from FASTA files. The library automatically handles index file generation and use.

Features:

  • FASTA index (.fai) generation for FASTA files
  • Sequence extraction
  • Subsequence extraction
  • Sequence statistics (currently only entropy is provided)

Sequence and subsequence extraction use fseek64 to provide fastest-possible extraction without RAM-intensive file loading operations. This makes fastahack a useful tool for bioinformaticists who need to quickly extract many subsequences from a reference FASTA sequence.

Registry entries: Bioconda 
fastani
Fast alignment-free computation of whole-genome Average Nucleotide Identity
Versions of package fastani
ReleaseVersionArchitectures
sid1.33-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.33-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.33-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ANI is defined as mean nucleotide identity of orthologous gene pairs shared between two microbial genomes. FastANI supports pairwise comparison of both complete and draft genome assemblies.

fastaq
FASTA and FASTQ file manipulation tools
Versions of package fastaq
ReleaseVersionArchitectures
bookworm3.17.0-5all
trixie3.17.0-6all
buster3.17.0-2all
jessie1.5.0-1all
sid3.17.0-6all
stretch3.14.0-1all
bullseye3.17.0-3all
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Fastaq represents a diverse collection of scripts that perform useful and common FASTA/FASTQ manipulation tasks, such as filtering, merging, splitting, sorting, trimming, search/replace, etc. Input and output files can be gzipped (format is automatically detected) and individual Fastaq commands can be piped together.

Topics: Bioinformatics
fastdnaml
Tool for construction of phylogenetic trees of DNA sequences
Versions of package fastdnaml
ReleaseVersionArchitectures
stretch1.2.2-11amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid1.2.2-17amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.2.2-17amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.2.2-15amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.2.2-15amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.2.2-14amd64,arm64,armhf,i386
jessie1.2.2-10amd64,armel,armhf,i386
Debtags of package fastdnaml:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing
works-with-formatplaintext
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

fastDNAml is a program derived from Joseph Felsenstein's version 3.3 DNAML (part of his PHYLIP package). Users should consult the documentation for DNAML before using this program.

fastDNAml is an attempt to solve the same problem as DNAML, but to do so faster and using less memory, so that larger trees and/or more bootstrap replicates become tractable. Much of fastDNAml is merely a recoding of the PHYLIP 3.3 DNAML program from PASCAL to C.

Note that the homepage of this program is not available any more and so this program will probably not see any further updates.

Please cite: Gary J. Olsen, Hideo Matsuda, Ray Hagstrom and Ross Overbeek: fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. (PubMed,eprint) Comput Appl Biosci 10(1):41-48 (1994)
fastlink
faster version of pedigree programs of Linkage
Versions of package fastlink
ReleaseVersionArchitectures
jessie4.1P-fix95-3amd64,armel,armhf,i386
bullseye4.1P-fix100+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster4.1P-fix100+dfsg-2amd64,arm64,armhf,i386
trixie4.1P-fix100+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid4.1P-fix100+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch4.1P-fix100+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm4.1P-fix100+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package fastlink:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Genetic linkage analysis is a statistical technique used to map genes and find the approximate location of disease genes. There was a standard software package for genetic linkage called LINKAGE. FASTLINK is a significantly modified and improved version of the main programs of LINKAGE that runs much faster sequentially, can run in parallel, allows the user to recover gracefully from a computer crash, and provides abundant new documentation. FASTLINK has been used in over 1000 published genetic linkage studies.

This package contains the following programs:

 ilink:    GEMINI optimization procedure to find a locally
           optimal value of the theta vector of recombination
           fractions
 linkmap:  calculates location scores of one locus against a
           fixed map of other loci
 lodscore: compares likelihoods at locally optimal theta
 mlink:    calculates lod scores and risk with two of more loci
 unknown:  identify possible genotypes for unknowns
Please cite: R. W. Cottingham Jr., R. M. Idury and A. A. Schaffer: Faster Sequential Genetic Linkage Computations. (PubMed,eprint) American Journal of Human Genetics 53(1):252-263 (1993)
Registry entries: SciCrunch 
fastml
maximum likelihood ancestral amino-acid sequence reconstruction
Versions of package fastml
ReleaseVersionArchitectures
bookworm3.11-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch3.1-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye3.11-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster3.1-4amd64,arm64,armhf,i386
trixie3.11-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid3.11-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

FastML is a bioinformatics tool for the reconstruction of ancestral sequences based on the phylogenetic relations between homologous sequences. FastML runs several algorithms that reconstruct the ancestral sequences with emphasis on an accurate reconstruction of both indels and characters. For character reconstruction the previously described FastML algorithms are used to efficiently infer the most likely ancestral sequences for each internal node of the tree. Both joint and the marginal reconstructions are provided. For indels reconstruction the sequences are first coded according to the indel events detected within the multiple sequence alignment (MSA) and then a state-of-the-art likelihood model is used to reconstruct ancestral indels states. The results are the most probable sequences, together with posterior probabilities for each character and indel at each sequence position for each internal node of the tree. FastML is generic and is applicable for any type of molecular sequences (nucleotide, protein, or codon sequences).

Please cite: Haim Ashkenazy, Osnat Penn, Adi Doron-Faigenboim, Ofir Cohen, Gina Cannarozzi, Oren Zomer and Tal Pupko: FastML: a web server for probabilistic reconstruction of ancestral sequences. (PubMed,eprint) Nucleic Acids Research 40(Web Server issue):W580-W584 (2012)
Registry entries: Bio.tools 
fastp
Ultra-fast all-in-one FASTQ preprocessor
Versions of package fastp
ReleaseVersionArchitectures
sid0.23.4+dfsg-1amd64,arm64,armel,armhf,mips64el,ppc64el,riscv64,s390x
bookworm0.23.2+dfsg-2amd64,arm64,armel,armhf,mips64el,mipsel,ppc64el,s390x
trixie0.23.4+dfsg-1amd64,arm64,armel,armhf,mips64el,ppc64el,riscv64,s390x
buster0.19.6+dfsg-1amd64,arm64,armhf,i386
bullseye0.20.1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

All-in-one FASTQ preprocessor, fastp provides functions including quality profiling, adapter trimming, read filtering and base correction. It supports both single-end and paired-end short read data and also provides basic support for long-read data.

The package is enhanced by the following packages: multiqc
Please cite: Shifu Chen, Yanqing Zhou, Yaru Chen and Jia Gu: fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34(17):i884-i890 (2018)
Registry entries: Bioconda 
fastq-pair
Rewrites paired end fastq so all reads have a mate to separate out singletons
Versions of package fastq-pair
ReleaseVersionArchitectures
sid1.0-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.0-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

This package rewrites the fastq files with the sequences in order, with matching files for the two files provided on the command line, and then any single reads that are not matched are place in two separate files, one for each original file.

This code is designed to be fast and memory efficient, and works with large fastq files. It does not store the whole file in memory, but rather just stores the locations of each of the indices in the first file provided in memory.

fastqc
quality control for high throughput sequence data
Versions of package fastqc
ReleaseVersionArchitectures
stretch0.11.5+dfsg-6all
buster0.11.8+dfsg-2all
jessie0.11.2+dfsg-3all
trixie0.12.1+dfsg-4all
bullseye0.11.9+dfsg-4all
bookworm0.11.9+dfsg-6all
sid0.12.1+dfsg-4all
Popcon: 23 users (76 upd.)*
Versions and Archs
License: DFSG free
Git

FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.

The main functions of FastQC are

  • Import of data from BAM, SAM or FastQ files (any variant)
  • Providing a quick overview to tell you in which areas there may be problems
  • Summary graphs and tables to quickly assess your data
  • Export of results to an HTML based permanent report
  • Offline operation to allow automated generation of reports without running the interactive application
The package is enhanced by the following packages: multiqc
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequencing
fastqtl
Quantitative Trait Loci (QTL) mapper in cis for molecular phenotypes
Versions of package fastqtl
ReleaseVersionArchitectures
trixie2.184+v7+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye2.184+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch2.184+dfsg-5amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
bookworm2.184+v7+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.184+v7+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster2.184+dfsg-6amd64,arm64,armhf
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

The goal of FastQTL is to identify single-nucleotide polymorphisms (SNPs) which are significantly associated with various molecular phenotypes (i.e. expression of known genes, cytosine methylation levels, etc). It performs scans for all possible phenotype-variant pairs in cis (i.e. variants located within a specific window around a phenotype). FastQTL implements a new permutation scheme (Beta approximation) to accurately and rapidly correct for multiple-testing at both the genotype and phenotype levels.

The package is enhanced by the following packages: fastqtl-doc
Please cite: Halit Ongen, Alfonso Buil, Andrew Anand Brown, Emmanouil T. Dermitzakis and and Olivier Delaneau: Fast and efficient QTL mapper for thousands of molecular phenotypes. (eprint) Bioinformatics (2015)
fasttree
phylogenetic trees from alignments of nucleotide or protein sequences
Versions of package fasttree
ReleaseVersionArchitectures
bookworm2.1.11-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye2.1.11-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster2.1.10-2amd64,arm64,armhf,i386
jessie2.1.7-2amd64,armel,armhf,i386
stretch2.1.9-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie2.1.11-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.1.11-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 8 users (73 upd.)*
Versions and Archs
License: DFSG free
Git

FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. It handles alignments with up to a million of sequences in a reasonable amount of time and memory. For large alignments, FastTree is 100-1,000 times faster than PhyML 3.0 or RAxML 7.

FastTree is more accurate than PhyML 3 with default settings, and much more accurate than the distance-matrix methods that are traditionally used for large alignments. FastTree uses the Jukes-Cantor or generalized time-reversible (GTR) models of nucleotide evolution and the JTT (Jones-Taylor-Thornton 1992) model of amino acid evolution. To account for the varying rates of evolution across sites, FastTree uses a single rate for each site (the "CAT" approximation). To quickly estimate the reliability of each split in the tree, FastTree computes local support values with the Shimodaira-Hasegawa test (these are the same as PhyML 3's "SH-like local supports").

This package contains a single threaded version (fasttree) and a parallel version which uses OpenMP (fasttreMP).

Please cite: Morgan N. Price, Paramvir S. Dehal and Adam P. Arkin: FastTree 2 -- Approximately Maximum-Likelihood Trees for Large Alignments.. (PubMed,eprint) PLoS ONE 5(3):e9490 (2010)
Registry entries: Bio.tools  SciCrunch  Bioconda 
ffindex
simple index/database for huge amounts of small files
Versions of package ffindex
ReleaseVersionArchitectures
trixie0.9.9.9-6.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
jessie0.9.9.3-2amd64,armel,armhf,i386
sid0.9.9.9-6.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.9.9.9-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye0.9.9.9-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.9.9.9-2amd64,arm64,armhf,i386
stretch0.9.9.7-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

FFindex is a very simple index/database for huge amounts of small files. The files are stored concatenated in one big data file, separated by '\0'. A second file contains a plain text index, giving name, offset and length of the small files. The lookup is currently done with a binary search on an array made from the index file.

This package provides the executables.

figtree
graphical phylogenetic tree viewer
Versions of package figtree
ReleaseVersionArchitectures
sid1.4.4-6all
buster1.4.4-3all
bullseye1.4.4-5all
trixie1.4.4-6all
jessie1.4-2all
bookworm1.4.4-5all
stretch1.4.2+dfsg-2all
Popcon: 6 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

FigTree is designed as a graphical viewer of phylogenetic trees and as a program for producing publication-ready figures. In particular it is designed to display summarized and annotated trees produced by BEAST.

Registry entries: Bio.tools  SciCrunch  Bioconda 
filtlong
quality filtering tool for long reads of genome sequences
Versions of package filtlong
ReleaseVersionArchitectures
bullseye0.2.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.2.1-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.2.1-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.2.1-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Filtlong is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter.

Registry entries: Bio.tools  Bioconda 
fitgcp
fitting genome coverage distributions with mixture models
Versions of package fitgcp
ReleaseVersionArchitectures
trixie0.0.20150429-5all
bookworm0.0.20150429-5all
bullseye0.0.20150429-4all
buster0.0.20150429-2amd64,arm64
stretch0.0.20150429-1amd64,arm64,mips64el,ppc64el
jessie0.0.20130418-2amd64,i386
sid0.0.20150429-5all
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Genome coverage, the number of sequencing reads mapped to a position in a genome, is an insightful indicator of irregularities within sequencing experiments. While the average genome coverage is frequently used within algorithms in computational genomics, the complete information available in coverage profiles (i.e. histograms over all coverages) is currently not exploited to its full extent. Thus, biases such as fragmented or erroneous reference genomes often remain unaccounted for. Making this information accessible can improve the quality of sequencing experiments and quantitative analyses.

fitGCP is a framework for fitting mixtures of probability distributions to genome coverage profiles. Besides commonly used distributions, fitGCP uses distributions tailored to account for common artifacts. The mixture models are iteratively fitted based on the Expectation-Maximization algorithm.

Please cite: Martin S. Lindner, Maximilian Kollock, Franziska Zickmann and Bernhard Y. Renard: Analyzing genome coverage profiles with applications to quality control in metagenomics. (PubMed,eprint) Bioinformatics 29(10):1260-1267 (2013)
Registry entries: SciCrunch 
flash
Fast Length Adjustment of SHort reads
Versions of package flash
ReleaseVersionArchitectures
bookworm1.2.11-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.2.11-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.2.11-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.2.11-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

FLASH (Fast Length Adjustment of SHort reads) is a very fast and accurate software tool to merge paired-end reads from next-generation sequencing experiments. FLASH is designed to merge pairs of reads when the original DNA fragments are shorter than twice the length of reads. The resulting longer reads can significantly improve genome assemblies. They can also improve transcriptome assembly when FLASH is used to merge RNA-seq data.

The package is enhanced by the following packages: multiqc
Please cite: Tanja Magoč and Steven L Salzberg: FLASH: Fast Length Adjustment of Short Reads to Improve Genome Assemblies. (PubMed,eprint) Bioinformatics 27(21):2957-2963 (2011)
Registry entries: Bio.tools  Bioconda 
flexbar
flexible barcode and adapter removal for sequencing platforms
Versions of package flexbar
ReleaseVersionArchitectures
stretch2.50-2amd64,arm64,armhf,i386,mips,mips64el,mipsel,ppc64el
bookworm3.5.0-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye3.5.0-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie2.50-1amd64,armhf,i386
sid3.5.0-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster3.4.0-2amd64,arm64,armhf,i386
trixie3.5.0-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Flexbar preprocesses high-throughput sequencing data efficiently. It demultiplexes barcoded runs and removes adapter sequences. Moreover, trimming and filtering features are provided. Flexbar increases mapping rates and improves genome and transcriptome assemblies. It supports next-generation sequencing data in fasta/q and csfasta/q format from Illumina, Roche 454, and the SOLiD platform.

Parameter names changed in Flexbar. Please review scripts. The recent months, default settings were optimised, several bugs were fixed and various improvements were made, e.g. revamped command-line interface, new trimming modes as well as lower time and memory requirements.

The package is enhanced by the following packages: multiqc
Please cite: Matthias Dodt, Johannes T. Roehr, Rina Ahmed and Christoph Dieterich: FLEXBAR — Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms. (eprint) Biology 1(3):895-905 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
flye
de novo assembler for single molecule sequencing reads using repeat graphs
Versions of package flye
ReleaseVersionArchitectures
trixie2.9.5+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
sid2.9.5+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm2.9.1+dfsg-1amd64,arm64,mips64el,ppc64el,s390x
Popcon: 1 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Flye is a de novo assembler for single molecule sequencing reads, such as those produced by PacBio and Oxford Nanopore Technologies. It is designed for a wide range of datasets, from small bacterial projects to large mammalian-scale assemblies. The package represents a complete pipeline: it takes raw PacBio / ONT reads as input and outputs polished contigs. Flye also has a special mode for metagenome assembly.

Please cite: Mikhail Kolmogorov, Jeffrey Yuan, Yu Lin and Pavel A. Pevzner: Assembly of long, error-prone reads using repeat graphs. (PubMed) Nature Biotechnology 37(5):540–546 (2019)
Registry entries: Bio.tools  Bioconda 
fml-asm
tool for assembling Illumina short reads in small regions
Versions of package fml-asm
ReleaseVersionArchitectures
sid0.1+git20190320.b499514-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
experimental0.1+git20190320.b499514-2~0expamd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.1+git20190320.b499514-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.1+git20190320.b499514-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye0.1+git20190320.b499514-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.1-5amd64
stretch-backports0.1-4~bpo9+1amd64
stretch0.1-2amd64
upstream0.1+git20221215.85f159e
Popcon: 2 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

Fml-asm is a command-line tool for assembling Illumina short reads in regions from 100bp to 10 million bp in size, based on the fermi-lite library. It is largely a light-weight in-memory version of fermikit without generating any intermediate files. It inherits the performance, the relatively small memory footprint and the features of fermikit. In particular, fermi-lite is able to retain heterozygous events and thus can be used to assemble diploid regions for the purpose of variant calling.

Registry entries: Bio.tools  SciCrunch  Bioconda 
freebayes
Bayesian haplotype-based polymorphism discovery and genotyping
Versions of package freebayes
ReleaseVersionArchitectures
bullseye1.3.5-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
experimental1.3.7-1~expamd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
sid1.3.7-1amd64,arm64,mips64el,ppc64el,riscv64
buster1.2.0-2amd64
stretch-backports1.2.0-1~bpo9+1amd64
bookworm1.3.6-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream1.3.8-pre3
Popcon: 0 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.

Please cite: Erik Garrison and Gabor Marth: Haplotype-based variant detection from short-read sequencing. (eprint) arXiv (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
freecontact
fast protein contact predictor
Versions of package freecontact
ReleaseVersionArchitectures
bullseye1.0.21-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.0.21-13amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0.21-14amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0.21-14amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie1.0.21-3amd64,armel,armhf,i386
stretch1.0.21-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.0.21-7amd64,arm64,armhf,i386
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

FreeContact is a protein residue contact predictor optimized for speed. Its input is a multiple sequence alignment. FreeContact can function as an accelerated drop-in for the published contact predictors EVfold-mfDCA of DS. Marks (2011) and PSICOV of D. Jones (2011).

FreeContact is accelerated by a combination of vector instructions, multiple threads, and faster implementation of key parts. Depending on the alignment, 8-fold or higher speedups are possible.

A sufficiently large alignment is required for meaningful results. As a minimum, an alignment with an effective (after-weighting) sequence count bigger than the length of the query sequence should be used. Alignments with tens of thousands of (effective) sequences are considered good input.

jackhmmer(1) from the hmmer package, or hhblits(1) from hhsuite can be used to generate the alignments, for example.

This package contains the command line tool freecontact(1).

Please cite: László Kaján, Thomas A. Hopf, Matúš Kalaš, Debora S. Marks and Burkhard Rost: FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics (2014)
Topics: Structure prediction; Sequence analysis
fsa
Fast Statistical Alignment of protein, RNA or DNA sequences
Versions of package fsa
ReleaseVersionArchitectures
stretch1.15.9+dfsg-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie1.15.9+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.15.9+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.15.9+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.15.9+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.15.9+dfsg-4amd64,arm64,armhf,i386
Popcon: 55 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

FSA is a probabilistic multiple sequence alignment algorithm which uses a "distance-based" approach to aligning homologous protein, RNA or DNA sequences. Much as distance-based phylogenetic reconstruction methods like Neighbor-Joining build a phylogeny using only pairwise divergence estimates, FSA builds a multiple alignment using only pairwise estimations of homology. This is made possible by the sequence annealing technique for constructing a multiple alignment from pairwise comparisons, developed by Ariel Schwartz.

FSA brings the high accuracies previously available only for small-scale analyses of proteins or RNAs to large-scale problems such as aligning thousands of sequences or megabase-long sequences. FSA introduces several novel methods for constructing better alignments:

  • FSA uses machine-learning techniques to estimate gap and substitution parameters on the fly for each set of input sequences. This "query-specific learning" alignment method makes FSA very robust: it can produce superior alignments of sets of homologous sequences which are subject to very different evolutionary constraints.
  • FSA is capable of aligning hundreds or even thousands of sequences using a randomized inference algorithm to reduce the computational cost of multiple alignment. This randomized inference can be over ten times faster than a direct approach with little loss of accuracy.
  • FSA can quickly align very long sequences using the "anchor annealing" technique for resolving anchors and projecting them with transitive anchoring. It then stitches together the alignment between the anchors using the methods described above.
  • The included GUI, MAD (Multiple Alignment Display), can display the intermediate alignments produced by FSA, where each character is colored according to the probability that it is correctly aligned
Please cite: Robert K. Bradley, Adam Roberts, Michael Smoot, Sudeep Juvekar, Jaeyoung Do, Colin Dewey, Ian Holmes and Lior Pachter: Fast Statistical Alignment. (PubMed,eprint) PLoS Comput Biol. 5(5):e1000392 (2009)
Registry entries: Bioconda 
Remark of Debian Med team: Precondition for T-Coffee

see http://wiki.debian.org/DebianMed/TCoffee

Upstream address bounced when contacting about segfaults so it seems to be dead upstream and no good code quality.

fsm-lite
frequency-based string mining (lite)
Versions of package fsm-lite
ReleaseVersionArchitectures
buster1.0-3amd64,arm64
sid1.0-8amd64,arm64,mips64el,ppc64el,riscv64,s390x
trixie1.0-8amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm1.0-8amd64,arm64,mips64el,ppc64el,s390x
bullseye1.0-5amd64,arm64,mips64el,ppc64el,s390x
stretch1.0-2amd64,arm64,mips64el,ppc64el,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

A singe-core implementation of frequency-based substring mining used in bioinformatics to extract substrings that discriminate two (or more) datasets inside high-throughput sequencing data.

Registry entries: SciCrunch  Bioconda 
gamgi
General Atomistic Modelling Graphic Interface (GAMGI)
Versions of package gamgi
ReleaseVersionArchitectures
buster0.17.3-2amd64,arm64,armhf,i386
bullseye0.17.3-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.17.5-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.17.5-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch0.17.1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie0.17.1-1amd64,armel,armhf,i386
Debtags of package gamgi:
roleprogram
uitoolkitgtk
Popcon: 9 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The General Atomistic Modelling Graphic Interface (GAMGI) provides a graphical interface to build, view and analyze atomic structures. The program is aimed at the scientific community and provides a graphical interface to study atomic structures and to prepare images for presentations, and for teaching the atomic structure of matter.

The package is enhanced by the following packages: gamgi-data gamgi-doc
Registry entries: SciCrunch 
Screenshots of package gamgi
garli
phylogenetic analysis of molecular sequence data using maximum-likelihood
Versions of package garli
ReleaseVersionArchitectures
bookworm2.1-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch2.1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2.1-3amd64,arm64,armhf,i386
bullseye2.1-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.1-7amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

GARLI, Genetic Algorithm for Rapid Likelihood Inference is a program for inferring phylogenetic trees. Using an approach similar to a classical genetic algorithm, it rapidly searches the space of evolutionary trees and model parameters to find the solution maximizing the likelihood score. It implements nucleotide, amino acid and codon-based models of sequence evolution, and runs on all platforms. The latest version adds support for partitioned models and morphology-like datatypes.

garlic
visualization program for biomolecules
Versions of package garlic
ReleaseVersionArchitectures
trixie1.6-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.6-3amd64,arm64,armhf,i386
sid1.6-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.6-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie1.6-1.1amd64,armel,armhf,i386
stretch1.6-1.1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.6-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package garlic:
fieldbiology, chemistry
interfacex11
roleprogram
scopeutility
uitoolkitxlib
useviewing
x11application
Popcon: 18 users (12 upd.)*
Versions and Archs
License: DFSG free
Git

Garlic is written for the investigation of membrane proteins. It may be used to visualize other proteins, as well as some geometric objects. This version of garlic recognizes PDB format version 2.1. Garlic may also be used to analyze protein sequences.

It only depends on the X libraries, no other libraries are needed.

Features include:

  • The slab position and thickness are visible in a small window.
  • Atomic bonds as well as atoms are treated as independent drawable objects.
  • The atomic and bond colors depend on position. Five mapping modes are available (as for slab).
  • Capable to display stereo image.
  • Capable to display other geometric objects, like membrane.
  • Atomic information is available for atom covered by the mouse pointer. No click required, just move the mouse pointer over the structure!
  • Capable to load more than one structure.
  • Capable to draw Ramachandran plot, helical wheel, Venn diagram, averaged hydrophobicity and hydrophobic moment plot.
  • The command prompt is available at the bottom of the main window. It is able to display one error message and one command string.
Please cite: Damir Zucic and Davor Juretic: Precise Annotation of Transmembrane Segments with Garlic - a Free Molecular Visualization Program (eprint) Croatica Chemica Acta 77(1-2):397-401 (2004)
Registry entries: Bio.tools 
gasic
genome abundance similarity correction
Versions of package gasic
ReleaseVersionArchitectures
stretch0.0.r19-1amd64
jessie0.0.r18-2amd64
sid0.0.r19-8all
trixie0.0.r19-8all
bookworm0.0.r19-8all
bullseye0.0.r19-7all
buster0.0.r19-4amd64
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

One goal of sequencing based metagenomic analysis is the quantitative taxonomic assessment of microbial community compositions. However, the majority of approaches either quantify at low resolution (e.g. at phylum level) or have severe problems discerning highly similar species. Yet, accurate quantification on species level is desirable in applications such as metagenomic diagnostics or community comparison. GASiC is a method to correct read alignment results for the ambiguities imposed by similarities of genomes. It has superior performance over existing methods.

The package is enhanced by the following packages: gasic-examples
Please cite: Martin S. Lindner and Bernhard Y. Renard: Metagenomic abundance estimation and diagnostic testing on species level. (PubMed,eprint) Nucleic Acids Research 41(1):e10 (2013)
Registry entries: SciCrunch 
gatb-core
Genome Analysis Toolbox with de-Bruijn graph
Versions of package gatb-core
ReleaseVersionArchitectures
buster1.4.1+git20181225.44d5a44+dfsg-3amd64,arm64,i386
trixie1.4.2+dfsg-13amd64,arm64,mips64el,ppc64el,riscv64
bookworm1.4.2+dfsg-11amd64,arm64,mips64el,ppc64el
sid1.4.2+dfsg-13amd64,arm64,mips64el,ppc64el,riscv64
bullseye1.4.2+dfsg-6amd64,arm64,i386,mips64el,ppc64el,s390x
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

The GATB-CORE project provides a set of highly efficient algorithms to analyse NGS data sets. These methods enable the analysis of data sets of any size on multi-core desktop computers, including very huge amount of reads data coming from any kind of organisms such as bacteria, plants, animals and even complex samples (e.g. metagenomes). Read more about GATB at https://gatb.inria.fr/. By itself GATB-CORE is not an NGS data analysis tool. However, it can be used to create such tools. There already exist a set of ready-to-use tools relying on GATB-CORE library: see https://gatb.inria.fr/software/

Please cite: Erwan Drezen, Guillaume Rizk, Rayan Chikhi, Charles Deltel, Claire Lemaitre, Pierre Peterlongo and Dominique Lavenier: GATB: Genome Assembly & Analysis Tool Box. Bioinformatics 30(20):2959-2961 (2014)
Registry entries: Bioconda 
gbrowse
GMOD Generic Genome Browser
Versions of package gbrowse
ReleaseVersionArchitectures
trixie2.56+dfsg-12all
jessie2.54+dfsg-3all
stretch2.56+dfsg-2all
buster2.56+dfsg-4all
bullseye2.56+dfsg-8all
bookworm2.56+dfsg-11all
sid2.56+dfsg-12all
Debtags of package gbrowse:
fieldbiology, biology:bioinformatics
interfaceweb
roleprogram
useanalysing, viewing
webapplication, cgi
Popcon: 31 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Generic Genome Browser is a simple but highly configurable web-based genome browser. It is a component of the Generic Model Organism Systems Database project (GMOD). Some of its features:

  • Simultaneous bird's eye and detailed views of the genome;
  • Scroll, zoom, center;
  • Attach arbitrary URLs to any annotation;
  • Order and appearance of tracks are customizable by administrator and end-user;
  • Search by annotation ID, name, or comment;
  • Supports third party annotation using GFF formats;
  • Settings persist across sessions;
  • DNA and GFF dumps;
  • Connectivity to different databases, including BioSQL and Chado;
  • Multi-language support;
  • Third-party feature loading;
  • Customizable plug-in architecture (e.g. run BLAST, dump & import many formats, find oligonucleotides, design primers, create restriction maps, edit features).
The package is enhanced by the following packages: libbio-samtools-perl
Please cite: Maureen J. Donlin: Using the Generic Genome Browser (GBrowse). (eprint) Department of Biochemistry and Molecular Biology and Department of Molecular Microbiology and Immunology, Saint Louis University School of Medicine (2009)
Registry entries: Bio.tools  SciCrunch 
gdpc
visualiser of molecular dynamic simulations
Versions of package gdpc
ReleaseVersionArchitectures
stretch2.2.5-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie2.2.5-3amd64,armel,armhf,i386
bullseye2.2.5-14amd64,arm64,mips64el,ppc64el
bookworm2.2.5-15amd64,arm64,mips64el,ppc64el
trixie2.2.5-16amd64,arm64,mips64el,ppc64el
sid2.2.5-16amd64,arm64,mips64el,ppc64el,riscv64
buster2.2.5-9amd64,arm64,armhf,i386
Debtags of package gdpc:
fieldbiology, biology:structural, chemistry, physics
interfacex11
roleprogram
scopeapplication
uitoolkitgtk
useviewing
works-with3dmodel, image, video
works-with-formatjpg, png
x11application
Popcon: 15 users (18 upd.)*
Versions and Archs
License: DFSG free
Git

gpdc is a graphical program for visualising output data from molecular dynamics simulations. It reads input in the standard xyz format, as well as other custom formats, and can output pictures of each frame in JPG or PNG format.

gemma
Genome-wide Efficient Mixed Model Association
Versions of package gemma
ReleaseVersionArchitectures
trixie0.98.5+dfsg-2amd64,arm64,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid0.98.5+dfsg-2amd64,arm64,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.98.5+dfsg-2amd64,arm64,armhf,i386,mips64el,ppc64el,s390x
bullseye0.98.4+dfsg-4amd64,arm64,armhf,i386,mips64el,ppc64el,s390x
buster0.98.1+dfsg-1amd64,arm64,armhf,i386
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

GEMMA is the software implementing the Genome-wide Efficient Mixed Model Association algorithm for a standard linear mixed model and some of its close relatives for genome-wide association studies (GWAS):

  • It fits a univariate linear mixed model (LMM) for marker association tests with a single phenotype to account for population stratification and sample structure, and for estimating the proportion of variance in phenotypes explained (PVE) by typed genotypes (i.e. "chip heritability").
  • It fits a multivariate linear mixed model (mvLMM) for testing marker associations with multiple phenotypes simultaneously while controlling for population stratification, and for estimating genetic correlations among complex phenotypes.
  • It fits a Bayesian sparse linear mixed model (BSLMM) using Markov chain Monte Carlo (MCMC) for estimating PVE by typed genotypes, predicting phenotypes, and identifying associated markers by jointly modeling all markers while controlling for population structure.
  • It estimates variance component/chip heritability, and partitions it by different SNP functional categories. In particular, it uses HE regression or REML AI algorithm to estimate variance components when individual-level data are available. It uses MQS to estimate variance components when only summary statisics are available.

GEMMA is computationally efficient for large scale GWAS and uses freely available open-source numerical libraries.

Please cite: Xiang Zhou and Matthew Stephens: Genome-wide efficient mixed-model analysis for association studies Nature Genetics 44:821-824 (2012)
Registry entries: Bioconda 
genometester
toolkit for performing set operations on k-mer lists
Versions of package genometester
ReleaseVersionArchitectures
bookworm4.0+git20200511.91cecb5+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster4.0+git20180508.a9c14a6+dfsg-1amd64,arm64,armhf,i386
trixie4.0+git20200511.91cecb5+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye4.0+git20200511.91cecb5+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid4.0+git20200511.91cecb5+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream4.0+git20221122.71e6625
Popcon: 2 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

Toolkit for performing set operations - union, intersection and complement - on k-mer lists.

GenomeTester4 toolkit, which contains a novel tool GListCompare for performing union, intersection and complement (difference) set operations on k-mer lists. It contains examples of how these general operations can be combined to solve a variety of biological analysis tasks.

Please cite: Lauris Kaplinski, Maarja Lepamets and Maido Remm: GenomeTester4: a toolkit for performing basic set operations - union, intersection and complement on k-mer lists. (PubMed,eprint) GigaScience 4(1):58 (2015)
Registry entries: Bio.tools  Bioconda 
genomethreader
software tool to compute gene structure predictions
Versions of package genomethreader
ReleaseVersionArchitectures
bullseye1.7.3+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.7.3+dfsg-10amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.7.3+dfsg-10amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.7.3+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

GenomeThreader is a software tool to compute gene structure predictions. The gene structure predictions are calculated using a similarity-based approach where additional cDNA/EST and/or protein sequences are used to predict gene structures via spliced alignments. GenomeThreader was motivated by disabling limitations in GeneSeqer, a popular gene prediction program which is widely used for plant genome annotation.

Please cite: G. Gremme, V. Brendel, M.E. Sparks and S. Kurtz: Engineering a software tool for gene structure prediction in higher organisms. Information and Software Technology 47(15):965-978 (2005)
Registry entries: Bio.tools  Bioconda 
genometools
versatile genome analysis toolkit
Versions of package genometools
ReleaseVersionArchitectures
bullseye1.6.1+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm-backports1.6.5+ds-2~bpo12+1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.6.2+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster-backports1.6.1+ds-3~bpo10+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye-backports-sloppy1.6.5+ds-2~bpo11+1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch1.5.9+ds-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid1.6.5+ds-2.2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.6.5+ds-2.2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.5.10+ds-3amd64,arm64,armhf,i386
jessie1.5.3-2amd64,armel,armhf,i386
Debtags of package genometools:
biologynuceleic-acids
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
uitoolkitncurses
Popcon: 4 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

The GenomeTools contains a collection of useful tools for biological sequence analysis and -presentation combined into a single binary.

The toolkit contains binaries for sequence and annotation handling, sequence compression, index structure generation and access, annotation visualization, and much more.

Please cite: Gordon Gremme, Sascha Steinbiss and Stefan Kurtz: GenomeTools: a comprehensive software library for efficient processing of structured genome annotations.. (PubMed) IEEE/ACM Transactions on Computational Biology and Bioinformatics 10(3):645-656 (2013)
Registry entries: Bio.tools 
genomicsdb-tools
sparse array storage library for genomics (tools)
Versions of package genomicsdb-tools
ReleaseVersionArchitectures
sid1.5.3-3amd64,mips64el
bookworm1.4.4-3amd64,mips64el
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

GenomicsDB is built on top of a htslib fork and an internal array storage system for importing, querying and transforming variant data. Variant data is sparse by nature (sparse relative to the whole genome) and using sparse array data stores is a perfect fit for storing such data.

The GenomicsDB stores variant data in a 2D array where:

  • Each column corresponds to a genomic position (chromosome + position);
  • Each row corresponds to a sample in a VCF (or CallSet in the GA4GH terminology);
  • Each cell contains data for a given sample/CallSet at a given position; data is stored in the form of cell attributes;
  • Cells are stored in column major order - this makes accessing cells with the same column index (i.e. data for a given genomic position over all samples) fast.
  • Variant interval/gVCF interval data is stored in a cell at the start of the interval. The END is stored as a cell attribute. For variant intervals (such as deletions and gVCF REF blocks), an additional cell is stored at the END value of the variant interval. When queried for a given genomic position, the query library performs an efficient sweep to determine all intervals that intersect with the queried position.

This package contains some tools to be run as executable files.

gentle
suite to plan genetic cloning
Versions of package gentle
ReleaseVersionArchitectures
sid1.9.5~alpha1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie1.9+cvs20100605+dfsg1-3amd64,armel,armhf,i386
trixie1.9.5~alpha1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
bookworm1.9+cvs20100605+dfsg1-10amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch1.9+cvs20100605+dfsg1-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.9+cvs20100605+dfsg1-7amd64,arm64,armhf,i386
bullseye1.9+cvs20100605+dfsg1-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream1.9.5~alpha2
Debtags of package gentle:
biologynuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacex11
roleprogram
uitoolkitwxwidgets
Popcon: 3 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

GENtle is a software for DNA and amino acid editing, database management, plasmid maps, restriction and ligation, alignments, sequencer data import, calculators, gel image display, PCR, and much more.

Please cite: Magnus Manske: GENtle, a free multi-purpose molecular biology tool. (eprint) (2006)
Registry entries: Bio.tools  SciCrunch 
gff2aplot
pair-wise alignment-plots for genomic sequences in PostScript
Versions of package gff2aplot
ReleaseVersionArchitectures
bookworm2.0-14amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.0-15amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster2.0-11amd64,arm64,armhf,i386
trixie2.0-15amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie2.0-7amd64,armel,armhf,i386
stretch2.0-8amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye2.0-13amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package gff2aplot:
fieldbiology, biology:bioinformatics
interfacecommandline, shell
roleprogram
scopeutility
useconverting, viewing
works-withimage:vector
works-with-formatplaintext, postscript
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

A program to visualize the alignment of two genomic sequences together with their annotations. From GFF-format input files it produces PostScript figures for that alignment. The following menu lists many features of gff2aplot:

  • Comprehensive alignment plots for any GFF-feature. Attributes are defined separately so you can modify only whatsoever attributes for a given file or share same customization across different data-sets.
  • All parameters are set by default within the program, but it can be also fully configured via gff2ps-like flexible customization files. Program can handle several of such files, summarizing all the settings before producing the corresponding figure. Moreover, all customization parameters can be set via command-line switches, which allows users to play with those parameters before adding any to a customization file.
  • Source order is taken from input files, if you swap file order you can visualize alignment and its annotation with the new input arrangement.
  • All alignment scores can be visualized in a PiP box below gff2aplot area, using grey-color scale, user-defined color scale or score-dependent gradients.
  • Scalable fonts, which can also be chosen among the basic PostScript default fonts. Feature and group labels can be rotated to improve readability in both annotation axes.
  • The program is still defined as a Unix filter so it can handle data from files, redirections and pipes, writing output to standard-output and warnings to standard error.
  • gff2aplot is able to manage many physical page formats (from A0 to A10, and more -see available page sizes in its manual-), including user-defined ones. This allows, for instance, the generation of poster size genomic maps, or the use of a continuous-paper supporting plotting device, either in portrait or landscape.
  • You can draw different alignments on same alignment plot and distinguish them by using different colors for each.
  • Shape dictionary has been expanded, so that further feature shapes are now available (see manual).
  • Annotation projections through alignment plots (so called ribbons) emulate transparencies via complementary color fill patterns. This feature allows one to show color pseudo-blending when horizontal and vertical ribbons overlap.
Please cite: J. F. Abril, R. Guigó and T. Wiehe: gff2aplot: Plotting sequence comparisons. (PubMed,eprint) Bioinformatics 19(18):2477-2479 (2003)
Registry entries: Bio.tools 
gff2ps
produces PostScript graphical output from GFF-files
Versions of package gff2ps
ReleaseVersionArchitectures
jessie0.98d-4all
sid0.98l-6all
trixie0.98l-6all
bookworm0.98l-6all
bullseye0.98l-4all
buster0.98l-2all
stretch0.98d-5all
Debtags of package gff2ps:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useconverting, viewing
works-withimage:vector
works-with-formatpostscript
Popcon: 4 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

gff2ps is a script program developed with the aim of converting gff-formatted records into high quality one-dimensional plots in PostScript. Such plots maybe useful for comparing genomic structures and to visualizing outputs from genome annotation programs. It can be used in a very simple way, because it assumes that the GFF file itself carries enough formatting information, but it also allows through a number of options and/or a configuration file, for a great degree of customization.

Please cite: J. F. Abril and R. Guigó: gff2ps: visualizing genomic annotations.. (PubMed,eprint) Bioinformatics 16(8):743-744 (2000)
Registry entries: Bio.tools  SciCrunch 
gffread
GFF/GTF format conversions, region filtering, FASTA sequence extraction
Versions of package gffread
ReleaseVersionArchitectures
sid0.12.7-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.12.7-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.12.7-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye0.12.1-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 4 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Gffread is a GFF/GTF parsing utility providing format conversions, region filtering, FASTA sequence extraction and more.

Registry entries: Bio.tools  Bioconda 
ggd-utils
programs for use in ggd
Versions of package ggd-utils
ReleaseVersionArchitectures
trixie1.0.0+ds-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0.0+ds-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.0.7+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.0.0+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Takes a genome file and (currently) a .vcf.gz or a .bed.gz and checks that:

* a .tbi is present
* the VCF has ""##fileformat=VCF" as the first
line
* the VCF has a #CHROM header
* the chromosome are in the order specified by
the genome file (and present)
* the positions are sorted
* the positions are <= the chromosome lengths
defined in the genome file.

As a result, any new genome going into GGD will have a .genome file that will dictate the sort order and presence or absence of the 'chr' prefix for chromosomes

ghemical
GNOME molecular modelling environment
Versions of package ghemical
ReleaseVersionArchitectures
buster3.0.0-4amd64,arm64,armhf,i386
bookworm3.0.0-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch3.0.0-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye3.0.0-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid3.0.0-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
Debtags of package ghemical:
fieldchemistry
interface3d, x11
roleprogram
suitegnome
uitoolkitgtk
useediting, learning, viewing
works-with3dmodel
x11application
Popcon: 12 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Ghemical is a computational chemistry software package written in C++. It has a graphical user interface and it supports both quantum- mechanics (semi-empirical) models and molecular mechanics models. Geometry optimization, molecular dynamics and a large set of visualization tools using OpenGL are currently available.

Ghemical relies on external code to provide the quantum-mechanical calculations. Semi-empirical methods MNDO, MINDO/3, AM1 and PM3 come from the MOPAC7 package (Public Domain), and are included in the package. The MPQC package is used to provide ab initio methods: the methods based on Hartree-Fock theory are currently supported with basis sets ranging from STO-3G to 6-31G**.

Registry entries: Bio.tools  SciCrunch 
Screenshots of package ghemical
ghmm
General Hidden-Markov-Model library - tools
Versions of package ghmm
ReleaseVersionArchitectures
bullseye0.9~rc3-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.9~rc3-2amd64,arm64,armhf,i386
sid0.9~rc3-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.9~rc3-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.9~rc3-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

The General Hidden Markov Model Library (GHMM) is a C library with additional Python3 bindings implementing a wide range of types of Hidden Markov Models and algorithms: discrete, continuous emissions, basic training, HMM clustering, HMM mixtures.

This package contains some tools using the library.

Registry entries: Bioconda 
glam2
gapped protein motifs from unaligned sequences
Versions of package glam2
ReleaseVersionArchitectures
sid1064-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1064-5amd64,arm64,armhf,i386
bookworm1064-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1064-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie1064-3amd64,armel,armhf,i386
stretch1064-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie1064-9amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Debtags of package glam2:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing, comparing, searching
works-with-formatplaintext
Popcon: 3 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

GLAM2 is a software package for finding motifs in sequences, typically amino-acid or nucleotide sequences. A motif is a re-occurring sequence pattern: typical examples are the TATA box and the CAAX prenylation motif. The main innovation of GLAM2 is that it allows insertions and deletions in motifs.

This package includes programs for discovering motifs shared by a set of sequences and finding matches to these motifs in a sequence database, as well as utilities for converting glam2 motifs to standard alignment formats, masking glam2 motifs out of sequences so that weaker motifs can be found, and removing highly similar members of a set of sequences.

The package includes these programs:

 glam2:       discovering motifs shared by a set of sequences;
 glam2scan:   finding matches, in a sequence database, to a motif discovered
              by glam2;
 glam2format: converting glam2 motifs to  standard alignment formats;
 glam2mask:   masking glam2 motifs out of sequences, so that weaker motifs
              can be found;
 glam2-purge: removing highly similar members of a set of sequences.

In this binary package, the fast Fourier algorithm (FFT) was enabled for the glam2 program.

Please cite: Martin C. Frith, Neil F. W. Saunders, Bostjan Kobe and Timothy L. Bailey: Discovering Sequence Motifs with Arbitrary Insertions and Deletions. (PubMed) PLoS Computational Biology 4(5):e1000071 (2008)
Screenshots of package glam2
gmap
spliced and SNP-tolerant alignment for mRNA and short reads
Versions of package gmap
ReleaseVersionArchitectures
bullseye2021-02-22+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
buster2019-01-24-1 (non-free)amd64
sid2024-08-20+ds-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
jessie2014-10-22-1 (non-free)amd64
bookworm2021-12-17+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie2024-08-20+ds-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
stretch2017-01-14-1 (non-free)amd64
Debtags of package gmap:
fieldbiology, biology:bioinformatics, biology:structural
roleprogram
useanalysing
Popcon: 3 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

This package contains the programs GMAP and GSNAP as well as utilities to manage genome databases in GMAP/GSNAP format. GMAP (Genomic Mapping and Alignment Program) is a tool for aligning EST, mRNA and cDNA sequences. GSNAP (Genomic Short-read Nucleotide Alignment Program) is a tool for aligning single-end and paired-end transcriptome reads. Both tools can use a database of

  • known splice sites and identify novel splice sites.
  • known single-nucleotide polymorphisms (SNPs). GSNAP can align bisulfite-treated DNA.
Please cite: Thomas D. Wu and Serban Nacu: Fast and SNP-tolerant detection of complex variants and splicing in short reads. (PubMed,eprint) Bioinformatics 26(7):873-81 (2010)
Registry entries: Bio.tools  SciCrunch  Bioconda 
grabix
wee tool for random access into BGZF files
Versions of package grabix
ReleaseVersionArchitectures
bookworm0.1.7-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.1.7-1amd64,arm64,armhf
bullseye0.1.7-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie0.1.7-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid0.1.7-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

In biomedical research it is increasing practice to study the genetic basis of disease. This now frequently comprises the sequencing of human sequences. The output of the machine however is redundant, and the real sequence is the best sequence to explain the redundancy. The exchange of data happens only with compressed files - to huge and redundant to perform otherwise. One should avoid uncompression whenever possible.

grabix leverages the fantastic BGZF library of the samtools package to provide random access into text files that have been compressed with bgzip. grabix creates it's own index (.gbi) of the bgzipped file. Once indexed, one can extract arbitrary lines from the file with the grab command. Or choose random lines with the, well, random command.

Registry entries: Bioconda 
graphlan
circular representations of taxonomic and phylogenetic trees
Versions of package graphlan
ReleaseVersionArchitectures
bullseye1.1.3-2all
buster1.1.3-1all
bookworm1.1.3-4all
stretch1.1-2all
sid1.1.3-6all
trixie1.1.3-6all
Popcon: 4 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

GraPhlAn is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees. It focuses on concise, integrative, informative, and publication-ready representations of phylogenetically- and taxonomically-driven investigation.

Registry entries: Bioconda 
grinder
Versatile omics shotgun and amplicon sequencing read simulator
Versions of package grinder
ReleaseVersionArchitectures
bookworm0.5.4-6all
trixie0.5.4-6all
sid0.5.4-6all
stretch0.5.4-1all
buster0.5.4-5all
bullseye0.5.4-6all
jessie0.5.3-3all
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Grinder is a versatile program to create random shotgun and amplicon sequence libraries based on DNA, RNA or proteic reference sequences provided in a FASTA file.

Grinder can produce genomic, metagenomic, transcriptomic, metatranscriptomic, proteomic, metaproteomic shotgun and amplicon datasets from current sequencing technologies such as Sanger, 454, Illumina. These simulated datasets can be used to test the accuracy of bioinformatic tools under specific hypothesis, e.g. with or without sequencing errors, or with low or high community diversity. Grinder may also be used to help decide between alternative sequencing methods for a sequence-based project, e.g. should the library be paired-end or not, how many reads should be sequenced.

Please cite: Florent E. Angly, Dana Willner, Forest Rohwer, Philip Hugenholtz and Gene W. Tyson: Grinder: a versatile amplicon and shotgun sequence simulator. (PubMed,eprint) Nucleic Acids Research Epub ahead of print (2012)
Registry entries: SciCrunch 
gromacs
Molecular dynamics simulator, with building and analysis tools
Versions of package gromacs
ReleaseVersionArchitectures
trixie2024.2-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm2022.5-2amd64,arm64,mips64el,ppc64el,s390x
bullseye2020.6-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster2019.1-1amd64,arm64,armhf,i386
sid2024.2-1riscv64
sid2024.3-1amd64,arm64,mips64el,ppc64el,s390x
stretch2016.1-2amd64,arm64,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie5.0.2-1amd64,armel,armhf,i386
Debtags of package gromacs:
fieldbiology, biology:structural, chemistry
interfacecommandline, x11
roleprogram
uitoolkitxlib
x11application
Popcon: 23 users (25 upd.)*
Versions and Archs
License: DFSG free
Git

GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non- biological systems, e.g. polymers.

This package contains variants both for execution on a single machine, and using the MPI interface across multiple machines.

Please cite: Berk Hess, Carsten Kutzner, David van der Spoel and Erik Lindahl: GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. (eprint) J. Chem. Theory Comput. 4(3):435-447 (2008)
Registry entries: Bio.tools  SciCrunch  Bioconda 
gsort
sort genomic data
Versions of package gsort
ReleaseVersionArchitectures
trixie0.1.4-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.1.4-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.1.4-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.1.4-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

gsort is a tool to sort genomic files according to a genomefile. For example, to sort VCF to have order: X, Y, 2, 1, 3, ... and the header needs to be kept at the top.

As a more likely example, if a file nneds to be sorted to match GATK order (1 ... X, Y, MT) which is not possible with any other sorting tool. With gsort one can simply place MT as the last chrom in the ".genome" file.

It will also be useful for getting files ready for use in bedtools.

Registry entries: Bioconda 
gubbins
phylogenetic analysis of genome sequences
Versions of package gubbins
ReleaseVersionArchitectures
stretch2.2.0-1amd64,i386
buster2.3.4-1amd64,i386
bullseye2.4.1-4amd64,i386
bookworm2.4.1-5amd64,i386
trixie3.3.5-1amd64,i386
sid3.3.5-1amd64,i386
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Gubbins supports rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences.

Gubbins (Genealogies Unbiased By recomBinations In Nucleotide Sequences) is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistic models of short-term bacterial evolution, and can be run in only a few hours on alignments of hundreds of bacterial genome sequences.

Please cite: Nicholas J. Croucher, Andrew J. Page, Thomas R. Connor, Aidan J. Delaney, Jacqueline A. Keane, Stephen D. Bentley, Julian Parkhill and Simon R. Harris: Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. (PubMed,eprint) Nucleic Acids Research 43(3):e15 (2014)
Registry entries: Bioconda 
gwama
Genome-Wide Association Meta Analysis
Versions of package gwama
ReleaseVersionArchitectures
buster2.2.2+dfsg-2amd64,arm64,armhf,i386
stretch2.2.2+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie2.2.2+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2.2.2+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.2.2+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye2.2.2+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

GWAMA (Genome-Wide Association Meta Analysis) software performs meta-analysis of the results of GWA studies of binary or quantitative phenotypes. Fixed- and random-effect meta-analyses are performed for both directly genotyped and imputed SNPs using estimates of the allelic odds ratio and 95% confidence interval for binary traits, and estimates of the allelic effect size and standard error for quantitative phenotypes. GWAMA can be used for analysing the results of all different genetic models (multiplicative, additive, dominant, recessive). The software incorporates error trapping facilities to identify strand alignment errors and allele flipping, and performs tests of heterogeneity of effects between studies.

Please cite: Reedik Mägi and Andrew P. Morris: GWAMA: software for genome-wide association meta-analysis. (eprint) BMC Bioinformatics 11(May):288 (2010)
Registry entries: Bio.tools  SciCrunch  Bioconda 
harvest-tools
archiving and postprocessing for reference-compressed genomic multi-alignments
Versions of package harvest-tools
ReleaseVersionArchitectures
trixie1.3-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.3-8amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch1.3-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye1.3-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.3-4amd64,arm64,armhf,i386
sid1.3-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

HarvestTools is a utility for creating and interfacing with Gingr files, which are efficient archives that the Harvest Suite uses to store reference-compressed multi-alignments, phylogenetic trees, filtered variants and annotations. Though designed for use with Parsnp and Gingr, HarvestTools can also be used for generic conversion between standard bioinformatics file formats.

Please cite: Todd J. Treangen, Brian D. Ondov, Sergey Koren and Adam M. Phillippy: Rapid Core-Genome Alignment and Visualization for Thousands of Intraspecific Microbial Genomes. (PubMed,eprint) bioRxiv 15(11):524 (2014)
Registry entries: Bioconda 
hhsuite
sensitive protein sequence searching based on HMM-HMM alignment
Versions of package hhsuite
ReleaseVersionArchitectures
stretch3.0~beta2+dfsg-3amd64
bullseye3.3.0+ds-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
jessie2.0.16-5amd64
bookworm3.3.0+ds-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie3.3.0+ds-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
buster3.0~beta3+dfsg-3amd64
sid3.3.0+ds-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
Popcon: 6 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

HH-suite is an open-source software package for sensitive protein sequence searching based on the pairwise alignment of hidden Markov models (HMMs).

This package contains HHsearch and HHblits among other programs and utilities.

HHsearch takes as input a multiple sequence alignment (MSA) or profile HMM and searches a database of HMMs (e.g. PDB, Pfam, or InterPro) for homologous proteins. HHsearch is often used for protein structure prediction to detect homologous templates and to build highly accurate query-template pairwise alignments for homology modeling.

HHblits can build high-quality MSAs starting from single sequences or from MSAs. It transforms these into a query HMM and, using an iterative search strategy, adds significantly similar sequences from the previous search to the updated query HMM for the next search iteration. Compared to PSI-BLAST, HHblits is faster, up to twice as sensitive and produces more accurate alignments.

Please cite: Michael Remmert, Andreas Biegert, Andreas Hauser and Johannes Söding: HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment.. (PubMed) Nat. Methods 9(2):173-175 (2011)
Registry entries: SciCrunch  Bioconda 
hilive
realtime alignment of Illumina reads
Versions of package hilive
ReleaseVersionArchitectures
buster1.1-2amd64,arm64,armhf
stretch0.3-2amd64,arm64,armel,i386,mips64el,mipsel,ppc64el