Summary

Next Generation Sequencing

Debian Med bioinformatics applications usable in Next Generation Sequencing

It aims at gettting packages which specialize in the processing or interpretation of data generated with next- (and later-) generation high-thoughput sequencing technologies.

Translate description

Description

For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:

Official Debian packages with high relevance
Official Debian packages with lower relevance
Debian packages in contrib or non-free
Packaging has started and developers might try the packaging code in VCS
No known packages available

If you discover a project which looks like a good candidate for Debian Med to you, or if you have prepared an unofficial Debian package, please do not hesitate to send a description of that project to the Debian Med mailing list

Links to other tasks

Index of all tasks

Biology

Biology Development

Next Generation Sequencing

Phylogeny

Cloud

Content Management

Covid-19

Medical data

Dental

Epidemiology

Hospital Information Systems

Imaging

Imaging Development

Laboratory

Oncology

Pharmacology

Physics

Practice

Psychology

Rehabilitation

Research

Statistics

Tools

Typesetting

Debian Med Next Generation Sequencing packages

Official Debian packages with high relevance

anfo

Short Read Aligner/Mapper from MPG

https://bioinf.eva.mpg.de/anfo/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Popcon: 1 users (0 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Anfo is a mapper in the spirit of Soap/Maq/Bowtie, but its implementation takes more after BLAST/BLAT. It's most useful for the alignment of sequencing reads where the DNA sequence is somehow modified (think ancient DNA or bisulphite treatment) and/or there is more divergence between sample and reference than what fast mappers will handle gracefully (say the reference genome is missing and a related species is used instead).

Registry entries: SciCrunch

Topics: Sequencing

Upload screenshot

arden

specificity control for read alignments using an artificial reference

http://sourceforge.net/projects/arden/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Popcon: 8 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

ARDEN (Artificial Reference Driven Estimation of false positives in NGS data) is a novel benchmark that estimates error rates based on real experimental reads and an additionally generated artificial reference genome. It allows the computation of error rates specifically for a dataset and the construction of a ROC-curve. Thereby, it can be used to optimize parameters for read mappers, to select read mappers for a specific problem or also to filter alignments based on quality estimation.

Please cite: Sven H. Giese, Franziska Zickmann and Bernhard Y. Renard: Specificity control for read alignments using an artificial reference genome-guided false discovery rate. (PubMed,eprint) Bioinformatics 30(1):9-16 (2013)

Registry entries: SciCrunch

Topics: Sequencing

Upload screenshot

art-nextgen-simulation-tools

simulation tools to generate synthetic next-generation sequencing reads

https://www.niehs.nih.gov/research/resources/software/biostatistics/art/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package art-nextgen-simulation-tools
Release	Version	Architectures
sid	20160605+dfsg-5	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	20160605+dfsg-5	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	20160605+dfsg-5	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	20160605+dfsg-4	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye	20160605+dfsg-4	amd64,arm64,armhf,i386

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

ART is a set of simulation tools to generate synthetic next-generation sequencing reads. ART simulates sequencing reads by mimicking real sequencing process with empirical error models or quality profiles summarized from large recalibrated sequencing data. ART can also simulate reads using user own read error model or quality profiles. ART supports simulation of single-end, paired-end/mate-pair reads of three major commercial next-generation sequencing platforms: Illumina's Solexa, Roche's 454 and Applied Biosystems' SOLiD. ART can be used to test or benchmark a variety of method or tools for next-generation sequencing data analysis, including read alignment, de novo assembly, SNP and structure variation discovery. ART was used as a primary tool for the simulation study of the 1000 Genomes Project . ART is implemented in C++ with optimized algorithms and is highly efficient in read simulation. ART outputs reads in the FASTQ format, and alignments in the ALN format. ART can also generate alignments in the SAM alignment or UCSC BED file format. ART can be used together with genome variants simulators (e.g. VarSim) for evaluating variant calling tools or methods.

Please cite: Weichun Huang, Leping Li, Jason R. Myers and Gabor T. Marth: ART: a next-generation sequencing read simulator. (PubMed,eprint) Bioinformatics 28(4):593-594 (2012)

Registry entries: SciCrunch Bioconda

Upload screenshot

artfastqgenerator

outputs artificial FASTQ files derived from a reference genome

https://sourceforge.net/projects/artfastqgen/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package artfastqgenerator
Release	Version	Architectures
bullseye	0.0.20150519-4	all
trixie	0.0.20150519-5	all
forky	0.0.20150519-5	all
sid	0.0.20150519-5	all
bookworm	0.0.20150519-4	all

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

ArtificialFastqGenerator takes the reference genome (in FASTA format) as input and outputs artificial FASTQ files in the Sanger format. It can accept Phred base quality scores from existing FASTQ files, and use them to simulate sequencing errors. Since the artificial FASTQs are derived from the reference genome, the reference genome provides a gold-standard for calling variants (Single Nucleotide Polymorphisms (SNPs) and insertions and deletions (indels)). This enables evaluation of a Next Generation Sequencing (NGS) analysis pipeline which aligns reads to the reference genome and then calls the variants.

Please cite: Matthew Frampton and Richard Houlston: Generation of Artificial FASTQ Files to Evaluate the Performance of Next-Generation Sequencing Pipelines. (PubMed,eprint) PLOSone 7(11):e49110 (2012)

Upload screenshot

bamtools

toolkit for manipulating BAM (genome alignment) files

https://github.com/pezmaster31/bamtools/wiki

Maintainer: Debian Med Packaging Team (Dominique Belhachemi)

Versions of package bamtools
Release	Version	Architectures
forky	2.5.3+dfsg-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bullseye	2.5.1+dfsg-9	amd64,arm64,armhf,i386
bookworm	2.5.2+dfsg-4	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	2.5.2+dfsg-6	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
sid	2.5.3+dfsg-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x

Popcon: 10 users (36 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

BamTools facilitates research analysis and data management using BAM files. It copes with the enormous amount of data produced by current sequencing technologies that is typically stored in compressed, binary formats that are not easily handled by the text-based parsers commonly used in bioinformatics research.

BamTools provides both a C++ API for BAM file support as well as a command-line toolkit.

This is the bamtools command-line toolkit.

Available bamtools commands:

 convert  Converts between BAM and a number of other formats
 count    Prints number of alignments in BAM file(s)
 coverage Prints coverage statistics from the input BAM file
 filter   Filters BAM file(s) by user-specified criteria
 header   Prints BAM header information
 index    Generates index for BAM file
 merge    Merge multiple BAM files into single file
 random   Select random alignments from existing BAM file(s), intended more
          as a testing tool.
 resolve  Resolves paired-end reads (marking the IsProperPair flag as needed)
 revert   Removes duplicate marks and restores original base qualities
 sort     Sorts the BAM file according to some criteria
 split    Splits a BAM file on user-specified property, creating a new BAM
          output file for each value found
 stats    Prints some basic statistics from input BAM file(s)

The package is enhanced by the following packages: multiqc

Please cite: Derek W. Barnett, Erik K. Garrison, Aaron R. Quinlan, Michael P. Stromberg and Gabor T. Marth: BamTools: a C++ API and toolkit for analyzing and managing BAM files. (PubMed,eprint) Bioinformatics 27(12):1691-2 (2011)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

bcftools

genomic variant calling and manipulation of VCF/BCF files

https://samtools.github.io/bcftools/

Maintainer: Debian Med Packaging Team (Steffen Möller)

Versions of package bcftools
Release	Version	Architectures
forky	1.22-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	1.22-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	1.11-1	amd64,arm64,armhf,i386
bookworm	1.16-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie	1.21-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
upstream	1.23

Popcon: 22 users (36 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.

The package is enhanced by the following packages: multiqc

Please cite: Petr Danecek and Shane A. McCarthy: BCFtools/csq: Haplotype-aware variant consequences. (2016)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

bedtools

suite of utilities for comparing genomic features

https://github.com/arq5x/bedtools2

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package bedtools
Release	Version	Architectures
trixie	2.31.1+dfsg-2	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
forky	2.31.1+dfsg-3	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	2.31.1+dfsg-3	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	2.30.0+dfsg-1	amd64,arm64,armhf,i386
bookworm	2.30.0+dfsg-3	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Debtags of package bedtools:
field	biology, biology:bioinformatics
interface	commandline
role	program
scope	suite
use	analysing, comparing, converting, filtering
works-with	biological-sequence

Popcon: 59 users (122 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using BEDTools, one can develop sophisticated pipelines that answer complicated research questions by streaming several BEDTools together.

The groupBy utility is distributed in the filo package.

Please cite: Aaron R. Quinlan and Ira M. Hall: BEDTools: a flexible suite of utilities for comparing genomic features. (PubMed,eprint) Bioinformatics 26(6):841-842 (2010)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

berkeley-express

Streaming quantification for high-throughput sequencing

http://bio.math.berkeley.edu/eXpress/index.html

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package berkeley-express
Release	Version	Architectures
bookworm	1.5.3+dfsg-3	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye	1.5.3+dfsg-1	amd64,arm64,armhf,i386
sid	1.5.3+dfsg-6	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	1.5.3+dfsg-5	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	1.5.3+dfsg-3	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x

Popcon: 7 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

eXpress is a streaming tool for quantifying the abundances of a set of target sequences from sampled subsequences. Example applications include transcript-level RNA-Seq quantification, allele-specific/haplotype expression analysis (from RNA-Seq), transcription factor binding quantification in ChIP-Seq, and analysis of metagenomic data. It is based on an online-EM algorithm that results in space (memory) requirements proportional to the total size of the target sequences and time requirements that are proportional to the number of sampled fragments. Thus, in applications such as RNA-Seq, eXpress can accurately quantify much larger samples than other currently available tools greatly reducing computing infrastructure requirements. eXpress can be used to build lightweight high-throughput sequencing processing pipelines when coupled with a streaming aligner (such as Bowtie), as output can be piped directly into eXpress, effectively eliminating the need to store read alignments in memory or on disk.

In an analysis of the performance of eXpress for RNA-Seq data, it was observed that this efficiency does not come at a cost of accuracy. eXpress is more accurate than other available tools, even when limited to smaller datasets that do not require such efficiency. Moreover, like the Cufflinks program, eXpress can be used to estimate transcript abundances in multi-isoform genes. eXpress is also able to resolve multi-mappings of reads across gene families, and does not require a reference genome so that it can be used in conjunction with de novo assemblers such as Trinity, Oases, or Trans-ABySS. The underlying model is based on previously described probabilistic models developed for RNA-Seq but is applicable to other settings where target sequences are sampled, and includes parameters for fragment length distributions, errors in reads, and sequence-specific fragment bias.

eXpress can be used to resolve ambiguous mappings in other high-throughput sequencing based applications. The only required inputs to eXpress are a set of target sequences and a set of sequenced fragments multiply-aligned to them. While these target sequences will often be gene isoforms, they need not be. Haplotypes can be used as the reference for allele-specific expression analysis, binding regions for ChIP-Seq, or target genomes in metagenomics experiments. eXpress is useful in any analysis where reads multi-map to sequences that differ in abundance.

Please cite: Adam Roberts and Lior Pachter: Streaming fragment assignment for real-time analysis of sequencing experiments. (PubMed) Nature Methods 10(1):71–73 (2013)

Registry entries: SciCrunch Bioconda

Upload screenshot

bio-rainbow

clustering and assembling short reads for bioinformatics

http://sourceforge.net/projects/bio-rainbow/

Maintainer: Debian Med Packaging Team (Pranav Ballaney)

Versions of package bio-rainbow
Release	Version	Architectures
forky	2.0.4+dfsg-2	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	2.0.4+dfsg-2	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	2.0.4+dfsg-2	amd64,arm64,armhf,i386
bookworm	2.0.4+dfsg-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	2.0.4+dfsg-2	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Efficient tool for clustering and assembling short reads, especially for RAD.

Rainbow is developed to provide an ultra-fast and memory-efficient solution to clustering and assembling short reads produced by RAD-seq. First, Rainbow clusters reads using a spaced seed method. Then, Rainbow implements a heterozygote calling like strategy to divide potential groups into haplotypes in a top-down manner. long a guided tree, it iteratively merges sibling leaves in a bottom-up manner if they are similar enough. Here, the similarity is defined by comparing the 2nd reads of a RAD segment. This approach tries to collapse heterozygote while discriminate repetitive sequences. At last, Rainbow uses a greedy algorithm to locally assemble merged reads into contigs. Rainbow not only outputs the optimal but also suboptimal assembly results. Based on simulation and a real guppy RAD-seq data, it is shown that Rainbow is more competent than the other tools in dealing with RAD-seq data.

Please cite: Zechen Chong, Jue Ruan and Chung-I. Wu: Rainbow: an integrated tool for efficient clustering and assembling RAD-seq reads.. (PubMed) Bioinformatics 28(21):2732-2737 (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

blasr

mapping single-molecule sequencing reads

https://github.com/PacificBiosciences/blasr

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package blasr
Release	Version	Architectures
bullseye	5.3.3+dfsg-5	amd64,arm64
bookworm	5.3.5+dfsg-6	amd64,arm64,mips64el,ppc64el
forky	5.3.5+dfsg-8	amd64,arm64,ppc64el,riscv64
trixie	5.3.5+dfsg-7	amd64,arm64,ppc64el,riscv64
sid	5.3.5+dfsg-8	amd64,arm64,loong64,ppc64el,riscv64

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Basic local alignment with successive refinement (BLASR) is a method for mapping single-molecule sequencing reads against a reference genome. Such reads are thousands of bases long, with divergence between them and the genome being dominated by insertion and deletion error.

Please cite: Mark J Chaisson and Glenn Tesler: Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. (PubMed) BMC Bioinformatics 13(238) (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

bowtie

Ultrafast memory-efficient short read aligner

https://bowtie-bio.sourceforge.net/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package bowtie
Release	Version	Architectures
trixie	1.3.1-3	amd64,arm64,ppc64el,riscv64,s390x
bullseye	1.3.0+dfsg1-1	amd64,arm64
sid	1.3.1-3	amd64,arm64,loong64,ppc64el,riscv64,s390x
forky	1.3.1-3	amd64,arm64,ppc64el,riscv64,s390x
bookworm	1.3.1-1	amd64,arm64,mips64el,ppc64el,s390x

Debtags of package bowtie:
biology	nuceleic-acids
field	biology:bioinformatics
interface	commandline
role	program
science	calculation
scope	utility
use	analysing, comparing
works-with	biological-sequence

Popcon: 20 users (42 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

This package addresses the problem to interpret the results from the latest (2010) DNA sequencing technologies. Those will yield fairly short stretches and those cannot be interpreted directly. It is the challenge for tools like Bowtie to give a chromosomal location to the short stretches of DNA sequenced per run.

Bowtie aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).

The package is enhanced by the following packages: bowtie-examples multiqc

Please cite: Ben Langmead, Cole Trapnell, Mihai Pop and Steven L Salzberg: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. (eprint) Genome Biology 10:R25 (2009)

Registry entries: Bio.tools SciCrunch Bioconda

Topics: Genomics

Upload screenshot

bowtie2

ultrafast memory-efficient short read aligner

https://bowtie-bio.sourceforge.net/bowtie2

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package bowtie2
Release	Version	Architectures
forky	2.5.4-1	amd64,arm64,ppc64el,riscv64
bullseye	2.4.2-2	amd64,arm64
bookworm	2.5.0-3	amd64,arm64,mips64el,ppc64el
trixie	2.5.4-1	amd64,arm64,ppc64el,riscv64
sid	2.5.4-1	amd64,arm64,ppc64el,riscv64
upstream	2.5.5

Popcon: 22 users (33 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes.

Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes

The package is enhanced by the following packages: bowtie2-examples multiqc

Please cite: Ben Langmead and Steven L Salzberg: Fast gapped-read alignment with Bowtie 2. (PubMed) Nature Methods 9:357–359 (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Topics: Genomics

Upload screenshot

bwa

Burrows-Wheeler Aligner

https://bio-bwa.sourceforge.net/

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package bwa
Release	Version	Architectures
trixie	0.7.18-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
forky	0.7.19-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	0.7.19-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	0.7.17-6	amd64,arm64,armhf,i386
bookworm	0.7.17-7	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Debtags of package bwa:
biology	nuceleic-acids, peptidic
field	biology, biology:bioinformatics
interface	commandline, text-mode
role	program
use	analysing, comparing

Popcon: 21 users (37 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.

Please cite: Heng Li and Richard Durbin: Fast and accurate short read alignment with Burrows-Wheeler transform. (PubMed,eprint) Bioinformatics 25(14):1754-1760 (2009)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

canu

single molecule sequence assembler for genomes

https://canu.readthedocs.org/en/latest/

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package canu
Release	Version	Architectures
bullseye	2.0+dfsg-1	amd64,arm64,armhf,i386
bookworm	2.0+dfsg-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	2.2+dfsg-5	amd64,arm64,loong64,ppc64el,riscv64,s390x
upstream	2.3

Popcon: 0 users (0 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Canu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing (such as the PacBio RS II or Oxford Nanopore MinION).

Canu is a hierarchical assembly pipeline which runs in four steps:

Detect overlaps in high-noise sequences using MHAP
Generate corrected sequence consensus
Trim corrected sequences
Assemble trimmed corrected sequences

Please cite: Sergey Koren, Brian P. Walenz, Konstantin Berlin, Jason R. Miller and Adam M. Phillippy: Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.. Genome Res. (2017)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

Remark of Debian Med team: Genome assembly and large-scale genome alignment (http://www.cbcb.umd.edu/software/)

changeo

Repertoire clonal assignment toolkit (Python 3)

https://changeo.readthedocs.io

Maintainer: Debian Med Packaging Team (Alexandre Detiste)

Versions of package changeo
Release	Version	Architectures
bookworm	1.3.0-1	all
trixie	1.3.0-3	all
forky	1.3.0-4	all
sid	1.3.0-4	all
bullseye	1.0.2-1	all
upstream	1.3.4

Popcon: 6 users (32 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Change-O is a collection of tools for processing the output of V(D)J alignment tools, assigning clonal clusters to immunoglobulin (Ig) sequences, and reconstructing germline sequences.

Dramatic improvements in high-throughput sequencing technologies now enable large-scale characterization of Ig repertoires, defined as the collection of trans-membrane antigen-receptor proteins located on the surface of B cells and T cells. Change-O is a suite of utilities to facilitate advanced analysis of Ig and TCR sequences following germline segment assignment. Change-O handles output from IMGT/HighV-QUEST and IgBLAST, and provides a wide variety of clustering methods for assigning clonal groups to Ig sequences. Record sorting, grouping, and various database manipulation operations are also included.

This package installs the library for Python 3.

Please cite: Namita T. Gupta, Jason A. Vander Heiden, Mohamed Uduman, Daniel Gadala-Maria, Gur Yaari and Steven H. Kleinstein: Link to publication (PubMed,eprint) Bioinformatics 31(20):3356-3358 (2015)

Registry entries: Bioconda

Upload screenshot

crac

integrated RNA-Seq read analysis

https://crac.gforge.inria.fr/

Maintainer: Debian Med Packaging Team (Nilesh Patra)

Versions of package crac
Release	Version	Architectures
bookworm	2.5.2+dfsg-5	amd64,arm64,armel,armhf,i386,mips64el,ppc64el
bullseye	2.5.2+dfsg-4	amd64,arm64,armhf,i386
trixie	2.5.2+dfsg-6	amd64,arm64,ppc64el,riscv64
forky	2.5.2+dfsg-7	amd64,arm64,ppc64el,riscv64
sid	2.5.2+dfsg-7	amd64,arm64,loong64,ppc64el,riscv64

Popcon: 7 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

CRAC is a tool to analyze High Throughput Sequencing (HTS) data in comparison to a reference genome. It is intended for transcriptomic and genomic sequencing reads. More precisely, with transcriptomic reads as input, it predicts point mutations, indels, splice junction, and chimeric RNAs (ie, non colinear splice junctions). CRAC can also output positions and nature of sequence error that it detects in the reads. CRAC uses a genome index. This index must be computed before running the read analysis. For this sake, use the command "crac-index" on your genome files. You can then process the reads using the command crac. See the man page of CRAC (help file) by typing "man crac". CRAC requires large amount of main memory on your computer. For processing against the Human genome, say 50 million reads of 100 nucleotide each, CRAC requires about 40 gigabytes of main memory. Check whether the system of your computing server is equipped with sufficient amount of memory before launching an analysis.

Please cite: Eliseos J. Mucaki, Natasha G. Caminsky, Ami M. Perri, Ruipeng Lu, Alain Laederach, Matthew Halvorsen, Joan H. M. Knoll and Peter K. Rogan: A unified analytic framework for prioritization of non-coding variants of uncertain significance in heritable breast and ovarian cancer. (PubMed) BMS Medical Genomics 9:19 (2016)

Registry entries: Bio.tools SciCrunch

Upload screenshot

cutadapt

Clean biological sequences from high-throughput sequencing reads

https://cutadapt.readthedocs.io/

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package cutadapt
Release	Version	Architectures
trixie	4.7-2	all
forky	4.7-2	all
sid	4.7-2	all
bookworm	4.2-1	all
bullseye	3.2-2	all
upstream	5.2

Popcon: 10 users (34 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Cutadapt helps with biological sequence clean tasks by finding the adapter or primer sequences in an error-tolerant way. It can also modify and filter reads in various ways. Adapter sequences can contain IUPAC wildcard characters. Also, paired-end reads and even colorspace data is supported. If you want, you can also just demultiplex your input data, without removing adapter sequences at all.

This package contains the user interface.

The package is enhanced by the following packages: multiqc

Please cite: Marcel Martin: Cutadapt removes adapter sequences from high-throughput sequencing reads. (eprint) EMBnet.journal 17(1):10-12 (2015)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

daligner

local alignment discovery between long nucleotide sequencing reads

https://dazzlerblog.wordpress.com

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package daligner
Release	Version	Architectures
bookworm	1.0+git20221215.bd26967-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	1.0+git20240119.335105d-3	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
forky	1.0+git20240119.335105d-3	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	1.0+git20240119.335105d-3	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	1.0+git20200727.ed40ce5-3	amd64,arm64,armhf,i386

Popcon: 9 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

These tools permit one to find all significant local alignments between reads encoded in a Dazzler database. The assumption is that the reads are from a Pacific Biosciences RS II long read sequencer. That is, the reads are long and noisy, up to 15% on average.

Please cite: Gene Myers: Efficient Local Alignment Discovery amongst Noisy Long Reads. 8701:52-67 (2014)

Registry entries: SciCrunch Bioconda

Upload screenshot

deepnano

alternative basecaller for MinION reads of genomic sequences

https://bitbucket.org/vboza/deepnano

Maintainer: Debian Med Packaging Team (Thomas Goirand)

Popcon: 0 users (0 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

DeepNano is alternative basecaller for Oxford Nanopore MinION reads based on deep recurrent neural networks.

Currently it works with SQK-MAP-006 and SQK-MAP-005 chemistry and as a postprocessor for Metrichor.

Please cite: Vladimír Boža, Broňa Brejová and Tomáš Vinař: DeepNano: Deep recurrent neural networks for base calling in MinION nanopore reads. PLOS one (2017)

Upload screenshot

discosnp

discovering Single Nucleotide Polymorphism from raw set(s) of reads

http://colibread.inria.fr/discosnp/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package discosnp
Release	Version	Architectures
bullseye	4.4.4-1	amd64,arm64,i386
bookworm	2.6.2-2	amd64,arm64,mips64el,ppc64el
trixie	2.6.2-4	amd64,arm64,ppc64el,riscv64
forky	2.6.2-5	amd64,arm64,ppc64el,riscv64
sid	2.6.2-5	amd64,arm64,loong64,ppc64el,riscv64

Popcon: 7 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Software discoSnp is designed for discovering Single Nucleotide Polymorphism (SNP) from raw set(s) of reads obtained with Next Generation Sequencers (NGS).

Note that number of input read sets is not constrained, it can be one, two, or more. Note also that no other data as reference genome or annotations are needed.

The software is composed by two modules. First module, kissnp2, detects SNPs from read sets. A second module, kissreads, enhance the kissnp2 results by computing per read set and for each found SNP:

 1) its mean read coverage
 2) the (phred) quality of reads generating the polymorphism.

This program is superseded by DiscoSnp++.

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

dnaclust

tool for clustering millions of short DNA sequences

https://dnaclust.sourceforge.net/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package dnaclust
Release	Version	Architectures
bullseye	3-7	amd64,arm64,armhf,i386
sid	3-8	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	3-8	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	3-7	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	3-7	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 8 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

dnaclust is a tool for clustering large number of short DNA sequences. The clusters are created in such a way that the "radius" of each clusters is no more than the specified threshold.

The input sequences to be clustered should be in Fasta format. The id of each sequence is based on the first word of the seqeunce in the Fasta format. The first word is the prefix of the header up to the first occurrence of white space characters in the header.

Please cite: Mohammadreza Ghodsi, Bo Liu and Mihai Pop: DNACLUST: accurate and efficient clustering of phylogenetic marker genes. (PubMed,eprint) BMC Bioinformatics 12:271 (2011)

Registry entries: Bio.tools SciCrunch

Upload screenshot

dwgsim

short sequencing read simulator

https://github.com/nh13/DWGSIM/

Maintainer: Debian Med Packaging Team (Nilesh Patra)

Versions of package dwgsim
Release	Version	Architectures
trixie	0.1.14-3	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
sid	0.1.14-4	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	0.1.14-4	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bookworm	0.1.14-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye	0.1.12-4	amd64,arm64,armhf,i386
upstream	0.1.16

Popcon: 6 users (32 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

DWGSIM simulates short sequencing reads from modern sequencing platforms. DWGSIM generates base error rates using a parametric model, allowing a more realisic error profile. It was originally developed for use in evaluating short read aligners.

Registry entries: SciCrunch Bioconda

Upload screenshot

ea-utils

command-line tools for processing biological sequencing data

https://expressionanalysis.github.io/ea-utils/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package ea-utils
Release	Version	Architectures
trixie	1.1.2+dfsg-9	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	1.1.2+dfsg-6	amd64,arm64,armhf,i386
bookworm	1.1.2+dfsg-9	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	1.1.2+dfsg-9	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	1.1.2+dfsg-9	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x

Popcon: 58 users (27 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Ea-utils provides a set of command-line tools for processing biological sequencing data, barcode demultiplexing, adapter trimming, etc.

Primarily written to support an Illumina based pipeline - but should work with any FASTQs.

Main Tools are:

fastq-mcf Scans a sequence file for adapters, and, based on a log-scaled threshold, determines a set of clipping parameters and performs clipping. Also does skewing detection and quality filtering.
fastq-multx Demultiplexes a fastq. Capable of auto-determining barcode id's based on a master set fields. Keeps multiple reads in-sync during demultiplexing. Can verify that the reads are in-sync as well, and fail if they're not.
fastq-join Similar to audy's stitch program, but in C, more efficient and supports some automatic benchmarking and tuning. It uses the same "squared distance for anchored alignment" as other tools.
varcall Takes a pileup and calculates variants in a more easily parameterized manner than some other tools.

Please cite: Erik Aronesty: Comparison of Sequencing Utility Programs. (eprint) The Open Bioinformatics Journal 7:1-8 (2013)

Registry entries: Bio.tools SciCrunch

Upload screenshot

fastaq

FASTA and FASTQ file manipulation tools

https://github.com/sanger-pathogens/Fastaq

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package fastaq
Release	Version	Architectures
bookworm	3.17.0-5	all
forky	3.17.0-9	all
trixie	3.17.0-9	all
sid	3.17.0-9	all
bullseye	3.17.0-3	all

Popcon: 9 users (34 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Fastaq represents a diverse collection of scripts that perform useful and common FASTA/FASTQ manipulation tasks, such as filtering, merging, splitting, sorting, trimming, search/replace, etc. Input and output files can be gzipped (format is automatically detected) and individual Fastaq commands can be piped together.

Topics: Bioinformatics

Upload screenshot

fastp

Ultra-fast all-in-one FASTQ preprocessor

https://github.com/OpenGene/fastp

Maintainer: Debian Med Packaging Team (Dylan Aïssi)

Versions of package fastp
Release	Version	Architectures
bullseye	0.20.1+dfsg-1	amd64,arm64,armhf,i386
bookworm	0.23.2+dfsg-2	amd64,arm64,armel,armhf,mips64el,mipsel,ppc64el,s390x
trixie	0.24.0+dfsg-1	amd64,arm64,ppc64el,riscv64,s390x
forky	1.0.1+dfsg-1	amd64,arm64,ppc64el,riscv64,s390x
sid	1.0.1+dfsg-1	amd64,arm64,loong64,ppc64el,riscv64,s390x
upstream	1.1.0

Popcon: 8 users (32 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

All-in-one FASTQ preprocessor, fastp provides functions including quality profiling, adapter trimming, read filtering and base correction. It supports both single-end and paired-end short read data and also provides basic support for long-read data.

The package is enhanced by the following packages: multiqc

Please cite: Shifu Chen, Yanqing Zhou, Yaru Chen and Jia Gu: fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34(17):i884-i890 (2018)

Registry entries: Bioconda

Upload screenshot

fastqc

quality control for high throughput sequence data

https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Maintainer: Debian Med Packaging Team (Pierre Gruet)

Versions of package fastqc
Release	Version	Architectures
sid	0.12.1+dfsg-4	all
trixie	0.12.1+dfsg-4	all
bookworm	0.11.9+dfsg-6	all
bullseye	0.11.9+dfsg-4	all
forky	0.12.1+dfsg-4	all

Popcon: 19 users (36 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.

The main functions of FastQC are

Import of data from BAM, SAM or FastQ files (any variant)
Providing a quick overview to tell you in which areas there may be problems
Summary graphs and tables to quickly assess your data
Export of results to an HTML based permanent report
Offline operation to allow automated generation of reports without running the interactive application

The package is enhanced by the following packages: multiqc

Registry entries: Bio.tools SciCrunch Bioconda

Topics: Sequencing

Upload screenshot

flexbar

flexible barcode and adapter removal for sequencing platforms

https://github.com/seqan/flexbar

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package flexbar
Release	Version	Architectures
trixie	3.5.0-5	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	3.5.0-3	amd64,arm64,armhf,i386
sid	3.5.0-7	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bookworm	3.5.0-5	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Flexbar preprocesses high-throughput sequencing data efficiently. It demultiplexes barcoded runs and removes adapter sequences. Moreover, trimming and filtering features are provided. Flexbar increases mapping rates and improves genome and transcriptome assemblies. It supports next-generation sequencing data in fasta/q and csfasta/q format from Illumina, Roche 454, and the SOLiD platform.

Parameter names changed in Flexbar. Please review scripts. The recent months, default settings were optimised, several bugs were fixed and various improvements were made, e.g. revamped command-line interface, new trimming modes as well as lower time and memory requirements.

The package is enhanced by the following packages: multiqc

Please cite: Matthias Dodt, Johannes T. Roehr, Rina Ahmed and Christoph Dieterich: FLEXBAR — Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms. (eprint) Biology 1(3):895-905 (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

fml-asm

tool for assembling Illumina short reads in small regions

https://github.com/lh3/fermi-lite

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package fml-asm
Release	Version	Architectures
forky	0.1+git20221215.85f159e-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bullseye	0.1+git20190320.b499514-1	amd64,arm64,armhf,i386
bookworm	0.1+git20190320.b499514-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	0.1+git20221215.85f159e-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
sid	0.1+git20221215.85f159e-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Fml-asm is a command-line tool for assembling Illumina short reads in regions from 100bp to 10 million bp in size, based on the fermi-lite library. It is largely a light-weight in-memory version of fermikit without generating any intermediate files. It inherits the performance, the relatively small memory footprint and the features of fermikit. In particular, fermi-lite is able to retain heterozygous events and thus can be used to assemble diploid regions for the purpose of variant calling.

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

fsm-lite

frequency-based string mining (lite)

https://github.com/nvalimak/fsm-lite

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package fsm-lite
Release	Version	Architectures
forky	1.0-8	amd64,arm64,ppc64el,riscv64,s390x
sid	1.0-8	amd64,arm64,loong64,ppc64el,riscv64,s390x
bullseye	1.0-5	amd64,arm64
trixie	1.0-8	amd64,arm64,ppc64el,riscv64,s390x
bookworm	1.0-8	amd64,arm64,mips64el,ppc64el,s390x

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

A singe-core implementation of frequency-based substring mining used in bioinformatics to extract substrings that discriminate two (or more) datasets inside high-throughput sequencing data.

Registry entries: SciCrunch Bioconda

Upload screenshot

grinder

Versatile omics shotgun and amplicon sequencing read simulator

https://sourceforge.net/projects/biogrinder/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Popcon: 8 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Grinder is a versatile program to create random shotgun and amplicon sequence libraries based on DNA, RNA or proteic reference sequences provided in a FASTA file.

Grinder can produce genomic, metagenomic, transcriptomic, metatranscriptomic, proteomic, metaproteomic shotgun and amplicon datasets from current sequencing technologies such as Sanger, 454, Illumina. These simulated datasets can be used to test the accuracy of bioinformatic tools under specific hypothesis, e.g. with or without sequencing errors, or with low or high community diversity. Grinder may also be used to help decide between alternative sequencing methods for a sequence-based project, e.g. should the library be paired-end or not, how many reads should be sequenced.

Please cite: Florent E. Angly, Dana Willner, Forest Rohwer, Philip Hugenholtz and Gene W. Tyson: Grinder: a versatile amplicon and shotgun sequence simulator. (PubMed,eprint) Nucleic Acids Research Epub ahead of print (2012)

Registry entries: SciCrunch

Upload screenshot

hilive

??? missing short description for package hilive :-(

https://gitlab.com/rki_bioinformatics/HiLive2

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package hilive
Release	Version	Architectures
experimental	2.0a-5	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	2.0a-3	amd64,arm64,armhf,i386
bookworm	2.0a-3	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 0 users (0 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Please cite: Martin S. Lindner, Benjamin Strauch, Jakob M. Schulze, Simon H. Tausch, Piotr W. Dabrowski, Andreas Nitsche and Bernhard Y. Renard: HiLive: real-time mapping of illumina reads while sequencing. (PubMed) Bioinformatics 33(6):917-919 (2017)

Upload screenshot

hinge

long read genome assembler based on hinging

https://github.com/HingeAssembler/HINGE

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package hinge
Release	Version	Architectures
bullseye	0.5.0-6	amd64,arm64,armhf,i386
bookworm	0.5.0-7	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie	0.5.0-7	amd64,arm64,armel,armhf,i386,ppc64el,riscv64
sid	0.5.0-8	amd64,arm64,armhf,i386,ppc64el,riscv64

Popcon: 5 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

HINGE is a genome assembler that seeks to achieve optimal repeat resolution by distinguishing repeats that can be resolved given the data from those that cannot. This is accomplished by adding “hinges” to reads for constructing an overlap graph where only unresolvable repeats are merged. As a result, HINGE combines the error resilience of overlap-based assemblers with repeat-resolution capabilities of de Bruijn graph assemblers.

Please cite: Govinda M Kamath, Ilan Shomorony, Fei Xia, Thomas Courtade and David N Tse: HINGE: Long-read assembly achieves optimal repeat resolution. (PubMed,eprint) Genome Research (2017)

Upload screenshot

hisat2

graph-based alignment of short nucleotide reads to many genomes

https://daehwankimlab.github.io/hisat2/

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package hisat2
Release	Version	Architectures
trixie	2.2.1-5	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	2.2.1-4	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	2.2.1-5	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bullseye	2.2.1-2	amd64,arm64,armhf,i386
forky	2.2.1-5	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
upstream	2.2.2

Popcon: 11 users (33 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as against a single reference genome). Based on an extension of BWT for graphs a graph FM index (GFM) was designed and implementd. In addition to using one global GFM index that represents a population of human genomes, HISAT2 uses a large set of small GFM indexes that collectively cover the whole genome (each index representing a genomic region of 56 Kbp, with 55,000 indexes needed to cover the human population). These small indexes (called local indexes), combined with several alignment strategies, enable rapid and accurate alignment of sequencing reads. This new indexing scheme is called a Hierarchical Graph FM index (HGFM).

The package is enhanced by the following packages: multiqc

Please cite: Daehwan Kim, Joseph M. Paggi, Chanhee Park, Christopher Bennett and Steven L. Salzberg: Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature Biotechnology 37(8):907-915 (2019)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

idba

iterative De Bruijn Graph short read assemblers

https://github.com/loneknightpy/idba

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package idba
Release	Version	Architectures
bookworm	1.1.3-8	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	1.1.3-8	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	1.1.3-7	amd64,arm64,armhf,i386
sid	1.1.3-8	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	1.1.3-8	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

IDBA stands for iterative de Bruijn graph assembler. In computational sequence biology, an assembler solves the puzzle coming from large sequencing machines that feature many gigabytes of short reads from a large genome.

This package provides several flavours of the IDBA assembler, as they all share the same source tree but serve different purposes and evolved over time.

IDBA is the basic iterative de Bruijn graph assembler for second-generation sequencing reads. IDBA-UD, an extension of IDBA, is designed to utilize paired-end reads to assemble low-depth regions and use progressive depth on contigs to reduce errors in high-depth regions. It is a generic purpose assembler and especially good for single-cell and metagenomic sequencing data. IDBA-Hybrid is another update version of IDBA-UD, which can make use of a similar reference genome to improve assembly result. IDBA-Tran is an iterative de Bruijn graph assembler for RNA-Seq data.

Please cite: Yu Peng, Henry C. M. Leung, S. M. Yiu and Francis Y. L. Chin: IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. (PubMed,eprint) Bioinformatics 28(11):1420-1428 (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

igor

infers V(D)J recombination processes from sequencing data

https://github.com/qmarcou/IGoR/

Maintainer: Debian Med Packaging Team (Lance Lin)

Versions of package igor
Release	Version	Architectures
sid	1.4.0+dfsg-5	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	1.4.0+dfsg-5	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	1.4.0+dfsg-5	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	1.4.0+dfsg-2	amd64,arm64,armhf,i386
bookworm	1.4.0+dfsg-4	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

IGoR (Inference and Generation of Repertoires) is a versatile software to analyze and model immune receptors generation, selection, mutation and all other processes.

Please cite: Quentin Marcou, Thierry Mora and Aleksandra M. Walczak: High-throughput immune repertoire analysis with IGoR. (PubMed,eprint) Nature Communications 9(1):561 (2018)

Registry entries: Bioconda

Upload screenshot

igv

Integrative Genomics Viewer

https://www.broadinstitute.org/igv/

Maintainer: Debian Med Packaging Team (Santiago Vila)

Versions of package igv
Release	Version	Architectures
trixie	2.18.5+dfsg-1	all
bookworm	2.16.0+dfsg-1	all
bullseye	2.6.3+dfsg-3 (non-free)	all
forky	2.18.5+dfsg-4	all
sid	2.18.5+dfsg-4	all
upstream	2.19.7

Popcon: 9 users (37 upd.)^*

Newer upstream!

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems.

Please cite: James T Robinson, Helga Thorvaldsdóttir, Wendy Winckler, Mitchell Guttman, Eric S Lander, Gad Getz and Jill P Mesirov: Integrative genomics viewer. (PubMed,eprint) Nature Biotechnology 29(1):24–26 (2011)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

iva

iterative virus sequence assembler

https://github.com/sanger-pathogens/iva

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package iva
Release	Version	Architectures
forky	1.0.11+ds-6	amd64,arm64,armhf,i386,ppc64el,riscv64
sid	1.0.11+ds-6	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64
bullseye	1.0.9+ds-11	amd64,arm64,armhf,i386
bookworm	1.0.11+ds-3	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie	1.0.11+ds-6	amd64,arm64,armel,armhf,i386,ppc64el,riscv64

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

IVA is a de novo assembler designed to assemble virus genomes that have no repeat sequences, using Illumina read pairs sequenced from mixed populations at extremely high depth.

IVA's main algorithm works by iteratively extending contigs using aligned read pairs. Its input can be just read pairs, or additionally you can provide an existing set of contigs to be extended. Alternatively, it can take reads together with a reference sequence.

Please cite: M. Hunt, A. Gall, S. H. Ong, J. Brener, B. Ferns, P. Goulder, E. Nastouli, J. A. Keane, P. Kellam and T. D. Otto: IVA: accurate de novo assembly of RNA virus genomes. (PubMed) Bioinformatics 31(14):2374-2376 (2015)

Registry entries: Bio.tools Bioconda

Upload screenshot

khmer

??? missing short description for package khmer :-(

https://khmer.readthedocs.org

Maintainer: Debian Med Packaging Team (Nilesh Patra)

Popcon: 0 users (0 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Please cite: Michael R. Crusoe, Hussien F. Alameldin, Sherine Awad, Elmar Bucher, Adam Caldwell, Reed Cartwright, Amanda Charbonneau, Bede Constantinides, Greg Edvenson, Scott Fay, Jacob Fenton, Thomas Fenzl, Jordan Fish, Leonor Garcia-Gutierrez, Phillip Garland, Jonathan Gluck, Iván González, Sarah Guermond, Jiarong Guo, Aditi Gupta, Joshua R. Herr, Adina Howe, Alex Hyer, Andreas Härpfer, Luiz Irber, Rhys Kidd, David Lin, Justin Lippi, Tamer Mansour, Pamela McA'Nulty, Eric McDonald, Jessica Mizzi, Kevin D. Murray, Joshua R. Nahum, Kaben Nanlohy, Alexander Johan Nederbragt, Humberto Ortiz-Zuazaga, Jeramia Ory, Jason Pell, Charles Pepe-Ranney, Zachary N Russ, Erich Schwarz, Camille Scott, Josiah Seaman, Scott Sievert, Jared Simpson, Connor T. Skennerton, James Spencer, Ramakrishnan Srinivasan, Daniel Standage, James A. Stapleton, Joe Stein, Susan R Steinman, Benjamin Taylor, Will Trimble, Heather L. Wiencko, Michael Wright, Brian Wyss, Qingpeng Zhang, en zyme and C. Titus Brown: The khmer software package: enabling efficient sequence analysis. (2015)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

kissplice

Detection of various kinds of polymorphisms in RNA-seq data

https://kissplice.prabi.fr/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package kissplice
Release	Version	Architectures
bookworm	2.6.2-2	amd64,arm64,mips64el,ppc64el
forky	2.6.7-2	amd64,arm64,ppc64el,riscv64
sid	2.6.7-2	amd64,arm64,loong64,ppc64el,riscv64
bullseye	2.5.3-3	amd64,arm64
trixie	2.6.7-2	amd64,arm64,ppc64el,riscv64

Debtags of package kissplice:
biology	nuceleic-acids
field	biology, biology:bioinformatics
interface	commandline
role	program
use	analysing
works-with	biological-sequence

Popcon: 8 users (32 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

KisSplice is a piece of software that enables the analysis of RNA-seq data with or without a reference genome. It is an exact local transcriptome assembler that allows one to identify SNPs, indels and alternative splicing events. It can deal with an arbitrary number of biological conditions, and will quantify each variant in each condition. It has been tested on Illumina datasets of up to 1G reads. Its memory consumption is around 5Gb for 100M reads.

Please cite: Gustavo AT Sacomoto, Janice Kielbassa, Rayan Chikhi, Raluca Uricaru, Pavlos Antoniou, Marie-France Sagot, Pierre Peterlongo and Vincent Lacroix: KISSPLICE: de-novo calling alternative splicing events from RNA-seq data. (PubMed,eprint) BMC Bioinformatics 13((Suppl 6)):S5 (2012)

Registry entries: SciCrunch Bioconda

Topics: RNA-seq; RNA splicing; Gene structure

Upload screenshot

kraken

assigning taxonomic labels to short DNA sequences

http://ccb.jhu.edu/software/kraken/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package kraken
Release	Version	Architectures
sid	1.1.1-4	amd64,arm64,loong64,ppc64el,riscv64
trixie	1.1.1-4	amd64,arm64,ppc64el,riscv64
bookworm	1.1.1-4	amd64,arm64,mips64el,ppc64el
bullseye	1.1.1-2	amd64,arm64,armhf,i386
forky	1.1.1-4	amd64,arm64,ppc64el,riscv64

Popcon: 5 users (35 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs. Kraken aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm.

In its fastest mode of operation, for a simulated metagenome of 100 bp reads, Kraken processed over 4 million reads per minute on a single core, over 900 times faster than Megablast and over 11 times faster than the abundance estimation program MetaPhlAn. Kraken's accuracy is comparable with Megablast, with slightly lower sensitivity and very high precision.

The package is enhanced by the following packages: jellyfish1 multiqc

Please cite: Derrick E Wood and Steven L Salzberg: Kraken: ultrafast metagenomic sequence classification using exact alignments. (PubMed,eprint) Genome Biol. 15(3):R46 (2014)

Registry entries: Bio.tools Bioconda

Upload screenshot

kraken2

taxonomic classification system using exact k-mer matches

https://www.ccb.jhu.edu/software/kraken2/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package kraken2
Release	Version	Architectures
bookworm	2.1.2-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	2.1.3-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	2.1.3-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	2.1.3-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	2.1.1-1	amd64,arm64,armhf,i386
upstream	2.17.1

Popcon: 5 users (32 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Kraken 2 is the newest version of Kraken, a taxonomic classification system using exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer. The k-mer assignments inform the classification algorithm. [see: Kraken 1's Webpage for more details].

Kraken 2 provides significant improvements to Kraken 1, with faster database build times, smaller database sizes, and faster classification speeds. These improvements were achieved by the following updates to the Kraken classification program:

 1. Storage of Minimizers: Instead of storing/querying entire k-mers,
    Kraken 2 stores minimizers (l-mers) of each k-mer. The length of
    each l-mer must be ≤ the k-mer length. Each k-mer is treated by
    Kraken 2 as if its LCA is the same as its minimizer's LCA.
 2. Introduction of Spaced Seeds: Kraken 2 also uses spaced seeds to
    store and query minimizers to improve classification accuracy.
 3. Database Structure: While Kraken 1 saved an indexed and sorted list
    of k-mer/LCA pairs, Kraken 2 uses a compact hash table. This hash
    table is a probabilistic data structure that allows for faster
    queries and lower memory requirements. However, this data structure
    does have a <1% chance of returning the incorrect LCA or returning
    an LCA for a non-inserted minimizer. Users can compensate for this
    possibility by using Kraken's confidence scoring thresholds.
 4. Protein Databases: Kraken 2 allows for databases built from amino
    acid sequences. When queried, Kraken 2 performs a six-frame
    translated search of the query sequences against the database.
 5. 16S Databases: Kraken 2 also provides support for databases not
    based on NCBI's taxonomy. Currently, these include the 16S
    databases: Greengenes, SILVA, and RDP.

Please cite: Derrick E Wood and Steven L Salzberg: Kraken: ultrafast metagenomic sequence classification using exact alignments. (PubMed,eprint) Genome Biol. 15(3):R46 (2014)

Registry entries: Bio.tools Bioconda

Upload screenshot

last-align

genome-scale comparison of biological sequences

https://gitlab.com/mcfrith/last

Maintainer: Debian Med Packaging Team (Charles Plessy)

Versions of package last-align
Release	Version	Architectures
bullseye	1179-1	amd64,arm64,armhf,i386
sid	1651-1	amd64,arm64,loong64,ppc64el,riscv64,s390x
bookworm	1447-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	1609-1	amd64,arm64,ppc64el,riscv64,s390x
forky	1651-1	amd64,arm64,ppc64el,riscv64,s390x

Popcon: 10 users (36 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

LAST is software for comparing and aligning sequences, typically DNA or protein sequences. LAST is similar to BLAST, but it copes better with very large amounts of sequence data. Here are two things LAST is good at:

Comparing large (e.g. mammalian) genomes.
Mapping lots of sequence tags onto a genome.

The main technical innovation is that LAST finds initial matches based on their multiplicity, instead of using a fixed size (e.g. BLAST uses 10-mers). This allows one to map tags to genomes without repeat-masking, without becoming overwhelmed by repetitive hits. To find these variable-sized matches, it uses a suffix array (inspired by Vmatch). To achieve high sensitivity, it uses a discontiguous suffix array, analogous to spaced seeds.

Please cite: Martin C. Frith, Raymond Wan and Paul Horton: Incorporating sequence quality data into alignment improves DNA read mapping. (PubMed,eprint) Nucl. Acids Res. 38(7):e100 (2010)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

libvcflib-tools

C++ library for parsing and manipulating VCF files (tools)

https://github.com/vcflib/vcflib

Maintainer: Debian Med Packaging Team (Santiago Vila)

Versions of package libvcflib-tools
Release	Version	Architectures
forky	1.0.12+dfsg-2	amd64,arm64,ppc64el,riscv64
bullseye	1.0.2+dfsg-2	amd64,arm64,armhf,i386
sid	1.0.12+dfsg-2	amd64,arm64,loong64,ppc64el,riscv64
trixie	1.0.12+dfsg-1	amd64,arm64,ppc64el,riscv64
bookworm	1.0.3+dfsg-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream	1.0.14

Popcon: 4 users (36 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The Variant Call Format (VCF) is a flat-file, tab-delimited textual format intended to concisely describe reference-indexed variations between individuals. VCF provides a common interchange format for the description of variation in individuals and populations of samples, and has become the defacto standard reporting format for a wide array of genomic variant detectors.

vcflib provides methods to manipulate and interpret sequence variation as it can be described by VCF. It is both:

an API for parsing and operating on records of genomic variation as it can be described by the VCF format,
and a collection of command-line utilities for executing complex manipulations on VCF files.

This package contains several tools using the library.

Upload screenshot

macs

Model-based Analysis of ChIP-Seq on short reads sequencers

https://github.com/taoliu/MACS/

Maintainer: Debian Med Packaging Team (Alexandre Detiste)

Versions of package macs
Release	Version	Architectures
bookworm	2.2.7.1-6	amd64,arm64,armel,armhf,i386,ppc64el,s390x
bullseye	2.2.7.1-3	amd64,arm64,armhf,i386
sid	3.0.2-2	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
forky	3.0.2-2	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	3.0.2-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
upstream	3.0.3

Popcon: 10 users (33 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, is publicly available open source, and can be used for ChIP-Seq with or without control samples.

Please cite: Yong Zhang, Tao Liu, Clifford A Meyer, Jérôme Eeckhoute, David S. Johnson, Bradley E. Bernstein, Chad Nussbaum, Richard M. Myers, Myles Brown, Wei Li and X Shirley Liu: Model-based Analysis of ChIP-Seq (MACS). (PubMed,eprint) Genome Biol. 9(9):R137 (2008)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

mapdamage

tracking and quantifying damage patterns in ancient DNA sequences

https://ginolhac.github.io/mapDamage/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package mapdamage
Release	Version	Architectures
sid	2.2.3+dfsg-2	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64
trixie	2.2.2+dfsg-1	all
forky	2.2.3+dfsg-2	amd64,arm64,armhf,i386,ppc64el,riscv64
bookworm	2.2.1+dfsg-3	all
bullseye	2.2.1+dfsg-1	all

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

MapDamage is a computational framework written in Python and R, which tracks and quantifies DNA damage patterns among ancient DNA sequencing reads generated by Next-Generation Sequencing platforms.

MapDamage is developed at the Centre for GeoGenetics by the Orlando Group.

Please cite: Hákon Jónsson, Aurélien Ginolhac, Mikkel Schubert and Philip Johnson and Ludovic Orlando: mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. (PubMed,eprint) Bioinformatics 29(13):1682-4 (2013)

Registry entries: SciCrunch Bioconda

Upload screenshot

mapsembler2

bioinformatics targeted assembly software

http://colibread.inria.fr/mapsembler2/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package mapsembler2
Release	Version	Architectures
trixie	2.2.4+dfsg1-4	amd64,arm64,ppc64el,s390x
bullseye	2.2.4+dfsg1-3	amd64,arm64
sid	2.2.4+dfsg1-5	amd64,arm64,ppc64el,s390x
bookworm	2.2.4+dfsg1-4	amd64,arm64,ppc64el,s390x

Popcon: 5 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Mapsembler2 is a targeted assembly software. It takes as input a set of NGS raw reads (fasta or fastq, gzipped or not) and a set of input sequences (starters).

It first determines if each starter is read-coherent, e.g. whether reads confirm the presence of each starter in the original sequence. Then for each read-coherent starter, Mapsembler2 outputs its sequence neighborhood as a linear sequence or as a graph, depending on the user choice.

Mapsembler2 may be used for (not limited to):

Validate an assembled sequence (input as starter), e.g. from a de Bruijn graph assembly where read-coherence was not enforced.
Checks if a gene (input as starter) has an homolog in a set of reads
Checks if a known enzyme is present in a metagenomic NGS read set.
Enrich unmappable reads by extending them, possibly making them mappable
Checks what happens at the extremities of a contig
Remove contaminants or symbiont reads from a read set

Please cite: Pierre Peterlongo and Rayan Chikhi: Mapsembler, targeted and micro assembly of large NGS datasets on a desktop computer. (PubMed) BMC Bioinformatics 13:48 (2012)

Registry entries: Bio.tools Bioconda

Upload screenshot

maq

maps short fixed-length polymorphic DNA sequence reads to reference sequences

http://maq.sourceforge.net/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package maq
Release	Version	Architectures
bullseye	0.7.1-9	amd64,arm64,armhf,i386
sid	0.7.1-10	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bookworm	0.7.1-9	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	0.7.1-10	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
forky	0.7.1-10	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x

Debtags of package maq:
biology	nuceleic-acids
field	biology, biology:bioinformatics
interface	commandline
role	program
scope	utility
use	analysing, comparing, searching
works-with-format	plaintext

Popcon: 13 users (35 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

Maq (short for Mapping and Assembly with Quality) builds mapping assemblies from short reads generated by the next-generation sequencing machines. It was particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has a preliminary functionality to handle ABI SOLiD data. Maq is previously known as mapass2.

Developmemt of Maq stopped in 2008. Its successors are BWA and SAMtools.

Please cite: Heng Li, Jue Ruan and Richard Durbin: Mapping short DNA sequencing reads and calling variants using mapping quality scores. (PubMed,eprint) Genome Research 18(11):1851-1858 (2008)

Registry entries: Bio.tools SciCrunch

Upload screenshot

maqview

graphical read alignment viewer for short gene sequences

https://maq.sourceforge.net/maqview.shtml

Maintainer: Debian-Med Packaging Team (Étienne Mollier)

Versions of package maqview
Release	Version	Architectures
sid	0.2.5-12	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	0.2.5-12	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	0.2.5-12	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	0.2.5-10	amd64,arm64,armhf,i386
bookworm	0.2.5-11	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Maqview is graphical read alignment viewer. It is specifically designed for the Maq alignment file and allows you to see the mismatches, base qualities and mapping qualities. Maqview is nothing fancy as Consed or GAP, but just a simple viewer for you to see what happens in a particular region.

In comparison to tgap-maq, the text-based read alignment viewer written by James Bonfield, Maqview is faster and takes up much less memory and disk space in indexing. This is possibly because tgap aims to be a general-purpose viewer but Maqview fully makes use of the fact that a Maq alignment file has already been sorted. Maqview is also efficient in viewing and provides a command-line tool to quickly retrieve any region in an Maq alignment file.

Please cite: Heng Li, Jue Ruan and Richard Durbin: Mapping short DNA sequencing reads and calling variants using mapping quality scores. (PubMed,eprint) Genome Research 18(11):1851-1858 (2008)

Registry entries: Bio.tools SciCrunch

Upload screenshot

mhap

locality-sensitive hashing to detect long-read overlaps

http://mhap.readthedocs.org/en/stable/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package mhap
Release	Version	Architectures
trixie	2.1.3+dfsg-3	all
forky	2.1.3+dfsg-3	all
bullseye	2.1.3+dfsg-3	all
bookworm	2.1.3+dfsg-3	all
sid	2.1.3+dfsg-3	all

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The MinHash Alignment Process (MHAP--pronounced MAP) is a reference implementation of a probabilistic sequence overlapping algorithm. Designed to efficiently detect all overlaps between noisy long-read sequence data. It efficiently estimates Jaccard similarity by compressing sequences to their representative fingerprints composed on min-mers (minimum k-mer).

Please cite: Konstantin Berlin, Sergey Koren, Chen-Shan Chin, James P Drake, Jane M Landolin and Adam M Phillippy: Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. (PubMed) Nature Biotechnology 33(6):623–630 (2015)

Registry entries: Bioconda

Upload screenshot

microbiomeutil

Microbiome Analysis Utilities

https://microbiomeutil.sourceforge.net/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package microbiomeutil
Release	Version	Architectures
sid	20101212+dfsg1-6	all
bookworm	20101212+dfsg1-5	all
trixie	20101212+dfsg1-6	all
forky	20101212+dfsg1-6	all
bullseye	20101212+dfsg1-4	all

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The microbiomeutil package comes with the following utilities:

ChimeraSlayer: ChimeraSlayer for chimera detection.
NAST-iEr: NAST-based alignment tool.
WigeoN: A reimplementation of the Pintail 16S anomaly detection utility
RESOURCES: Reference 16S sequences and NAST-alignments that the tools above leverage.

Please cite: Brian J. Haas, Dirk Gevers, Ashlee M. Earl, Mike Feldgarden, Doyle V. Ward, Georgia Giannoukos, Dawn Ciulla, Diana Tabbaa, Sarah K. Highlander, Erica Sodergren, Barbara Methé, Todd Z. DeSantis, The Human Microbiome Consortium, Joseph F. Petrosino, Rob Knight and Bruce W. Birren: Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. (PubMed,eprint) Genome Research 21(3):494-504 (2011)

Registry entries: SciCrunch

Upload screenshot

mira-assembler

Whole Genome Shotgun and EST Sequence Assembler

https://sourceforge.net/p/mira-assembler/wiki/Home/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package mira-assembler
Release	Version	Architectures
trixie	4.9.6-11	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	4.9.6-5	amd64,arm64,armhf,i386
bookworm	4.9.6-7	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
forky	4.9.6-11	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	4.9.6-12	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x

Popcon: 5 users (34 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

The mira genome fragment assembler is a specialised assembler for sequencing projects classified as 'hard' due to high number of similar repeats. For expressed sequence tags (ESTs) transcripts, miraEST is specialised on reconstructing pristine mRNA transcripts while detecting and classifying single nucleotide polymorphisms (SNP) occurring in different variations thereof.

The assembler is routinely used for such various tasks as mutation detection in different cell types, similarity analysis of transcripts between organisms, and pristine assembly of sequences from various sources for oligo design in clinical microarray experiments.

The package provides the following executables: Binaries provided:

mira: for assembly of genome sequences
miramem: estimating memory needed to assemble projects.
mirabait: a "grep" like tool to select reads with kmers up to 256 bases.
miraconvert: is a tool to convert, extract and sometimes recalculate all kinds of data related to sequence assembly files.

Please cite: Bastien Chevreux, Thomas Pfisterer, Bernd Drescher, Albert J. Driesel, Werner E. G. Müller, Thomas Wetter and Sándor Suhai: Using the miraEST Assembler for Reliable and Automated mRNA Transcript Assembly and SNP Detection in Sequenced ESTs. (PubMed,eprint) Genome Research 14(6):1147-1159 (2004)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

mothur

suíte de análise de sequência para pesquisa em microbiota

https://www.mothur.org

Maintainer: Debian Med Packaging Team (Tomasz Buchert)

Versions of package mothur
Release	Version	Architectures
bullseye	1.44.3-2	amd64,arm64,armhf,i386
forky	1.48.5-2	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	1.48.5-2	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
trixie	1.48.1-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	1.48.0-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 5 users (36 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Fix translated description

Mothur procura desenvolver uma única peça de software de código aberto e expansível para preencher as necessidades de bioinformática da comunidade de ecologia microbiana. Incorporou a funcionalidade de dotur, sons, treeclimber, s-libshuff, unifrac, e muito mais. Além de melhorar a flexibilidade desses algoritmos, uma série de outras características, incluindo calculadoras e ferramentas de visualização, foram adicionadas.

Please cite: Patrick D Schloss, Sarah L Westcott, Thomas Ryabin, Justine R Hall, Martin Hartmann, Emily B Hollister, Ryan A Lesniewski, Brian B Oakley, Donovan H Parks, Courtney J Robinson, Jason W Sahl, Blaz Stres, Gerhard G Thallinger, David J Van Horn and Carolyn F Weber: Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. (PubMed) Appl Environ Microbiol 75(23):7537-7541 (2009)

Registry entries: Bio.tools SciCrunch Bioconda

Topics: Microbial ecology

Upload screenshot

nanopolish

consensus caller for nanopore sequencing data

https://github.com/jts/nanopolish

Maintainer: Debian Med Packaging Team (Nilesh Patra)

Versions of package nanopolish
Release	Version	Architectures
bullseye	0.13.2-3	amd64,arm64,armhf,i386
sid	0.14.0-2	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64
forky	0.14.0-2	amd64,arm64,armhf,i386,ppc64el,riscv64
trixie	0.14.0-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64
bookworm	0.14.0-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Nanopolish uses a signal-level hidden Markov model for consensus calling of nanopore genome sequencing data. It can perform signal-level analysis of Oxford Nanopore sequencing data. Nanopolish can calculate an improved consensus sequence for a draft genome assembly, detect base modifications, call SNPs and indels with respect to a reference genome and more.

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

paleomix

pipelines and tools for the processing of ancient and modern HTS data

https://geogenetics.ku.dk/publications/paleomix

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package paleomix
Release	Version	Architectures
bullseye	1.3.2-1	amd64,arm64
sid	1.3.10-1	amd64,arm64
forky	1.3.10-1	amd64,arm64
trixie	1.3.8-2	amd64,arm64
bookworm	1.3.7-3	amd64,arm64

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The PALEOMIX pipelines are a set of pipelines and tools designed to aid the rapid processing of High-Throughput Sequencing (HTS) data: The BAM pipeline processes de-multiplexed reads from one or more samples, through sequence processing and alignment, to generate BAM alignment files useful in downstream analyses; the Phylogenetic pipeline carries out genotyping and phylogenetic inference on BAM alignment files, either produced using the BAM pipeline or generated elsewhere; and the Zonkey pipeline carries out a suite of analyses on low coverage equine alignments, in order to detect the presence of F1-hybrids in archaeological assemblages. In addition, PALEOMIX aids in metagenomic analysis of the extracts.

The pipelines have been designed with ancient DNA (aDNA) in mind, and includes several features especially useful for the analyses of ancient samples, but can all be for the processing of modern samples, in order to ensure consistent data processing.

Please cite: Mikkel Schubert, Luca Ermini, Clio Der Sarkissian, Hákon Jónsson, Aurélien Ginolhac, Robert Schaefer, Michael D Martin, Ruth Fernández, Martin Kircher, Molly McCue, Eske Willerslev and Ludovic Orlando: Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. (PubMed) Nature Protocols 9(5):1056-82 (2014)

Registry entries: Bio.tools SciCrunch

Upload screenshot

pbhoney

genomic structural variation discovery

http://sourceforge.net/projects/pb-jelly

Maintainer: Debian Med Packaging Team (Andreas Tille)

Popcon: 0 users (0 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

PBHoney is an implementation of two variant-identification approaches designed to exploit the high mappability of long reads (i.e., greater than 10,000 bp). PBHoney considers both intra-read discordance and soft-clipped tails of long reads to identify structural variants.

PBHoney is part of the PBSuite.

Upload screenshot

pbjelly

genome assembly upgrading tool

http://sourceforge.net/projects/pb-jelly

Maintainer: Debian Med Packaging Team (Andreas Tille)

Popcon: 0 users (0 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

PBJelly is a highly automated pipeline that aligns long sequencing reads (such as PacBio RS reads or long 454 reads in fasta format) to high-confidence draft assembles. PBJelly fills or reduces as many captured gaps as possible to produce upgraded draft genomes.

PBJelly is part of the PBSuite.

Upload screenshot

pbsuite

software for Pacific Biosciences sequencing data

http://sourceforge.net/projects/pb-jelly

Maintainer: Debian Med Packaging Team (Andreas Tille)

Popcon: 0 users (0 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The PBSuite contains two projects created for analysis of Pacific Biosciences long-read sequencing data.

PBJelly - genome upgrading tool
PBHoney - structural variation discovery

Upload screenshot

picard-tools

Command line tools to manipulate SAM and BAM files

https://broadinstitute.github.io/picard/

Maintainer: Debian Med Packaging Team (Pierre Gruet)

Versions of package picard-tools
Release	Version	Architectures
trixie	3.3.0+dfsg-2	all
forky	3.4.0+dfsg-1	all
bookworm	2.27.5+dfsg-2	all
sid	3.4.0+dfsg-1	all
bullseye	2.24.1+dfsg-1	all

Popcon: 11 users (37 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments. Picard Tools includes these utilities to manipulate SAM and BAM files:

 AddCommentsToBam                  FifoBuffer
 AddOrReplaceReadGroups            FilterSamReads
 BaitDesigner                      FilterVcf
 BamIndexStats                     FixMateInformation
                                   GatherBamFiles
 BedToIntervalList                 GatherVcfs
 BuildBamIndex                     GenotypeConcordance
 CalculateHsMetrics                IlluminaBasecallsToFastq
 CalculateReadGroupChecksum        IlluminaBasecallsToSam
 CheckIlluminaDirectory            LiftOverIntervalList
 CheckTerminatorBlock              LiftoverVcf
 CleanSam                          MakeSitesOnlyVcf
 CollectAlignmentSummaryMetrics    MarkDuplicates
 CollectBaseDistributionByCycle    MarkDuplicatesWithMateCigar
 CollectGcBiasMetrics              MarkIlluminaAdapters
 CollectHiSeqXPfFailMetrics        MeanQualityByCycle
 CollectIlluminaBasecallingMetrics MergeBamAlignment
 CollectIlluminaLaneMetrics        MergeSamFiles
 CollectInsertSizeMetrics          MergeVcfs
 CollectJumpingLibraryMetrics      NormalizeFasta
 CollectMultipleMetrics            PositionBasedDownsampleSam
 CollectOxoGMetrics                QualityScoreDistribution
 CollectQualityYieldMetrics        RenameSampleInVcf
 CollectRawWgsMetrics              ReorderSam
 CollectRnaSeqMetrics              ReplaceSamHeader
 CollectRrbsMetrics                RevertOriginalBaseQualitiesAndAddMateCigar
 CollectSequencingArtifactMetrics  RevertSam
 CollectTargetedPcrMetrics         SamFormatConverter
 CollectVariantCallingMetrics      SamToFastq
 CollectWgsMetrics                 ScatterIntervalsByNs
 CompareMetrics                    SortSam
 CompareSAMs                       SortVcf
 ConvertSequencingArtifactToOxoG   SplitSamByLibrary
 CreateSequenceDictionary          SplitVcfs
 DownsampleSam                     UpdateVcfSequenceDictionary
 EstimateLibraryComplexity         ValidateSamFile
 ExtractIlluminaBarcodes           VcfFormatConverter
 ExtractSequences                  VcfToIntervalList
 FastqToSam                        ViewSam

The package is enhanced by the following packages: multiqc

Please cite: Broad Institute: Picard toolkit. Broad Institute, GitHub repository (2019)

Registry entries: Bio.tools SciCrunch Bioconda

Topics: Sequencing; Document, record and content management

Upload screenshot

pirs

Profile based Illumina pair-end Reads Simulator

https://github.com/galaxy001/pirs

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package pirs
Release	Version	Architectures
forky	2.0.2+dfsg-12	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bookworm	2.0.2+dfsg-11	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye	2.0.2+dfsg-9	amd64,arm64,armhf,i386
sid	2.0.2+dfsg-12	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
trixie	2.0.2+dfsg-12	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The program pIRS can be used for simulating Illumina PE reads, with a series of characters generated by Illumina sequencing platform, such as insert size distribution, sequencing error(substitution, insertion, deletion), quality score and GC content-coverage bias.

The insert size follows a normal distribution, so users should set the mean value and standard deviation. Usually the standard deviation is set as 1/20 of the mean value. The normal distribution by Box-Muller method is simulated.

The program simulates sequencing error, quality score and GC content- coverage bias according to the empirical distribution profile. Some default profiles counted from lots of real sequencing data are provided.

To simulate reads from diploid genome, users should simulate the diploid genome sequence firstly by setting the ratio of heterozygosis SNP, heterozygosis InDel and structure variation.

Please cite: Xuesong Hu, Jianying Yuan, Yujian Shi, Jianliang Lu, Binghang Liu, Zhenyu Li, Yanxiang Chen, Desheng Mu, Hao Zhang, Nan Li, Zhen Yue, Fan Bai, Heng Li and Wei Fan: pIRS: Profile-based Illumina pair-end reads simulator. (PubMed,eprint) Bioinformatics 28(11):1533-5 (2012)

Registry entries: Bioconda

Upload screenshot

pizzly

Identifies gene fusions in RNA sequencing data

https://github.com/pmelsted/pizzly

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package pizzly
Release	Version	Architectures
bookworm	0.37.3+ds-9	amd64,arm64,mips64el,ppc64el,s390x
sid	0.37.3+ds-11	amd64,arm64,ppc64el,riscv64,s390x
bullseye	0.37.3+ds-5	amd64,arm64,armhf,i386
trixie	0.37.3+ds-9	amd64,arm64,ppc64el,riscv64,s390x

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

For the interpretation of the transcriptome (the abundance and sequence of RNA) of tomour cells one is particularly interested in transcripts that cannot be mapped to single genes but that are seen to be fused as parts from two genes. Likely eplanations are chromosomal translocations.

Pizzly can identify novel such peculiarities, building on interpretations on variable splicing by the tool kallisto. Both tools are elements of the bcbio workflow.

Registry entries: Bioconda

Upload screenshot

placnet

Plasmid Constellation Network project

http://sourceforge.net/projects/placnet/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Placnet is a new tool for plasmid analysis in NGS projects. Placnet is optimized to work with Illumina sequences but it also works with 454, Iontorrent or any of the actual sequence technologies.

The input of placnet is a set of contigs and one or more SAM files with the mapping of the reads against the contigs. Placnet obtains a set of files, easily opened on Cytoscape software or other network tools.

Please cite: Val F. Lanza, María de Toro, M. Pilar Garcillán-Barcia, Azucena Mora, Jorge Blanco, Teresa M. Coque and Fernando de la Cruz: Plasmid Flux in Escherichia coli ST131 Sublineages, Analyzed by Plasmid Constellation Network (PLACNET), a New Method for Plasmid Reconstruction from Whole Genome Sequences. (PubMed,eprint) PLOS 10(12):e1004766 (2014)

Upload screenshot

poretools

toolkit for nanopore nucleotide sequencing data

https://poretools.readthedocs.org

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package poretools
Release	Version	Architectures
bookworm	0.6.0+dfsg-6	all
trixie	0.6.0+dfsg-7	all
forky	0.6.0+dfsg-7	all
sid	0.6.0+dfsg-7	all
bullseye	0.6.0+dfsg-5	all

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

poretools is a flexible toolkit for exploring datasets generated by nanopore sequencing devices from MinION for the purposes of quality control and downstream analysis. Poretools operates directly on the native FAST5 (a variant of the HDF5 standard) file format produced by ONT and provides a wealth of format conversion utilities and data exploration and visualization tools.

Please cite: Nicholas Loman and Aaron Quinlan: Poretools: a toolkit for analyzing nanopore sequence data. (PubMed,eprint) Bioinformatics 30(23):3399-3401 (2014)

Registry entries: Bio.tools Bioconda

Upload screenshot

python3-airr

Data Representation Standard library for antibody and TCR sequences

https://docs.airr-community.org/en/latest/packages/airr-python/overview.html

Maintainer: Debian Python Team (Colin Watson)

Versions of package python3-airr
Release	Version	Architectures
bookworm	1.3.1-1	all
bullseye	1.3.1-1	all
trixie	1.5.1-1	all
forky	1.5.1-2	all
sid	1.5.1-2	all

Popcon: 7 users (33 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

This package provides a library by the AIRR community to for describing, reporting, storing, and sharing adaptive immune receptor repertoire (AIRR) data, such as sequences of antibodies and T cell receptors (TCRs). Some specific efforts include:

The MiAIRR standard for describing minimal information about AIRR datasets, including sample collection and data processing information.
Data representations (file format) specifications for storing large amounts of annotated AIRR data.
APIs for exposing a common interface to repositories/databases containing AIRR data.
A community standard for software tools which will allow conforming tools to gain community recognition.

This package installs the library for Python 3.

Upload screenshot

python3-gffutils

Work with GFF and GTF files in a flexible database framework

https://daler.github.io/gffutils

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package python3-gffutils
Release	Version	Architectures
forky	0.13-4	all
sid	0.13-4	all
trixie	0.13-2	all
bookworm	0.11.1-3	all
bullseye	0.10.1-2	all

Popcon: 40 users (39 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

A Python package for working with and manipulating the GFF and GTF format files typically used for genomic annotations. Files are loaded into a sqlite3 database, allowing much more complex manipulation of hierarchical features (e.g., genes, transcripts, and exons) than is possible with plain-text methods alone.

Registry entries: Bio.tools Bioconda

Upload screenshot

python3-presto

toolkit for processing B and T cell sequences (Python3 module)

https://presto.readthedocs.io

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package python3-presto
Release	Version	Architectures
sid	0.7.6-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	0.6.2-1	all
forky	0.7.6-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	0.7.2-2	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	0.7.1-1	all
upstream	0.7.8

Popcon: 7 users (32 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

pRESTO is a toolkit for processing raw reads from high-throughput sequencing of B cell and T cell repertoires.

Dramatic improvements in high-throughput sequencing technologies now enable large-scale characterization of lymphocyte repertoires, defined as the collection of trans-membrane antigen-receptor proteins located on the surface of B cells and T cells. The REpertoire Sequencing TOolkit (pRESTO) is composed of a suite of utilities to handle all stages of sequence processing prior to germline segment assignment. pRESTO is designed to handle either single reads or paired-end reads. It includes features for quality control, primer masking, annotation of reads with sequence embedded barcodes, generation of unique molecular identifier (UMI) consensus sequences, assembly of paired-end reads and identification of duplicate sequences. Numerous options for sequence sorting, sampling and conversion operations are also included.

This package provides the presto Python3 module.

Please cite: Jason A. Vander Heiden, Gur Yaari, Mohamed Uduman, Joel N.H. Stern, Kevin C. O’Connor, David A. Hafler, Francois Vigneault and Steven H. Kleinstein: pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. (PubMed,eprint) Bioinformatics 30(13):1930-1932 (2014)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

python3-pybedtools

Python 3 wrapper around BEDTools for bioinformatics work

https://daler.github.io/pybedtools/

Maintainer: Debian Med Packaging Team (Santiago Vila)

Versions of package python3-pybedtools
Release	Version	Architectures
trixie	0.10.0-4	amd64,arm64,armel,armhf,i386,ppc64el,riscv64
sid	0.10.0-5	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64
bookworm	0.9.0-4	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
forky	0.10.0-5	amd64,arm64,armhf,i386,ppc64el,riscv64
bullseye	0.8.0-5	amd64,arm64,armhf,i386
upstream	0.12.0

Popcon: 34 users (41 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The BEDTools suite of programs is widely used for genomic interval manipulation or “genome algebra”. pybedtools wraps and extends BEDTools and offers feature-level manipulations from within Python.

This is the Python 3 version.

Please cite: R. K. Dale, B. S. Pedersen and A. R. Quinlan: Pybedtools: a flexible Python library for manipulating genomic datasets and annotations". Bioinformatics 27(24):3423-3424 (2011)

Registry entries: Bio.tools Bioconda

Upload screenshot

python3-sqt

SeQuencing Tools for biological DNA/RNA high-throughput data

https://bitbucket.org/marcelm/sqt

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package python3-sqt
Release	Version	Architectures
sid	0.8.0-9	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64
bullseye	0.8.0-4	amd64,arm64,armhf,i386
bookworm	0.8.0-6	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie	0.8.0-9	amd64,arm64,armel,armhf,i386,ppc64el,riscv64
forky	0.8.0-9	amd64,arm64,armhf,i386,ppc64el,riscv64

Popcon: 4 users (33 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

sqt is a collection of command-line tools for working with high-throughput sequencing data. Conceptionally not fixed to use any particular language, many sqt subcommands are currently implemented in Python. For them, a Python package is available with functions for reading and writing FASTA/FASTQ files, computing alignments, quality trimming, etc.

The following tools are offered:

sqt-coverage -- Compute per-reference statistics such as coverage and GC content
sqt-fastqmod -- FASTQ modifications: shorten, subset, reverse complement, quality trimming.
sqt-fastastats -- Compute N50, min/max length, GC content etc. of a FASTA file
sqt-qualityguess -- Guess quality encoding of one or more FASTA files.
sqt-globalalign -- Compute a global or semiglobal alignment of two strings.
sqt-chars -- Count length of the first word given on the command line.
sqt-sam-cscq -- Add the CS and CQ tags to a SAM file with colorspace reads.
sqt-fastamutate -- Add substitutions and indels to sequences in a FASTA file.
sqt-fastaextract -- Efficiently extract one or more regions from an indexed FASTA file.
sqt-translate -- Replace characters in FASTA files (like the 'tr' command).
sqt-sam-fixn -- Replace all non-ACGT characters within reads in a SAM file.
sqt-sam-insertsize -- Mean and standard deviation of paired-end insert sizes.
sqt-sam-set-op -- Set operations (union, intersection, ...) on SAM/BAM files.
sqt-bam-eof -- Check for the End-Of-File marker in compressed BAM files.
sqt-checkfastqpe -- Check whether two FASTQ files contain correctly paired paired-end data.

Registry entries: Bioconda

Upload screenshot

q2cli

Click-based command line interface for QIIME 2

https://qiime2.org/

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package q2cli
Release	Version	Architectures
bookworm	2022.11.1-2	all
sid	2024.5.0-2	all
bullseye	2020.11.1-1	all
upstream	2026.1.0

Popcon: 0 users (0 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

QIIME 2 is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results. Key features:

Integrated and automatic tracking of data provenance
Semantic type system
Plugin system for extending microbiome analysis functionality
Support for multiple types of user interfaces (e.g. API, command line, graphical)

QIIME 2 is a complete redesign and rewrite of the QIIME 1 microbiome analysis pipeline. QIIME 2 will address many of the limitations of QIIME 1, while retaining the features that makes QIIME 1 a powerful and widely-used analysis pipeline.

QIIME 2 currently supports an initial end-to-end microbiome analysis pipeline. New functionality will regularly become available through QIIME 2 plugins. You can view a list of plugins that are currently available on the QIIME 2 plugin availability page. The future plugins page lists plugins that are being developed.

Please cite: Evan Bolyen, Jai Ram Rideout, Matthew R Dillon, Nicholas A Bokulich, Christian Abnet, Gabriel A Al-Ghalith, Harriet Alexander, Eric J Alm, Manimozhiyan Arumugam, Francesco Asnicar, Yang Bai, Jordan E Bisanz, Kyle Bittinger, Asker Brejnrod, Colin J Brislawn, C Titus Brown, Benjamin J Callahan, Andrés Mauricio Caraballo-Rodríguez, John Chase, Emily Cope, Ricardo Da Silva, Pieter C Dorrestein, Gavin M Douglas, Daniel M Durall, Claire Duvallet, Christian F Edwardson, Madeleine Ernst, Mehrbod Estaki, Jennifer Fouquier, Julia M Gauglitz, Deanna L Gibson, Antonio Gonzalez, Kestrel Gorlick, Jiarong Guo, Benjamin Hillmann, Susan Holmes, Hannes Holste, Curtis Huttenhower, Gavin Huttley, Stefan Janssen, Alan K Jarmusch, Lingjing Jiang, Benjamin Kaehler, Kyo Bin Kang, Christopher R Keefe, Paul Keim, Scott T Kelley, Dan Knights, Irina Koester, Tomasz Kosciolek, Jorden Kreps, Morgan GI Langille, Joslynn Lee, Ruth Ley, Yong-Xin Liu, Erikka Loftfield, Catherine Lozupone, Massoud Maher, Clarisse Marotz, Bryan D Martin, Daniel McDonald, Lauren J McIver, Alexey V Melnik, Jessica L Metcalf, Sydney C Morgan, Jamie Morton, Ahmad Turan Naimey, Jose A Navas-Molina, Louis Felix Nothias, Stephanie B Orchanian, Talima Pearson, Samuel L Peoples, Daniel Petras, Mary Lai Preuss, Elmar Pruesse, Lasse Buur Rasmussen, Adam Rivers, Michael S Robeson, Patrick Rosenthal, Nicola Segata, Michael Shaffer, Arron Shiffer, Rashmi Sinha, Se Jin Song, John R Spear, Austin D Swafford, Luke R Thompson, Pedro J Torres, Pauline Trinh, Anupriya Tripathi, Peter J Turnbaugh, Sabah Ul-Hasan, Justin JJ van der Hooft, Fernando Vargas, Yoshiki Vázquez-Baeza, Emily Vogtmann, Max von Hippel, William Walters, Yunhu Wan, Mingxun Wang, Jonathan Warren, Kyle C Weber, Chase HD Williamson, Amy D Willis, Zhenjiang Zech Xu, Jesse R Zaneveld, Yilong Zhang, Qiyun Zhu, Rob Knight and J Gregory Caporaso: Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. (eprint) Nature Biotechnology 37 (2019)

Upload screenshot

qcumber

quality control of genomic sequences

https://gitlab.com/RKIBioinformaticsPipelines/QCumber

Maintainer: Debian Med Packaging Team (Santiago Vila)

Popcon: 4 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

QCPipeline is a tool for quality control. The workflow is as follows:

 1. Quality control with FastQC
 2. Trim Reads with Trimmomatic
 3. Quality control of trimmed reads with FastQC
 4. Map reads against reference using bowtie2
 5. Classify reads with Kraken

Registry entries: Bioconda

Upload screenshot

qiime

Quantitative Insights Into Microbial Ecology

https://qiime2.org

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package qiime
Release	Version	Architectures
sid	2024.5.0-1	all
bullseye	2020.11.1-1	all
bookworm	2022.11.1-2	all
upstream	2026.1.0

Popcon: 3 users (0 upd.)^*

Newer upstream!

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

Microbes are surrounding us, animals, plants and all their parasites with strong effect on these and the environment these live in. Soil quality comes to mind but also the effect that bacteria have on each other. Humans are influencing the absolute and relative abundance of bacteria by antibiotics, food, fertilizers - you name it - and these changes affect us.

Integrated and automatic tracking of data provenance
Semantic type system
Plugin system for extending microbiome analysis functionality
Support for multiple types of user interfaces (e.g. API, command line, graphical)

Registry entries: Bio.tools SciCrunch Bioconda

Topics: Microbial ecology

Upload screenshot

quorum

QUality Optimized Reads of genomic sequences

https://github.com/gmarcais/Quorum

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package quorum
Release	Version	Architectures
sid	1.1.2-2	amd64,arm64,loong64,ppc64el,riscv64
bullseye	1.1.1-4	amd64,arm64
forky	1.1.2-2	amd64,arm64,ppc64el,riscv64
bookworm	1.1.1-7	amd64,arm64,mips64el,ppc64el
trixie	1.1.2-2	amd64,arm64,ppc64el,riscv64

Popcon: 7 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

QuorUM enables to obtain trimmed and error-corrected reads that result in assemblies with longer contigs and fewer errors. QuorUM provides best performance compared to other published error correctors in several metrics. QuorUM is efficiently implemented making use of current multi- core computing architectures and it is suitable for large data sets (1 billion bases checked and corrected per day per core). The third-party assembler (SOAPdenovo) benefits significantly from using QuorUM error- corrected reads. QuorUM error corrected reads result in a factor of 1.1 to 4 improvement in N50 contig size compared to using the original reads with SOAPdenovo for the data sets investigated.

Please cite: Guillaume Marçais, James A. Yorke and Aleksey Zimin: QuorUM: An Error Corrector for Illumina Reads. (PubMed,eprint) PLoS One 10(6):e0130821 (2015)

Registry entries: SciCrunch

Upload screenshot

r-bioc-deseq2

R package for RNA-Seq Differential Expression Analysis

https://bioconductor.org/packages/DESeq2/

Maintainer: Debian R Packages Maintainers (Michael R. Crusoe)

Versions of package r-bioc-deseq2
Release	Version	Architectures
bullseye	1.30.1+dfsg-1	amd64,arm64,armhf,i386
sid	1.46.0+dfsg-2	amd64,arm64,loong64,ppc64el,riscv64,s390x
bookworm	1.38.3+dfsg-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	1.46.0+dfsg-2	amd64,arm64,ppc64el,riscv64,s390x
upstream	1.50.2

Popcon: 16 users (34 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Differential gene expression analysis based on the negative binomial distribution. Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.

Please cite: Michael I Love, Wolfgang Huber and Simon Anders: Moderated estimation of fold change and dispersion for {RNA}-seq data with {DESeq}2. (eprint) Genome Biol 15(12) (2014)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

r-bioc-edger

Empirical analysis of digital gene expression data in R

https://bioconductor.org/packages/edgeR/

Maintainer: Debian R Packages Maintainers (Charles Plessy)

Versions of package r-bioc-edger
Release	Version	Architectures
bookworm	3.40.2+dfsg-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	4.4.2+dfsg-1	amd64,arm64,loong64,ppc64el,riscv64,s390x
trixie	4.4.2+dfsg-1	amd64,arm64,ppc64el,riscv64,s390x
bullseye	3.32.1+dfsg-1	amd64,arm64,armhf,i386
upstream	4.8.2

Popcon: 21 users (33 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Bioconductor package for differential expression analysis of whole transcriptome sequencing (RNA-seq) and digital gene expression profiles with biological replication. It uses empirical Bayes estimation and exact tests based on the negative binomial distribution. It is also useful for differential signal analysis with other types of genome-scale count data.

Please cite: Mark D. Robinson, Davis J. McCarthy and Gordon K. Smyth: edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. (PubMed,eprint) Bioinformatics 26,:139-140 (2010)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

r-bioc-hilbertvis

GNU R package to visualise long vector data

https://bioconductor.org/packages/HilbertVis

Maintainer: Debian R Packages Maintainers (Michael R. Crusoe)

Versions of package r-bioc-hilbertvis
Release	Version	Architectures
bullseye	1.48.0-1	amd64,arm64,armhf,i386
bookworm	1.56.0-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	1.64.0-2	amd64,arm64,loong64,ppc64el,riscv64,s390x
trixie	1.64.0-2	amd64,arm64,ppc64el,riscv64,s390x
upstream	1.68.0

Popcon: 7 users (32 upd.)^*

Newer upstream!

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

This tool allows one to display very long data vectors in a space-efficient manner, by organising it along a 2D Hilbert curve. The user can then visually judge the large scale structure and distribution of features simultaenously with the rough shape and intensity of individual features.

In bioinformatics, a typical use case is ChIP-Chip and ChIP-Seq, or basically all the kinds of genomic data, that are conventionally displayed as quantitative track ("wiggle data") in genome browsers such as those provided by Ensembl or UCSC.

Please cite: Simon Anders: Visualization of genomic data with the Hilbert curve. (PubMed,eprint) Bioinformatics 25(10):1231-1235 (2009)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

r-bioc-metagenomeseq

GNU R statistical analysis for sparse high-throughput sequencing

https://bioconductor.org/packages/metagenomeSeq/

Maintainer: Debian R Packages Maintainers (Michael R. Crusoe)

Versions of package r-bioc-metagenomeseq
Release	Version	Architectures
trixie	1.48.1-1	all
bookworm	1.40.0-1	all
bullseye	1.32.0-1	all
sid	1.48.1-1	all
upstream	1.52.0

Popcon: 5 users (32 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

MetagenomeSeq is designed to determine features (be it Operational Taxanomic Unit (OTU), species, etc.) that are differentially abundant between two or more groups of multiple samples. metagenomeSeq is designed to address the effects of both normalization and under-sampling of microbial communities on disease association detection and the testing of feature correlations.

Registry entries: Bio.tools Bioconda

Upload screenshot

r-bioc-rsubread

Subread Sequence Alignment and Counting for R

https://bioconductor.org/packages/Rsubread/

Maintainer: Debian R Packages Maintainers (Michael R. Crusoe)

Versions of package r-bioc-rsubread
Release	Version	Architectures
sid	2.20.0-2	amd64,arm64,loong64,ppc64el,riscv64,s390x
bullseye	2.4.2-1	amd64,arm64
bookworm	2.12.2-1	amd64,arm64,mips64el,ppc64el,s390x
trixie	2.20.0-2	amd64,arm64,ppc64el,riscv64,s390x
upstream	2.24.0

Popcon: 14 users (1 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Alignment, quantification and analysis of second and third generation sequencing data. Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery.

Can be applied to all major sequencing techologies and to both short and long sequence reads.

Please cite: Yang Liao, Gordon K Smyth and Wei Shi: The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads,. (eprint) Nucleic Acids Research 47(8):e47 (2019)

Registry entries: Bio.tools Bioconda

Upload screenshot

r-cran-alakazam

Immunoglobulin Clonal Lineage and Diversity Analysis

https://cran.r-project.org/package=alakazam

Maintainer: Debian R Packages Maintainers (Charles Plessy)

Versions of package r-cran-alakazam
Release	Version	Architectures
trixie	1.3.0-1	amd64,arm64,ppc64el,riscv64,s390x
sid	1.4.2-1	amd64,arm64,loong64,ppc64el,riscv64
experimental	1.3.0-2~0exp0	amd64,arm64,ppc64el,riscv64,s390x
sid	1.3.0-1	s390x
bookworm	1.2.1-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye	1.1.0-1	amd64,arm64,armhf,i386

Popcon: 6 users (34 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Alakazam is part of the Immcantation analysis framework for Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) and provides a set of tools to investigate lymphocyte receptor clonal lineages, diversity, gene usage, and other repertoire level properties, with a focus on high-throughput immunoglobulin (Ig) sequencing.

Alakazam serves five main purposes:

Providing core functionality for other R packages in the Immcantation framework. This includes common tasks such as file I/O, basic DNA sequence manipulation, and interacting with V(D)J segment and gene annotations.
Providing an R interface for interacting with the output of the pRESTO and Change-O tool suites.
Performing lineage reconstruction on clonal populations of Ig sequences and analyzing the topology of the resultant lineage trees.
Performing clonal abundance and diversity analysis on lymphocyte repertoires.
Performing physicochemical property analyses of lymphocyte receptor sequences.

Please cite: Namita T. Gupta, Jason A. Vander Heiden, Mohamed Uduman, Daniel Gadala-Maria, Gur Yaari and Steven H. Kleinstein: Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. (eprint) 31(20):3356–3358 (2017)

Upload screenshot

r-cran-shazam

Immunoglobulin Somatic Hypermutation Analysis

https://cran.r-project.org/package=shazam

Maintainer: Debian R Packages Maintainers (Charles Plessy)

Popcon: 5 users (33 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Provides a computational framework for Bayesian estimation of antigen-driven selection in immunoglobulin (Ig) sequences, providing an intuitive means of analyzing selection by quantifying the degree of selective pressure. Also provides tools to profile mutations in Ig sequences, build models of somatic hypermutation (SHM) in Ig sequences, and make model-dependent distance comparisons of Ig repertoires.

SHazaM is part of the Immcantation analysis framework for Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) and provides tools for advanced analysis of somatic hypermutation (SHM) in immunoglobulin (Ig) sequences. Shazam focuses on the following analysis topics:

Quantification of mutational load SHazaM includes methods for determine the rate of observed and expected mutations under various criteria. Mutational profiling criteria include rates under SHM targeting models, mutations specific to CDR and FWR regions, and physicochemical property dependent substitution rates.
Statistical models of SHM targeting patterns Models of SHM may be divided into two independent components: 1) a mutability model that defines where mutations occur and 2) a nucleotide substitution model that defines the resulting mutation. Collectively these two components define an SHM targeting model. SHazaM provides empirically derived SHM 5-mer context mutation models for both humans and mice, as well tools to build SHM targeting models from data.
Analysis of selection pressure using BASELINe The Bayesian Estimation of Antigen-driven Selection in Ig Sequences (BASELINe) method is a novel method for quantifying antigen-driven selection in high-throughput Ig sequence data. BASELINe uses SHM targeting models can be used to estimate the null distribution of expected mutation frequencies, and provide measures of selection pressure informed by known AID targeting biases.
Model-dependent distance calculations SHazaM provides methods to compute evolutionary distances between sequences or set of sequences based on SHM targeting models. This information is particularly useful in understanding and defining clonal relationships.

Registry entries: Bioconda

Upload screenshot

r-cran-tcr

??? missing short description for package r-cran-tcr :-(

https://cran.r-project.org/package=tcR

Maintainer: Debian R Packages Maintainers (Steffen Moeller)

Versions of package r-cran-tcr
Release	Version	Architectures
trixie	2.3.2+ds-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	2.3.2+ds-1	amd64,arm64,armhf,i386
bookworm	2.3.2+ds-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Please cite: Vadim I. Nazarov, Mikhail V. Pogorelyy, Ekaterina A. Komech, Ivan V. Zvyagin, Dmitry A. Bolotin, Mikhail Shugay, Dmitry M. Chudakov, Yury B. Lebedev and Ilgar Z. Mamedov: tcR: an R package for T cell receptor repertoire advanced data analysis. (eprint) BMC Bioinformatics 16:175 (2015)

Registry entries: Bio.tools Bioconda

Upload screenshot

r-cran-tigger

Infers new Immunoglobulin alleles from Rep-Seq Data

https://cran.r-project.org/package=tigger

Maintainer: Debian R Packages Maintainers (Charles Plessy)

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Summary: Infers the V genotype of an individual from immunoglobulin (Ig) repertoire-sequencing (Rep-Seq) data, including detection of any novel alleles. This information is then used to correct existing V allele calls from among the sample sequences.

High-throughput sequencing of B cell immunoglobulin receptors is providing unprecedented insight into adaptive immunity. A key step in analyzing these data involves assignment of the germline V, D and J gene segment alleles that comprise each immunoglobulin sequence by matching them against a database of known V(D)J alleles. However, this process will fail for sequences that utilize previously undetected alleles, whose frequency in the population is unclear.

TIgGER is a computational method that significantly improves V(D)J allele assignments by first determining the complete set of gene segments carried by an individual (including novel alleles) from V(D)J-rearrange sequences. TIgGER can then infer a subject’s genotype from these sequences, and use this genotype to correct the initial V(D)J allele assignments.

The application of TIgGER continues to identify a surprisingly high frequency of novel alleles in humans, highlighting the critical need for this approach. TIgGER, however, can and has been used with data from other species.

Core Abilities:

Detecting novel alleles
Inferring a subject’s genotype
Correcting preliminary allele calls

Required Input

A table of sequences from a single individual, with columns containing the following:
V(D)J-rearranged nucleotide sequence (in IMGT-gapped format)
Preliminary V allele calls
Preliminary J allele calls
Length of the junction region
Germline Ig sequences in IMGT-gapped fasta format (e.g., as those downloaded from IMGT/GENE-DB)

The former can be created through the use of IMGT/HighV-QUEST and Change-O.

Registry entries: Bioconda

Upload screenshot

rna-star

ultrafast universal RNA-seq aligner

https://github.com/alexdobin/STAR/

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package rna-star
Release	Version	Architectures
bullseye	2.7.8a+dfsg-2	amd64,arm64
bookworm	2.7.10b+dfsg-2	amd64,arm64,mips64el,ppc64el
trixie	2.7.11b+dfsg-2	amd64,arm64,ppc64el,riscv64
forky	2.7.11b+dfsg-2	amd64,arm64,ppc64el,riscv64
sid	2.7.11b+dfsg-2	amd64,arm64,loong64,ppc64el,riscv64

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, the authors experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy.

The package is enhanced by the following packages: multiqc

Please cite: Alexander Dobin, Carrie A. Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, Sonali Jha, Philippe Batut, Mark Chaisson and Thomas R. Gingeras: STAR: ultrafast universal RNA-seq aligner. (PubMed,eprint) Bioinformatics 29(1):15-21 (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Topics: Sequence analysis

Upload screenshot

rtax

classificação de leituras de sequência do gene do RNA ribosomal 16S

https://github.com/davidsoergel/rtax/

Maintainer: Debian Med Packaging Team (xiao sheng wen)

Popcon: 5 users (33 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Fix translated description

Tecnologias de leitura curta para perfis de comunidade microbiana estão cada vez mais populares, mesmo que técnicas anteriores para atribuir taxonomia a leituras emparelhadas tenham desempenhos ruins. RTAX fornece atribuições taxonômicas rápidas de leituras emparelhadas usando um algoritmo de consenso.

Please cite: David A. W. Soergel, Neelendu Dey, Rob Knight and Steven E. Brenner: Selection of primers for optimal taxonomic classification of environmental 16S rRNA gene sequences. (PubMed,eprint) The ISME Journal 6:1440–1444 (2012)

Upload screenshot

salmon

wicked-fast transcript quantification from RNA-seq data

https://github.com/COMBINE-lab/salmon

Maintainer: Debian Med Packaging Team (Yavor Doganov)

Versions of package salmon
Release	Version	Architectures
sid	1.10.3+ds1-1	amd64,arm64
bullseye	1.4.0+ds1-1	amd64,arm64
bookworm	1.10.1+ds1-1	amd64,arm64
trixie	1.10.2+ds1-1	amd64,arm64
forky	1.10.3+ds1-1	amd64,arm64

Popcon: 6 users (33 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Salmon is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimates from RNA-seq data. Salmon achieves is accuracy and speed via a number of different innovations, including the use of lightweight alignments (accurate but fast-to-compute proxies for traditional read alignments) and massively-parallel stochastic collapsed variational inference. The result is a versatile tool that fits nicely into many different pipelines. For example, you can choose to make use of the lightweight alignments by providing Salmon with raw sequencing reads, or, if it is more convenient, you can provide Salmon with regular alignments (e.g. computed with your favorite aligner), and it will use the same wicked-fast, state-of-the-art inference algorithm to estimate transcript-level abundances for your experiment.

The package is enhanced by the following packages: multiqc

Please cite: Rob Patro, Geet Duggal, Michael I Love, Rafael A Irizarry and Carl Kingsford: Salmon provides fast and bias-aware quantification of transcript expression. (eprint) Nature Methods 14(4):417-419 (2017)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

sambamba

tools for working with SAM/BAM data

https://github.com/lomereiter/sambamba

Maintainer: Debian Med Packaging Team (Santiago Vila)

Versions of package sambamba
Release	Version	Architectures
trixie	1.0.1+dfsg-2	amd64,arm64,riscv64
bullseye	0.8.0-1	amd64,arm64
sid	1.0.1+dfsg-3	amd64,arm64,riscv64
forky	1.0.1+dfsg-3	amd64,arm64,riscv64
bookworm	1.0+dfsg-1	amd64,arm64

Popcon: 10 users (36 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Sambamba positions itself as a performant alternative to samtools and provides tools for

Powerful filtering with sambamba view --filter
Picard-like SAM header merging in the merge tool
Optional for operations on whole BAMs
Fast copying of a region to a new file with the slice tool
Duplicate marking/removal, using the Picard criteria

Please cite: Artem Tarasov, Albert J. Vilella, Edwin Cuppen, Isaac J. Nijman and Pjotr Prins: Sambamba: fast processing of NGS alignment formats. (PubMed,eprint) Bioinformatics 31(12):2032-2034 (2015)

Registry entries: Bio.tools Bioconda

Upload screenshot

samblaster

marks duplicates, extracts discordant/split reads

https://github.com/GregoryFaust/samblaster

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package samblaster
Release	Version	Architectures
sid	0.1.26-4	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	0.1.26-1	amd64,arm64,armhf,i386
bookworm	0.1.26-4	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	0.1.26-4	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
forky	0.1.26-4	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Current "next-generation" sequencing technologies cannot tell what exact sequence they will be reading. They take what is available. And if some sequences are read very often, then this needs some extra biomedical thinking. The genome could for instance be duplicated.

samblaster is a fast and flexible program for marking duplicates in read-id grouped paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file. When marking duplicates, samblaster will require approximately 20MB of memory per 1M read pairs.

The package is enhanced by the following packages: multiqc

Please cite: Gregory G. Faust and Ira M. Hall: SAMBLASTER: fast duplicate marking and structural variant read extraction. (PubMed,eprint) Bioinformatics 30(17):2503-2505 (2014)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

samtools

processing sequence alignments in SAM, BAM and CRAM formats

https://www.htslib.org/

Maintainer: Debian Med Packaging Team (Steffen Moeller)

Versions of package samtools
Release	Version	Architectures
trixie	1.21-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
sid	1.22.1-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bookworm	1.16.1-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
forky	1.22.1-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bullseye	1.11-1	amd64,arm64,armhf,i386
upstream	1.23

Debtags of package samtools:
field	biology
interface	commandline
network	client
role	program
scope	utility
uitoolkit	ncurses
use	analysing, calculating, filtering
works-with	biological-sequence

Popcon: 68 users (52 upd.)^*

Newer upstream!

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

Samtools is a set of utilities that manipulate nucleotide sequence alignments in the binary BAM format. It imports from and exports to the ascii SAM (Sequence Alignment/Map) and CRAM formats, does sorting, merging and indexing, and allows one to retrieve reads in any regions swiftly. It is designed to work on a stream, and is able to open a BAM or CRAM (not SAM) file on a remote FTP or HTTP server.

The package is enhanced by the following packages: libbio-samtools-perl multiqc

Please cite: Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell, Jue Ruan, Nils Homer, Gabor Marth, Goncalo Abecasis, Richard Durbin and 1000 Genome Project Data Processing Subgroup: The Sequence Alignment/Map (SAM) Format and SAMtools. (PubMed,eprint) Bioinformatics 25(16):2078-2079 (2009)

Registry entries: Bio.tools SciCrunch Bioconda

scoary

pangenome-wide association studies

https://github.com/AdmiralenOla/Scoary

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package scoary
Release	Version	Architectures
forky	1.6.16-10	all
bookworm	1.6.16-5	all
bullseye	1.6.16-2	all
sid	1.6.16-10	all
trixie	1.6.16-10	all

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Scoary is designed to take the gene_presence_absence.csv file from Roary as well as a traits file created by the user and calculate the associations between all genes in the accessory genome and the traits. It reports a list of genes sorted by strength of association per trait.

Please cite: Ola Brynildsrud, Jon Bohlin, Lonneke Scheffer and Vegard Eldholm: Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary. (PubMed,eprint) Genome Biology 17(238) (2016)

Registry entries: Bio.tools Bioconda

Upload screenshot

scythe

Bayesian adaptor trimmer for sequencing reads

https://github.com/vsbuffalo/scythe

Maintainer: Debian Med Packaging Team (Charles Plessy)

Versions of package scythe
Release	Version	Architectures
bookworm	0.994+git20141017.20d3cff-3	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	0.994+git20141017.20d3cff-5	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
forky	0.994+git20141017.20d3cff-5	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	0.994+git20141017.20d3cff-5	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	0.994+git20141017.20d3cff-3	amd64,arm64,armhf,i386

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Scythe uses a Naive Bayesian approach to classify contaminant substrings in sequence reads. It considers quality information, which can make it robust in picking out 3'-end adapters, which often include poor quality bases.

Registry entries: SciCrunch

Upload screenshot

seqprep

stripping adaptors and/or merging paired reads of DNA sequences with overlap

http://seqanswers.com/wiki/SeqPrep

Maintainer: Debian Med Packaging Team (Nilesh Patra)

Versions of package seqprep
Release	Version	Architectures
trixie	1.3.2-9	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
sid	1.3.2-10	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	1.3.2-5	amd64,arm64,armhf,i386
bookworm	1.3.2-8	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
forky	1.3.2-10	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

SeqPrep is a program to merge paired end Illumina reads that are overlapping into a single longer read. It may also just be used for its adapter trimming feature without doing any paired end overlap. When an adapter sequence is present, that means that the two reads must overlap (in most cases) so they are forcefully merged. When reads do not have adapter sequence they must be treated with care when doing the merging, so a much more specific approach is taken. The default parameters were chosen with specificity in mind, so that they could be ran on libraries where very few reads are expected to overlap. It is always safest though to save the overlapping procedure for libraries where you have some prior knowledge that a significant portion of the reads will have some overlap.

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

seqtk

Fast and lightweight tool for processing sequences in the FASTA or FASTQ format

https://github.com/lh3/seqtk

Maintainer: Debian Med Packaging Team (Lance Lin)

Versions of package seqtk
Release	Version	Architectures
bullseye	1.3-2	amd64,arm64,armhf,i386
bookworm	1.3-4	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
forky	1.4-2	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	1.4-2	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
trixie	1.4-2	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
upstream	1.5

Popcon: 12 users (34 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Currently, seqtk supports quality based trimming with the phred algorithm, converting fastq to fasta, reverse complementing sequences, extracting or masking subsequences in regions given in a BED/name list file, and more. It contains a subsampling module to sample exactly n sequences or a fraction of sequences.

Seqtk supports both fasta and fastq input files, which can be optionally gzip compressed.

Registry entries: Bio.tools Bioconda

sga

de novo genome assembler that uses string graphs

https://github.com/jts/sga

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package sga
Release	Version	Architectures
trixie	0.10.15-7	amd64,arm64,armel,armhf,i386,ppc64el,riscv64
sid	0.10.15-7	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64
bullseye	0.10.15-5	amd64,arm64,armhf,i386
bookworm	0.10.15-7	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el

Popcon: 7 users (30 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The major goal of SGA is to be very memory efficient, which is achieved by using a compressed representation of DNA sequence reads.

SGA is a de novo assembler for DNA sequence reads. It is based on Gene Myers' string graph formulation of assembly and uses the FM-index/Burrows-Wheeler transform to efficiently find overlaps between sequence reads.

Please cite: Jared T. Simpson and Richard Durbin: Efficient de novo assembly of large genomes using compressed data structures.. (PubMed,eprint) Genome Res 22(3):549-555 (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

sickle

windowed adaptive trimming tool for FASTQ files using quality

https://github.com/najoshi/sickle

Maintainer: Debian Med Packaging Team (Nilesh Patra)

Versions of package sickle
Release	Version	Architectures
forky	1.33+git20150314.f3d6ae3-2	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	1.33+git20150314.f3d6ae3-2	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
trixie	1.33+git20150314.f3d6ae3-2	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	1.33+git20150314.f3d6ae3-2	amd64,arm64,armhf,i386
bookworm	1.33+git20150314.f3d6ae3-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Most modern sequencing technologies produce reads that have deteriorating quality towards the 3'-end. Incorrectly called bases here negatively impact assembles, mapping, and downstream bioinformatics analyses.

Sickle is a tool that uses sliding windows along with quality and length thresholds to determine when quality is sufficiently low to trim the 3'-end of reads. It will also discard reads based upon the length threshold. It takes the quality values and slides a window across them whose length is 0.1 times the length of the read. If this length is less than 1, then the window is set to be equal to the length of the read. Otherwise, the window slides along the quality values until the average quality in the window drops below the threshold. At that point the algorithm determines where in the window the drop occurs and cuts both the read and quality strings there. However, if the cut point is less than the minimum length threshold, then the read is discarded entirely.

Sickle supports four types of quality values: Illumina, Solexa, Phred, and Sanger. Note that the Solexa quality setting is an approximation (the actual conversion is a non-linear transformation). The end approximation is close.

Sickle also supports gzipped file inputs.

The package is enhanced by the following packages: multiqc

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

sideretro

??? missing short description for package sideretro :-(

https://github.com/galantelab/sideRETRO

Maintainer: Debian Med Packaging Team (Daniela Moreira Mombach)

Versions of package sideretro
Release	Version	Architectures
forky	1.1.6-2	amd64,arm64,armhf,i386,ppc64el,riscv64
sid	1.1.6-2	amd64,arm64,armhf,i386,ppc64el,riscv64
trixie	1.1.6-2	amd64,arm64,armel,armhf,i386,ppc64el,riscv64

Popcon: users ( upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Upload screenshot

smalt

Sequence Mapping and Alignment Tool

https://www.sanger.ac.uk/science/tools/smalt-0

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package smalt
Release	Version	Architectures
trixie	0.7.6-13	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
sid	0.7.6-13	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	0.7.6-13	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bookworm	0.7.6-12	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye	0.7.6-9	amd64,arm64,armhf,i386

Popcon: 7 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

SMALT efficiently aligns DNA sequencing reads with a reference genome. Reads from a wide range of sequencing platforms, for example Illumina, Roche-454, Ion Torrent, PacBio or ABI-Sanger, can be processed including paired reads.

The software employs a perfect hash index of short words (< 20 nucleotides long), sampled at equidistant steps along the genomic reference sequences.

For each read, potentially matching segments in the reference are identified from seed matches in the index and subsequently aligned with the read using a banded Smith-Waterman algorithm.

The best gapped alignments of each read is reported including a score for the reliability of the best mapping. The user can adjust the trade-off between sensitivity and speed by tuning the length and spacing of the hashed words.

A mode for the detection of split (chimeric) reads is provided. Multi-threaded program execution is supported.

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

Remark of Debian Med team: This can be regarded as successor of ssaha2

This program is from the same author as ssaha2 and according to its author faster and more precise than ssaha2 (except for sequences > 2000bp).

smrtanalysis

software suite for single molecule, real-time sequencing

https://www.pacb.com/products-and-services/analytical-software/smrt-analysis/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Popcon: 0 users (0 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

SMRT® Analysis is a powerful, open-source bioinformatics software suite available for analysis of DNA sequencing data from Pacific Biosciences’ SMRT technology. Users can choose from a variety of analysis protocols that utilize PacBio® and third-party tools. Analysis protocols include de novo genome assembly, cDNA mapping, DNA base-modification detection, and long-amplicon analysis to determine phased consensus sequences.

This is a metapackage that depends on the components of SMRT Analysis.

Registry entries: Bio.tools SciCrunch

Upload screenshot

snap-aligner

Scalable Nucleotide Alignment Program

https://snap.cs.berkeley.edu/

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package snap-aligner
Release	Version	Architectures
trixie	2.0.3+dfsg-2	amd64,arm64,ppc64el,riscv64
forky	2.0.5+dfsg-1	amd64,arm64,ppc64el,riscv64
bullseye	1.0.0+dfsg-2	amd64,arm64
sid	2.0.5+dfsg-1	amd64,arm64,loong64,ppc64el,riscv64
bookworm	2.0.2+dfsg-1	amd64,arm64,mips64el,ppc64el

Popcon: 6 users (34 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

SNAP is a new sequence aligner that is 3-20x faster and just as accurate as existing tools like BWA-mem, Bowtie2 and Novoalign. It runs on commodity x86 processors, and supports a rich error model that lets it cheaply match reads with more differences from the reference than other tools. This gives SNAP up to 2x lower error rates than existing tools (in some cases) and lets it match larger mutations that they may miss. SNAP also natively reads BAM, FASTQ, or gzipped FASTQ, and natively writes SAM or BAM, with built-in sorting, duplicate marking, and BAM indexing.

Please cite: Matei Zaharia, William J. Bolosky, Kristal Curtis, Armando Fox, David Patterson, Scott Shenker, Ion Stoica, Richard M. Karp and Taylor Sittler: Faster and More Accurate Sequence Alignment with SNAP. (eprint) arXiv preprint arXiv:1111.5572 (2011)

Registry entries: SciCrunch

Upload screenshot

sniffles

structural variation caller using third-generation sequencing

https://github.com/fritzsedlazeck/Sniffles

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package sniffles
Release	Version	Architectures
trixie	2.6.0-1	all
bookworm	2.0.7-1	all
forky	2.6.0-1	all
sid	2.6.0-1	all
bullseye	1.0.12b+ds-1	amd64,arm64,armhf,i386
upstream	2.7.2

Popcon: 5 users (32 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Sniffles is a structural variation (SV) caller using third-generation sequencing data such as those from Pacific Biosciences or Oxford Nanopore platforms. It detects all types of SVs using evidence from split-read alignments, high-mismatch regions, and coverage analysis.

Please cite: Fritz J. Sedlazeck, Philipp Rescheneder, Moritz Smolka, Han Fang, Maria Nattestad, Arndt von Haeseler and Michael Schatz: Accurate detection of complex structural variations using single molecule sequencing. (eprint) bioRxiv (2017)

Registry entries: Bio.tools Bioconda

Upload screenshot

snp-sites

Binary code for the package snp-sites

https://github.com/sanger-pathogens/snp-sites

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package snp-sites
Release	Version	Architectures
bullseye	2.5.1-1	amd64,arm64,armhf,i386
sid	2.5.1-2	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bookworm	2.5.1-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	2.5.1-2	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
forky	2.5.1-2	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

This program finds single nucleotide polymorphism (SNP) sites from multi-fasta alignment input files (which might be compressed). Its output can be in various widely used formats (Multi Fasta Alignment, Vcf, phylip).

The software has been developed at the Wellcome Trust Sanger Institute.

A Single Nucleotide - polymorphism (SNP, pronounced snip; plural snips) is a DNA sequence variation occurring when a Single Nucleotide — A, T, C or G — in the genome (or other shared sequence) differs between members of a biological species or paired chromosomes. For example, two sequenced DNA fragments from different individuals, AAGCCTA to AAGCTTA, contain a difference in a single nucleotide. In this case there are two alleles. Almost all common SNPs have only two alleles.

Please cite: Andrew J. Page, Ben Taylor, Aidan J. Delaney, Jorge Soares, Torsten Seemann, Jacqueline A. Keane and Simon R. Harris: SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments. (eprint) Microbial Genomics 2(4) (2016)

Topics: Genetic variation

snpomatic

fast, stringent short-read mapping software

https://github.com/magnusmanske/snpomatic

Maintainer: Debian Med Packaging Team (Sascha Steinbiss)

Versions of package snpomatic
Release	Version	Architectures
forky	1.0-7	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	1.0-7	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	1.0-5	amd64,arm64,armhf,i386
bookworm	1.0-6	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	1.0-7	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

High throughput sequencing technologies generate large amounts of short reads. Mapping these to a reference sequence consumes large amounts of processing time and memory, and read mapping errors can lead to noisy or incorrect alignments.

SNP-o-matic is a fast, stringent short-read mapping software. It supports a multitude of output types and formats, for uses in filtering reads, alignments, sequence-based genotyping calls, assisted reassembly of contigs etc.

Please cite: Heinrich Magnus Manske and Dominic P. Kwiatkowski: SNP-o-matic. (PubMed,eprint) Bioinformatics 25(18):2434-2435 (2009)

Registry entries: Bio.tools Bioconda

Topics: Genetic variation; Mapping

Upload screenshot

soapdenovo

short-read assembly method to build de novo draft assembly

http://soap.genomics.org.cn/soapdenovo.html

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package soapdenovo
Release	Version	Architectures
bookworm	1.05-6	amd64
bullseye	1.05-6	amd64
sid	1.05-7	amd64
forky	1.05-7	amd64
trixie	1.05-6	amd64

Popcon: 8 users (33 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

SOAPdenovo is a novel short-read assembly method that can build a de novo draft assembly for the human-sized genomes. The program is specially designed to assemble Illumina GA short reads.

It creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost effective way.

This version is not maintained anymore, consider using soapdenovo2.

Please cite: Ruiqiang Li, Hongmei Zhu, Jue Ruan, Wubin Qian, Xiaodong Fang, Zhongbin Shi, Yingrui Li, Shengting Li, Gao Shan, Karsten Kristiansen, Songgang Li, Huanming Yang, Jian Wang and Jun Wang: De novo assembly of human genomes with massively parallel short read sequencing. (PubMed,eprint) Genome Research 20(2):265-72 (2009)

Registry entries: Bio.tools SciCrunch

Upload screenshot

soapdenovo2

short-read assembly method to build de novo draft assembly

http://soap.genomics.org.cn/soapdenovo.html

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package soapdenovo2
Release	Version	Architectures
sid	242+dfsg-5	amd64
bullseye	242+dfsg-1	amd64
bookworm	242+dfsg-3	amd64
trixie	242+dfsg-4	amd64
forky	242+dfsg-5	amd64

Popcon: 7 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

SOAPdenovo is a novel short-read assembly method that can build a de novo draft assembly for the human-sized genomes. The program is specially designed to assemble Illumina GA short reads.

It creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost effective way.

Please cite: Ruibang Luo, Binghang Liu, Yinlong Xie, Zhenyu Li, Weihua Huang, Jianying Yuan, Guangzhu He, Yanxiang Chen, Qi Pan, Yunjie Liu, Jingbo Tang, Gengxiong Wu, Hao Zhang, Yujian Shi, Yong Liu, Chang Yu, Bo Wang, Yao Lu, Changlei Han, David W Cheung, Siu-Ming Yiu, Shaoliang Peng, Zhu Xiaoqian, Guangming Liu, Xiangke Liao, Yingrui Li, Huanming Yang, Jian Wang, Tak-Wah Lam and Jun Wang: SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Giga Science 1(1):18 (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

sortmerna

tool for filtering, mapping and OTU-picking NGS reads

https://github.com/sortmerna/sortmerna

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package sortmerna
Release	Version	Architectures
bullseye	2.1-5	amd64,i386
bookworm	4.3.6-2	amd64,i386
trixie	4.3.7-2	amd64,i386
forky	4.3.7-3	amd64,i386
sid	4.3.7-3	amd64,i386

Popcon: 4 users (33 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

SortMeRNA is a biological sequence analysis tool for filtering, mapping and OTU-picking NGS reads. The core algorithm is based on approximate seeds and allows for fast and sensitive analyses of nucleotide sequences. The main application of SortMeRNA is filtering rRNA from metatranscriptomic data. Additional applications include OTU-picking and taxonomy assignation available through QIIME v1.9+ (http://qiime.org - v1.9.0-rc1). SortMeRNA takes as input a file of reads (fasta or fastq format) and one or multiple rRNA database file(s), and sorts apart rRNA and rejected reads into two files specified by the user. Optionally, it can provide high quality local alignments of rRNA reads against the rRNA database. SortMeRNA works with Illumina, 454, Ion Torrent and PacBio data, and can produce SAM and BLAST-like alignments.

The package is enhanced by the following packages: multiqc

Please cite: Evguenia Kopylova, Laurent Noé and Hélène Touzet: SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data". (PubMed,eprint) Bioinformatics 28(24):3211-3217 (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

spades

genome assembler for single-cell and isolates data sets

https://github.com/ablab/spades

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package spades
Release	Version	Architectures
bookworm	3.15.5+dfsg-2	amd64
sid	4.0.0+really3.15.5+dfsg-3	amd64
forky	4.0.0+really3.15.5+dfsg-3	amd64
trixie	4.0.0+really3.15.5+dfsg-1	amd64
bullseye	3.13.1+dfsg-2	amd64
upstream	4.2.0

Popcon: 5 users (35 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The SPAdes – St. Petersburg genome assembler is intended for both standard isolates and single-cell MDA bacteria assemblies. It works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio and Sanger reads. You can also provide additional contigs that will be used as long reads.

This package provides the following additional pipelines:

metaSPAdes – a pipeline for metagenomic data sets
plasmidSPAdes – a pipeline for extracting and assembling plasmids from WGS data sets
metaplasmidSPAdes – a pipeline for extracting and assembling plasmids from metagenomic data sets
rnaSPAdes – a de novo transcriptome assembler from RNA-Seq data
truSPAdes – a module for TruSeq barcode assembly
biosyntheticSPAdes – a module for biosynthetic gene cluster assembly with paired-end reads

SPAdes provides several stand-alone binaries with relatively simple command-line interface: k-mer counting (spades-kmercounter), assembly graph construction (spades-gbuilder) and long read to graph aligner (spades-gmapper).

Please cite: Anton Bankevich, Sergey Nurk, Dmitry Antipov, Alexey A. Gurevich, Mikhail Dvorkin, Alexander S. Kulikov, Valery M. Lesin, Sergey I. Nikolenko, Son Pham, Andrey D. Prjibelski, Alexey V. Pyshkin, Alexander V. Sirotkin, Nikolay Vyahhi, Glenn Tesler, Max A. Alekseyev and Pavel A. Pevzner: SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. (PubMed,eprint) Journal of Computational Biology 19(5):455-477 (2012)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

sprai

single-pass sequencing read accuracy improver

https://web.archive.org/web/20180316202959/http://zombie.cb.k.u-tokyo.ac.jp/sprai/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package sprai
Release	Version	Architectures
sid	0.9.9.23+dfsg1-3	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	0.9.9.23+dfsg1-2	amd64,arm64,armhf,i386
forky	0.9.9.23+dfsg1-3	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	0.9.9.23+dfsg1-3	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	0.9.9.23+dfsg1-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Sprai is a tool to correct sequencing errors in single-pass reads for de novo assembly. It is originally designed for correcting sequencing errors in single-molecule DNA sequencing reads, especially in Continuous Long Reads (CLRs) generated by PacBio RS sequencers. The goal of Sprai is not maximizing the accuracy of error-corrected reads. Instead, Sprai aims at maximizing the continuity (i.e., N50 contig length) of assembled contigs after error correction.

Upload screenshot

sra-toolkit

utilities for the NCBI Sequence Read Archive

https://github.com/ncbi/sra-tools/

Maintainer: Debian Med Packaging Team (Aaron M. Ucko)

Versions of package sra-toolkit
Release	Version	Architectures
bullseye	2.10.9+dfsg-2	amd64
bookworm	3.0.3+dfsg-6~deb12u1	amd64,arm64
trixie	3.2.1+dfsg-4	amd64,arm64
sid	3.2.1+dfsg-5	amd64,arm64
forky	3.2.1+dfsg-5	amd64,arm64
upstream	3.3.0

Popcon: 15 users (33 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Tools for reading the SRA archive, generally by converting individual runs into some commonly used format such as fastq.

The textual dumpers "sra-dump" and "vdb-dump" are provided in this release as an aid in visual inspection. It is likely that their actual output formatting will be changed in the near future to a stricter, more formalized representation[s]. PLEASE DO NOT RELY UPON THE OUTPUT FORMAT SEEN IN THIS RELEASE.

Other tools distributed in this package are:

 abi-dump, abi-load
 align-info
 bam-load
 cache-mgr
 cg-load
 copycat
 fasterq-dump
 fastq-dump, fastq-load
 helicos-load
 illumina-dump, illumina-load
 kar
 kdbmeta
 latf-load
 pacbio-load
 prefetch
 rcexplain
 remote-fuser
 sff-dump, sff-load
 sra-pileup, sra-sort, sra-stat, srapath
 srf-load
 test-sra
 vdb-config, vdb-copy, vdb-decrypt, vdb-encrypt, vdb-get, vdb-lock,
 vdb-passwd, vdb-unlock, vdb-validate

The "help" information will be improved in near future releases, and the tool options will become standardized across the set. More documentation will also be provided documentation on the NCBI web site.

Tool options may change in the next release. Version 1 tool options will remain supported wherever possible in order to preserve operation of any existing scripts.

Please cite: Rasko Leinonen, Ruth Akhtar, Ewan Birney, James Bonfield, Lawrence Bower, Matt Corbett, Ying Cheng, Fehmi Demiralp, Nadeem Faruque, Neil Goodgame, Richard Gibson, Gemma Hoad, Christopher Hunter, Mikyung Jang, Steven Leonard, Quan Lin, Rodrigo Lopez, Michael Maguire, Hamish McWilliam, Sheila Plaister, Rajesh Radhakrishnan, Siamak Sobhany, Guy Slater, Petra Ten Hoopen, Franck Valentin, Robert Vaughan, Vadim Zalunin, Daniel Zerbino and Guy Cochrane: Improvements to services at the European Nucleotide Archive. (PubMed,eprint) Nucleic Acids Research 38(Database issue):D39-45 (2010)

Registry entries: Bio.tools Bioconda

Upload screenshot

srst2

Short Read Sequence Typing for Bacterial Pathogens

https://katholt.github.io/srst2/

Maintainer: Debian Med Packaging Team (Alexandre Detiste)

Versions of package srst2
Release	Version	Architectures
bookworm	0.2.0-9	amd64,arm64,mips64el,ppc64el
trixie	0.2.0-13	amd64,arm64,ppc64el,riscv64
forky	0.2.0-14	amd64,arm64,ppc64el,riscv64
sid	0.2.0-14	amd64,arm64,ppc64el,riscv64
bullseye	0.2.0-8	amd64,arm64

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

This program is designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes.

Please cite: Michael Inouye, Harriet Dashnow, Lesley-Ann Raven, Mark B Schultz, Bernard J Pope, Takehiro Tomita, Justin Zobel and Kathryn E Holt: SRST2: Rapid genomic surveillance for public health and hospital microbiology labs. (PubMed,eprint) Genome Medicine 6(11):90 (2014)

Registry entries: Bioconda

Upload screenshot

ssake

genomics application for assembling millions of very short DNA sequences

https://www.bcgsc.ca/platform/bioinfo/software/ssake

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Popcon: 6 users (32 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

The Short Sequence Assembly by K-mer search and 3′ read Extension (SSAKE) is a genomics application for aggressively assembling millions of short nucleotide sequences by progressively searching for perfect 3′-most k-mers using a DNA prefix tree. SSAKE is designed to help leverage the information from short sequences reads by stringently clustering them into contigs that can be used to characterize novel sequencing targets.

Please cite: Rene L. Warren, Granger G. Sutton, Steven J. M. Jones and Robert A. Holt: Assembling millions of short DNA sequences using SSAKE. (PubMed,eprint) Bioinformatics 23(4):500-501 (2007)

Registry entries: Bio.tools SciCrunch Bioconda

Topics: Sequence assembly

Upload screenshot

stacks

pipeline for building loci from short-read DNA sequences

https://creskolab.uoregon.edu/stacks/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package stacks
Release	Version	Architectures
bookworm	2.62+dfsg-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
forky	2.68+dfsg-2	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bullseye	2.55+dfsg-1	amd64,arm64,armhf,i386
sid	2.68+dfsg-2	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
trixie	2.68+dfsg-2	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography.

Note that this package installs Stacks such that all commands must be run as: $ stacks

The package is enhanced by the following packages: multiqc

Please cite: Julian Catchen, Paul A. Hohenlohe, Susan Bassham, Angel Amores and William A. Cresko: Stacks: an analysis tool set for population genomics. (PubMed) Molecular Ecology 22(11):3124-40 (2013)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

stringtie

assemble short RNAseq reads to transcripts

https://ccb.jhu.edu/software/stringtie/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package stringtie
Release	Version	Architectures
sid	3.0.3+ds-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
trixie	2.2.1+ds-3	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	2.1.4+ds-4	amd64,arm64,armhf,i386
bookworm	2.2.1+ds-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
forky	3.0.3+ds-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

The abundance of transcripts in a human tissue sample can be determined by RNA sequencing. The exact sequence sampled may be random, depending on the technology used. And it may be short, i.e. shorter than the transcript. At some point, many shorter reads need to be assembled to the model the complete transcripts.

StringTie knows how to assemble of RNA-Seq into potential transcripts without the need of a reference genome and provides a quantification also of the splice variants.

Please cite: Mihaela Pertea, Geo M. Pertea, Corina .M. Antonescu, Tsung-Cheng Chang, Joshua T. Mendell and Steven L. Salzberg: StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nature Biotechnology 33:290–295 (2015)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

subread

toolkit for processing next-gen sequencing data

http://sourceforge.net/projects/subread/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package subread
Release	Version	Architectures
forky	2.0.8+dfsg-1	amd64,arm64,armhf,i386,ppc64el,riscv64
sid	2.0.8+dfsg-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64
trixie	2.0.8+dfsg-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64
bullseye	2.0.1+dfsg-1	amd64,arm64,armhf,i386
bookworm	2.0.3+dfsg-1	amd64,arm64,armel,armhf,i386,ppc64el
upstream	2.1.1

Popcon: 11 users (30 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Subread aligner can be used to align both gDNA-seq and RNA-seq reads. Subjunc aligner was specified designed for the detection of exon-exon junction. For the mapping of RNA-seq reads, Subread performs local alignments and Subjunc performs global alignments.

Please cite: Yang Lian, Gordon K. Smyth and Wei Shi: The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. (PubMed) Nucleic Acids Research 47(8):e47-e47 (2019)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

sumaclust

fast and exact clustering of genomic sequences

http://metabarcoding.org/sumaclust

Maintainer: Debian Med Packaging Team (Pierre Gruet)

Versions of package sumaclust
Release	Version	Architectures
trixie	1.0.36+ds-4	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	1.0.36+ds-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye	1.0.36+ds-1	amd64,arm64,armhf,i386
forky	1.0.36+ds-4	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	1.0.36+ds-4	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x

Popcon: 5 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

With the development of next-generation sequencing, efficient tools are needed to handle millions of sequences in reasonable amounts of time. Sumaclust is a program developed by the LECA. Sumaclust aims to cluster sequences in a way that is fast and exact at the same time. This tool has been developed to be adapted to the type of data generated by DNA metabarcoding, i.e. entirely sequenced, short markers. Sumaclust clusters sequences using the same clustering algorithm as UCLUST and CD- HIT. This algorithm is mainly useful to detect the 'erroneous' sequences created during amplification and sequencing protocols, deriving from 'true' sequences.

Registry entries: Bioconda

Upload screenshot

sumatra

fast and exact comparison and clustering of sequences

http://metabarcoding.org/sumatra

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package sumatra
Release	Version	Architectures
bookworm	1.0.36+ds-2	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	1.0.36+ds-2	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	1.0.36+ds-1	amd64,arm64,armhf,i386
forky	1.0.36+ds-3	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
sid	1.0.36+ds-3	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x

Popcon: 8 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

With the development of next-generation sequencing, efficient tools are needed to handle millions of sequences in reasonable amounts of time. Sumatra is a program developed by the LECA. Sumatra aims to compare sequences in a way that is fast and exact at the same time. This tool has been developed to be adapted to the type of data generated by DNA metabarcoding, i.e. entirely sequenced, short markers. Sumatra computes the pairwise alignment scores from one dataset or between two datasets, with the possibility to specify a similarity threshold under which pairs of sequences that have a lower similarity are not reported. The output can then go through a classification process with programs such as MCL or MOTHUR.

Registry entries: SciCrunch

Upload screenshot

tabix

indexador genérico para arquivos de posicionamento de genoma delimitados por TAB

https://github.com/samtools/htslib

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package tabix
Release	Version	Architectures
bookworm	1.16+ds-3	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	1.22.1+ds2-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	1.11-4	amd64,arm64,armhf,i386
forky	1.22.1+ds2-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	1.21+ds-1	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
upstream	1.23

Popcon: 22 users (49 upd.)^*

Newer upstream!

Edit Debtags

License: DFSG free

Official Debian package

Git

Fix translated description

Tabix indexa arquivos nos quais algumas colunas indicam coordenadas de sequência: nome (em geral um cromossomo), começo e fim. Os dados de entrada devem ser ordenados pela posição e comprimidos com bgzip (fornecido neste pacote), que tem uma interface semelhante ao gzip. Depois da indexação, tabix pode recuperar rapidamente linhas de dados por coordenadas de cromossomos. A recuperação rápida de dados também se dá por rede se uma URI for dada como nome de arquivo.

Este pacote foi construído com a fonte HTSlib, e fornece as ferramentas bgzip, htsfile e tabix.

Please cite: Heng Li: Tabix: fast retrieval of sequence features from generic TAB-delimited files. (PubMed,eprint) Bioinformatics 27(5):718-719 (2011)

Registry entries: Bio.tools Bioconda

transrate-tools

helper for transrate

https://hibberdlab.com/transrate/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package transrate-tools
Release	Version	Architectures
sid	1.0.0-7	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	1.0.0-3	amd64,arm64,armhf,i386
bookworm	1.0.0-5	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie	1.0.0-5	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
forky	1.0.0-7	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Transrate is a library and command-line tool for quality assessment of de-novo transcriptome assemblies.

This package provides command line tools used by transrate to process BAM files.

Please cite: Richard Smith-Unna, Chris Boursnell, Rob Patro, Julian M. Hibberd and Steven Kelly: TransRate: reference-free quality assessment of de novo transcriptome assemblies.. (PubMed,eprint) Genome Research 26(8):1134-1144 (2016)

Registry entries: Bioconda

Upload screenshot

trimmomatic

flexible read trimming tool for Illumina NGS data

http://www.usadellab.org/cms/index.php?page=trimmomatic

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package trimmomatic
Release	Version	Architectures
forky	0.39+dfsg-2	all
bullseye	0.39+dfsg-2	all
bookworm	0.39+dfsg-2	all
trixie	0.39+dfsg-2	all
sid	0.39+dfsg-2	all

Popcon: 10 users (34 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Trimmomatic performs a variety of useful trimming tasks for illumina paired-end and single ended data.The selection of trimming steps and their associated parameters are supplied on the command line.

The current trimming steps are:

ILLUMINACLIP: Cut adapter and other illumina-specific sequences from the read.
SLIDINGWINDOW: Perform a sliding window trimming, cutting once thes average quality within the window falls below a threshold.
LEADING: Cut bases off the start of a read, if below a threshold quality
TRAILING: Cut bases off the end of a read, if below a threshold quality
CROP: Cut the read to a specified length
HEADCROP: Cut the specified number of bases from the start of the read
MINLENGTH: Drop the read if it is below a specified length
TOPHRED33: Convert quality scores to Phred-33
TOPHRED64: Convert quality scores to Phred-64 It works with FASTQ (using phred + 33 or phred + 64 quality scores, depending on the Illumina pipeline used), either uncompressed or gzipp'ed FASTQ. Use of gzip format is determined based on the .gz extension.

The package is enhanced by the following packages: multiqc

Please cite: A.M. Bolger, M. Lohse and B. Usadel: Trimmomatic: a flexible trimmer for Illumina sequence data. (PubMed,eprint) Bioinformatics 30(15):2114-2120 (2014)

Registry entries: Bio.tools SciCrunch Bioconda

Topics: Sequencing

trinityrnaseq

RNA-Seq De novo Assembly

https://github.com/trinityrnaseq/trinityrnaseq

Maintainer: Debian Med Packaging Team (Santiago Vila)

Versions of package trinityrnaseq
Release	Version	Architectures
sid	2.15.2+dfsg-3	amd64,arm64,loong64,ppc64el,riscv64
bullseye	2.11.0+dfsg-6	amd64,arm64
trixie	2.15.2+dfsg-1	amd64,arm64,ppc64el,riscv64

Popcon: 9 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Trinity represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes.

Please cite: Manfred G Grabherr, Brian J Haas, Moran Yassour, Joshua Z Levin, Dawn A Thompson, Ido Amit, Xian Adiconis, Lin Fan, Raktima Raychowdhury, Qiandong Zeng, Zehua Chen, Evan Mauceli, Nir Hacohen, Andreas Gnirke, Nicholas Rhind, Federica di Palma, Bruce W Birren, Chad Nusbaum, Kerstin Lindblad-Toh, Nir Friedman and Aviv Regev: Full-length transcriptome assembly from RNA-Seq data without a reference genome.. (PubMed) Nature Biotechnology 29(7):644-652 (2011)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

uc-echo

error correction algorithm designed for short-reads from NGS

https://uc-echo.sourceforge.net/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package uc-echo
Release	Version	Architectures
sid	1.12-19	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
trixie	1.12-19	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
forky	1.12-19	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bullseye	1.12-15	amd64,arm64,armhf,i386
bookworm	1.12-18	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

ECHO is an error correction algorithm designed for short-reads from next-generation sequencing platforms such as Illumina's Genome Analyzer II. The algorithm uses a Bayesian framework to improve the quality of the reads in a given data set by employing maximum a posteriori estimation.

Please cite: W.-C. Kao, A.H. Chan and Y.S. Song: ECHO: A reference-free short-read error correction algorithm. (PubMed,eprint) Genome Research 21:1181-1192 (2011)

Registry entries: Bio.tools SciCrunch

Topics: Data management; Sequencing

Upload screenshot

vcftools

Collection of tools to work with VCF files

https://vcftools.github.io/

Maintainer: Debian Med Packaging Team (Dylan Aïssi)

Versions of package vcftools
Release	Version	Architectures
bullseye	0.1.16-2	amd64,arm64,armhf,i386
sid	0.1.17-1	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	0.1.17-1	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	0.1.16-3	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	0.1.16-3	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 19 users (37 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide methods for working with VCF files: validating, merging, comparing and calculate some basic population genetic statistics.

The package is enhanced by the following packages: multiqc

Please cite: Petr Danecek, Adam Auton, Goncalo Abecasis, Cornelis A. Albers, Eric Banks, Mark A. DePristo, Robert E. Handsaker, Gerton Lunter, Gabor T. Marth, Stephen T. Sherry, Gilean McVean and Richard Durbin: The variant call format and VCFtools. (PubMed,eprint) Bioinformatics 27(15):2156-8 (2011)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

velvet

Nucleic acid sequence assembler for very short reads

https://github.com/dzerbino/velvet

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package velvet
Release	Version	Architectures
forky	1.2.10+dfsg1-10	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
bookworm	1.2.10+dfsg1-8	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	1.2.10+dfsg1-10	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	1.2.10+dfsg1-7	amd64,arm64,armhf,i386
trixie	1.2.10+dfsg1-9	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x

Popcon: 9 users (32 upd.)^*

Versions and Archs

Edit Debtags

License: DFSG free

Official Debian package

Git

Translate description

Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454, developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI), near Cambridge, in the United Kingdom.

Velvet currently takes in short read sequences, removes errors then produces high quality unique contigs. It then uses paired read information, if available, to retrieve the repeated areas between contigs.

Please cite: Daniel R. Zerbino and Ewan Birney: Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. (PubMed,eprint) Genome Research 18(5):821-829 (2008)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

velvet-long

Nucleic acid sequence assembler for very short reads, long version

https://github.com/dzerbino/velvet

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package velvet-long
Release	Version	Architectures
bullseye	1.2.10+dfsg1-7	amd64,arm64,armhf,i386
sid	1.2.10+dfsg1-10	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	1.2.10+dfsg1-10	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	1.2.10+dfsg1-9	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	1.2.10+dfsg1-8	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 0 users (1 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

This package installs special long-mode versions of Velvet, as recommended in the Velvet tutorials.

Please cite: Daniel R. Zerbino and Ewan Birney: Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. (PubMed,eprint) Genome Research 18(5):821-829 (2008)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

velvetoptimiser

automatically optimise Velvet do novo assembly parameters

https://github.com/tseemann/VelvetOptimiser/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package velvetoptimiser
Release	Version	Architectures
bullseye	2.2.6-3	all
forky	2.2.6-5	all
trixie	2.2.6-5	all
bookworm	2.2.6-5	all
sid	2.2.6-5	all

Popcon: 6 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

VelvetOptimiser is a multi-threaded Perl script for automatically optimising the three primary parameter options (K, -exp_cov, -cov_cutoff) for the Velvet de novo sequence assembler.

Registry entries: Bio.tools Bioconda

Upload screenshot

vsearch

tool for processing metagenomic sequences

https://github.com/torognes/vsearch/

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package vsearch
Release	Version	Architectures
sid	2.30.4-1	amd64,arm64,loong64,ppc64el,riscv64
bookworm	2.22.1-1	amd64,arm64,ppc64el
bullseye	2.15.2-3	amd64,arm64
trixie	2.30.0-1	amd64,arm64,ppc64el,riscv64
forky	2.30.4-1	amd64,arm64,ppc64el,riscv64

Popcon: 5 users (31 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Versatile 64-bit multithreaded tool for processing metagenomic sequences, including searching, clustering, chimera detection, dereplication, sorting, masking and shuffling

The aim of this project is to create an alternative to the USEARCH tool developed by Robert C. Edgar (2010). The new tool should:

have a 64-bit design that handles very large databases and much more than 4GB of memory
be as accurate or more accurate than usearch
be as fast or faster than usearch

The package is enhanced by the following packages: vsearch-examples

Please cite: Torbjørn Rognes, Tomáš Flouri, Ben Nichols, Christopher Quince and Frédéric Mahé: VSEARCH: a versatile open source tool for metagenomics. (eprint) PeerJ 4:e2584

Registry entries: Bio.tools Bioconda

Upload screenshot

wham-align

Wisconsin's High-Throughput Alignment Method

http://research.cs.wisc.edu/wham

Maintainer: Debian Med Packaging Team (Nilesh Patra)

Versions of package wham-align
Release	Version	Architectures
sid	0.1.5-8	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
bullseye	0.1.5-8	amd64,arm64,armhf,i386
forky	0.1.5-8	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	0.1.5-8	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bookworm	0.1.5-8	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

This package provides functionality analogous to BWA or bowtie in aligning reads from next-generation DNA sequencing machines against a reference genome.

Please cite: Yinan Li, Allie Terrell and Jignesh M. Patel: WHAM: A High-throughput Sequence Alignment Method (eprint) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece (2011)

Registry entries: Bio.tools SciCrunch Bioconda

Upload screenshot

wigeon

reimplementation of the Pintail 16S DNA anomaly detection utility

https://microbiomeutil.sourceforge.net/

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package wigeon
Release	Version	Architectures
bullseye	20101212+dfsg1-4	all
forky	20101212+dfsg1-6	all
trixie	20101212+dfsg1-6	all
bookworm	20101212+dfsg1-5	all
sid	20101212+dfsg1-6	all

Popcon: 5 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

WigeoN examines the sequence conservation between a query and a trusted reference sequence, both in NAST alignment format. Based on the sequence identity between the query and the reference sequence, there is an expected amount of variation among the alignment. If the observed variation is greater than the 95% quantile of the distribution of variation observed between non-anomalous sequences, then it is flagged as an anomaly.

WigeoN is a flexible command-line based reimplementation of the Pintail algorithm Appl Environ Microbiol. 2005 Dec;7112:7724-36.

WigeoN is useful for flagging chimeras and anomalies only in near full-length 16S rRNA sequences. WigeoN lacks sensitivity with sequences less than 1000 bp.

To run WigeoN, you need NAST-formatted sequences generated by the nast-ier utility.

WigeoN is part of the microbiomeutil suite.

The package is enhanced by the following packages: microbiomeutil-data

Registry entries: SciCrunch

Upload screenshot

Official Debian packages with lower relevance

nanolyse

remove lambda phage reads from a fastq file

https://github.com/wdecoster/nanolyse

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package nanolyse
Release	Version	Architectures
bullseye	1.2.0-1	amd64,arm64,armhf,i386
bookworm	1.2.0-4	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	1.2.0-4	amd64,arm64,armhf,i386,loong64,ppc64el,riscv64,s390x
forky	1.2.0-4	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x
trixie	1.2.0-4	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x

Popcon: 4 users (32 upd.)^*

Versions and Archs

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

NanoLyse is a tool for rapid removal of contaminant DNA, using the Minimap2 aligner through the mappy Python binding. A typical application would be the removal of the lambda phage control DNA fragment supplied by ONT, for which the reference sequence is included in the package. However, this approach may lead to unwanted loss of reads from regions highly homologous to the lambda phage genome.

Please cite: Wouter De Coster, Svenn D’Hert, Darrin T Schultz, Marc Cruts and Christine Van Broeckhoven: NanoPack: visualizing and processing long-read sequencing data. (PubMed,eprint) Bioinformatics 34(15):2666-2669 (2018)

Registry entries: Bioconda

Upload screenshot

python3-anndata

annotated gene by sample numpy matrix

https://github.com/theislab/anndata

Maintainer: Debian Med Packaging Team (Michael R. Crusoe)

Versions of package python3-anndata
Release	Version	Architectures
sid	0.12.6-1	all
bullseye	0.7.5+ds-3	all
bookworm	0.8.0-4	all
upstream	0.12.10

Popcon: 2 users (1 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

AnnData provides a scalable way of keeping track of data together with learned annotations. It is used within Scanpy, for which it was initially developed. Both packages have been introduced in Genome Biology (2018).

Please cite: F. Alexander Wolf, Philipp Angerer and Fabian J. Theis: SCANPY: large-scale single-cell gene expression data analysis.. (PubMed) Genome Biol. 19:15 (2018)

Registry entries: Bioconda

Upload screenshot

r-bioc-isoformswitchanalyzer

Identify, Annotate and Visualize Alternative Splicing and

https://bioconductor.org/packages/IsoformSwitchAnalyzeR/

Maintainer: Debian R Packages Maintainers (Michael R. Crusoe)

Versions of package r-bioc-isoformswitchanalyzer
Release	Version	Architectures
trixie	2.6.0+ds-2	amd64,arm64,ppc64el,riscv64,s390x
sid	2.6.0+ds-2	amd64,arm64,loong64,ppc64el,riscv64,s390x
bookworm	1.20.0+ds-1	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream	2.10.0

Popcon: 1 users (1 upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Isoform Switches with Functional Consequences from both short- and long-read RNA-seq data. Analysis of alternative splicing and isoform switches with predicted functional consequences (e.g. gain/loss of protein domains etc.) from quantification of all types of RNASeq by tools such as Kallisto, Salmon, StringTie, Cufflinks/Cuffdiff etc.

Registry entries: Bio.tools Bioconda

Upload screenshot

r-bioc-mofa2

??? missing short description for package r-bioc-mofa2 :-(

https://bioconductor.org/packages/MOFA2/

Maintainer: Debian R Packages Maintainers (Michael R. Crusoe)

Popcon: users ( upd.)^*

Newer upstream!

Go tagging

License: DFSG free

Official Debian package

Git

Translate description

Registry entries: Bio.tools Bioconda

Upload screenshot

Debian packages in contrib or non-free

bcbio

toolkit for analysing high-throughput sequencing data

https://github.com/bcbio/bcbio-nextgen

Maintainer: Debian Med Packaging Team (Alexandre Detiste)

Popcon: 0 users (0 upd.)^*

Versions and Archs

Go tagging

License: DFSG free, but needs non-free components

Debian package in contrib/non-free

Git

This package installs the command line tools of the bcbio-nextgen toolkit implementing best-practice pipelines for fully automated high throughput sequencing analysis.

A high-level configuration file specifies inputs and analysis parameters to drive a parallel pipeline that handles distributed execution, idempotent processing restarts and safe transactional steps. The project contributes a shared community resource that handles the data processing component of sequencing analysis, providing researchers with more time to focus on the downstream biology.

This package builds and having it in Debian unstable helps the Debian developers to synchronize their efforts. But unless a series of external dependencies are not installed manually, the functionality of bcbio in Debian is only a shadow of itself. Please use the official distribution of bcbio for the time being, which means "use conda". The TODO file in the Debian directory should give an overview on progress for Debian packaging.

Registry entries: Bio.tools Bioconda

cufflinks

Transcript assembly, differential expression and regulation for RNA-Seq

https://cufflinks.cbcb.umd.edu

Maintainer: Debian Med Packaging Team (Étienne Mollier)

Versions of package cufflinks
Release	Version	Architectures
trixie	2.2.1+dfsg.1-10 (non-free)	amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
bullseye	2.2.1+dfsg.1-8 (non-free)	amd64,arm64,armhf,i386
bookworm	2.2.1+dfsg.1-9 (non-free)	amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid	2.2.1+dfsg.1-10 (non-free)	amd64,arm64,armhf,i386,ppc64el,riscv64,s390x

Popcon: 1 users (1 upd.)^*

Versions and Archs

Go tagging

License: non-free

Debian package in contrib/non-free

Git

Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one.

This package provides the binary of cufflinks and associated tools, i.e. compress_gtf, cuffcompare, cuffdiff, cuffmerge, cuffnorm, cuffquant and gtf_to_sam.

Please cite: Cole Trapnell, Brian A Williams, Geo Pertea, Ali Mortazavi, Gordon Kwan, Marijke J van Baren, Steven L Salzberg, Barbara J Wold and Lior Pachter: Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. (PubMed) Nature Biotechnology 28(5):511-515 (2010)

Registry entries: Bio.tools SciCrunch Bioconda

vdjtools

framework for post-analysis of B/T cell repertoires

https://github.com/mikessh/vdjtools

Maintainer: Debian Med Packaging Team (Andreas Tille)

Versions of package vdjtools
Release	Version	Architectures
bullseye	1.2.1+git20190311-5 (non-free)	all
sid	1.2.1+git20190311+repack-2 (non-free)	all
trixie	1.2.1+git20190311+repack-2 (non-free)	all
bookworm	1.2.1+git20190311+repack-1 (non-free)	all
forky	1.2.1+git20190311+repack-2 (non-free)	all

Popcon: 0 users (0 upd.)^*

Versions and Archs

Go tagging

License: non-free

Debian package in contrib/non-free

Git

VDJtools is an open-source Java/Groovy-based framework designed to facilitate analysis of immune repertoire sequencing (RepSeq) data. VDJtools computes a wide set of statistics and is able to perform various forms of cross-sample analysis. Both comprehensive tabular output and publication-ready plots are provided.

The main aims of the VDJtools Project are:

To ensure consistency between post-analysis methods and results
To save the time of bioinformaticians analyzing RepSeq data
To create an API framework facilitating development of new RepSeq analysis applications
To provide a simple enough command line tool so it could be used by immunologists and biologists with little computational background

Please cite: M Shugay, D.V. Bagaev, M.A. Turchaninova, D.A. Bolotin, O.V. Britanova, E.V. Putintseva, M.V. Pogorelyy, V.I. Nazarov VI, I.V. Zvyagin, V.I. Kirgizova, K.I. Kirgizov, E.V. Skorobogatova and D.M. Chudakov: VDJtools: Unifying Post-analysis of T Cell Receptor Repertoires. (PubMed,eprint) PLoS Comput Biol. 11(11):e1004503 (2015)

Packaging has started and developers might try the packaging code in VCS

giira

RNA-Seq driven gene finding incorporating ambiguous reads

http://sourceforge.net/projects/giira/

Responsible: Debian Med Packaging Team (Andreas Tille)

Versions and Archs

License: GPL-3

Debian package not available

Git

Version: 0.0.20140625-2

GIIRA is a gene prediction method that identifies potential coding regions exclusively based on the mapping of reads from an RNA-Seq experiment. It was foremost designed for prokaryotic gene prediction and is able to resolve genes within the expressed region of an operon. However, it is also applicable to eukaryotes and predicts exon intron structures as well as alternative isoforms.

Please cite: Franziska Zickmann, Martin S. Lindner and Bernhard Y. Renard: GIIRA—RNA-Seq driven gene finding incorporating ambiguous reads. (PubMed,eprint) Bioinformatics (2013)

Registry entries: Bio.tools SciCrunch

graphmap2

highly sensitive and accurate mapper for long, error-prone reads

https://github.com/lbcb-sci/graphmap2

Responsible: Debian Med Packaging Team (Andreas Tille)

Versions and Archs

License: MIT

Debian package not available

Git

Version: 0.6.4-1

GraphMap2 is a highly sensitive and accurate mapper for long, error- prone reads. The mapping algorithm is designed to analyse nanopore sequencing reads, which progressively refines candidate alignments to robustly handle potentially high-error rates and a fast graph traversal to align long reads with speed and high precision (>95%). Evaluation on MinION sequencing data sets against short- and long-read mappers indicates that GraphMap increases mapping sensitivity by 10–80% and maps

95% of bases. GraphMap alignments enabled single-nucleotide variant calling on the human genome with increased sensitivity (15%) over the next best mapper, precise detection of structural variants from length 100 bp to 4 kbp, and species and strain-specific identification of pathogens using MinION reads.

Please cite: Ivan Sović, Mile Šikić, Andreas Wilm, Shannon Nicole Fenlon, Swaine Chen and Niranjan Nagarajan: Fast and sensitive mapping of nanopore sequencing reads with GraphMap. (PubMed,eprint) Nature Communications 7(11307) (2016)

Registry entries: Bioconda

mosaik-aligner

reference-guided aligner for next-generation sequencing

https://github.com/wanpinglee/MOSAIK

Responsible: Debian Med Packaging Team (Andreas Tille)

Versions and Archs

License: MIT

Debian package not available

Git

Version: 2.2.30+20140627-1

MosaikBuild converts various sequence formats into Mosaik’s native read format. MosaikAligner pairwise aligns each read to a specified series of reference sequences. MosaikSort resolves paired-end reads and sorts the alignments by the reference sequence coordinates. Finally, MosaikText converts alignments to different text-based formats.

At this time, the workflow consists of supplying sequences in FASTA, FASTQ, Illumina Bustard & Gerald, or SRF file formats and producing results in the BLAT axt, the BAM/SAM, the UCSC Genome Browser bed, or the Illumina ELAND formats.

nanoplot

plotting scripts for long read sequencing data

https://github.com/wdecoster/NanoPlot

Responsible: Debian Med Packaging Team (Andreas Tille)

Versions and Archs

License: MIT

Debian package not available

Git

Version: 1.36.2-1

NanoPlot provides plotting scripts for long read sequencing data.

These scripts perform data extraction from Oxford Nanopore sequencing data in the following formats:

fastq files (optionally compressed)
fastq files generated by albacore, guppy or MinKNOW containing additional information (optionally compressed)
sorted bam files
sequencing_summary.txt output table generated by albacore, guppy or MinKnow basecalling (optionally compressed)
fasta files (optionally compressed)
multiple files of the same type can be offered simultaneously

Please cite: Wouter De Coster, Svenn D'Hert, Darrin T Schultz, Marc Cruts and Christine Van Broeckhoven: NanoPack: visualizing and processing long-read sequencing data. (PubMed,eprint) Bioinformatics 34(15):2666-2669 (2018)

Registry entries: Bioconda

umap

quantify genome and methylome mappability

https://bitbucket.org/hoffmanlab/umap

Responsible: Debian Med Packaging Team (Afif Elghraoui)

Versions and Archs

License: GPL-3.0

Debian package not available

Git

Version: 1.0.0-1

Umap identifies uniquely mappable regions of any genome. Its Bismap extension identifies mappability of the bisulfite converted genome (methylome).

Please cite: Mehran Karimzadeh, Carl Ernst, Anshul Kundaje and Michael M. Hoffman: Umap and Bismap: quantifying genome and methylome mappability. (PubMed,eprint) Nucleic Acids Res. 46(20):e120 (2018)

Registry entries: Bioconda

No known packages available

annovar

annotate genetic variants detected from diverse genomes

http://www.openbioinformatics.org/annovar/

License: Open Source for non-profit

Debian package not available

ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, as well as mouse, worm, fly, yeast and many others). Given a list of variants with chromosome, start position, end position, reference nucleotide and observed nucleotides, ANNOVAR can perform:

 1. Gene-based annotation: identify whether SNPs or CNVs cause protein coding
    changes and the amino acids that are affected. Users can flexibly use RefSeq
    genes, UCSC genes, ENSEMBL genes, GENCODE genes, or many other gene definition
    systems.
 2. Region-based annotations: identify variants in specific genomic regions,
    for example, conserved regions among 44 species, predicted transcription
    factor binding sites, segmental duplication regions, GWAS hits, database
    of genomic variants, DNAse I hypersensitivity sites, ENCODE
    H3K4Me1/H3K4Me3/H3K27Ac/CTCF sites, ChIP-Seq peaks, RNA-Seq peaks, or many
    other annotations on genomic intervals.
 3. Filter-based annotation: identify variants that are reported in dbSNP,
    or identify the subset of common SNPs (MAF>1%) in the 1000 Genome Project,
    or identify subset of non-synonymous SNPs with SIFT score>0.05, or many
    other annotations on specific mutations.
 4. Other functionalities: Retrieve the nucleotide sequence in any
    user-specific genomic positions in batch, identify a candidate gene list
    for Mendelian diseases from exome data, identify a list of SNPs from
    1000 Genomes that are in strong LD with a GWAS hit, and many other
    creative utilities.

In a modern desktop computer (3GHz Intel Xeon CPU, 8Gb memory), for 4.7 million variants, ANNOVAR requires ~4 minutes to perform gene-based functional annotation, or ~15 minutes to perform stepwise "variants reduction" procedure, making it practical to handle hundreds of human genomes in a day.

forge genome assembler for mixed read types http://combiol.org/forge/		License: Apache 2.0 Debian package not available
Forge Genome Assembler is a parallel, MPI based genome assembler for mixed read types. Forge is a classic "Overlap layout consensus" genome assembler written by Darren Platt and Dirk Evers. Implemented in C++ and using the parallel MPI library, it runs on one or more machines in a network and can scale to very large numbers of reads provided there is enough collective memory on the machines used. It generates a full consensus alignment of all reads, can handle mixtures of sanger, 454 and illumina reads. There is some support for solid color space and it includes built in tools for vector trimming and contamination screening. Forge and was originally developed at Exelixis and they have kindly agreed to place the software which underwent much subsequent development outside Exelixis, into the public domain. Forge works with most of the common MPI implementations.
Remark of Debian Med team: Competitor to MIRA2 and wgs-assembler This package was requested by William Spooner whs@eaglegenomics.com as a competitor to MIRA2 and wgs-assembler.