Debian Science Project
Summary
Data Management
Debian Science Data Management packages

This metapackage will install packages to assist with data management tasks, such as obtaining data from remote resources, keeping data under version control, etc.

Description

For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:

If you discover a project which looks like a good candidate for Debian Science to you, or if you have prepared an unofficial Debian package, please do not hesitate to send a description of that project to the Debian Science mailing list

Links to other tasks

Debian Science Data Management packages

Official Debian packages with high relevance

datalad
??? missing short description for package datalad :-(
Versions of package datalad
ReleaseVersionArchitectures
sid1.1.3-2all
bullseye0.14.0-1all
buster0.11.2-2all
stretch0.4.1-1all
trixie1.1.3-2all
bookworm0.18.1-2all
Popcon: 49 users (2 upd.)*
Versions and Archs
License: DFSG free
Git
datalad-container
DataLad extension for working with containerized environments
Maintainer: Yaroslav Halchenko
Versions of package datalad-container
ReleaseVersionArchitectures
bookworm1.1.9-1all
trixie1.2.5-1all
sid1.2.5-1all
buster0.2.2-2all
bullseye1.1.2-1all
Popcon: 5 users (1 upd.)*
Versions and Archs
License: DFSG free

This extension enhances DataLad (http://datalad.org) for working with computational containers.

git-annex
gestion de fichiers avec git, sans vérification de leur contenu dans git
Versions of package git-annex
ReleaseVersionArchitectures
bookworm-backports10.20240430-1~bpo12+1amd64,arm64,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm10.20230126-3amd64,arm64,i386,mips64el,mipsel,ppc64el,s390x
bullseye8.20210223-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster-backports8.20200330-1~bpo10+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster7.20190129-3amd64,arm64,armhf,i386
stretch-backports7.20190129-2~bpo9+1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
stretch-backports7.20181211-2~bpo9+1mips
stretch-backports6.20180913-1~bpo9+1mipsel
stretch6.20170101-1+deb9u2amd64,arm64,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie10.20240927-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid10.20240927-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch-security6.20170101-1+deb9u1amd64,i386
jessie-security5.20141125+oops-1+deb8u2amd64,armel,armhf,i386
jessie5.20141125+deb8u1amd64,armel,armhf,i386
bookworm-backports10.20240129-1~bpo12+1armel
Debtags of package git-annex:
develrcs
roleprogram
works-withfile
Popcon: 435 users (88 upd.)*
Versions and Archs
License: DFSG free
Git

Git-annex permet la gestion de grands fichiers avec git, sans stocker leur contenu dans git. Il peut synchroniser, restaurer et archiver les données en ligne ou hors ligne. Les sommes de contrôle et le chiffrement rendent les données sécurisées. Utilisez la puissance et la nature distribuée de git pour prendre en charge de grands fichiers avec git-annex.

Il peut stocker de gros fichiers à divers endroits, depuis les disques durs locaux à un grand nombre de services de stockage en ligne, y compris S3, WebDAV ou rsync et des douzaines de fournisseurs de stockage en ligne utilisables à partir de greffons. Les fichiers peuvent être stockés chiffrés avec gpg, de telle sorte que le fournisseur de stockage en ligne ne puisse pas voir les données. Git-annex conserve la trace de l'endroit où est stocké chaque fichier, afin de savoir combien de copies sont disponibles, et possède de nombreuses fonctionnalités pour assurer la préservation des données

Git-annex peut aussi être utilisé pour assurer la synchronisation d'un dossier entre plusieurs ordinateurs, en détectant les modifications de fichier et en les transmettant automatiquement à git pour transfert aux autres ordinateurs. L'application web de git-annex facilite la configuration et l'utilisation de git-annex à cette fin.

The package is enhanced by the following packages: elpa-git-annex elpa-magit-annex
Screenshots of package git-annex
hdf5-filter-plugin
filtres externes pour HDF5 : LZ4, BZip2, Bitshuffle
Versions of package hdf5-filter-plugin
ReleaseVersionArchitectures
trixie0.0~git20221111.49e3b65-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.0~git20221111.49e3b65-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.0~git20221111.49e3b65-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Le mécanisme externe de filtrage introduit avec HDF5 1.8.12 permet aux applications d’utiliser des filtres personnalisés non fournis par la bibliothèque centrale d’HDF5 sans recompiler l’application. Ce paquet fournit des filtres externes pour HDF5 pour :

 – l’algorithme de compression LZ4 ;
 – la compression BZip2.
hdf5-filter-plugin-blosc-serial
bibliothèque de compression sans perte, de réarrangement et de blocs
Versions of package hdf5-filter-plugin-blosc-serial
ReleaseVersionArchitectures
bookworm0.0~git20220616.9683f7d-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.0~git20220616.9683f7d-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid0.0~git20220616.9683f7d-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream0.0~git20240808.b108ad1
Popcon: 6 users (4 upd.)*
Newer upstream!
License: DFSG free
Git

Ce paquet fournit un filtre pour HDF5 qui utilise le compresseur Blosc. En installant ce filtre, il est possible de lire et écrire des fichiers HDF5 avec des ensembles de données compressés avec Blosc.

hdf5-filter-plugin-zfp-serial
Compression plugin for the HDF5 library using ZFP compression
Versions of package hdf5-filter-plugin-zfp-serial
ReleaseVersionArchitectures
sid1.1.1-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
trixie1.1.1-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bookworm1.1.0+git20221021-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

H5Z-ZFP is a compression filter for HDF5 using the ZFP compression library, supporting lossy and lossless compression of floating point and integer data to meet bitrate, accuracy, and/or precision targets.

nexus-tools
format scientifique Nexus de fichiers de données – applications
Versions of package nexus-tools
ReleaseVersionArchitectures
jessie4.3.2-svn1921-2amd64,armel,armhf,i386
bookworm4.4.3-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye4.4.3-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid4.4.3-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie4.4.3-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

NeXus est un format de données courant pour la science des rayons X, des neutrons et des muons. Il a été développé comme norme internationale par les scientifiques et les programmeurs des institutions scientifiques majeures d’Europe, d’Asie, d’Australie et d’Amérique du Nord dans le but d’améliorer la coopération pour l’analyse et la visualisation de données de neutrons, de rayons X et de muons.

Ce paquet fournit quelques applications pour lire et écrire des fichiers NeXus.

plfit
fitting power-law distributions to empirical data -- interfaces
Versions of package plfit
ReleaseVersionArchitectures
bookworm0.9.4+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
sid0.9.6+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.9.6+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The plfit software fits power-law distributions to empirical (discrete or continuous) data, according to the method of Clauset, Shalizi and Newman [SIAM Review 51, 661-703 (2009)].

This package provides two command line utilities, plfit and plgen.

The package is enhanced by the following packages: plfit-doc
python3-jdata
JData encoder/decoder for python 3
Versions of package python3-jdata
ReleaseVersionArchitectures
bullseye0.3.6-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.3.6-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.3.6-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The JData Specification (https://github.com/fangq/jdata/) defines a lightweight language-independent data annotation interface targeted at storing and sharing complex data structures across different programming languages such as MATLAB, JavaScript, python etc. Using JData formats, a complex python data structure can be encoded as a dict object that is easily serialized as a JSON/binary JSON file and share such data between programs of different languages.

python3-mdp
boite à outils modulaire pour le traitement de données
Versions of package python3-mdp
ReleaseVersionArchitectures
jessie3.3-2all
stretch3.5-1all
bullseye3.6-1.1all
bookworm3.6-2amd64,arm64,mips64el,ppc64el
trixie3.6-8all
sid3.6-8all
Popcon: 16 users (12 upd.)*
Versions and Archs
License: DFSG free
Git

Il s’agit d’un cadriciel en Python de traitement de données pour construire des logiciels complexes de traitement de données en combinant des algorithmes d’apprentissage automatique largement utilisés dans des tuyauteries et réseaux. Les algorithmes implémentés incluent l'analyse en composantes principales (PCA), l'analyse en composantes indépendantes (ICA), Slow Feature Analysis (SFA — analyse des variations lentes), Independent Slow Feature Analysis (ISFA), Growing Neural Gas (GNG — réseau neuronal artificiel incrémental), l’analyse factorielle, l’analyse discriminante linéaire de Fisher (FDA) et les classifieurs gaussiens.

The package is enhanced by the following packages: python3-sklearn
python3-nxs
format NeXus de fichiers de données scientifiques – liaisons de Python 3
Versions of package python3-nxs
ReleaseVersionArchitectures
trixie4.4.1-5all
bookworm4.4.1-4all
bullseye4.4.1-3all
sid4.4.1-5all
Popcon: 2 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

NeXus est un format de données courant pour la science des rayons X, des neutrons et des muons. Il a été développé comme norme internationale par les scientifiques et les programmeurs des institutions scientifiques majeures d’Europe, d’Asie, d’Australie et d’Amérique du Nord dans le but d’améliorer la coopération pour l’analyse et la visualisation de données de neutrons, de rayons X et de muons.

Ce paquet fournit les liaisons de Python 3.

python3-pyzoltan
Wrapper for the Zoltan data management library
Versions of package python3-pyzoltan
ReleaseVersionArchitectures
bookworm1.0.1-5+deb12u1amd64,arm64,ppc64el,s390x
sid1.0.1-12amd64,arm64,mips64el,ppc64el,riscv64,s390x
trixie1.0.1-12amd64,arm64,mips64el,ppc64el,riscv64,s390x
bullseye1.0.1-2+deb11u1amd64,arm64,ppc64el,s390x
Popcon: 2 users (12 upd.)*
Versions and Archs
License: DFSG free
Git

PyZoltan is as the name suggests, is a Python wrapper for the Zoltan data management library.

In PyZoltan, only specific routines and objects are wrapped. The following features of Zoltan are currently supported:

  • Dynamic load balancing using geometric algorithms
  • Unstructured point-to-point communication
  • Distributed data directories
virtuoso-opensource
high-performance database
Versions of package virtuoso-opensource
ReleaseVersionArchitectures
trixie7.2.12+dfsg-1all
stretch6.1.6+dfsg2-4all
buster6.1.6+dfsg2-4all
sid7.2.12+dfsg-1all
jessie6.1.6+dfsg2-2all
bullseye7.2.5.1+dfsg1-0.1all
bookworm7.2.5.1+dfsg1-0.3all
upstream7.2.13
Debtags of package virtuoso-opensource:
rolemetapackage, program
works-withdb
Popcon: 0 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

OpenLink Virtuoso is a high-performance object-relational SQL database. It provides transactions, a smart SQL compiler, hot backup, SQL:1999 support, a powerful stored-procedure language supporting server-side Java or .NET, and more. It supports all major data-access interfaces, including ODBC, JDBC, ADO.NET, and OLE/DB.

Virtuoso supports SPARQL embedded into SQL for querying RDF data stored in its database. SPARQL benefits from low-level support in the engine itself, such as SPARQL-aware type-casting rules and a dedicated IRI data type.

Install this metapackage for the full suite of packages that make up Virtuoso OSE ("Open-Source Edition").

visidata
rapidly explore columnar data in the terminal
Versions of package visidata
ReleaseVersionArchitectures
bookworm2.11-1all
bullseye2.2.1-1all
trixie3.0.2-1all
sid3.0.2-1all
buster1.5.2-1all
upstream3.1.1
Popcon: 43 users (13 upd.)*
Newer upstream!
License: DFSG free
Git

VisiData is a multipurpose terminal utility for exploring, cleaning, restructuring and analysing tabular data. Current supported sources are TSV, CSV, fixed-width text, JSON, SQLite, HTTP, HTML, .xls, and .xlsx (Microsoft Excel).

Official Debian packages with lower relevance

libnexus-dev
NeXus scientific data file format - development libraries
Versions of package libnexus-dev
ReleaseVersionArchitectures
bookworm4.4.3-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid4.4.3-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie4.4.3-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye4.4.3-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

NeXus is a common data format for neutron, X-ray, and muon science. It is being developed as an international standard by scientists and programmers representing major scientific facilities in Europe, Asia, Australia, and North America in order to facilitate greater cooperation in the analysis and visualization of neutron, X-ray, and muon data.

This is the package containing the development libraries.

libnexus-java
NeXus scientific data file format - java libraries
Versions of package libnexus-java
ReleaseVersionArchitectures
sid4.4.3-6all
bullseye4.4.3-5all
bookworm4.4.3-5all
trixie4.4.3-6all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

NeXus is a common data format for neutron, X-ray, and muon science. It is being developed as an international standard by scientists and programmers representing major scientific facilities in Europe, Asia, Australia, and North America in order to facilitate greater cooperation in the analysis and visualization of neutron, X-ray, and muon data.

This is the package containing the java libraries.

libplfit-dev
fitting power-law distributions to empirical data -- development
Versions of package libplfit-dev
ReleaseVersionArchitectures
bookworm0.9.4+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie0.9.6+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid0.9.6+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The plfit software fits power-law distributions to empirical (discrete or continuous) data, according to the method of Clauset, Shalizi and Newman [SIAM Review 51, 661-703 (2009)].

This package contains the header files, static libraries and symbolic links that developers using the plfit library will need.

The package is enhanced by the following packages: plfit-doc
python3-openpyxl
Python 3 module to read/write OpenXML xlsx/xlsm files
Versions of package python3-openpyxl
ReleaseVersionArchitectures
sid3.1.5+dfsg-1all
stretch2.3.0-3all
buster2.4.9-1all
bookworm3.0.9-1all
trixie3.1.5+dfsg-1all
bullseye3.0.3-1all
Popcon: 233 users (340 upd.)*
Versions and Archs
License: DFSG free
Git

Openpyxl is a pure Python 3 module to read/write Excel 2007 (OpenXML) xlsx/xlsm files.

This package contains the module itself.

python3-opentsne
t-Distributed Stochastic Neighbor Embedding algorithm
Versions of package python3-opentsne
ReleaseVersionArchitectures
trixie1.0.2-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.0.2-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Modular Python implementation of t-Distributed Stochasitc Neighbor Embedding (t-SNE), a popular dimensionality-reduction algorithm for visualizing high-dimensional data sets. openTSNE incorporates the latest improvements to the t-SNE algorithm, including the ability to add new data points to existing embeddings, massive speed improvements, enabling t-SNE to scale to millions of data points and various tricks to improve global alignment of the resulting visualizations.

python3-plfit
fitting power-law distributions to empirical data -- Python
Versions of package python3-plfit
ReleaseVersionArchitectures
bookworm0.9.4+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
sid0.9.6+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.9.6+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The plfit software fits power-law distributions to empirical (discrete or continuous) data, according to the method of Clauset, Shalizi and Newman [SIAM Review 51, 661-703 (2009)].

This package provides a Python module.

The package is enhanced by the following packages: plfit-doc
*Popularitycontest results: number of people who use this package regularly (number of people who upgraded this package recently) out of 245494