Debian Science Project
Summary
Data management
Debian Science Data Management packages

This metapackage will install packages to assist with data management tasks, such as obtaining data from remote resources, keeping data under version control, etc.

Description

For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:

If you discover a project which looks like a good candidate for Debian Science to you, or if you have prepared an unofficial Debian package, please do not hesitate to send a description of that project to the Debian Science mailing list

Links to other tasks

Debian Science Data management packages

Official Debian packages with high relevance

Datalad
data files management and distribution platform
Versions of package datalad
ReleaseVersionArchitectures
buster0.11.2-2all
sid0.13.3-1all
stretch0.4.1-1all
upstream0.13.4
Popcon: 34 users (7 upd.)*
Newer upstream!
License: DFSG free
Git

DataLad is a data management and distribution platform providing access to a wide range of data resources already available online. Using git-annex as its backend for data logistics it provides following facilities built-in or available through additional extensions

  • command line and Python interfaces for manipulation of collections of datasets (install, uninstall, update, publish, save, etc.) and separate files/directories (add, get)

  • extract, aggregate, and search through various sources of metadata (xmp, EXIF, etc; install datalad-neuroimaging for DICOM, BIDS, NIfTI support)

  • crawl web sites to automatically prepare and update git-annex repositories with content from online websites, S3, etc (install datalad-crawler)
Datalad-container
DataLad extension for working with containerized environments
Maintainer: Yaroslav Halchenko
Versions of package datalad-container
ReleaseVersionArchitectures
sid1.0.1-1all
buster0.2.2-2all
Popcon: 20 users (5 upd.)*
Versions and Archs
License: DFSG free

This extension enhances DataLad (http://datalad.org) for working with computational containers.

Git-annex
manage files with git, without checking their contents into git
Versions of package git-annex
ReleaseVersionArchitectures
stretch-security6.20170101-1+deb9u1amd64,i386
stretch6.20170101-1+deb9u2amd64,arm64,i386,mips,mips64el,mipsel,ppc64el,s390x
buster7.20190129-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye8.20200908-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid8.20200908-1ppc64el
sid8.20201007-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,s390x
wheezy3.20120629amd64,armel,armhf,i386,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
wheezy-security3.20120629+deb7u1amd64,armel,armhf,i386
jessie5.20141125+deb8u1amd64,armel,armhf,i386
jessie-security5.20141125+oops-1+deb8u2amd64,armel,armhf,i386
Debtags of package git-annex:
develrcs
roleprogram
works-withfile
Popcon: 492 users (58 upd.)*
Versions and Archs
License: DFSG free
Git

git-annex allows managing files with git, without checking the file contents into git. While that may seem paradoxical, it is useful when dealing with files larger than git can currently easily handle, whether due to limitations in memory, time, or disk space.

It can store large files in many places, from local hard drives, to a large number of cloud storage services, including S3, WebDAV, and rsync, with a dozen cloud storage providers usable via plugins. Files can be stored encrypted with gpg, so that the cloud storage provider cannot see your data. git-annex keeps track of where each file is stored, so it knows how many copies are available, and has many facilities to ensure your data is preserved.

git-annex can also be used to keep a folder in sync between computers, noticing when files are changed, and automatically committing them to git and transferring them to other computers. The git-annex webapp makes it easy to set up and use git-annex this way.

The package is enhanced by the following packages: elpa-git-annex elpa-magit-annex keysafe
*Popularitycontest results: number of people who use this package regularly (number of people who upgraded this package recently) out of 199837