Debian Science Project
Summary
Linguistics
Debian Science Linguistics packages

This metapackage is part of the Debian Pure Blend "Debian Science" and installs packages related to Linguistics.

Description

For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:

If you discover a project which looks like a good candidate for Debian Science to you, or if you have prepared an unofficial Debian package, please do not hesitate to send a description of that project to the Debian Science mailing list

Links to other tasks

Debian Science Linguistics packages

Official Debian packages with high relevance

Apertium
Shallow-transfer machine translation engine
Versions of package apertium
ReleaseVersionArchitectures
squeeze3.1.0-1.2amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy3.1.0-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.1.0-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.4.0~r61013-5amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid3.4.2~r68466-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package apertium:
fieldlinguistics
roleprogram
Popcon: 26 users (41 upd.)*
Versions and Archs
License: DFSG free
Git

An open-source shallow-transfer machine translation engine, Apertium is initially aimed at related-language pairs.

It uses finite-state transducers for lexical processing, hidden Markov models for part-of-speech tagging, and finite-state based chunking for structural transfer.

The system is largely based upon systems already developed by the Transducens group at the Universitat d'Alacant, such as interNOSTRUM (Spanish-Catalan, http://www.internostrum.com/welcome.php) and Traductor Universia (Spanish-Portuguese, http://traductor.universia.net).

It will be possible to use Apertium to build machine translation systems for a variety of related-language pairs simply providing the linguistic data needed in the right format.

Screenshots of package apertium
Apertium-lex-tools
Constraint-based lexical selection module
Versions of package apertium-lex-tools
ReleaseVersionArchitectures
stretch0.1.1~r66150-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.1.1~r66150-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 4 users (35 upd.)*
Versions and Archs
License: DFSG free
Git

This package provides a module for compiling lexical selection rules and processing them in the pipeline.

Artha
Handy off-line thesaurus based on WordNet
Versions of package artha
ReleaseVersionArchitectures
squeeze0.9.1-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.0.2-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.0.3-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.0.3-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.0.3-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package artha:
fieldlinguistics
interfacex11
roleprogram
uitoolkitgtk
uselearning
x11application
Popcon: 61 users (42 upd.)*
Versions and Archs
License: DFSG free
Svn

Artha is a off-line English thesaurus with distinct features like:

  • hot-key press word look-up (select text on any window and press a preset hot-key for look-up)
  • regular expressions based search (broaden search using wild-cards like *, ?, etc.)
  • passive desktop notifications (of word definitions for uninterrupted work-flow)
  • spelling suggestions (when the exact spelling is vague/not known)

Once launched, it monitors for a preset hot-key combination. When some text is selected on any window and the hot-key is pressed, it pops-up with the word looked-up. Should the user prefer passive notifications, this can be done by enabling the notifications option.

When the term looked for is vague/not known, then either the search can be broadened with the use of regular expressions (*, ?, etc.) in the search string or spelling suggestions when a term is incorrect.

For regular expressions based search to work, wordnet-sense-index package is required.

Screenshots of package artha
Cg3
Tools for using the 3rd edition of Constraint Grammar (CG-3)
Versions of package cg3
ReleaseVersionArchitectures
stretch0.9.9~r11624-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.9.9~r11624-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 5 users (65 upd.)*
Versions and Archs
License: DFSG free
Git

Constraint Grammar compiler and applicator for the 3rd edition of CG that is developed and maintained by VISL SDU and GrammarSoft ApS.

CG-3 can be used for disambiguation of morphology, syntax, semantics, etc; dependency markup, target language lemma choice for MT, QA systems, and much more. The core idea is that you choose what to do based on the whole available context, as opposed to n-grams.

See http://visl.sdu.dk/cg3.html for more documentation

Collatinus
lemmatisation of latin text
Maintainer: Georges Khaznadar ()
Versions of package collatinus
ReleaseVersionArchitectures
squeeze9.3-4all
wheezy10.0-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie10.2-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch10.2-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid10.2-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package collatinus:
fieldlinguistics
interfacex11
roledummy, program
scopeapplication
uitoolkitgtk
uselearning
x11application
Popcon: 7 users (31 upd.)*
Versions and Archs
License: DFSG free

Collatinus can be used to lemmatise latin texts, i.e. extract words and make a lexicon which indicates for each word its canonic form, and how the form actually found in the text was derived from it, for instance by declining it. Example : rosam gives : rosa-rosae -- acc. sing. Collatinus provides a nice graphic front-end to each operation.

Collatinus-nouus (stands for Collatinus, new generation) replaces every previous version of Collatinus.

This package provides a documentation in HTML format.

Dimbl
Distributed Memory Based Learner
Versions of package dimbl
ReleaseVersionArchitectures
wheezy0.11-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.12-2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.12-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.12-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream0.15
Debtags of package dimbl:
roleprogram
Popcon: 5 users (36 upd.)*
Newer upstream!
License: DFSG free
Svn

Dimbl is a wrapper around the k-nearest neighbor classifier in TiMBL, offering parallel classification on multi-CPU machines. Dimbl splits the original training set, builds separate TiMBL classifiers per training subset, and merges their nearest-neighbor sets per classified instance

Dimbl's features are:

  • Wraps neatly around TiMBL, retaining all command line options;
  • Knows what to do with your multiple, duo, or quad cores;
  • Makes use of the OpenMP specification for parallel programming;
  • Can attain superlinear speed gains compared to standard TiMBL.

Dimbl is a product of the ILK Research Group (Tilburg University, The Netherlands).

If you do scientific research in Natural Language Processing using the Memory-Based Learning technique, Dimbl will likely be of use to you.

Frog
tagger and parser for natural languages (runtime)
Versions of package frog
ReleaseVersionArchitectures
wheezy0.12.15-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.12.17-7.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.12.20-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream0.13.0
Popcon: 3 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

Memory-Based Learning (MBL) is a machine-learning method applicable to a wide range of tasks in Natural Language Processing (NLP).

Frog is a modular system integrating a morphosyntactic tagger, lemmatizer, morphological analyzer, and dependency parser for natural languages. It is based upon it's predecessor TADPOLE (TAgger, Dependency Parser, and mOrphoLogical analyzEr). Using Memory-Based Learning techniques, frog tokenizes, tags, lemmatizes, and morphologically segments word tokens in incoming UTF-8 text files, and assigns a dependency graph to each sentence. Frog is particularly targeted at the increasing need for fast, automatic NLP systems applicable to very large (multi-million to billion word) document collections that are becoming available due to the progressive digitization of both new and old textual data. Up to now, frog has only been tested and used using corpora of Dutch natural language (see the frogdata package for samples).

Frog is a product of the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

If you do scientific research in NLP, Frog will likely be of use to you.

Libcld2-dev
Compact Language Detector 2, development package
Versions of package libcld2-dev
ReleaseVersionArchitectures
stretch0.0.0-git20150806-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
sid0.0.0-git20150806-5amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips64el,mipsel,ppc64el
Popcon: 1 users (44 upd.)*
Versions and Archs
License: DFSG free
Git

Detects over 80 languages in UTF-8 text, based largely on groups of four letters. Also tables for 160+ language version.

This is the development package.

Link-grammar
Carnegie Mellon University's link grammar parser
Versions of package link-grammar
ReleaseVersionArchitectures
squeeze4.6.7-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy4.7.4-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie4.7.4-2amd64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch5.3.8-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid5.3.8-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream5.3.9
Debtags of package link-grammar:
fieldlinguistics
interfacecommandline
roleprogram
usechecking
works-withdictionary
Popcon: 13 users (54 upd.)*
Newer upstream!
License: DFSG free
Git

In Sleator, D. and Temperley, D. "Parsing English with a Link Grammar" (1991), the authors defined a new formal grammatical system called a "link grammar". A sequence of words is in the language of a link grammar if there is a way to draw "links" between words in such a way that the local requirements of each word are satisfied, the links do not cross, and the words form a connected graph. The authors encoded English grammar into such a system, and wrote this program to parse English using this grammar.

link-grammar can be used for linguistic parsing for information retrieval or extraction from natural language documents. It can also be used as a grammar checker.

This package contains the user-executable binary.

Mbt
memory-based tagger-generator and tagger
Versions of package mbt
ReleaseVersionArchitectures
wheezy3.2.8-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.2.10-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.2.16-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3.2.16-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package mbt:
fieldlinguistics
roleprogram
Popcon: 4 users (54 upd.)*
Versions and Archs
License: DFSG free
Git

MBT is a memory-based tagger-generator and tagger in one. The tagger-generator part can generate a sequence tagger on the basis of a training set of tagged sequences; the tagger part can tag new sequences. MBT can, for instance, be used to generate part-of-speech taggers or chunkers for natural language processing. Features:

  • Tagger generation: tagged text in, tagger out,
  • Optional feedback loop: feed previous tag decision back to input of next decision,
  • Easily customizable feature representation; can incorporate user-provided features,
  • Automatic generation of separate sub-taggers for known words and unknown words,
  • Can make use of full algorithmic parameters of TiMBL.

MBT is a product of the Centre of Language and Speech Technology (Radboud University Nijmegen, The Netherlands), the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

If you do scientific research in natural language processing, MBT will likely be of use to you.

Mbtserver
Server extensions for the MBT tagger
Versions of package mbtserver
ReleaseVersionArchitectures
wheezy0.5-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.7-3amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.11-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.11-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 3 users (54 upd.)*
Versions and Archs
License: DFSG free
Git

MbtServer extends Mbt with a server layer, running as a TCP server. Mbt is a memory-based tagger-generator and tagger for natural language processing. MbtServer provides the possibility to access a trained tagger from multiple sessions. It also allows one to run and access different taggers in parallel.

MbtServer is a product of the Centre for Language and Speech Technology (Radboud University, Nijmegne, The Netherlands), the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

If you do scientific research in natural language processing, MbtServer will likely be of use to you.

Timbl
Tilburg Memory Based Learner
Versions of package timbl
ReleaseVersionArchitectures
wheezy6.4.2-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie6.4.4-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch6.4.8-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid6.4.8-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package timbl:
roleprogram
Popcon: 3 users (78 upd.)*
Versions and Archs
License: DFSG free
Git

Memory-Based Learning (MBL) is a machine-learning method applicable to a wide range of tasks in Natural Language Processing (NLP).

The Tilburg Memory Based Learner, TiMBL, is a tool for NLP research, and for many other domains where classification tasks are learned from examples. It is an efficient implementation of k-nearest neighbor classifier.

TiMBL's features are:

  • Fast, decision-tree-based implementation of k-nearest neighbor classification;

  • Implementations of IB1 and IB2, IGTree, TRIBL, and TRIBL2 algorithms;

  • Similarity metrics: Overlap, MVDM, Jeffrey Divergence, Dot product, Cosine;
  • Feature weighting metrics: information gain, gain ratio, chi squared, shared variance;

  • Distance weighting metrics: inverse, inverse linear, exponential decay;

  • Extensive verbosity options to inspect nearest neighbor sets;
  • Server functionality and extensive API;
  • Fast leave-one-out testing and internal cross-validation;
  • and Handles user-defined example weighting.

TiMBL is a product of the Centre of Language and Speech Technology (Radboud University, Nijmegen, The Netherlands), the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

If you do scientific research in NLP, timbl will likely be of use to you.

Timblserver
Server extensions for Timbl
Versions of package timblserver
ReleaseVersionArchitectures
wheezy1.4-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.7-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.11-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid1.11-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package timblserver:
roleprogram
Popcon: 2 users (54 upd.)*
Versions and Archs
License: DFSG free
Git

timblserver is a TiMBL wrapper; it adds server functionality to TiMBL. It allows TiMBL to run multiple experiments as a TCP server, optionally via HTTP.

The Tilburg Memory Based Learner, TiMBL, is a tool for Natural Language Processing research, and for many other domains where classification tasks are learned from examples.

TimblServer is a product of the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

If you do scientific research in NLP, TimblServer will likely be of use to you.

Ucto
Unicode Tokenizer
Versions of package ucto
ReleaseVersionArchitectures
wheezy0.5.2-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.5.3-3.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.8.0-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
upstream0.8.2
Debtags of package ucto:
roleprogram
Popcon: 7 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

Ucto can tokenize UTF-8 encoded text files (i.e. separate words from punctuation, split sentences, generate n-grams), and offers several other basic preprocessing steps (change case, count words/characters and reverse lines) that make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation.

Ucto was written by Maarten van Gompel and Ko van der Sloot. Work on Ucto was funded by NWO, the Netherlands Organisation for Scientific Research, under the Implicit Linguistics project and the CLARIN-NL program.

If you are interested in machine parsing of UTF-8 encoded text files, e.g. to do scientific research in natural language processing, ucto will likely be of use to you.

Wordnet
electronic lexical database of English language
Versions of package wordnet
ReleaseVersionArchitectures
squeeze3.0-24amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy3.0-29amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie3.0-33amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch3.0-33amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid3.0-33amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package wordnet:
fieldlinguistics
interfacex11
roleprogram
scopeapplication
uitoolkittk
usechecking
works-withdictionary
x11application
Popcon: 140 users (93 upd.)*
Versions and Archs
License: DFSG free
Svn

WordNet(C) is an on-line lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets.

WordNet was developed by the Cognitive Science Laboratory at Princeton University under the direction of Professor George A. Miller (Principal Investigator).

WordNet is considered to be the most important resource available to researchers in computational linguistics, text analysis, and many related areas.

Binary and manpages of WordNet as well as general manpages.

Please cite: George A. Miller: WordNet: A Lexical Database for English. Communications of the ACM 38(11):39-41 (1995)

Official Debian packages with lower relevance

Apertium-af-nl
Apertium translation data for the Afrikaans-Dutch pair
Versions of package apertium-af-nl
ReleaseVersionArchitectures
stretch0.2.0~r58256-1all
sid0.2.0~r58256-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Afrikaans and Dutch languages.

Apertium-apy
Apertium APY service
Versions of package apertium-apy
ReleaseVersionArchitectures
stretch0.9.1~r343-1all
sid0.9.1~r343-1all
Popcon: 4 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This package contains Apertium APY which is simple Apertium API written in Python 3 meant as a drop-in replacement for ScaleMT.

Apertium-br-fr
Apertium linguistic data to translate between Breton and French
Versions of package apertium-br-fr
ReleaseVersionArchitectures
stretch0.5.0~r61325-2all
sid0.5.0~r61325-2all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This is a linguistic package for the Apertium shallow-transfer machine translation system. The package can be used to translate between Breton and French.

Apertium-cat
Apertium single language data for Catalan
Versions of package apertium-cat
ReleaseVersionArchitectures
stretch1.0.0~r65787-1all
sid1.0.0~r65787-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for Catalan

Apertium-cy-en
Apertium translation data for the Welsh-English pair
Versions of package apertium-cy-en
ReleaseVersionArchitectures
stretch0.1.1~r57554-3all
sid0.1.1~r57554-3all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Welsh and English languages.

Apertium-dan
Apertium single language data for Danish
Versions of package apertium-dan
ReleaseVersionArchitectures
stretch0.5.0~r67099-1all
sid0.5.0~r67099-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for Danish

Apertium-dan-nor
Apertium translation data for the Danish-Norwegian pair
Versions of package apertium-dan-nor
ReleaseVersionArchitectures
stretch1.3.0~r67099-1all
sid1.3.0~r67099-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Danish and Norwegian languages.

Apertium-en-ca
Apertium translation data for the English-Catalan pair
Versions of package apertium-en-ca
ReleaseVersionArchitectures
squeeze0.8.9-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.8.9-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.8.9-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.9.3~r61328-1all
sid0.9.3~r61328-1all
Debtags of package apertium-en-ca:
culturecatalan
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the English and Catalan languages.

Apertium-en-es
Apertium translation data for the English-Spanish pair
Versions of package apertium-en-es
ReleaseVersionArchitectures
squeeze0.6.0-1.1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.6.0-1.1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.6.0-1.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.8.0~r57502-2all
sid0.8.0~r57502-2all
Debtags of package apertium-en-es:
culturespanish
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the English and Spanish languages.

Apertium-eo-ca
Apertium translation data for the Esperanto-Catalan pair
Versions of package apertium-eo-ca
ReleaseVersionArchitectures
squeeze0.9.0-1.1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.9.0-1.1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.9.0-1.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.9.1~r60655-1all
sid0.9.1~r60655-1all
Debtags of package apertium-eo-ca:
culturecatalan, esperanto
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Esperanto and Catalan languages.

Apertium-eo-es
Apertium translation data for the Esperanto-Spanish pair
Versions of package apertium-eo-es
ReleaseVersionArchitectures
squeeze0.9.0-1.1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.9.0-1.1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.9.0-1.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.9.1~r60655-1all
sid0.9.1~r60655-1all
Debtags of package apertium-eo-es:
cultureesperanto, spanish
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Esperanto and Spanish languages.

Apertium-es-ca
Apertium translation data for the Spanish-Catalan pair
Versions of package apertium-es-ca
ReleaseVersionArchitectures
squeeze1.1.0-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.1.0-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.1.0-1.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.2.1+svn~57448-4all
sid1.2.1+svn~57448-4all
Debtags of package apertium-es-ca:
culturecatalan, spanish
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Spanish and Catalan languages.

Apertium-es-gl
Apertium translation data for the Spanish-Galician pair
Versions of package apertium-es-gl
ReleaseVersionArchitectures
squeeze1.0.7-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.0.7-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.0.7-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.0.8~r57542-2all
sid1.0.8~r57542-2all
Debtags of package apertium-es-gl:
culturegalician, spanish
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Spanish and Galician languages.

Apertium-es-pt
Apertium translation data for the Spanish-Portuguese pair
Versions of package apertium-es-pt
ReleaseVersionArchitectures
squeeze1.0.3-2.1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.0.3-2.1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.0.3-2.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.1.5+svn~57507-3all
sid1.1.5+svn~57507-3all
Debtags of package apertium-es-pt:
cultureesperanto, portuguese, spanish
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Spanish and Portuguese languages.

Apertium-es-ro
Apertium translation data for the Spanish-Romanian pair
Versions of package apertium-es-ro
ReleaseVersionArchitectures
squeeze0.7.1-2.1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.7.1-2.1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.7.1-2.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.7.3~r57551-2all
sid0.7.3~r57551-2all
Debtags of package apertium-es-ro:
cultureromanian, spanish
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Spanish and Romanian languages.

Apertium-eu-es
Apertium translation data for the Basque-Spanish pair
Versions of package apertium-eu-es
ReleaseVersionArchitectures
squeeze0.3.1-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.3.1-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.3.1-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.3.3~r56159-2all
sid0.3.3~r56159-2all
Debtags of package apertium-eu-es:
culturebasque, spanish
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Basque and Spanish languages.

Apertium-fr-ca
Transitional dummy package for apertium-fra-cat
Versions of package apertium-fr-ca
ReleaseVersionArchitectures
squeeze1.0.2-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.0.2-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.0.2-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.1.0~r64309-1all
sid1.1.0~r64309-1all
Debtags of package apertium-fr-ca:
culturecatalan, french
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This is a transitional dummy package. It can safely be removed.

Apertium-fr-es
Apertium translation data for the French-Spanish pair
Versions of package apertium-fr-es
ReleaseVersionArchitectures
squeeze0.9.0-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.9.0-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.9.0-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.9.2~r61322-2all
sid0.9.2~r61322-2all
Debtags of package apertium-fr-es:
culturefrench, spanish
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the French and Spanish languages.

Apertium-fra
Apertium single language data for French
Versions of package apertium-fra
ReleaseVersionArchitectures
stretch1.0.0~r65786-1all
sid1.0.0~r65786-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for French

Apertium-fra-cat
Apertium translation data for the French-Catalan pair
Versions of package apertium-fra-cat
ReleaseVersionArchitectures
stretch1.1.0~r64309-1all
sid1.1.0~r64309-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the French and Catalan languages.

Apertium-nno
Apertium single language data for Norwegian Nynorsk
Versions of package apertium-nno
ReleaseVersionArchitectures
stretch0.9.0~r69513-1all
sid0.9.0~r69513-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for Norwegian Nynorsk

Apertium-nno-nob
Apertium translation data for the Norwegian Nynorsk-Norwegian Bokmål pair
Versions of package apertium-nno-nob
ReleaseVersionArchitectures
stretch1.1.0~r66076-1all
sid1.1.0~r66076-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Norwegian Nynorsk and Norwegian Bokmål languages.

Apertium-nob
Apertium single language data for Norwegian Bokmål
Versions of package apertium-nob
ReleaseVersionArchitectures
stretch0.9.0~r69513-1all
sid0.9.0~r69513-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for Norwegian Bokmål

Apertium-oc-ca
Apertium translation data for the Occitan-Catalan pair
Versions of package apertium-oc-ca
ReleaseVersionArchitectures
squeeze1.0.5-1.1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.0.5-1.1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.0.5-1.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.0.6~r57551-2all
sid1.0.6~r57551-2all
Debtags of package apertium-oc-ca:
culturecatalan
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Occitan and Catalan languages.

Apertium-oc-es
Apertium translation data for the Occitan-Spanish pair
Versions of package apertium-oc-es
ReleaseVersionArchitectures
squeeze1.0.5-1.1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy1.0.5-1.1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie1.0.5-1.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch1.0.6~r57551-2all
sid1.0.6~r57551-2all
Debtags of package apertium-oc-es:
culturespanish
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Occitan and Spanish languages.

Apertium-pt-ca
Apertium translation data for the Portuguese-Catalan pair
Versions of package apertium-pt-ca
ReleaseVersionArchitectures
squeeze0.8.1-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.8.1-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.8.1-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.8.2+svn~57507-3all
sid0.8.2+svn~57507-3all
Debtags of package apertium-pt-ca:
culturecatalan, portuguese
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Portuguese and Catalan languages.

Apertium-pt-gl
Apertium translation data for the Portuguese-Galician pair
Versions of package apertium-pt-gl
ReleaseVersionArchitectures
squeeze0.9.1-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.9.1-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.9.1-1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.9.2~r57551-2all
sid0.9.2~r57551-2all
Debtags of package apertium-pt-gl:
culturegalician, portuguese
fieldlinguistics
roleapp-data
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Portuguese and Galician languages.

Apertium-swe
Apertium single language data for Swedish
Versions of package apertium-swe
ReleaseVersionArchitectures
stretch0.7.0~r69513-1all
sid0.7.0~r69513-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for Swedish

Apertium-swe-dan
Apertium translation data for the Swedish-Danish pair
Versions of package apertium-swe-dan
ReleaseVersionArchitectures
stretch0.7.0~r66063-1all
sid0.7.0~r66063-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Data package providing Apertium language resources for translating between the Swedish and Danish languages.

Frogdata
Data files for Frog
Versions of package frogdata
ReleaseVersionArchitectures
wheezy0.3-2all
jessie0.4-1all
stretch0.4-1all
sid0.4-1all
upstreamlatest
Popcon: 0 users (0 upd.)*
Newer upstream!
License: DFSG free
Svn

Frog is a modular system integrating a morphosyntactic tagger, lemmatizer, morphological analyzer, and dependency parser for the Dutch language.

This package provided necessary datafiles for running Frog.

Frog is a product of the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

Libcg3-dev
Headers and shared files to develop using the CG-3 library
Versions of package libcg3-dev
ReleaseVersionArchitectures
stretch0.9.9~r11624-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.9.9~r11624-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Development files to use the CG-3 API.

It is recommended to instrument the CLI tools instead of using this API.

See http://visl.sdu.dk/cg3.html for more documentation

Libfolia-dev
implementation of the FoLiA document format
Versions of package libfolia-dev
ReleaseVersionArchitectures
jessie0.10-4.2amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
sid0.13-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package libfolia-dev:
devellibrary
roledevel-lib
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

FoLiA is an XML-based format for Linguistic Annotation suitable for representing written language resources such as corpora. Its goal is to unify a variety of linguistic annotations in one single rich format, without committing to any particular standard annotation set. Instead, it seeks to accommodate any desired system or tagset, and so offer maximum flexibility. This makes FoLiA language independent. see http://ilk.uvt.nl/folia/ for more information.

libfolia is a product of the ILK Research Group, Tilburg University (The Netherlands). Work on libfolia is funded by NWO, the Netherlands Organisation for Scientific Research.

This package provides the FoLiA header files required to compile C++ programs that use libfolia.

Libmbt0-dev
memory-based tagger-generator and tagger - development
Versions of package libmbt0-dev
ReleaseVersionArchitectures
wheezy3.2.8-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
Debtags of package libmbt0-dev:
devellibrary
roledevel-lib
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Svn

MBT is a memory-based tagger-generator and tagger in one. The tagger-generator part can generate a sequence tagger on the basis of a training set of tagged sequences; the tagger part can tag new sequences. MBT can, for instance, be used to generate part-of-speech taggers or chunkers for natural language processing.

MBT is a product of the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

If you do scientific research in natural language processing, MBT will likely be of use to you.

This package provides the header files required to compile C++ programs that use libmbt.

Libticcutils2-dev
library for TiCC software - development files
Versions of package libticcutils2-dev
ReleaseVersionArchitectures
jessie0.4-5.1amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch0.13.1-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid0.13.1-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package libticcutils2-dev:
devellibrary
roledevel-lib
Popcon: 0 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

The TiCC utils C++ library contains useful functions and other goodies for general use in TiMBL and other parts of the TiCC software stack and beyond.

TiCC utils is a product of the Tilburg centre for Cognition and Communication (Tilburg University, The Netherlands). If you do scientific research in Natural Language Processing, TiCC software will likely be of use to you.

This package provides the header files required to compile C++ programs that use libticcutils.

Libtimbl3-dev
Tilburg Memory Based Learner - development
Versions of package libtimbl3-dev
ReleaseVersionArchitectures
wheezy6.4.2-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
Debtags of package libtimbl3-dev:
devellibrary
roledevel-lib
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Svn

The Tilburg Memory Based Learner, TiMBL, is a tool for Natural Language Processing research, and for many other domains where classification tasks are learned from examples. It is an efficient implementation of k-nearest neighbor classifier.

TiMBL is a product of the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

This package provides the TiMBL header files required to compile C++ programs that use TiMBL.

Libtimbl4-dev
Tilburg Memory Based Learner - development
Versions of package libtimbl4-dev
ReleaseVersionArchitectures
jessie6.4.4-4amd64,arm64,armel,armhf,i386,mips,mipsel,powerpc,ppc64el,s390x
stretch6.4.8-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
sid6.4.8-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Debtags of package libtimbl4-dev:
devellibrary
roledevel-lib
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The Tilburg Memory Based Learner, TiMBL, is a tool for Natural Language Processing research, and for many other domains where classification tasks are learned from examples. It is an efficient implementation of k-nearest neighbor classifier.

TiMBL is a product of the Centre of Language and Speech Technology (Radboud University, Nijmegen, The Netherlands), the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

This package provides the TiMBL header files required to compile C++ programs that use TiMBL.

Libtimblserver2-dev
Server extensions for Timbl - development
Versions of package libtimblserver2-dev
ReleaseVersionArchitectures
wheezy1.4-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
Debtags of package libtimblserver2-dev:
devellibrary
roledevel-lib
Popcon: users ( upd.)*
Versions and Archs
License: DFSG free
Svn

timblserver is a TiMBL wrapper; it adds server functionality to TiMBL. It allows TiMBL to run multiple experiments as a TCP server, optionally via HTTP.

The Tilburg Memory Based Learner, TiMBL, is a tool for Natural Language Processing research, and for many other domains where classification tasks are learned from examples.

TimblServer is a product of the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).

This package provides the header files required to compile C++ programs that use timblserver

Libucto1-dev
Unicode Tokenizer - development
Versions of package libucto1-dev
ReleaseVersionArchitectures
wheezy0.5.2-2amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
upstream0.8.2
Debtags of package libucto1-dev:
devellibrary
roledevel-lib
Popcon: 0 users (0 upd.)*
Newer upstream!
License: DFSG free
Svn

Ucto can tokenize UTF-8 encoded text files (i.e. separate words from punctuation, split sentences, generate n-grams), and offers several other basic preprocessing steps (change case, count words/characters and reverse lines) that make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation.

Ucto is a product of the ILK Research Group, Tilburg University (The Netherlands).

This package provides the ucto header files required to compile C++ programs that use ucto.

Python-nltk
Python libraries for natural language processing
Versions of package python-nltk
ReleaseVersionArchitectures
jessie3.0.0-1all
stretch3.2.1-2all
sid3.2.1-2all
Popcon: 14 users (17 upd.)*
Versions and Archs
License: DFSG free
Git

The Natural Language Toolkit (NLTK) is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

This package contains the modules for Python2.

Python3-nltk
Python3 libraries for natural language processing
Versions of package python3-nltk
ReleaseVersionArchitectures
jessie3.0.0-1all
stretch3.2.1-2all
sid3.2.1-2all
Popcon: 4 users (8 upd.)*
Versions and Archs
License: DFSG free
Git

The Natural Language Toolkit (NLTK) is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

This package contains the modules for Python3.

Debian packages in experimental

Sequitur-g2p
Grapheme to Phoneme conversion tool
Maintainer: Giulio Paci
Versions of package sequitur-g2p
ReleaseVersionArchitectures
experimental0+r1668.r3-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,powerpc,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Sequitur G2P is a data-driven grapheme-to-phoneme converter. It can be applied to any monotonous sequence translation problem, provided the source and target alphabets are small (less than 255 symbols). Data-driven means that you need to train it with example pronunciations. Training takes a pronunciation dictionary and creates a model file. The model file can then be used to transcribe words that where not in the dictionary.

Packaging has started and developers might try the packaging code in VCS

Travatar
tree based machine translation toolkit
Versions of package travatar
ReleaseVersionArchitectures
VCS0.1.0+git20131221-1all
Versions and Archs
License: LGPL-3.0+
Debian package not available
Git
Version: 0.1.0+git20131221-1

Travatar is tree based statistical machine translation system containing Tree-to-String (T2S) and Forest-to-String (F2S).

Tree based translation uses syntax trees of natural language and it's particularly effective for language pairs that require a large amount of reordering, such as English-Japanese translation.

No known packages available

Wnsqlbuilder
SQL version of WordNet 3.0
License: GPL
Debian package not available

WordNet SQL Builder is a Java utility to generate SQL database from WordNet standard database as released by the WordNet Project (Princeton University)

Features

  • Support for MySql and PostGreSQL.
  • Complete port (however, orphaned morphological forms are dropped, and so are VerbNet/XWordNet data that cannot be linked to WordNet entries).
  • Incremental build support.
  • Retains synset index as primary key allowing easy reference to wordnet original database
  • Includes support for WordNet 3.0
  • Includes support for WordNet 2.0 to 2.1, 2.1 to 3.0, 2.0 to 3.0 sense maps
  • Includes support for VerbNet 2.3
  • Includes support for XWordNet 2.0-1.1
  • Ready-to-use database (see wnsqldatabase package in download section) including
  • WordNet 3.0
  • WordNet 2.0 to 2.1, 2.1 to 3.0, 2.0 to 3.0 sense maps
  • VerbNet 2.3
  • XWordNet 2.0-1.1
  • British National Corpus statistical data (for commonly used-words)
*Popularitycontest results: number of people who use this package regularly (number of people who upgraded this package recently) out of 184299