xtas

the eXtensible Text Analysis Suite

1
mention
2
contributors

Cite this software

What xtas can do for you

  • easy access to numerous text processing and analysis tools
  • full support for Dutch and English
  • can use Elasticsearch for document storage
  • can be run as a service

xtas is a collection of natural language processing and text mining tools, brought together in a single software package with built-in distributed computing and support for the Elasticsearch document store.

xtas functionality consists partly of wrappers for existing packages, with automatic installation of software and data; and partly of custom-built modules coming out of research. Currently offered are various parsers for Dutch and English (Alpino, CoreNLP, Frog, Semafor), named entity recognizers (Frog, Stanford and custom-built ones), a temporal expression tagger (Heideltime) and a sentiment tagger based on SentiWords.

A basic installation of xtas works like a Python module. Built-in package management and a simple, uniform interface take away the hassle of installing, configuring and using many existing NLP tools.

xtas’s open architecture makes it possible to include custom code, run this in a distributed fashion and have it communicate with Elasticsearch to provide document storage and retrieval.

Keywords
Programming languages
  • Python 94%
  • Shell 4%
  • Java 1%
License
  • Apache-2.0
</>Source code

Participating organisations

Social Sciences & Humanities
Social Sciences & Humanities
Netherlands eScience Center
University of Amsterdam

Mentions

  • 1.
    Xtas 3, the eXtensible Text Analysis Suite
    Published in 2016

Contributors

Contact person

Lourens Veen

Lourens Veen

Netherlands eScience Center
Mail Lourens
LB
Lars Buitinck
Netherlands eScience Center
Lourens Veen
Lourens Veen
Netherlands eScience Center

Related projects

SPuDisc

Searching public discourse

Updated 13 months ago
Finished