No commit activity in last 3 years
No release in over 3 years
Parser to translate KAF to NAF and back.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

>= 0
>= 0
~> 3.0

Runtime

 Project Readme

KafNafParserPy

Description

Parser for KAF or NAF files in python. The documentation for all methods and API of this parser can be found at:

  1. PDF API: http://kyoto.let.vu.nl/~izquierdo/documentation/KafNafParserPy/api.pdf
  2. Maintainer guide: http://kyoto.let.vu.nl/~izquierdo/documentation/KafNafParserPy/kafnafparserpy-maintaining.pdf

You can also take a look at this presentation on slideshare about this library.

Quick Installation

The KafNafParserPy is from Feb 10th available in the Python Package Index, so you can easily install it (and its dependencies), by running:

pip install KafNafParserPy

Installation

Clone the repository from github

git clone https://github.com/cltl/KafNafParserPy.git

You will need to have installed the lxml library for python (http://lxml.de/). Usually just by runningpip install --user lxml should be enough for getting lxml installed. In some cases there can be problems with the libraries libxml and libxslt. In this case (considering you have no root access for the machine), you can try to do the following:

wget http://xmlsoft.org/sources/libxml2-sources-2.7.7.tar.gz
gzip -dc libxml2-sources-2.7.7.tar.gz | tar xvf -
cd libxml2-2.7.7
./configure --prefix=/home/ruben/lib
make
make install
wget http://xmlsoft.org/sources/libxslt-1.1.26.tar.gz
gzip -dc libxslt-1.1.26.tar.gz | tar xvf -
cd libxslt-1.1.26
./configure --prefix=/home/ruben/lib --with-libxml-prefix=/home/ruben/lib
make
make install
PATH=$PATH:/home/ruben/lib/bin/
pip install --user lxml

Of course replace /home/ruben/lib by the folder where you want to install the libraries, and check the corresponding websites for newer versions of the libraries.

Usage

This library is a python module, that reads a KAF or NAF file and parses it. It basically parses one KAF/NAF file and allows to access to all the layers through different methods and functions. This is one example of usage:

python
>>> from KafNafParserPy import KafNafParser
>>> my_parser = KafNafParser('myfile.kaf')
>>> for token_obj in my_parser.get_tokens():
>>>     print 'Token id',token.get_id()
>>>     print 'Token text',token.get_text()
>>>
>>> for term_obj in my_parser.get_terms():
>>>    print 'Lemma',term_obj.get_lemma()
>>>    print 'Ids:',term_obj.get_span().get_span_ids()
>>>
>>> for prop in my_paser.get_properties():
>>>    print 'Id',prop.get_id()
>>>    for reference in prop.get_references():
>>>        for span_obj in reference: ##Iterator over Creference object
>>>            print 'span ids',span_obj.get_span_ids()

You can find some examples of usage of this parser in the subfolder examples.

Documentation

The documentation can be generated automatically by running:

epydoc --config documentation.cfg

This will call to the external program epydoc (http://epydoc.sourceforge.net/) with the provided configuration file, and will create the HTML documents for the API in the folder apidocs. As said before the already generated documentation can be seen at http://kyoto.let.vu.nl/~izquierdo/api/KafNafParserPy

Contact

License

Sofware distributed under GPL.v3, see LICENSE file for details.