Jump to:
Project Description | Capabilities
Personal document management using Formal Concept Analysis The tool is able to index local hard drives and everything mounted
into the local file system, such as Windows or Unix network
drives.
It scans for a number of different document formats and
creates a database containing which words are contained in which
documents.
This allows very fast lookup of keywords and other
information like authors, title or location.
The keywords used are
generated from the bodies of the documents, such that no manual
annotation is required.
Docco support the follwing formats:
* plain text
* HTML
* XML
* OpenOffice/ StarOffice 6.0 documents
* Word (with POI plugin)
* Excel (with POI plugin)
* PDF (with PDFbox or Multivalent plugin)
* UNIX man pages (with Multivalent plugin)
Once an index is created, the query interface allows asking for any
documents containing certain keywords and shows how these
combine.
Once a set of interesting documents is found, they can be
selected and will be displayed as tree view, from which they can be
opened in the default application.
This page is part of the FWeb package.
It derives from the
Robotics Institute projects page.
Last updated Mon Jan 15 08:47:40 CST 2007
.