openDIAS - Specification.

This document is open for community scrutiny. If you have any feedback, please send it here.

INTRODUCTION
------------
open Document Imaging Archive System (openDIAS) is an interface that provides 
document office workflow application to the home user.
The application will accept documents and/or images in a file or from a 
scanning interface. These documents will then seamlessly be saved into a 
database using 'tags' as index markers. Optionally, when using an image input, 
the source can be OCRed, storing the basic text of the document (basic text 
can also be extracted from other document formats). This text will be linked 
to and stored along with the original document. Later documents can be browsed, 
updated, printed or deleted. A second application will provide a user interface 
to filter by the stored tags and search using the OCRed text or the document 
body.

MODULES
-------
1. Document collection.
Collect documents either from a ODF file or by scanning a document.
	Allow user to select:
		location of ODF document;
		if to extract basic text;
	or
		number of pages to scan;
		the proposed resolution (always B&W)
		if to OCR the document (OCR will not be done in this module)
openDIASs own interface to the Sane API (Similar to Xsane but tailored for SDA use).
After document collection [possible loop for multiple docs] move to module 2.

2. Saving interface
Shows the scanned document, allows users to operate on the image.
	Discard the image - move to the next scan;
	Open the image/document for editing [keep version control].
	OCR (if not already done & is in required resolution) [ocr done here];
	Save with specific tags

3. Retrieve & control
Provides the main interface to the openDIAS application
	Filter and search documents then allow interaction.
	Filter by tag
	Search by OCR text or document body
	Browse resultant documents
	Change attributes [opens module 2 for specific file(s)]
	Delete, email, print, export [PDF, etc]
	Start an acquisition process [opens module 1]


STRUCTURES
----------
The project will be released under GPL (2 or 3).
The system will be written in C, using the GlibC and GTK interfaces.
The system will be optimised for the GNOME platform, but should be transportable.
The system should be fully localisable.


PRE-REQUISITS
-------------
Scanning API [sane]
OCR interface [teseract]
PDF creator [unknown]
Database [sqlite3]