Aller à : navigation, rechercher

Maarch Framework 3/Introduction

Understanding Maarch Framework 3

Maarch Framework 3 is a Production DMS infrastructure, responding to the standard in most of the needs of operational management of the content of an organization. The vast majority of the components of the Framework is available in GPLv3 license, ie open source, so that the cost of implementation makes the solution aborbable for any type of organization (public, private, parastatal, NGO ).

However, Maarch having been designed by two consultants combining together more than 20 years of expertise in electronic archiving systems and publishing, the product offers all the guarantees of robustness, integrity, performance that you should expect from this type of product. A great attention has been focused on the architecture to allow for maximum performance on standard hardware.

Maarch is fully developed in PHP5 Object. It is compatible with the 4 following databases engines : MySQL, PostgreSQL, SQLServer, and soon Oracle.

Maarch is completely modular: all the features are grouped into modules with specific services that can be enabled / disabled depending on the profile of the user. An experienced engineer can add or replace an existing module without touching the heart of the system.

Maarch proposes a global schema and all the tools to acquire, manage, conserve and restore mass production documents.

Let's look at what Maarch can do for you:



Maarch can prepare loading of the archive, thanks to the module Physical Archive. The module provides printing of separator bar code, containing the identity of the file to be scanned, the document type, and the box in which archive will be placed on physical folder after scanning. The scan batches are pre-created in the software for optimum traceability.

The barcode separators are positioned on the stack of paper to identify and to separate documents during mass scanning.

The system also manages several types of batches: one can choose the input screen and print the label for the type of document to be scanned.

The module is integrated in the sample application : credit files

To manage physical storage, we will soon publish an Advanced Physical Archiving module :

  • Packaging units management (boxes, cartons, volume occupied)

Gestion des emplacements de stockage multi entrepôt (création et attribution d'espace)

  • Multi warehouse storage locations management (creation and allocation of space)
  • Loans and refunds of archives, etc.

This module is produced with the assistance of Marc Créhange, Consultant in physical archives, and Anne-Marie Bruleaux responsible for Archival at the University of Mulhouse, France.


Maarch has several channels for acquiring incoming documents:

  • By manual uploading of an electronic document
  • By converting to PDF using a virtual printer, so as to build an electronic picture for the document to archive
  • By scanning directly on the workstation using a low volume scanner

The Maarch virtual printer and direct scanning are very simple and intuitive ways to capture and upload : the document is presented in PDF on the right side of the screen, while the qualifiers to input are listed on the left.

More info : Maarch ScanSnap Connector

Finally, for projects of high-volume scanning, Maarch has a "SAI" module, efficient and innovative, allowing unlimited scanning of documents. This module allows, among others:

  • Image transfer via secure internet protocols from the scan site to the archiving site. On transfer, the documents are divided into packets of 1024 bytes and are complemented by a CRC. A transfer protocol is established between the client module and the server, ensuring the integrity of the document whatever the line quality
  • Barcode recognition and document separation, using reliable proprietary library
  • PDF conversion of TIFF images
  • Maarch AutoImport preparation, with all the service information related to scanning, for compliancy purpose

A constant communication link is maintained between the server and the client module: so it is possible to add custom controls server side and warn the operator in case of problem related to the quality of the scan (page size, unmatching codes, sequence breakdowns, etc.)..

Using Maarch SAI, there is no more need of heavy and expensive scanning software like Ascent Capture, Captiva, etc. : just scan with free software included with scanners (Kodak Capture Software, Fujitsu ScandAll, ...). There is also no limitation on the number of pages.

Maarch SAI can not be offered for free download because of the libraries used. Meanwhile sources are available from the Maarch integrators to implement controls or specific processing.


The modules offer interactive capture and upload within the same process.

For mass processing, Maarch AutoImport is the module allowing for rapid upload of lots of electronic resources in Maarch, either out of mass scanning or mass printout.

Maarch AutoImport also provides all information relating to the process.

The module was designed to handle tens of thousands of documents per hour.

More info : Maarch AutoImport


The documents in Maarch are immediately available for consultation, but there are ways to organize them to make them easier to view and manipulate.

Maarch OCR converts a PDF file PDF image + text (license owner), or extract the text of a PDF to save the attachment to the document (free).

The conversion of a TIFF image or an image PDF is in a module integrating the ABBYY Maarch OCR engine, whose reputation is well established. The resulting is an image+text PDF with an outstanding recognition.

Free OCR just extracts the text from the image. It is based on implementation of Tesseract, Open Source project sponsored by Google.

Maarch Fulltext indexes PDF image + text or plain text in full text. This module is based on the famous open source project Lucene in its PHP port by Zend.

The combined use of Maarch OCR and Maarch Fulltext allows fuzzy searches into the scanned document content.

Maarch Autofoldering carries out an automatic categorization of documents based on their index. After Autofoldering, the user accesses the archive through categorization trees.


Maarch ensures secure content conservation, but also ensures document flow through functionally rich modules. The search for documents in coded in the "Indexing & Searching" module.

By connecting the module Folders, you can store the documents in structured folders, and run searches on folder qualifiers.

Circulation of documents for validation / storage is done with Bannettes (Baskets module) or procedural workflow (Workflow module).

Finally, Maarch CD is a tool to burn a standalone CD, including documents, index, and the interface for searching and viewing.

Maarch CD requires application specific integration: as it is not reasonable to put all the framework on removable media, the recipient has a simplified interface for consultation for specific needs. Please contact Maarch for how to use this module.

To go further...

For an overview of Maarch capabilities on your machine, you must download and install the framework with the sample application.

The Quick start guide describes Maarch features in light of the sample, then guides you through its use.

The English Setup and configuration guide describes how to install, setup, and the internal mechanics of the Core, application, and modules.

Modules are described in detail in the Developer handbook

Data model is also fully documented in english.