Aller à : navigation, rechercher

Maarch AutoImport/Installation and Exploitation guide

Presentation

Introduction

Light and simple to use, Maarch AutoImport enables a fast mass importation of documents in Maarch Maarch DMS. The 3.2 AutoImport version works with Maarch Enterprise 1.2 and Maarch Letterbox 2.9.

Working Process

Maarch AutoImport scans every document that need to be imported. To be able to perform the indexation, the file must be associated to an XML file with the same name. It contains the metadata necessary to reference the document (description, status, date of creation, documentary collection , etc…).

AutoImport001.jpg

New in version 3

  • Subfolders are now allowed in the incoming directory
  • Constant values for Maarch fields can be set in the mapping file. this is usefull to fill values that are not present in document index files
  • bug fixing

Installation and Configuration

Pre-requisites

Maarch AutoImport needs the installation of a console PHP 5.3 or later to execute the script.

The Maarch appication must be previously configured. The DocServer must exist as well and it must be linked to the database.

Configurations of Maarch Database

The size limit of the DocServer in the database must be adapted to the quantity of documents to import. If the size limit is reached, importation will not be authorized.

When first using Maarch Autoimport, work_batch_autoimport_id must be set to 0 in the table parameters

AutoImport002.png

Installation of Maarch AutoImport

Installation of Maarch AutoImport Files

No installation procedure is needed for this application. The script can be run from any folder, but the different configuration files must be set up accordingly.

AutoImport004.png

Configuration of the Application

To configure Maarch AutoImport, it is necessary to edit the file config.xml. It contains the information necessary to the connection to the database and to file importation in the DocServer. The mapping.xml is the link between the different XML files and the database. You will find below a detailed description for each of these files.

Configuration of config.xml

Config.xml is the configuration file for Maarch AutoImport. It allows the connection to the database and sending files to Maarch DocServer.

Each tag is a configuration entry. The values can be changed.

Description of config.xml tags

  • <CONFIG_NAME> : name of the config.
  • <MAPPING_FILE> : absolute path to the mapping file.
  • <SCAN_IMPORT_DIRECTORY> : absolute path to the folder where scanned documents will are saved. This folder will be monitored (eg: E:\IMPORTATION\)
  • <LOCATION>: IP address of the database server.
  • <DATABASE_PORT> : port of the database server.
  • <DATABASE>: name of Maarch database.
  • <DATABASETYPE> : type of the database server (postgresql, mysql, oracle, mssql).
  • <DATABASEWORKSPACE> : workspace of the database server.
  • <USER_NAME>: login to the MySql database.
  • <PASSWORD>: associated password to the database.
  • <TABLE_NAME>: name of the table where the files must be referenced in the database.
  • <INSERT_MODE> : put true if SQL insert mode.
  • <DOCSERVER_NAME>: name of the DocServer.
  • <DATE_TIME_FORMAT> : date format of the database server.
  • <AUTO_IMPORT_DIRECTORY>: folder where Maarch AutoImport scripts are (eg: E:\Maarch_auto_import_php\).
  • <WITHOUT_XML> : put true if documents without xml file.
  • <BACKUP_BATCH> : put true if you want to backup the documents.
  • <EXCLUDE_EXISTING_DOCS> : put true if you want to exclude existing documents (fingerprint control).
  • <EXCLUDE_EXISTING_DOCS_FOLDER> : name of the folder of exclude docs.
  • <CONTROL_COMPLETE_FILES> : put true if you want to control integrity of the batch.
  • <CREATE_FOLDERS> : put true if you want to create folders.
  • <FOLDERS_TABLE_NAME> : name of the folder table.
  • <FOLDERS_MAPPING_FILE> : absolute path to the mapping file for folders. .

Configuration of mapping_file.xml

mapping_file.xml allows the indexation of properties in the desired fields of the database. The configuration of this file is imperative to perform the indexation.

Every field to index must be described in a XML tag: the <ELEMENT> tag. It contains two child tags: <COLUMN> and the <VALUE> tags.

<COLUMN>: Maarch column in RES_X table (or other table according to settings) <VALUE>: index file tag or constant value between single quotes. <FORMAT>: 'DATE' for date format, nothing for others. Date format is taken in config.xml file.


 Example of file mapping file.xml
 
 <ROOT>
   <MAPPING_FILE>
     <ELEMENT>
       <COLUMN>DOC_DATE</COLUMN>
       <VALUE>DATE_OF_DOCUMENT </VALUE>
       <FORMAT>DATE</FORMAT>
     </ELEMENT>
     <ELEMENT>
       <COLUMN>DESCRIPTION</COLUMN>
       <VALUE>SUM</VALUE>
     </ELEMENT>                                                        
     <ELEMENT>
       <COLUMN>TYPE_ID</COLUMN>
       <VALUE>TYPE_ID</VALUE>
     </ELEMENT>
     <ELEMENT>
       <COLUMN>TITLE</COLUMN>
       <VALUE>INVOICE</VALUE>
     </ELEMENT>
     <ELEMENT>
       <COLUMN>IDENTIFIER</COLUMN>
       <VALUE>CLIENT</VALUE>
     </ELEMENT>
     <ELEMENT>
       <COLUMN>CUSTOM_T4</COLUMN>
       <VALUE>ADRESS</VALUE>
     </ELEMENT>         
     <ELEMENT>
       <COLUMN>STATUS</COLUMN>
       <VALUE>'NEW'</VALUE>
     </ELEMENT>         
   </MAPPING_FILE>
 </ROOT>

Description of the XML files containing information for the documents

In order to perform the indexation of a document in Maarch, the information of indexation must be written in another file without compromising the original file. The file containing these information is an XML file with the same name as the original file.

The information to be indexed in the chosen fields are palced between tags having the same name that the tag <VALUE> contains in mapping_file.xml

 Example of file XML of property:
 
 <?xml version='1.0' encoding='iso-8859-1'?>
 
 <ROOT>
   <DATE_OF_DOCUMENT>2007-01-11 00:00:00</DATE_OF_DOCUMENT>
   <TYPE_ID>Customer Invoice</TYPE_ID>
   <INVOICE>ACME F-003</INVOICE>
   <CLIENT>ACCOVA RADIATEURS</CLIENT>
   <ADRESS>7 rue Jean Mermoz</ADRESS>
   <CP>91080</CP>
   <CITY>COURCOURONNES</CITY>
   <SUM>803,71</SUM>
 </ROOT>

Execution of the application

When config.xml and mapping_file.xml are correctly configured, Maarch AutoImport can be executed to import the documents in Maarch.

Maarch AutoImport needs an command-line interpreter to be executed.

AutoImport006.png

With Microsoft Windows: in the "Start Menu", select "Execute". In the execution window, enter "cmd" and click OK.

First, you need to execute PHP interpreter. you need to enter the folder php.

DOS Piloting Indexes: Location by default of Php 5.0:

  cd « C:\xampp\php\php.exe »

When in the folder of the PHP interpreter, you must execute it with the names of Maarch AutoImport Script and config file:

 php [application location] [configuration file]
 

Example of command to launch the mass importation with a standard config file:

  C:\xampp\php\php.exe C:\autoimport\maarch_autoimport\maarch_auto_import.php  C:\autoimport\maarch_autoimport\config.xml

Then press "Enter": the PHP interpreter is now executing the script.

The different actions performed during the importation process stored in an event log. It is located in the same folder as Maarch AutoImport and it is called log.txt. This document is overwritten after every new execution of the script.

If an error occurs during the execution of the script, its execution is immediately stopped and the indexation will not perform.

An SQL file is created in Maarch AutoµImport folder. It contains the SQL request to execute in the database for documents indexation.

When the documents are correctly inserted in Maarch, the importation file copies the documents in the /backup/ folder. If an error occurs during the execution, the documents are sent in the folder /failed/.


Locking of the application

During the execution of the application, it creates a lock file named AutoImport.lck<tt>. It will be deleted when the execution is over.

This file aim at avoiding any mishandle of the documents. A single instance of Maarch AutoImport can be launched at a time.

Log event file

In Maarch AutoImport folder, you will find a file named <tt>log.txt.

It records the different evolutions of the script during its execution. This log make possible to follow AutoImport does. This file is overwritten every time the script is executed.

When that an error occurs, it will be notified in the log, as well as the error number to guide through the resolution of the problem.

Error codes

The most frequent error codes are:

  • List of known errors :
  • 2 : the free space available in the docserver is not sufficient to continue : put all the loading batch in failed directory and stop application,
  • 3 : copy of a file failed during treatment of it : put the file in failed directory,
  • 4 : Maarch AutoImport cannot execute query on database : put all the loading batch in failed directory and stop application,
  • 5 : image of a xml file not found : put the xml file in failed directory,
  • 6 : Xml of an image file not found : put the xml file in failed directory,
  • 7 : move imported files in backup failed : don't delete file in incoming directory and stop application,
  • 8 : unable to move file in CreateLockFile function : put all the loading batch in failed directory and stop application,
  • 9 : error DATE_FORMAT tag : wrong description of the date in the mapping file : put all the loading batch in failed directory and stop application,
  • 10 : error XML format (DATE_FORMAT tag) ! Please check your xml transfert file : put all the loading batch in failed directory and stop application,
  • 11 : workbatch already exist : put all the loading batch in failed directory and stop application,
  • 12 : path of the docserver not exists : put all the loading batch in failed directory and stop application,
  • In case of a known error #2, #4, #7, #8, #9, #10, #11, #12 all the loading batch is moved in failed directory.

Organizations of the files in the DocServer

Only the documents are sent to the DocServer. The XML Files are destroyed when they are inserted in the database.

In the DocServeur folder, documents are sorted first by year and then by month.

The third level is the number of the batch of the AutoImport. This number can be modified in the database through the table parameters, but it is strongly advised against while indexing documents.

To organize the documents in the DocServer, they are renamed and classified by folder. Every thousand documents, a new folder is created to receive the following thousand of files.

AutoImport011.jpg

Descriptions of the main functions

  • Scan: Browse the different files in the importation folder and verifies the number of documents in course of treatment.
  • GetWorkBatchAutoImport: Raise the working number of the latest batch of Maarch AutoImport.
  • MoveFiles: Move the files to the final folder according to ending of the process.
  • UpdateWorkingBatchAutoImport: Updates the working number of the batch of Maarch AutoImport.
  • SearchImageOfXml: Verifies the presence of a file linked to the XML.
  • ExtractFileExt: Retrieve the extension of the filename which is processed.
  • ExtractFileName: Retrieve only the name of the file (without the extension).
  • DocServerSize_init: Retrieve the current size of the DocServer.
  • DocServerSize_update: Updates the size of the DocServer.
  • Checking_available_space: Verify if the required space for the importation is available in the DocServer.
  • BuildDirectory: Index the desired files within the DocServer.
  • Function_log: Creates a file logging the different events happening during the execution of the script.
  • WriteInSqlFile: Write in a file the information of the different XML files to import in the database.
  • BuildSqlQuery: Parse the XML files and build the SQL request. the request is then written in a file by WriteInSqlFile.