Sponsored Development: Replication

From ADempiere
Revision as of 11:26, 20 January 2008 by Trifonnt (Talk) (Q.8 How are those events being sent to the queue?)

Jump to: navigation, search
This Wiki is read-only for reference purposes to avoid broken links.

Contents

Plan

Implemented in stages.

  • 2 weeks. Research and Proof of Concept.

License

GPL2 - the same as Adempiere.

Project Team

Coordinator

Victor Perez

Functional Specs

Developers

Testers

Sponsors

e-evolution $2400 USD


Requirements

Finish off the beta Replication functionality inherited from Compiere.

  • Need replication master to master asynchronously.
    • Must work with Postgres!
    • 38 stores working with local servers must synchronize with one central server.
    • Connection between remote servers is 128 kbs.
    • Records created on Central server: Product, Price, Term Credit, Business Partner.
    • Records created on Remote servers: Sales Order, Shipment, Invoice (Customer), Payment, Journal Cash and Business Partner , Inventory

Scenary

The Company has 38 stores with a local adempiere and postgresql server and a 128kb internet conection , these stores must be able to operate even if they are not online, when they are online the stores will automatically replicate to the central server

Design

What we need for Replication

  • A. Way to export information from Adempiere.
    • I think that there must be mechanism inside Adepiere to define Export message.

This is what i managed to develop with 'Export Format' window. User can define XML format. XML Format can have tree structure.

  • B. Triger which starts Export process.
    • See 'When to triger export event'
  • C. Place where all Export messages are stored till Adempiere is disconnected from Internet.
    • I think that this must be some external Server with proven capabilities.
      • Like JMS. JMS can guarantee delivery of messages. So task of Admepiere is just to deliver message to JMS Server.
  • D. Place where all Incoming messages are stored till Adempiere process them.
    • I think that this must be some external Server with proven capabilities.
      • Like JMS. JMS can guarantee delivery of messages. So task of Admepiere is just to read Incoming message from JMS Server.
  • E. Process which listen for new incoming messages.
    • I think this also must be process inside Adempiere which connects to Incoming Message storage and process messages(Calls Import functionality).
  • F. Way to import information in Adempiere.
    • I think that this also must be functionality inside Admepiere.

This is what i managed to develop with ImportProcess.

When to triger export event

When record is saved in ModelValidator
  • Low Hengsin's opinion:
I don't think using ModelValidator to generate the xml export file is a good approach. 
Replication should be a background process that have little performance impact on normal transaction 
and the frequency of replication should also be user-configurable.
I think we should have a background process that read the changelog and replication configuration 
to generate the replication message ( xml or otherwise ) and send that out.
    • My opinion:
I agree. But ModelVlaidator is one possible option. 
At the same time Replication MUST be guaranteeed. Which means that record should not be saved if Change Message 
can't be sent to Message Storage.
JMS guarantee delivery once message is stored into Queue or Topic.
I think to install on each local Adempiere instance one JMS Server which will collect all messages from local clients.
It will be the task of the JMS Server to transfer messages to other ADempiere Instances once Internet connection is online.
At some predefined schedule & AD_ChangeLog
  • Extend the changelog (AD_ChangeLog) feature in Adempiere and use that to generate the replication message.

Also, it might be good to add versioning support for all tables which will help in implementing conflicts-resolution.

For the cycle issue, if the changelog mechanism is use, we can add a way to turn on or off the change log management during the save operation.

  • My opinion
I think that it is possible to be done!
According to me require some additional processing. We must take care to collect all non-sent to JMS Server Change Messages,
which duplicates work of JMS server. 
  • There are cases in which AD_ChangeLog do not work.
    • Create new Business Partner. AD_ChangeLog do not contain information that new record was created.

I think that this aplies for all other tables.

    • Delete existing Business Partner. AD_ChangeLog do not contain information that record was deleted.
Use Workflow functionality and add new Node which will be responsible for sending XML Message

Thiw will allow to have Approaval and User configuirable process, but will make priocess slow.


Tasks


Existing functionality

  • Post from Carlos - here

There is a seed from JJ currently in windows "Replication Strategy" and "Replication"

    • Configuration: AD_ReplicationStrategy + AD_ReplicationTable + AD_Table
    • Execution: AD_Replication + AD_ReplicationRun + AD_ReplicationLog
    • There is a hidden field "Replication Type" in AD_Table.
    • There is code to manage different sequences in different installations.
    • There is code to replicate in ReplicationLocal.java -> looks like it manages the replication set based on the Updated column of replicated tables.
    • There is some code (looks unused) in ReplicationRemote.java - I suppose this code is intended to be used in the remote installations.

I'm not saying this is a complete solution, it must be enhanced: i.e. Updated column must be guaranteed or changed to manage AD_ChangeLog instead of AD_ChangeLog is just another seed that needs to be enhanced. There are known flaw points.

Known Issue

  • Circular Link in some tables (i.e. C_Invoice - C_CashLine_ID & C_CashLine - C_Invoice_ID) --Armen 01:38, 10 July 2007 (EDT)
  • Many places are using Update sql (instead of PO) resulting unreliable modified date --Armen 01:38, 10 July 2007 (EDT)

Tables which do not have IsIdentifier set

AD_Package_Imp_Detail
AD_Package_Imp_Proc
AD_UserBPAccess
CM_Container_URL
CM_MediaDeploy
CM_NewsItem
CM_WebAccessLog
C_TaxDeclarationAcct
C_TaxDeclarationLine
M_AttributeSetExclude
M_CostDetail
M_CostQueue
M_LotCtlExclude
M_SerNoCtlExclude

Issue during importing process

Example xml file which must be imported.

<?xml version="1.0" encoding="UTF-8"?>
<C_BPartner AD_Client_Value="GardenWorld">
    <AD_Client>
	<AD_Client_Value>GardenWorld</AD_Client_Value>
    </AD_Client>
    <Value>GardenAdmin-7</Value>
    <Name>GardenAdmin BP</Name>
</C_BPartner>
How to understand which column(columns) is/are Unique for given table?
  • Could add sub tab in Export format and add all coulmns for givien table which make records uniquely identifieable.
  • Unfortunately AD (Application dictionary) do not store any information which can be used.
    • For example: C_BPartner table. In AD we have Name column set as IsIdentifier, but Unique columns at DB level are (AD_Client_ID + Value)
  • Could read Meta information from DB, but in this case will need Oracle specific and Postgre specific handling.
  • Export Format Line could contain additional field: IsPartUniqueIndex. All Columns which have this flag will form Unique Key of Record.
  • Implemented Option 4.
How import process could FIND proper Export Format

Importer has as input XML document. From XML Document importer must FIND proper Export Format. Which means that all information must be kept inside XML file.

  • 1. Upon save of XML file, ExportModelValidator can add 2 xml attributes. Both attributes makes Export format unique.
    • AD_Client_Value
    • EXP_Format_Value
  • 2. Upon save of XML file, ExportModelValidator can add only 1 xml attributes. Root XML node name and AD_Client_Value attributes makes Export format unique.
    • AD_Client_Value
  • Implemented option 2 as option 1 leads to duplication of information.


Order of messages

One of the biggest problems in replication is that many times you CANNOT simply send the transactions grouped by table, but you must send the transactions in the same order as they happen (to avoid referential integrity problems in the target system). So, you have to make sure you can replicate in the same order (AD_ChangeLog.Created?)

Questions and Answers

Q.1 What is the Role of Export Process Type?

Export Processor Type defines java class which is responsible to send Export Message.

In the case of JMS:

'Local JMS' Server stores received messages and will transfer them to 'Remote JMS' server when network connection is online. If 'Local JMS' server is down then 'Local Adempiere' instance will not be able to work as sending of JMS messages from 'Local Adempiere' instance to 'Local JMS' server will fail.


Two Export Processor Types are defined in default examples:

  • org.adempiere.server.rpl.exp.HDDExportProcessor
    • Uses shared file system to store exported messages.
  • org.adempiere.server.rpl.exp.TopicExportProcessor
    • Uses JMS server to send exported messages.
    • TopicExportProcessor is a JMS client that sends messages to Local JMS Server.

Q.2 Should I define a Export Processor for each store?

Yes.

Each store will be defined as new organization in Adempiere. That's why we need new Export Processor for each Adempiere Organization/Store.


Q.3 Do i need Import Processor?

Yes.

Each Adempiere instance must have Import Processor defined. We need to define new Import Processor for each Adempiere Organization/Store. This Import Processor import messages from 'Local JMS Server'.


Q.4 What happens when record can't be replicated?

Record can't be replicated when 'Local JMS Server' is not working. In this cases Adempiere will show error message to the user.

Answer of this question is the same as answer of question: 'What happens when my Database server stop working?'


Q.5 What happens if JMS server is down?

See answer of Question 4.


Q.6 How DB record is marked as replicated?

It is not necessary to mark record as exported. Marking record as exported can be done but this is redundant step. Once JMS message is sent to 'Local JMS Server' we are guaranteed that record will be transfered to 'Remote JMS Server'. JMS protocol is responsible to handle this.

Of course we can create functionality which send confirmation from 'Remote Server' to 'Local Server' when record is saved, but this will require additional development effort.


Q.6 Is the queue constructed with records or transactions?

Export can be configured as per record export or as per Document export or mixed. Queue is inside JMS Server. JMS messages are stored in Queue.


Q.7 What example transactions are provided?

- Create a Business Partner Group......... -- DONE.

- Create a Business Partner............... -- DONE.

- Create an Order......................... -- DONE.

- Create an OrderLine..................... -- DONE.

- Update the Order (complete) ............ -- DONE.

- Create an Invoice (based on the Order).. -- TO BE DONE.

- Create corresponding Invoice Lines...... -- TO BE DONE.

- Update the Invoice (complete)........... -- TO BE DONE.

Q.8 How are those events being sent to the queue?

All cases are possible it depends how export format is defined.

>a) like single records?
>Insert BPGroup XML
>Insert BP XML
>Insert Order XML
>Insert OrderLines XML (one for each record?)
>Update Order XML
>Insert Invoice XML
>Insert InvoiceLines XML
>Update Invoice XML
>Update BPartner? -> maybe to update the openbalance because of the invoice completed?
>
>b) like transactions? 
>transaction inserting BPGroup 
>transaction inserting BPartner 
Export format must defined to export only BPGroup/BPartner.

>transaction when completing the order sending the order + lines in one single XML?
Export format must be defined to export Order and all lines.

>transaction when completing the invoice sending the invoice + lines + BP in one single >XML?
Export format must be defined to export invoice + lines + BP in one single XML.


Q.9 What happens if a record fails to be replicated on slave?

i.e. in the previous example what will happen if the BPGroup can't be inserted (i.e. because of primary key broken)

Record will stay on JMS Server. Notification mechanism must be created in order to notify administrator and take appropriate actions.

>This question is important because: 
>a) if the process continues then the rest of the records can fail (because of foreign key >issues) 
>b) if the process stops then it needs special attention to failures - because one failure >stop all the replication process 
>--> I suppose is better and safer a) 

At the moment dependent transactions will fail.


Q.10 How is the ID's issue on bidirectional replication resolved?

I mean if you can for example import BPGroups on master and slave at the same time, you'll have conflict with ID's.

Problem can arise when Value columns are duplicated. Replication do not transfer IDs. IDs are internal for DB and i do not Export/Import them.

Conflict resolution process must be created.

Sorry for asking too much, I didn't find design details in wiki or requirements.txt. Obviously you're free to answer or not (I know I could simply download and review your code). These questions are trying to figure also the answer for the next one:

4 - What's the status and plan of this development? a) what's the status? alpha? beta? release-candidate? b) is Victor planning to include it in adempiere350? c) are there plans to be included in trunk before 3.4 - we're on a freeze with possible votings for new functionalities d) are there plans to document the steps needed to set up replication? e) are there plans to document the scope of replication?

Example XML documents created by Export process

  • XML Documents are created by Export process(ExportModelValidator class).
  • Examples 1, 2, 3 export the same record but in different XML format.


First Example

<C_BPartner AD_Client_Value="GardenWorld" Version="3.2.0">
    <AD_Client>
	<AD_Client_Value>GardenWorld</AD_Client_Value>
    </AD_Client>
    <AD_Org>
	<Value>HQ</Value>
	<AD_Client_Value>
	    <AD_Client_Value>GardenWorld</AD_Client_Value>
	</AD_Client_Value>
    </AD_Org>
    <Value>Test-Replication-BP-3</Value>
    <Name>Test-Replication-BP-3</Name>
    <DUNS>Duns-3     </DUNS>
    <Created>2007-08-05 21:21:37.0</Created>
    <CreatedBy>
	<Name>SuperUser</Name>
	<AD_Client_Value>
	    <AD_Client_Value>SYSTEM</AD_Client_Value>
	</AD_Client_Value>
    </CreatedBy>
    <Updated>2007-08-05 21:21:37.0</Updated>
    <UpdatedBy>
	<Name>SuperUser</Name>
	<AD_Client_Value>
	    <AD_Client_Value>SYSTEM</AD_Client_Value>
	</AD_Client_Value>
    </UpdatedBy>
</C_BPartner>


Second example

The same document but different name of XML Elements.

<?xml version="1.0" encoding="UTF-8"?>
<C_BPartner AD_Client_Value="GardenWorld" Version="3.2.0">
    <AD_Client_ID>
		<AD_Client_Value>GardenWorld</AD_Client_Value>
    </AD_Client_ID>
    <AD_Org_ID>
		<Value>0</Value>
		<AD_Client_ID>
	    		<AD_Client_Value>SYSTEM</AD_Client_Value>
		</AD_Client_ID>
    </AD_Org_ID>
    <Value>GardenAdmin-17</Value>
    <Name>GardenAdmin BP-17</Name>
    <DUNS>Duns-----17</DUNS>
    <Created>2003-03-27 15:44:25.0</Created>
    <CreatedBy>
		<Name>System</Name>
		<AD_Client_ID>
	    		<AD_Client_Value>SYSTEM</AD_Client_Value>
		</AD_Client_ID>
    </CreatedBy>
    <Updated>2007-08-06 00:30:31.0</Updated>
    <UpdatedBy>
		<Name>SuperUser</Name>
		<AD_Client_ID>
	    		<AD_Client_Value>SYSTEM</AD_Client_Value>
		</AD_Client_ID>
    </UpdatedBy>
</C_BPartner>


Third example

The same document but stores IDs instead of record unique key.

<?xml version="1.0" encoding="UTF-8"?>
<C_BPartner AD_Client_Value="GardenWorld" Version="3.2.0">
    <AD_Client_ID>
		1000000
    </AD_Client_ID>
    <AD_Org_ID>
		100
    </AD_Org_ID>
    <Value>GardenAdmin-17</Value>
    <Name>GardenAdmin BP-17</Name>
    <DUNS>Duns-----17</DUNS>
    <Created>2003-03-27 15:44:25.0</Created>
    <CreatedBy>
		101
    </CreatedBy>
    <Updated>2007-08-06 00:30:31.0</Updated>
    <UpdatedBy>
		101
    </UpdatedBy>
</C_BPartner>

Links

sf.net forum posts


External links

[PostgreSQL Info]

Screen shots

Export Format

1-ExportFormat.jpg


Export Format Line

2-ExportFormatLine.jpg


Export Format - Grid Mode

3-ExportFormat-GridMode.jpg


Example: Org_Value - Export Format window

4-ExportFormat-Org Value.jpg


Example: Org_Value - Export Format Line window

5-ExportFormatLine-GridMode-Org Value.jpg