1 Star 0 Fork 413

Yang / porter

forked from sxfad / porter 
加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
architecture_EN.md 3.41 KB
一键复制 编辑 原始数据 按行查看 历史
MurasakiSeiFu 提交于 2018-08-01 21:50 . 提交管理后台模块

Architecture Model

Applicable scene

  • Final consensus
  • One-way synchronization

Realization basis

  • MQ Message Sequence consumption
  • Messages in the MQ message group are consumed at most once
  • Only supports DML, DDL、DCL manual execution
  • Table must have primary key、last update time

System architecture

architecture_design

Node Memory model

node-model

TaskController 1---* TaskWorker
TaskWorker 1---* TaskWork
TaskWork 1---1 *Job
Popularly speaking, TaskController corresponds to the Node process, there is only one in the process; TaskWorker corresponds to the task, Each task corresponds to a Worker;
Each task which has a plurality of conduits, namesd TaskWork,corresponds to the MQ topic; Each Work has multiple phased tasks.
In terms of the whole, This is a pipeline filter style architectural pattern.
SelectJob single thread consumes data from a data source
ExtractJob single thread reads data from the Select queue, multithreading extract data.
TrasnformJob single thread reads data from the Extract memory collection,Multi-threaded mapping conversion data.
LoadJob single thread loads data into the database according to the order of the SelectJob consumption.
AlertJob, the single-thread synchronous Zookeeper database checks the time point and compares the data entry differences between the source database and the target database in the specified time period.
Alarms are configured according to the alarm mode configured in the configuration file.

Problem & Solution(Phenomenon description and compensation)

  • MQ distribution problem
    • Different source databases spit into different MQ clusters.
    • Broker cluster storage, topic only allows unique partitions.
    • Each table individually corresponds to a topic.
    • Message consumption node cluster topic consumption
    • Message consumption node only consumes messages, no complex business logic.
    • Manually control the progress of message production.
  • Data consistency problem
    • Final consistency
    • Message loss
      • Target record is inconsistent with source record.
        • The last insert、update message is missing.
          • The target database lags behind the source database.
        • Last delete message lost.
          • Target database redundant dirty data.
      • Message condition does not match the target database.
        • No insert messages,just have update and delete message.
          • Insert according to the latest value.
          • No need to delete actions.
        • Primary key change condition matching.
          • Update the latest value based on the primary key.
        • Primary key change conditions do not match.
          • Target database insert the new primary key, the old primary key redundant dirty data.
    • Message loss solution
      • Select data records that have not changed recently.
        • Over time, the data set has changed, there is no way to perform task segmentation based on time, nor can it accumulate historical task execution results.
      • Select data records that change at a specified time.
        • Record the initial synchronization time.
        • According to a reasonable time span, target database and source database data records for a specified time span before the current time node.
        • Cumulative completed data comparison time.
Java
1
https://gitee.com/yangfanyun/porter.git
git@gitee.com:yangfanyun/porter.git
yangfanyun
porter
porter
master

搜索帮助