# btc **Repository Path**: zijiwork/btc ## Basic Information - **Project Name**: btc - **Description**: BTC——big table cleaner 大表在线清理工具,基于 gh-ost 和携程大佬的代码; 感觉会有蛮多人需要使用这样的工具的,为此特别构建好可执行文件放在这里; - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 2 - **Created**: 2021-11-25 - **Last Updated**: 2021-11-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 基于 gh-ost 的大表瘦身工具使用说明及测试案例 大表瘦身工具基于 gh-ost 和携程DBA和大佬的代码,仅用于MySQL数据库大表的瘦身和DDL;本文是测试文档 ## 1. 数据清理的使用方式和速率对比 ### 1.1. 使用方式 二进制文件是`go\cmd\gh-ost\btc`,该二进制工具在linux环境下运行: ```shell ./btc \ --assume-rbr \ --initially-drop-ghost-table \ --initially-drop-socket-file \ --host="数据库地址" \ --port=3306 \ --user="root" \ --password="数据库密码" \ --database="数据库名称" \ --table="表名" \ --verbose \ --alter="noop" \ --where-reserve-clause="created_at between '2021-09-09 15:55:00' and '2021-09-09 15:56:20'" \ --panic-flag-file=/tmp/ghost.panic.flag \ --chunk-size 10000 \ --dml-batch-size 50 \ --max-lag-millis 15000 \ --default-retries 1000 \ --allow-on-master \ --throttle-flag-file=/tmp/ghost.throttle.flag \ --exact-rowcount \ --ok-to-drop-table \ --replica-server-id=$RANDOM \ --execute >> ./user_details.out & ``` - `where-reserve-clause`参数是保留数据的where条件,将会和`explain select * from `peppa`.`user_details` where 1=1 and `拼接; 可以先`SELECT * FROM user_details WHERE created_at between '2021-09-09 15:55:00' and '2021-09-09 15:56:20'`试验下会选中哪些数据; - `alter`参数塞值为`noop`可以不对表做DDL操作; - `assume-rbr`禁止从库运行stop slave等操作; ### 1.2. 速率对比 | 总行数(行) | 拷贝行数(行) | 拷贝占比(%) | 耗时(秒) | 速率(行/秒) | 条件语句 | 结论 | | ---------- | ------------ | ----------- | -------- | ----------- | ------------------------------------------------------------ | ---- | | 2500000 | 2456632 | 98.3 | 48 | 51179 | created_at BETWEEN '2021-09-09 17:58:45' AND '2021-09-09 18:04:45' | 通过 | | 2456632 | 408421 | 16.6 | 17 | 24025 | created_at BETWEEN '2021-09-09 17:59:00' AND '2021-09-09 18:00:00' | 通过 | | 5520000 | 5383031 | 97.5 | 104 | 51760 | created_at BETWEEN '2021-09-10 11:16:00' AND '2021-09-10 11:30:00' | 通过 | | 5383031 | 783073 | 14.5 | 29 | 27003 | created_at BETWEEN '2021-09-10 11:20:00' AND '2021-09-10 11:22:00' | 通过 | | 10000000 | 8438122 | 84.4 | 164 | 51452 | created_at BETWEEN '2021-09-10 16:17:58' AND '2021-09-10 16:40:00' | 通过 | | 8438122 | 390552 | 4.6 | 28 | 13948 | created_at BETWEEN '2021-09-10 16:20:00' AND '2021-09-10 16:21:00' | 通过 | | 10823055 | 10703051 | 99.3 | 295 | 36282 | creation_date BETWEEN '2020-05-10 17:45:00' AND '2021-06-10 17:45:00' | 通过 | | 10703051 | 72074 | 1.5 | 21 | 7735 | creation_date BETWEEN '2020-05-10 10:00:00' AND '2020-05-30 20:00:00' | 通过 | ## 2 测试案例 ### 2. 1. 250万数据清理测试 自装MySQL、表名`user_details`、行数**250**万、用下面方法插入数据: ```go // 用户明细表 type UserDetail struct { gorm.Model Id int `json:"id" gorm:"type:varchar(255);unique_index;not null;comment:ID"` Name string `json:"name" gorm:"type:varchar(300);not null;comment:姓名"` Age int `json:"age" gorm:"type:varchar(10);not null;comment:年龄"` Birthday time.Time `json:"birthday" gorm:"type:date;not null;comment:出生日期"` Gender string `json:"gender" gorm:"type:tinyint(1);not null;comment:性别"` Mobile string `json:"mobile" gorm:"type:varchar(20);not null;comment:手机号"` Email string `json:"email" gorm:"type:varchar(255);not null;comment:邮箱"` Attr1 string `json:"attr1" gorm:"type:varchar(255);comment:个性化字段"` Attr2 string `json:"attr2" gorm:"type:varchar(255);comment:个性化字段"` Attr3 string `json:"attr3" gorm:"type:varchar(255);comment:个性化字段"` Attr4 string `json:"attr4" gorm:"type:varchar(255);comment:个性化字段"` Attr5 string `json:"attr5" gorm:"type:varchar(255);comment:个性化字段"` Attr6 string `json:"attr6" gorm:"type:varchar(255);comment:个性化字段"` Attr7 string `json:"attr7" gorm:"type:varchar(255);comment:个性化字段"` Attr8 string `json:"attr8" gorm:"type:varchar(255);comment:个性化字段"` Attr9 string `json:"attr9" gorm:"type:varchar(255);comment:个性化字段"` Attr10 string `json:"attr10" gorm:"type:varchar(255);comment:个性化字段"` Attr11 string `json:"attr11" gorm:"type:varchar(255);comment:个性化字段"` Attr12 string `json:"attr12" gorm:"type:varchar(255);comment:个性化字段"` Attr13 string `json:"attr13" gorm:"type:varchar(255);comment:个性化字段"` Attr14 string `json:"attr14" gorm:"type:varchar(255);comment:个性化字段"` Attr15 string `json:"attr15" gorm:"type:varchar(255);comment:个性化字段"` Attr16 string `json:"attr16" gorm:"type:varchar(255);comment:个性化字段"` Attr17 string `json:"attr17" gorm:"type:varchar(255);comment:个性化字段"` Attr18 string `json:"attr18" gorm:"type:varchar(255);comment:个性化字段"` Attr19 string `json:"attr19" gorm:"type:varchar(255);comment:个性化字段"` Attr20 string `json:"attr20" gorm:"type:varchar(255);comment:个性化字段"` Attr21 string `json:"attr21" gorm:"type:varchar(255);comment:个性化字段"` Attr22 string `json:"attr22" gorm:"type:varchar(255);comment:个性化字段"` Attr23 string `json:"attr23" gorm:"type:varchar(255);comment:个性化字段"` Attr24 string `json:"attr24" gorm:"type:varchar(255);comment:个性化字段"` Attr25 string `json:"attr25" gorm:"type:varchar(255);comment:个性化字段"` Attr26 string `json:"attr26" gorm:"type:varchar(255);comment:个性化字段"` Attr27 string `json:"attr27" gorm:"type:varchar(255);comment:个性化字段"` Attr28 string `json:"attr28" gorm:"type:varchar(255);comment:个性化字段"` Attr29 string `json:"attr29" gorm:"type:varchar(255);comment:个性化字段"` } var db *gorm.DB var sqlDB *sql.DB var err error func init() { // 参考 https://github.com/go-sql-driver/mysql#dsn-data-source-name 获取详情 db, err = gorm.Open(mysql.New(mysql.Config{ DSN: "root:密码@tcp(192.168.x.xx:3306)/peppa?charset=utf8&parseTime=True&loc=Local", // DSN data source name DefaultStringSize: 256, // string 类型字段的默认长度 DisableDatetimePrecision: true, // 禁用 datetime 精度,MySQL 5.6 之前的数据库不支持 DontSupportRenameIndex: true, // 重命名索引时采用删除并新建的方式,MySQL 5.7 之前的数据库和 MariaDB 不支持重命名索引 DontSupportRenameColumn: true, // 用 `change` 重命名列,MySQL 8 之前的数据库和 MariaDB 不支持重命名列 SkipInitializeWithVersion: false, // 根据当前 MySQL 版本自动配置 }), &gorm.Config{ Logger: logger.Default.LogMode(logger.Silent), }) sqlDB, err = db.DB() // SetMaxIdleConns 设置空闲连接池中连接的最大数量 sqlDB.SetMaxIdleConns(10) // SetMaxOpenConns 设置打开数据库连接的最大数量。 sqlDB.SetMaxOpenConns(200) // SetConnMaxLifetime 设置了连接可复用的最大时间。 sqlDB.SetConnMaxLifetime(time.Hour) if err != nil { fmt.Println("error:", err) } } // 批量写入 rowLen 总行数 coroutineNum 并发数 grainSize 事务粒度 func batchInsertUserDetail(rowLen, coroutineNum, grainSize int) { start := time.Now() fmt.Println("开始写入: ", start.Format("2006-01-02 15:04:05.000")) // 定时器 go func() { for range time.Tick(30 * time.Second) { cue(start) } }() // 分页插入数据逻辑 totalInsertNumber := 0 for i := 0; i < coroutineNum; i++ { go func(i int) { sqlExec := "INSERT INTO `peppa`.`user_details` (`id`, `created_at`, `updated_at`, `name`, `age`, `birthday`, `gender`, `mobile`, `email`, `attr1`, `attr2`, `attr3`, `attr4`, `attr5`, `attr6`, `attr7`, `attr8`, `attr9`, `attr10`, `attr11`, `attr12`, `attr13`, `attr14`, `attr15`, `attr17`, `attr19`, `attr22`, `attr23`, `attr25`, `attr26`, `attr28`) VALUES " for k := i; k < grainSize/coroutineNum; k = k + coroutineNum { sqlData := " " for j := k * grainSize; j < k*grainSize+grainSize; j++ { if j < k*grainSize+grainSize-1 { id := strconv.Itoa(j) singleSqlData := "('" + id + "', '" + time.Now().Format("2006-01-02 15:04:05") + "', '" + time.Now().Format("2006-01-02 15:04:05") + "', 'test" + id + "', '" + id + "', '" + time.Now().Format("2006-01-02 15:04:05") + "', '1'" + ", '15003456789', '" + id + "@outlook.com'" + ", '1', '1', '1', '1', '1', '23', '4', '5', '6', '7', '8', '9', '0', '-', '7', '435', '5', '4', '2', '25', '67', '687'), " sqlData = sqlData + singleSqlData } else { id := strconv.Itoa(j) singleSqlData := "('" + id + "', '" + time.Now().Format("2006-01-02 15:04:05") + "', '" + time.Now().Format("2006-01-02 15:04:05") + "', 'test" + id + "', '" + id + "', '" + time.Now().Format("2006-01-02 15:04:05") + "', '1'" + ", '15003456789', '" + id + "@outlook.com'" + ", '1', '1', '1', '1', '1', '23', '4', '5', '6', '7', '8', '9', '0', '-', '7', '435', '5', '4', '2', '25', '67', '687')" sqlData = sqlData + singleSqlData } } _, err := sqlDB.Exec(sqlExec + sqlData) // 执行写入 if err != nil { fmt.Println("错误: ", err) break } totalInsertNumber = totalInsertNumber + grainSize fmt.Println("已写入行数: ", totalInsertNumber) } }(i) } // 停止逻辑 for { if totalInsertNumber < rowLen { // 没达到需要的行数就接着插入 } else { fmt.Println("完成写入: ", time.Since(start)) break } } } // cue 提示方法 func cue(t time.Time) { fmt.Println("已运行时间: ", time.Since(t)) } func main() { runtime.GOMAXPROCS(runtime.NumCPU()) db.AutoMigrate(&UserDetail{}) // 数据迁移 batchInsertUserDetail(15500000, 10, 10000) // 造数据 } ``` 插入数据日志: ``` 开始写入: 2021-09-09 17:58:45.619 已写入行数: 10000 ... 已写入行数: 130000 已运行时间: 30.0108469s 已写入行数: 140000 ... 已写入行数: 2500000 完成写入: 6m16.0577954s ``` 查询行数: ```sql SELECT COUNT(*) FROM user_details; - 2500000 ``` #### 2.1.1. 保留较多行(copy多、删除少) 保留符合条件的`2456632`行,占总量的**98.3%** 第一次我们试出来一个框选较少行数的条件语句: ```sql SELECT COUNT(*) FROM user_details WHERE created_at BETWEEN '2021-09-09 17:58:45' AND '2021-09-09 18:04:45'; - 2456632 ``` 直接使用打包后的可执行文件**btc**进行测试,将上局的`WHERE`语句塞给`where-reserve-clause`参数即可,执行命令为: ```shell ./btc \ --initially-drop-ghost-table \ --initially-drop-socket-file \ --host="127.0.0.1" \ --port=3306 \ --user="root" \ --password="数据库密码" \ --database="peppa" \ --table="user_details" \ --verbose \ --alter="noop" \ --where-reserve-clause="created_at BETWEEN '2021-09-09 17:58:45' AND '2021-09-09 18:04:45'" \ --panic-flag-file=/tmp/ghost.panic.flag \ --chunk-size 10000 \ --dml-batch-size 50 \ --max-lag-millis 15000 \ --default-retries 1000 \ --allow-on-master \ --throttle-flag-file=/tmp/ghost.throttle.flag \ --exact-rowcount \ --ok-to-drop-table \ --replica-server-id=$RANDOM \ --execute >> ./user_details.out & ``` 执行日志如下: ```shell [2021/09/10 10:07:27] [info] binlogsyncer.go:133 create BinlogSyncer with config {30568 mysql 127.0.0.1 3306 root false false false UTC true 0 0s 0s 0 false} [2021/09/10 10:07:27] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.000006, 90356324) [2021/09/10 10:07:27] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:3306 [2021/09/10 10:07:27] [info] binlogsyncer.go:723 rotate to (mysql-bin.000006, 90356324) # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 10:07:27 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 0/2472755 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.000006:90358436; Lag: 0.00s, HeartbeatLag: 0.01s, State: migrating; ETA: N/A ... Copy: 2456632/2456632 100.0%; Applied: 0; Backlog: 1/1000; Time: 47s(total), 46s(copy); streamer: mysql-bin.000006:468817980; Lag: 0.10s, HeartbeatLag: 0.10s, State: migrating; ETA: due # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 10:07:27 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 2456632/2456632 100.0%; Applied: 0; Backlog: 0/1000; Time: 47s(total), 46s(copy); streamer: mysql-bin.000006:468819700; Lag: 0.10s, HeartbeatLag: 0.02s, State: migrating; ETA: due [2021/09/10 10:08:14] [info] binlogsyncer.go:164 syncer is closing... [2021/09/10 10:08:15] [error] binlogstreamer.go:77 close sync with err: sync is been closing... [2021/09/10 10:08:15] [info] binlogsyncer.go:179 syncer is closed # Done ``` **结论** 复查现表总行数符合预期**2456632行**,耗时**48秒**,效率**51179行/秒**,测试通过。 **使用总结** 只要看到同路径下的日志文件执行结束,出现`# Done`了即可回车关掉执行命令。 #### 2.1.2. 保留较少行(copy少、删除多) 只保留较少占比行数`408421`行,占现有总行数的**16.6%** ```sql SELECT COUNT(*) FROM user_details WHERE created_at BETWEEN '2021-09-09 17:59:00' AND '2021-09-09 18:00:00'; ``` 执行语句为 ```shell ./btc \ --initially-drop-ghost-table \ --initially-drop-socket-file \ --host="127.0.0.1" \ --port=3306 \ --user="root" \ --password="数据库密码" \ --database="peppa" \ --table="user_details" \ --verbose \ --alter="noop" \ --where-reserve-clause="created_at BETWEEN '2021-09-09 17:59:00' AND '2021-09-09 18:00:00'" \ --panic-flag-file=/tmp/ghost.panic.flag \ --chunk-size 10000 \ --dml-batch-size 50 \ --max-lag-millis 15000 \ --default-retries 1000 \ --allow-on-master \ --throttle-flag-file=/tmp/ghost.throttle.flag \ --exact-rowcount \ --ok-to-drop-table \ --replica-server-id=$RANDOM \ --execute >> ./user_details.out & ``` 日志为 ```shell [2021/09/10 10:21:50] [info] binlogsyncer.go:133 create BinlogSyncer with config {22012 mysql 127.0.0.1 3306 root false false false UTC true 0 0s 0s 0 false} [2021/09/10 10:21:50] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.000006, 468821746) [2021/09/10 10:21:50] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:3306 [2021/09/10 10:21:50] [info] binlogsyncer.go:723 rotate to (mysql-bin.000006, 468821746) # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 10:21:50 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 0/2160679 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.000006:468823858; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: N/A ... Copy: 408421/408421 100.0%; Applied: 0; Backlog: 1/1000; Time: 15s(total), 14s(copy); streamer: mysql-bin.000006:530927655; Lag: 0.10s, HeartbeatLag: 0.00s, State: migrating; ETA: due # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 10:21:50 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 408421/408421 100.0%; Applied: 0; Backlog: 0/1000; Time: 15s(total), 14s(copy); streamer: mysql-bin.000006:530928167; Lag: 0.10s, HeartbeatLag: 0.07s, State: migrating; ETA: due Copy: 408421/408421 100.0%; Applied: 0; Backlog: 0/1000; Time: 16s(total), 14s(copy); streamer: mysql-bin.000006:530932527; Lag: 0.01s, HeartbeatLag: 0.11s, State: migrating; ETA: due [2021/09/10 10:22:06] [info] binlogsyncer.go:164 syncer is closing... [2021/09/10 10:22:07] [error] binlogstreamer.go:77 close sync with err: sync is been closing... [2021/09/10 10:22:07] [info] binlogsyncer.go:179 syncer is closed # Done ``` **结论** 通过查询,确实只保留了符合条件的**408421行**,耗时**17秒**,效率**24025行/秒**,测试通过。 ### 2. 2. 550万数据清理测试 自建MySQL、表名`user_details`、行数**552**万(实际插入行数有出入,不过不影响): 插入数据日志: ``` 开始写入: 2021-09-10 11:15:57.827 已写入行数: 10000 ... 已写入行数: 5500000 完成写入: 14m25.8192891s ``` 写入效率**6382行/秒** #### 2.2.1. 保留较多行(copy多、删除少) 总行数**5520000**、拷贝保留**5383031**、占比**97.5%**的条件SQL: ```sql SELECT COUNT(*) FROM user_details WHERE created_at BETWEEN '2021-09-10 11:16:00' AND '2021-09-10 11:30:00'; ``` 执行命令如下: ```shell ./btc \ --assume-rbr \ --initially-drop-ghost-table \ --initially-drop-socket-file \ --host="127.0.0.1" \ --port=3306 \ --user="root" \ --password="数据库密码" \ --database="peppa" \ --table="user_details" \ --verbose \ --alter="noop" \ --where-reserve-clause="created_at BETWEEN '2021-09-10 11:16:00' AND '2021-09-10 11:30:00'" \ --panic-flag-file=/tmp/ghost.panic.flag \ --chunk-size 10000 \ --dml-batch-size 50 \ --max-lag-millis 15000 \ --default-retries 1000 \ --allow-on-master \ --throttle-flag-file=/tmp/ghost.throttle.flag \ --exact-rowcount \ --ok-to-drop-table \ --replica-server-id=$RANDOM \ --execute >> ./user_details.out & ``` 执行日志: ```shell [2021/09/10 13:47:00] [info] binlogsyncer.go:133 create BinlogSyncer with config {12646 mysql 127.0.0.1 3306 root false false false UTC true 0 0s 0s 0 false} [2021/09/10 13:47:00] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.000007, 373887754) [2021/09/10 13:47:00] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:3306 [2021/09/10 13:47:00] [info] binlogsyncer.go:723 rotate to (mysql-bin.000007, 373887754) # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 13:47:00 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 0/4499542 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.000007:373889525; Lag: 0.00s, HeartbeatLag: 9223372036.85s, State: migrating; ETA: N/A ... Copy: 5383031/5383031 100.0%; Applied: 0; Backlog: 0/1000; Time: 1m41s(total), 1m41s(copy); streamer: mysql-bin.000008:132252044; Lag: 0.10s, HeartbeatLag: 0.05s, State: migrating; ETA: due # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 13:47:00 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 5383031/5383031 100.0%; Applied: 0; Backlog: 0/1000; Time: 1m42s(total), 1m41s(copy); streamer: mysql-bin.000008:133752892; Lag: 0.10s, HeartbeatLag: 0.05s, State: migrating; ETA: due [2021/09/10 13:48:44] [info] binlogsyncer.go:164 syncer is closing... [2021/09/10 13:48:44] [error] binlogstreamer.go:77 close sync with err: sync is been closing... [2021/09/10 13:48:44] [info] binlogsyncer.go:179 syncer is closed # Done ``` **结论** 拷贝符合条件的行数**5383031行**、耗时**104秒**、效率**51760行/秒**,测试通过。 #### 2.2.2. 保留较少行(copy少、删除多) 总行数**5383031行**、拷贝保留**783073**、占比**14.5%**的条件SQL: ```sql SELECT COUNT(*) FROM user_details WHERE created_at BETWEEN '2021-09-10 11:20:00' AND '2021-09-10 11:22:00'; ``` 执行命令如下: ```shell ./btc \ --assume-rbr \ --initially-drop-ghost-table \ --initially-drop-socket-file \ --host="127.0.0.1" \ --port=3306 \ --user="root" \ --password="数据库密码" \ --database="peppa" \ --table="user_details" \ --verbose \ --alter="noop" \ --where-reserve-clause="created_at BETWEEN '2021-09-10 11:20:00' AND '2021-09-10 11:22:00'" \ --panic-flag-file=/tmp/ghost.panic.flag \ --chunk-size 10000 \ --dml-batch-size 50 \ --max-lag-millis 15000 \ --default-retries 1000 \ --allow-on-master \ --throttle-flag-file=/tmp/ghost.throttle.flag \ --exact-rowcount \ --ok-to-drop-table \ --replica-server-id=$RANDOM \ --execute >> ./user_details.out & ``` 执行日志: ```shell [2021/09/10 14:04:01] [info] binlogsyncer.go:133 create BinlogSyncer with config {23684 mysql 127.0.0.1 3306 root false false false UTC true 0 0s 0s 0 false} [2021/09/10 14:04:01] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.000008, 133758726) [2021/09/10 14:04:01] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:3306 [2021/09/10 14:04:01] [info] binlogsyncer.go:723 rotate to (mysql-bin.000008, 133758726) # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 14:04:01 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 0/5324875 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.000008:133760837; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: N/A ... Copy: 783073/783073 100.0%; Applied: 0; Backlog: 1/1000; Time: 27s(total), 26s(copy); streamer: mysql-bin.000008:255883167; Lag: 0.10s, HeartbeatLag: 0.10s, State: migrating; ETA: due # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 14:04:01 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 783073/783073 100.0%; Applied: 0; Backlog: 0/1000; Time: 27s(total), 26s(copy); streamer: mysql-bin.000008:255886323; Lag: 0.10s, HeartbeatLag: 0.06s, State: migrating; ETA: due Copy: 783073/783073 100.0%; Applied: 0; Backlog: 0/1000; Time: 28s(total), 26s(copy); streamer: mysql-bin.000008:255888440; Lag: 0.00s, HeartbeatLag: 0.10s, State: migrating; ETA: due [2021/09/10 14:04:30] [info] binlogsyncer.go:164 syncer is closing... [2021/09/10 14:04:30] [error] binlogstreamer.go:77 close sync with err: sync is been closing... [2021/09/10 14:04:30] [info] binlogsyncer.go:179 syncer is closed # Done ``` **结论** 拷贝符合条件的行数**783073行**、耗时**29秒**、效率**27003行/秒**,测试通过。 ### 2.3. 1000万数据清理测试 插入数据 ``` 开始写入: 2021-09-10 16:17:58.006 已写入行数: 10000 ... 已写入行数: 10000000 已运行时间: 26m0.021172s ``` #### 2.3.1. 保留较多行(copy多、删除少) 总行数**10000000**、拷贝保留**8438122**、占比**84.4%**的条件SQL: ```sql SELECT COUNT(*) FROM user_details WHERE created_at BETWEEN '2021-09-10 16:17:58' AND '2021-09-10 16:40:00'; ``` 执行命令如下: ```shell ./btc \ --assume-rbr \ --initially-drop-ghost-table \ --initially-drop-socket-file \ --host="127.0.0.1" \ --port=3306 \ --user="root" \ --password="数据库密码" \ --database="peppa" \ --table="user_details" \ --verbose \ --alter="noop" \ --where-reserve-clause="created_at BETWEEN '2021-09-10 16:17:58' AND '2021-09-10 16:40:00'" \ --panic-flag-file=/tmp/ghost.panic.flag \ --chunk-size 10000 \ --dml-batch-size 50 \ --max-lag-millis 15000 \ --default-retries 1000 \ --allow-on-master \ --throttle-flag-file=/tmp/ghost.throttle.flag \ --exact-rowcount \ --ok-to-drop-table \ --replica-server-id=$RANDOM \ --execute >> ./user_details.out & ``` 执行日志: ```shell [2021/09/10 16:56:05] [info] binlogsyncer.go:133 create BinlogSyncer with config {29328 mysql 127.0.0.1 3306 root false false false UTC true 0 0s 0s 0 false} [2021/09/10 16:56:05] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.000013, 405045054) [2021/09/10 16:56:05] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:3306 [2021/09/10 16:56:05] [info] binlogsyncer.go:723 rotate to (mysql-bin.000013, 405045054) # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 16:56:05 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 0/9908527 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.000013:405047166; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: N/A ... Copy: 8438122/8438122 100.0%; Applied: 0; Backlog: 0/1000; Time: 2m42s(total), 2m42s(copy); streamer: mysql-bin.000014:640866187; Lag: 0.10s, HeartbeatLag: 0.08s, State: migrating; ETA: due # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 16:56:05 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 8438122/8438122 100.0%; Applied: 0; Backlog: 0/1000; Time: 2m43s(total), 2m42s(copy); streamer: mysql-bin.000014:640998453; Lag: 0.10s, HeartbeatLag: 0.08s, State: migrating; ETA: due [2021/09/10 16:58:49] [info] binlogsyncer.go:164 syncer is closing... [2021/09/10 16:58:49] [error] binlogstreamer.go:77 close sync with err: sync is been closing... [2021/09/10 16:58:49] [info] binlogsyncer.go:179 syncer is closed # Done ``` **结论** 拷贝符合条件的行数**8438122行**、耗时**164秒**、效率**51452行/秒**,测试通过。 #### 2.3.2. 保留较少行(copy少、删除多) 总行数**8438122**、拷贝保留**390552**、占比**4.6%**的条件SQL: ```sql SELECT COUNT(*) FROM user_details WHERE created_at BETWEEN '2021-09-10 16:20:00' AND '2021-09-10 16:21:00'; ``` 执行命令如下: ```shell ./btc \ --assume-rbr \ --initially-drop-ghost-table \ --initially-drop-socket-file \ --host="127.0.0.1" \ --port=3306 \ --user="root" \ --password="数据库密码" \ --database="peppa" \ --table="user_details" \ --verbose \ --alter="noop" \ --where-reserve-clause="created_at BETWEEN '2021-09-10 16:20:00' AND '2021-09-10 16:21:00'" \ --panic-flag-file=/tmp/ghost.panic.flag \ --chunk-size 10000 \ --dml-batch-size 50 \ --max-lag-millis 15000 \ --default-retries 1000 \ --allow-on-master \ --throttle-flag-file=/tmp/ghost.throttle.flag \ --exact-rowcount \ --ok-to-drop-table \ --replica-server-id=$RANDOM \ --execute >> ./user_details.out & ``` 执行日志: ```shell [2021/09/10 17:05:07] [info] binlogsyncer.go:133 create BinlogSyncer with config {25857 mysql 127.0.0.1 3306 root false false false UTC true 0 0s 0s 0 false} [2021/09/10 17:05:07] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.000014, 641000670) [2021/09/10 17:05:07] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:3306 [2021/09/10 17:05:07] [info] binlogsyncer.go:723 rotate to (mysql-bin.000014, 641000670) # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 17:05:07 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 0/7906326 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.000014:641002781; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: N/A ... Copy: 390552/390552 100.0%; Applied: 0; Backlog: 1/1000; Time: 27s(total), 26s(copy); streamer: mysql-bin.000014:701467941; Lag: 0.10s, HeartbeatLag: 0.10s, State: migrating; ETA: due # Migrating `peppa`.`user_details`; Ghost table is `peppa`.`_user_details_gho` # Migrating test-openvpn-node2:3306; inspecting test-openvpn-node2:3306; executing on test-openvpn-node2 # Migration started at Fri Sep 10 17:05:07 +0800 2021 # chunk-size: 10000; max-lag-millis: 15000ms; dml-batch-size: 50; max-load: ; critical-load: ; nice-ratio: 0.000000 # throttle-flag-file: /tmp/ghost.throttle.flag # throttle-additional-flag-file: /tmp/btc.throttle # panic-flag-file: /tmp/ghost.panic.flag # Serving on unix socket: /tmp/btc.peppa.user_details.sock Copy: 390552/390552 100.0%; Applied: 0; Backlog: 0/1000; Time: 28s(total), 26s(copy); streamer: mysql-bin.000014:701472699; Lag: 0.10s, HeartbeatLag: 0.04s, State: migrating; ETA: due [2021/09/10 17:05:35] [info] binlogsyncer.go:164 syncer is closing... [2021/09/10 17:05:35] [error] binlogstreamer.go:77 close sync with err: sync is been closing... [2021/09/10 17:05:35] [info] binlogsyncer.go:179 syncer is closed Copy: 390552/390552 100.0%; Applied: 0; Backlog: 0/1000; Time: 28s(total), 26s(copy); streamer: mysql-bin.000014:701473693; Lag: 0.10s, HeartbeatLag: 0.11s, State: migrating; ETA: due # Done ``` **结论** 拷贝符合条件的行数**390552行**、耗时**28秒**、效率**13948行/秒**,测试通过。 ## 3. 参考文档 - [携程数据库发布系统演进之路](https://www.infoq.cn/article/jp49ob2escfp5szqp9lg) - [Dev-where branch : Add a feature to filter the data you need #718](https://github.com/github/btc/pull/718)