第十三章:备份恢复
学习 Elasticsearch 快照备份与恢复,包括仓库配置、快照创建、恢复操作和数据迁移。
最后更新: 2024-01-15
页面目录
第十三章:备份恢复
13.1 备份概述
13.1.1 备份方式
| 方式 | 说明 | 适用场景 |
|---|---|---|
| 快照 API | Elasticsearch 内置 | 推荐方式 |
| 文件系统备份 | 复制数据目录 | 离线备份 |
| 云快照 | 云存储集成 | 云部署 |
13.1.2 备份架构
┌─────────────────────────────────────────────────────────┐
│ Elasticsearch Cluster │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Indices │ │
│ │ ┌───────┐ ┌───────┐ ┌───────┐ │ │
│ │ │ Shard │ │ Shard │ │ Shard │ │ │
│ │ └───────┘ └───────┘ └───────┘ │ │
│ └─────────────────────────────────────────────────┘ │
│ │ Snapshot │
└─────────────────────────┼────────────────────────────────┘
│
┌─────────────────────────▼────────────────────────────────┐
│ Snapshot Repository │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ File System │ │ S3/MinIO │ │ HDFS │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└──────────────────────────────────────────────────────────┘
13.2 仓库配置
13.2.1 创建仓库
# 文件系统仓库
PUT /_snapshot/my_backup
{
"type": "fs",
"settings": {
"location": "/backup/elasticsearch",
"compress": true,
"max_restore_bytes_per_sec": "100mb",
"max_snapshot_bytes_per_sec": "100mb",
"chunk_size": "1gb"
}
}
# 验证仓库
POST /_snapshot/my_backup/_verify
13.2.2 S3 仓库
# 安装 S3 插件
./bin/elasticsearch-plugin install repository-s3
# 配置 S3 仓库
PUT /_snapshot/s3_backup
{
"type": "s3",
"settings": {
"bucket": "my-es-backups",
"region": "us-east-1",
"base_path": "backups",
"compress": true,
"storage_class": "standard"
}
}
13.2.3 HDFS 仓库
# 安装 HDFS 插件
./bin/elasticsearch-plugin install repository-hdfs
# 配置 HDFS 仓库
PUT /_snapshot/hdfs_backup
{
"type": "hdfs",
"settings": {
"uri": "hdfs://namenode:8020",
"path": "/user/elasticsearch/backups",
"conf_location": "/etc/hadoop/core-site.xml",
"compress": true
}
}
13.3 快照操作
13.3.1 创建快照
# 快照所有索引
PUT /_snapshot/my_backup/snapshot_1
# 快照指定索引
PUT /_snapshot/my_backup/snapshot_2
{
"indices": ["index1", "index2"],
"ignore_unavailable": true,
"include_global_state": false,
"metadata": {
"taken_by": "admin",
"taken_date": "2024-01-15",
"description": "Daily backup"
}
}
# 异步创建
PUT /_snapshot/my_backup/snapshot_3?wait_for_completion=false
# 返回
{
"snapshot": "snapshot_3",
"uuid": "abc123",
"state": "STARTED"
}
13.3.2 查看快照
# 列出所有快照
GET /_snapshot/my_backup/_all
# 查看特定快照
GET /_snapshot/my_backup/snapshot_1
# 响应
{
"snapshots": [
{
"snapshot": "snapshot_1",
"uuid": "abc123",
"state": "SUCCESS",
"start_time": "2024-01-15T10:00:00Z",
"end_time": "2024-01-15T10:15:00Z",
"duration_in_millis": 900000,
"indices": ["products", "orders"],
"total_shards": 10,
"successful_shards": 10,
"failed_shards": 0,
"version": "8.12.0"
}
]
}
13.3.3 快照状态
# 查看正在运行的快照
GET /_snapshot/_current
# 详细状态
GET /_snapshot/my_backup/snapshot_1/_status
13.4 恢复操作
13.4.1 基本恢复
# 恢复所有索引
POST /_snapshot/my_backup/snapshot_1/_restore
# 恢复指定索引
POST /_snapshot/my_backup/snapshot_2/_restore
{
"indices": ["index1"],
"rename_pattern": "index(.+)",
"rename_replacement": "restored_index_$1"
}
# 查看恢复进度
GET /_cat/recovery?v
13.4.2 恢复选项
# 带选项的恢复
POST /_snapshot/my_backup/snapshot_1/_restore
{
"indices": ["products"],
"index_settings": {
"index.number_of_replicas": 0,
"index.refresh_interval": "-1"
},
"ignore_index_settings": [
"index.mapper.dynamic"
],
"include_aliases": false
}
13.4.3 部分恢复
# 只恢复部分分片
POST /_snapshot/my_backup/snapshot_1/_restore
{
"indices": ["products"],
"partial": true
}
13.5 数据迁移
13.5.1 跨集群恢复
# 配置远程仓库(源集群)
PUT /_cluster/settings
{
"persistent": {
"repositories.url.allowed_urls": [
"http://source-cluster:9200/_snapshot/*"
]
}
}
# 创建远程仓库引用(目标集群)
PUT /_snapshot/remote_backup
{
"type": "url",
"settings": {
"url": "http://source-cluster:9200/_snapshot/my_backup"
}
}
# 从远程恢复
POST /_snapshot/remote_backup/snapshot_1/_restore
13.5.2 Reindex 迁移
# 跨集群 Reindex
POST /_reindex
{
"source": {
"remote": {
"host": "http://source-cluster:9200",
"username": "elastic",
"password": "password"
},
"index": "source_index",
"size": 10000,
"query": {
"match_all": {}
}
},
"dest": {
"index": "dest_index"
}
}
# 带变换的 Reindex
POST /_reindex
{
"source": {
"index": "old_index"
},
"dest": {
"index": "new_index"
},
"script": {
"source": "ctx._source.category = ctx._source.category.toUpperCase()",
"lang": "painless"
}
}
13.5.3 批量 Reindex
# 并行 Reindex
POST /_reindex
{
"source": {
"remote": {
"host": "http://source-cluster:9200"
},
"index": "large_index",
"size": 5000
},
"dest": {
"index": "new_index"
},
"script": {
"source": """
ctx._source.timestamp = ctx._source['@timestamp'];
ctx._source.remove('@timestamp');
""",
"lang": "painless"
},
"conflicts": "proceed"
}
13.6 自动备份
13.6.1 快照生命周期管理
# 创建 SLM 策略
PUT /_slm/policy/daily-snapshot
{
"schedule": "0 2 * * *",
"name": "daily-snapshot-{now/d}",
"repository": "my_backup",
"config": {
"indices": ["*"],
"ignore_unavailable": true,
"include_global_state": false
},
"retention": {
"expire_after": "30d",
"min_count": 5,
"max_count": 50
}
}
# 查看 SLM 状态
GET /_slm/stats
# 手动执行 SLM
POST /_slm/policy/daily-snapshot/_execute
13.6.2 自动备份脚本
#!/bin/bash
# backup.sh
REPO="/backup/elasticsearch"
RETENTION_DAYS=30
# 创建快照
SNAPSHOT_NAME="backup-$(date +%Y%m%d-%H%M%S)"
curl -X PUT "localhost:9200/_snapshot/my_backup/${SNAPSHOT_NAME}" \
-u elastic:password \
-H 'Content-Type: application/json' \
-d '{
"indices": ["*"],
"ignore_unavailable": true,
"include_global_state": false
}'
# 清理过期快照
EXPIRED=$(curl -s -u elastic:password "localhost:9200/_snapshot/my_backup/_all" | \
jq -r '.snapshots[] | select(.end_time < "'$(date -d "-${RETENTION_DAYS} days" -I)'") | .snapshot')
for snapshot in $EXPIRED; do
echo "Deleting expired snapshot: $snapshot"
curl -X DELETE "localhost:9200/_snapshot/my_backup/${snapshot}" \
-u elastic:password
done
echo "Backup completed: $SNAPSHOT_NAME"
13.7 恢复验证
13.7.1 验证恢复
# 检查恢复的索引
GET /_cat/indices/restored_*
# 验证文档数量
GET /restored_index/_count
# 抽样验证数据
GET /restored_index/_search
{
"size": 10,
"query": {
"match_all": {}
}
}
13.7.2 数据对比
# 对比源索引和恢复索引
GET /source_index/_count
GET /restored_index/_count
# 检查特定文档
GET /source_index/_doc/123
GET /restored_index/_doc/123
13.8 常见问题
13.8.1 恢复失败处理
| 问题 | 原因 | 解决方案 |
|---|---|---|
| 分片未分配 | 磁盘空间不足 | 清理磁盘或扩容 |
| 索引已存在 | 同名索引 | 使用 rename 或删除旧索引 |
| 仓库不可用 | 网络问题 | 检查仓库配置 |
| 版本不兼容 | ES 版本差异 | 升级 ES 版本 |
13.8.2 仓库锁定
# 清理锁文件
rm -f /backup/elasticsearch/*.lock
# 清理损坏的仓库
DELETE /_snapshot/corrupted_backup
# 重新创建仓库
PUT /_snapshot/backup
13.9 最佳实践
13.9.1 备份策略
□ 每日快照,保留 30 天
□ 每周完整备份,保留 90 天
□ 跨区域/跨集群复制关键数据
□ 定期测试恢复流程
□ 监控快照大小和保留情况
□ 备份配置和映射定义
13.9.2 恢复计划
# 恢复检查清单
1. 确认快照状态为 SUCCESS
2. 检查目标集群磁盘空间
3. 确认索引名称不冲突
4. 验证数据完整性
5. 更新别名指向新索引
6. 清理临时恢复索引
13.10 总结
本章介绍了 Elasticsearch 的备份恢复功能,包括快照仓库配置、快照创建与恢复、数据迁移等。完善的备份策略是保障数据安全的重要措施。