如何增加 Elasticsearch 中的 primary shard 数量

news2026/5/4 3:50:04

作者：来自 Elastic Kofi Bartlett

探索增加 Elasticsearch 中 primary shard 数量的方法。

更多阅读：

Elasticsearch：Split index API - 把一个大的索引分拆成更多分片
Elasticsearch：通过 shrink API 减少 shard 数量来缩小 Elasticsearch 索引
Elasticsearch: Reindex 接口

无法增加已有索引的 primary shard 数量，这意味着如果你想增加 primary shard 数量，必须重新创建索引。在这种情况下通常有两种方法可用： _reindex API 和 _split API。

_split API 通常比 _reindex API 更快。在执行这两种操作之前必须停止写入索引，否则 source_index 和 target_index 的文档数量会不一致。

方法 1 – 使用 split API

split API 用于通过复制现有索引的设置和映射，创建一个具有所需 primary shard 数量的新索引。可以在创建过程中设置所需的 primary shard 数量。在使用 split API 之前应检查以下设置：

源索引必须是只读的。这意味着需要停止写入过程。
目标索引的 primary shard 数量必须是源索引 primary shard 数量的倍数。例如，如果源索引有 5 个 primary shard，目标索引可以设置为 10、15、20 等。

注意：如果只需要更改 primary shard 数量，建议使用 split API，因为它比 Reindex API 快得多。

实现 split API

创建一个测试索引：

POST test_split_source/_doc
{
  "test": "test"
}

我们可以使用如下的命令来查看这个索引的设置：

GET test_split_source/_settings

{
  "test_split_source": {
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "1",
        "provided_name": "test_split_source",
        "creation_date": "1744934104333",
        "number_of_replicas": "1",
        "uuid": "Ixn7Y6gdTaOnuW9x9AbEjg",
        "version": {
          "created": "9009000"
        }
      }
    }
  }
}

我们可以看到 number_of_shards 为 1。

源索引必须是只读的才能进行 split：

PUT test_split_source/_settings
{
  "index.blocks.write": true
}

设置和映射会自动从源索引复制：

POST /test_split_source/_split/test_split_target
{
  "settings": {
    "index.number_of_shards": 3
  }
}

在上面，我们可以看到 number_of_shards 是 3。它是我们之前的 1 的整数倍。

你可以使用以下命令检查进度：

GET _cat/recovery/test_split_target?v&h=index,shard,time,stage,files_percent,files_total

由于设置和映射是从源索引复制的，目标索引是只读的。现在让我们为目标索引启用写入操作：

PUT test_split_target/_settings
{
    "index.blocks.write": null
}

在删除原始索引之前，检查源索引和目标索引的 docs.count：

GET _cat/indices/test_split*?v&h=index,pri,rep,docs.count

索引名称和别名名称不能相同。你需要删除源索引，并将源索引名称作为别名添加到目标索引：

DELETE test_split_source
PUT /test_split_target/_alias/test_split_source

在将 test_split_source 别名添加到 test_split_target 索引后，你应该使用以下命令进行测试：

GET test_split_source
POST test_split_source/_doc
{
  "test": "test"
}

方法 2 – 使用 reindex API

通过使用 Reindex API 创建新索引，可以设置任何数量的 primary shard 数量。在使用所需的 primary shard 数量创建新索引后，源索引中的所有数据可以重新索引到该新索引。

除了 split API 的功能外，还可以使用 reindex API 中的 ingest_pipeline 对数据进行处理。通过 ingest_pipeline，只有符合筛选条件的指定字段会使用查询索引到目标索引中。数据内容可以通过 painless 脚本进行修改，并且可以将多个索引合并为一个索引。

实现 reindex API

创建一个测试 reindex：

POST test_reindex_source/_doc
{
    "test": "test"
}

从源索引复制设置和映射：

GET test_reindex_source

使用设置、映射和所需的 shard 数量创建目标索引：

PUT test_reindex_target
{
  "mappings" : {},
  "settings": {
    "number_of_shards": 10,
    "number_of_replicas": 0,
    "refresh_interval": -1
  }
}

*注意：设置 number_of_replicas: 0 和 refresh_interval: -1 将提高 reindex 速度。

启动 reindex 过程。设置 requests_per_second=-1 和 slices=auto 将调整 reindex 速度。

POST _reindex?requests_per_second=-1&slices=auto&wait_for_completion=false
{
  "source": {
    "index": "test_reindex_source"
  },
  "dest": {
    "index": "test_reindex_target"
  }
}

当你运行 reindex API 时，系统会显示 task_id。复制该 task_id 并使用 _tasks API 检查进度：

GET _tasks/<task_id>

在 reindex 完成后，更新设置：

PUT test_reindex_target/_settings
{
  "number_of_replicas": 1,
  "refresh_interval": "1s"
}

在删除原始索引之前，检查源索引和目标索引的 docs.count，应该是相同的：

GET _cat/indices/test_reindex_*?v&h=index,pri,rep,docs.count

索引名称和别名名称不能相同。删除源索引，并将源索引名称作为别名添加到目标索引：

DELETE test_reindex_source
PUT /test_reindex_target/_alias/test_reindex_source

在将 test_split_source 别名添加到 test_split_target 索引后，使用以下命令进行测试：

GET test_reindex_source

总结

如果你想增加已有索引的 primary shard 数量，需要将设置和映射重新创建到一个新索引中。实现这一点有两种主要方法：reindex API 和 split API。在使用这两种方法之前，必须停止当前的索引操作。

想获得 Elastic 认证吗？了解下一期 Elasticsearch 工程师培训的时间！

Elasticsearch 拥有许多新特性，帮助你为你的用例构建最佳的搜索解决方案。深入了解我们的示例笔记本，开始免费的云试用，或现在就尝试在本地机器上使用 Elastic。

原文：How to increase primary shard count in Elasticsearch - Elasticsearch Labs

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/2338755.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！