目录结构 
  
 
  
 
注:提前言明 本文借鉴了以下博主、书籍或网站的内容,其列表如下:
1、参考书籍:《PostgreSQL数据库内核分析》
2、参考书籍:《数据库事务处理的艺术:事务管理与并发控制》
3、PostgreSQL数据库仓库链接,点击前往
4、日本著名PostgreSQL数据库专家 铃木启修 网站主页,点击前往
5、参考书籍:《PostgreSQL指南:内幕探索》,点击前往
6、参考书籍:《事务处理 概念与技术》
7、pgtune 官方git仓库,点击前往
8、pgtune 官方在线使用,点击前往
1、本文内容全部来源于开源社区 GitHub和以上博主的贡献,本文也免费开源(可能会存在问题,评论区等待大佬们的指正)
2、本文目的:开源共享 抛砖引玉 一起学习
3、本文不提供任何资源 不存在任何交易 与任何组织和机构无关
4、大家可以根据需要自行 复制粘贴以及作为其他个人用途,但是不允许转载 不允许商用 (写作不易,还请见谅 💖)
5、本文内容基于PostgreSQL master源码开发而成
在线调优工具pgtune的实现原理和源码解析
- 文章快速说明索引
- 功能使用背景说明
- 功能实现源码解析
 
 
文章快速说明索引
学习目标:
做数据库内核开发久了就会有一种 少年得志,年少轻狂 的错觉,然鹅细细一品觉得自己其实不算特别优秀 远远没有达到自己想要的。也许光鲜的表面掩盖了空洞的内在,每每想到于此,皆有夜半临渊如履薄冰之感。为了睡上几个踏实觉,即日起 暂缓其他基于PostgreSQL数据库的兼容功能开发,近段时间 将着重于学习分享Postgres的基础知识和实践内幕。
学习内容:(详见目录)
1、在线调优工具pgtune的实现原理和源码解析
学习时间:
2024年11月24日 13:46:59
学习产出:
1、PostgreSQL数据库基础知识回顾 1个
 2、CSDN 技术博客 1篇
 3、PostgreSQL数据库内核深入学习
注:下面我们所有的学习环境是Centos8+PostgreSQL master +Oracle19C+MySQL8.0
postgres=# select version();
                                                  version                                                   
------------------------------------------------------------------------------------------------------------
 PostgreSQL 18devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-21), 64-bit
(1 row)
postgres=#
#-----------------------------------------------------------------------------#
SQL> select * from v$version;          
BANNER        Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production	
BANNER_FULL	  Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production Version 19.17.0.0.0	
BANNER_LEGACY Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production	
CON_ID 0
#-----------------------------------------------------------------------------#
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.27    |
+-----------+
1 row in set (0.06 sec)
mysql>
功能使用背景说明

作用:根据硬件调整 PostgreSQL 配置。
优点:
- 无需下载或安装任何东西
- 也可以离线工作
- 可以作为移动应用程序工作
- 开源
使用一下,如下是(常用TPCC)一个推荐配置:

接下来看一下其实现细节,使用上面比较简单,这里不再赘述。后面大家在使用过程中有问题,也可以直接去官方仓库下提问题!
注1:因为这个算是一个通用设置,所能提供的参数较少(如下) 在具体的使用中,其他相关参数的设置大家可以自行分析:
    const configData = [
      ['max_connections', maxConnectionsVal],
      ['shared_buffers', formatValue(sharedBuffersVal)],
      ['effective_cache_size', formatValue(effectiveCacheSizeVal)],
      ['maintenance_work_mem', formatValue(maintenanceWorkMemVal)],
      ['checkpoint_completion_target', checkpointCompletionTargetVal],
      ['wal_buffers', formatValue(walBuffersVal)],
      ['default_statistics_target', defaultStatisticsTargetVal],
      ['random_page_cost', randomPageCostVal],
      ['effective_io_concurrency', effectiveIoConcurrencyVal],
      ['work_mem', formatValue(workMemVal)],
      ['huge_pages', hugePagesVal]
    ]
注2:这不是标准答案,可以视为一种参考建议值 大家在使用中可以根据自己的实际情况进行调整!
可选的应用场景,有:
const dbTypeOptions = () => [
  {
    label: 'Web application',
    value: DB_TYPE_WEB
  },
  {
    label: 'Online transaction processing system',
    value: DB_TYPE_OLTP
  },
  {
    label: 'Data warehouse',
    value: DB_TYPE_DW
  },
  {
    label: 'Desktop application',
    value: DB_TYPE_DESKTOP
  },
  {
    label: 'Mixed type of application',
    value: DB_TYPE_MIXED
  }
]
因此大家要充分理解自己的业务情况!
功能实现源码解析
下面我们按照参数的顺序,挨个解释pgtune设置值的细节。
一、max_connections 这个值设置多少即为多少;若是没有设置 则使用下面的默认值:
参考点击:http://postgres.cn/docs/current/runtime-config-connection.html#GUC-MAX-CONNECTIONS
export const selectMaxConnections = createSelector(
  [selectConnectionNum, selectDBType],
  (connectionNum, dbType) =>
    connectionNum
      ? connectionNum // 设置的走设置值,而非下面的默认值
      : {
          [DB_TYPE_WEB]: 200,
          [DB_TYPE_OLTP]: 300,
          [DB_TYPE_DW]: 40,
          [DB_TYPE_DESKTOP]: 20,
          [DB_TYPE_MIXED]: 100
        }[dbType]
)
二、shared_buffers
参考点击:http://postgres.cn/docs/current/runtime-config-resource.html#GUC-SHARED-BUFFERS
export const selectSharedBuffers = createSelector(
  [selectTotalMemoryInKb, selectDBType, selectOSType, selectDBVersion],
  (totalMemoryKb, dbType, osType, dbVersion) => {
    let sharedBuffersValue = {
      [DB_TYPE_WEB]: Math.floor(totalMemoryKb / 4),
      [DB_TYPE_OLTP]: Math.floor(totalMemoryKb / 4),
      [DB_TYPE_DW]: Math.floor(totalMemoryKb / 4),
      [DB_TYPE_DESKTOP]: Math.floor(totalMemoryKb / 16),
      [DB_TYPE_MIXED]: Math.floor(totalMemoryKb / 4)
    }[dbType]
    if (dbVersion < 10 && OS_WINDOWS === osType) {
      // Limit shared_buffers to 512MB on Windows
      const winMemoryLimit = (512 * SIZE_UNIT_MAP['MB']) / SIZE_UNIT_MAP['KB']
      if (sharedBuffersValue > winMemoryLimit) {
        sharedBuffersValue = winMemoryLimit
      }
    }
    return sharedBuffersValue
  }
)
- 该参数通常设置建议为:totalMemoryKb的1/4 或 1/16(看业务场景)
- 在pg10以下的Windows环境上,建议设置不超过 512MB
三、effective_cache_size 该参数通常建议设置为totalMemoryKb的 3/4
参考点击:http://postgres.cn/docs/current/runtime-config-query.html#GUC-EFFECTIVE-CACHE-SIZE
export const selectEffectiveCacheSize = createSelector(
  [selectTotalMemoryInKb, selectDBType],
  (totalMemoryKb, dbType) =>
    ({
      [DB_TYPE_WEB]: Math.floor((totalMemoryKb * 3) / 4),
      [DB_TYPE_OLTP]: Math.floor((totalMemoryKb * 3) / 4),
      [DB_TYPE_DW]: Math.floor((totalMemoryKb * 3) / 4),
      [DB_TYPE_DESKTOP]: Math.floor(totalMemoryKb / 4),
      [DB_TYPE_MIXED]: Math.floor((totalMemoryKb * 3) / 4)
    })[dbType]
)
四、maintenance_work_mem
参考点击:http://postgres.cn/docs/current/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM
export const selectMaintenanceWorkMem = createSelector(
  [selectTotalMemoryInKb, selectDBType, selectOSType],
  (totalMemoryKb, dbType, osType) => {
    let maintenanceWorkMemValue = {
      [DB_TYPE_WEB]: Math.floor(totalMemoryKb / 16),
      [DB_TYPE_OLTP]: Math.floor(totalMemoryKb / 16),
      [DB_TYPE_DW]: Math.floor(totalMemoryKb / 8),
      [DB_TYPE_DESKTOP]: Math.floor(totalMemoryKb / 16),
      [DB_TYPE_MIXED]: Math.floor(totalMemoryKb / 16)
    }[dbType]
    // Cap maintenance RAM at 2GB on servers with lots of memory
    const memoryLimit = (2 * SIZE_UNIT_MAP['GB']) / SIZE_UNIT_MAP['KB']
    if (maintenanceWorkMemValue >= memoryLimit) {
      if (OS_WINDOWS === osType) {
        // 2048MB (2 GB) will raise error at Windows, so we need remove 1 MB from it
        maintenanceWorkMemValue = memoryLimit - (1 * SIZE_UNIT_MAP['MB']) / SIZE_UNIT_MAP['KB']
      } else {
        maintenanceWorkMemValue = memoryLimit
      }
    }
    return maintenanceWorkMemValue
  }
)
- 该参数通常是totalMemoryKb的 1/16 或 1/8
- 若是该值大于2G,则设置为2G
五、checkpoint_completion_target 设置为0.9
参考点击:http://postgres.cn/docs/current/runtime-config-wal.html#GUC-CHECKPOINT-COMPLETION-TARGET
export const selectCheckpointCompletionTarget = createSelector(
  [],
  () => 0.9 // based on https://github.com/postgres/postgres/commit/bbcc4eb2
)
六、wal_buffers
参考点击:http://postgres.cn/docs/current/runtime-config-wal.html#GUC-WAL-BUFFERS
export const selectWalBuffers = createSelector([selectSharedBuffers], (sharedBuffersValue) => {
  // Follow auto-tuning guideline for wal_buffers added in 9.1, where it's
  // set to 3% of shared_buffers up to a maximum of 16MB.
  let walBuffersValue = Math.floor((3 * sharedBuffersValue) / 100)
  const maxWalBuffer = (16 * SIZE_UNIT_MAP['MB']) / SIZE_UNIT_MAP['KB']
  if (walBuffersValue > maxWalBuffer) {
    walBuffersValue = maxWalBuffer
  }
  // It's nice of wal_buffers is an even 16MB if it's near that number. Since
  // that is a common case on Windows, where shared_buffers is clipped to 512MB,
  // round upwards in that situation
  const walBufferNearValue = (14 * SIZE_UNIT_MAP['MB']) / SIZE_UNIT_MAP['KB']
  if (walBuffersValue > walBufferNearValue && walBuffersValue < maxWalBuffer) {
    walBuffersValue = maxWalBuffer
  }
  // if less, than 32 kb, than set it to minimum
  if (walBuffersValue < 32) {
    walBuffersValue = 32
  }
  return walBuffersValue
})
- 遵循 9.1 中添加的 wal_buffers 自动调整指南,将其设置为 shared_buffers 的 3%,最高为 16MB。
- 如果 wal_buffers 接近该数字,则最好是 16MB。由于这是 Windows 上的常见情况,其中 shared_buffers 被限制为 512MB,因此在这种情况下向上舍入
- 如果小于 32 kb,则将其设置为最小值
七、default_statistics_target 该参数使用以下默认值
参考点击:http://postgres.cn/docs/current/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET
export const selectDefaultStatisticsTarget = createSelector(
  [selectDBType],
  (dbType) =>
    ({
      [DB_TYPE_WEB]: 100,
      [DB_TYPE_OLTP]: 100,
      [DB_TYPE_DW]: 500,
      [DB_TYPE_DESKTOP]: 100,
      [DB_TYPE_MIXED]: 100
    })[dbType]
)
八、random_page_cost 根据硬盘的类型,该参数选择以下值
参考点击:http://postgres.cn/docs/current/runtime-config-query.html#GUC-RANDOM-PAGE-COST
export const selectRandomPageCost = createSelector([selectHDType], (hdType) => {
  return {
    [HARD_DRIVE_HDD]: 4,
    [HARD_DRIVE_SSD]: 1.1,
    [HARD_DRIVE_SAN]: 1.1
  }[hdType]
})
九、effective_io_concurrency 该参数在Linux系统下,根据硬盘类型设置如下默认值
参考点击:http://postgres.cn/docs/current/runtime-config-resource.html#GUC-EFFECTIVE-IO-CONCURRENCY
export const selectEffectiveIoConcurrency = createSelector(
  [selectOSType, selectHDType],
  (osType, hdType) => {
    if (osType !== OS_LINUX) {
      return null
    }
    return {
      [HARD_DRIVE_HDD]: 2,
      [HARD_DRIVE_SSD]: 200,
      [HARD_DRIVE_SAN]: 300
    }[hdType]
  }
)
十、parallel setting
在分析work_mem之前,首先先看一下parallel setting相关:
// 默认设置如下
const DEFAULT_DB_SETTINGS = {
  default: {
    ['max_worker_processes']: 8,
    ['max_parallel_workers_per_gather']: 2,
    ['max_parallel_workers']: 8
  }
}
如下(如果cpuNum未指定或者太小,不再设置相关并行设置 进而使用上面默认):
export const selectParallelSettings = createSelector(
  [selectDBVersion, selectDBType, selectCPUNum],
  (dbVersion, dbType, cpuNum) => {
    if (!cpuNum || cpuNum < 4) {
      return []
    }
    let workersPerGather = Math.ceil(cpuNum / 2)
    if (dbType !== DB_TYPE_DW && workersPerGather > 4) {
      // 没有明确的证据表明每个新工人都会为每个新核心带来巨大的利益
      workersPerGather = 4 // no clear evidence, that each new worker will provide big benefit for each noew core
    }
    let config = [
      {
        key: 'max_worker_processes',
        value: cpuNum
      },
      {
        key: 'max_parallel_workers_per_gather',
        value: workersPerGather
      }
    ]
    if (dbVersion >= 10) {
      config.push({
        key: 'max_parallel_workers',
        value: cpuNum
      })
    }
    if (dbVersion >= 11) {
      let parallelMaintenanceWorkers = Math.ceil(cpuNum / 2)
      if (parallelMaintenanceWorkers > 4) {
        parallelMaintenanceWorkers = 4 // no clear evidence, that each new worker will provide big benefit for each noew core
      }
      config.push({
        key: 'max_parallel_maintenance_workers',
        value: parallelMaintenanceWorkers
      })
    }
    return config
  }
)
- max_worker_processes设置成cpunum
- max_parallel_workers_per_gather的设置:数仓的场景下- workersPerGather = cpunum/2;其他场景下- workersPerGather通常最大都是 4
- 在pg10及其以上,max_parallel_workers设置成cpunum
- 在pg11及其以上,max_parallel_maintenance_workers的设置 通常最大都是 4
参考点击:
- http://postgres.cn/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
- http://postgres.cn/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-WORKERS-PER-GATHER
- http://postgres.cn/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-WORKERS
- http://postgres.cn/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS
十一、work_mem
参考点击:http://postgres.cn/docs/current/runtime-config-resource.html#GUC-WORK-MEM
export const selectWorkMem = createSelector(
  [
    selectTotalMemoryInKb,
    selectSharedBuffers,
    selectMaxConnections,
    selectParallelSettings,
    selectDbDefaultValues,
    selectDBType
  ],
  (
    totalMemoryKb,
    sharedBuffersValue,
    maxConnectionsValue,
    parallelSettingsValue,
    dbDefaultValues,
    dbType
  ) => {
    const parallelForWorkMem = (() => {
      if (parallelSettingsValue.length) {
        const maxParallelWorkersPerGather = parallelSettingsValue.find(
          (param) => param['key'] === 'max_parallel_workers_per_gather'
        )
        if (
          maxParallelWorkersPerGather &&
          maxParallelWorkersPerGather['value'] &&
          maxParallelWorkersPerGather['value'] > 0
        ) {
          return maxParallelWorkersPerGather['value']
        }
      }
      if (
        dbDefaultValues['max_parallel_workers_per_gather'] &&
        dbDefaultValues['max_parallel_workers_per_gather'] > 0
      ) {
        return dbDefaultValues['max_parallel_workers_per_gather']
      }
      return 1
    })()
    // work_mem is assigned any time a query calls for a sort, or a hash, or any other structure that needs a space allocation, which can happen multiple times per query. So you're better off assuming max_connections * 2 or max_connections * 3 is the amount of RAM that will actually use in reality. At the very least, you need to subtract shared_buffers from the amount you're distributing to connections in work_mem.
    // The other thing to consider is that there's no reason to run on the edge of available memory. If you do that, there's a very high risk the out-of-memory killer will come along and start killing PostgreSQL backends. Always leave a buffer of some kind in case of spikes in memory usage. So your maximum amount of memory available in work_mem should be ((RAM - shared_buffers) / (max_connections * 3) / max_parallel_workers_per_gather).
    const workMemValue =
      (totalMemoryKb - sharedBuffersValue) / (maxConnectionsValue * 3) / parallelForWorkMem
    let workMemResult = {
      [DB_TYPE_WEB]: Math.floor(workMemValue),
      [DB_TYPE_OLTP]: Math.floor(workMemValue),
      [DB_TYPE_DW]: Math.floor(workMemValue / 2),
      [DB_TYPE_DESKTOP]: Math.floor(workMemValue / 6),
      [DB_TYPE_MIXED]: Math.floor(workMemValue / 2)
    }[dbType]
    // if less, than 64 kb, than set it to minimum
    if (workMemResult < 64) {
      workMemResult = 64
    }
    return workMemResult
  }
)
简单分析一下:
- parallelForWorkMem的计算:来源于上面并行设置中- max_parallel_workers_per_gather
- work_mem的计算:
- 每次查询调用排序、哈希或任何其他需要空间分配的结构时,都会分配 work_mem,每个查询可能会多次发生这种情况。因此,最好假设 max_connections * 2 或 max_connections * 3 是实际使用的 RAM 量。至少,您需要从分配给 work_mem 中连接的量中减去 shared_buffers
- 另一件需要考虑的事情是,没有理由在可用内存的边缘运行。如果这样做,内存不足杀手出现并开始杀死 PostgreSQL 后端的风险非常高。始终留下某种缓冲区以防内存使用量激增。因此,work_mem 中可用的最大内存量应为 ((RAM - shared_buffers) / (max_connections * 3) / max_parallel_workers_per_gather)
- 根据不同场景 计算出最后的建议值
    // work_mem is assigned any time a query calls for a sort, or a hash, or any other structure that needs a space allocation, which can happen multiple times per query. So you're better off assuming max_connections * 2 or max_connections * 3 is the amount of RAM that will actually use in reality. At the very least, you need to subtract shared_buffers from the amount you're distributing to connections in work_mem.
    // The other thing to consider is that there's no reason to run on the edge of available memory. If you do that, there's a very high risk the out-of-memory killer will come along and start killing PostgreSQL backends. Always leave a buffer of some kind in case of spikes in memory usage. So your maximum amount of memory available in work_mem should be ((RAM - shared_buffers) / (max_connections * 3) / max_parallel_workers_per_gather).
    const workMemValue =
      (totalMemoryKb - sharedBuffersValue) / (maxConnectionsValue * 3) / parallelForWorkMem
    let workMemResult = {
      [DB_TYPE_WEB]: Math.floor(workMemValue),
      [DB_TYPE_OLTP]: Math.floor(workMemValue),
      [DB_TYPE_DW]: Math.floor(workMemValue / 2),
      [DB_TYPE_DESKTOP]: Math.floor(workMemValue / 6),
      [DB_TYPE_MIXED]: Math.floor(workMemValue / 2)
    }[dbType]
    // if less, than 64 kb, than set it to minimum
    if (workMemResult < 64) {
      workMemResult = 64
    }
十二、huge_pages
参考点击:http://postgres.cn/docs/current/runtime-config-resource.html#GUC-HUGE-PAGES
export const selectHugePages = createSelector(
  [selectTotalMemoryInKb],
  // more 32GB - better also have huge page
  // 超过 32GB - 最好也有大页面
  (totalMemoryKBytes) => (totalMemoryKBytes >= 33554432 ? 'try' : 'off')
)










![[网安靶场] [更新中] UPLOAD LABS —— 靶场笔记合集](https://csdnimg.cn/release/blog_editor_html/release2.3.7/ckeditor/plugins/CsdnLink/icons/icon-default.png?t=O83A)








