Oracle非归档模式遇到文件损坏怎么办?

news2025/5/11 11:53:31

昨天夜里基地夜班的兄弟,打电话说有个报表库连不上了,赶紧起来连上VPN查看一下,看到实例宕机了,先赶紧startup起来。

1.查看报错信息

环境介绍:Redhat 6.9 Oracle 11.2.0.4   No Archive Mode

查看alert log 关键报错信息如下

Thread 1 advanced to log sequence 4231012 (LGWR switch)
  Current log# 2 seq# 4231012 mem# 0: /oradata/rtp/redo02.log
Thu May 08 23:22:56 2025
KCF: read, write or open error, block=0x240ab online=1
        file=118 '/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf'
        error=27072 txt: 'Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1'
Errors in file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_dbw0_3300.trc:
Errors in file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_dbw0_3300.trc:
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 118 (block # 147627)
ORA-01110: data file 118: '/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf'
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1
DBW0 (ospid: 3300): terminating the instance due to error 63999
Thu May 08 23:22:57 2025
System state dump requested by (instance=1, osid=3300 (DBW0)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_diag_3292_20250508232257.trc
Instance terminated by DBW0, pid = 3300

排查路径  查看报错的trc文件

Trace file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp\_dbw0\_3300.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE\_HOME = /u01/app/oracle/product/11.2.0/db\_1
System name:    Linux
Node name:      rtpdb
Release:        2.6.32-696.el6.x86\_64
Version:        #1 SMP Tue Feb 21 00:53:17 EST 2017
Machine:        x86\_64
VM name:        VMWare Version: 6
Instance name: rtp
Redo thread mounted by this instance: 1
Oracle process number: 10
Unix process pid: 3300, image: oracle\@rtpdb (DBW0)

\*\*\* 2025-05-08 23:22:56.680
\*\*\* SESSION ID:(1521.1) 2025-05-08 23:22:56.680
\*\*\* CLIENT ID:() 2025-05-08 23:22:56.680
\*\*\* SERVICE NAME:(SYS\$BACKGROUND) 2025-05-08 23:22:56.680
\*\*\* MODULE NAME:() 2025-05-08 23:22:56.680
\*\*\* ACTION NAME:() 2025-05-08 23:22:56.680

KCF: read, write or open error, block=0x240ab online=1
file=118 '/oradata2/rtp/RTP/datafile/o1\_mf\_tbs\_ods\_n1qx02j0\_.dbf'
error=27072 txt: 'Linux-x86\_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1'
Encountered write error
DDE rules only execution for: ORA 1110
\----- START Event Driven Actions Dump ----
\---- END Event Driven Actions Dump ----
\----- START DDE Actions Dump -----
Executing SYNC actions
\----- START DDE Action: 'DB\_STRUCTURE\_INTEGRITY\_CHECK' (Async) -----
Successfully dispatched
\----- END DDE Action: 'DB\_STRUCTURE\_INTEGRITY\_CHECK' (SUCCESS, 0 csec) -----
Executing ASYNC actions
\----- END DDE Actions Dump (total 0 csec) -----
error 63999 detected in background process
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 118 (block # 147627)
ORA-01110: data file 118: '/oradata2/rtp/RTP/datafile/o1\_mf\_tbs\_ods\_n1qx02j0\_.dbf'
ORA-27072: File I/O error
Linux-x86\_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1
kjzduptcctx: Notifying DIAG for crash event
\----- Abridged Call Stack Trace -----
ksedsts()+465<-kjzdssdmp()+267<-kjzduptcctx()+232<-kjzdicrshnfy()+63<-ksuitm()+5570<-ksbrdp()+3507<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai\_real()+250<-ssthrdmain()+265<-main()+201<-\_\_libc\_start\_main()+253
\----- End of Abridged Call Stack Trace -----

\*\*\* 2025-05-08 23:22:56.750
DBW0 (ospid: 3300): terminating the instance due to error 63999
ksuitm: waiting up to \[5] seconds before killing DIAG(3292)
\[oracle\@rtpdb \~]\$

2. OS层面检查IO报错问题

2.1查看/oradata2挂载点是否正常,发现有较多的io错误

[oracle@rtpdb ~]$ dmesg | grep -i error
end_request: I/O error, dev sdc, sector 716711160
Buffer I/O error on device dm-0, logical block 89588639
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588640
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588641
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588642
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588643
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588644
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588645
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588646
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588647
lost page write due to I/O error on dm-0
JBD2: Detected IO errors while flushing file data on dm-0-8
[root@rtpdb ~]# tail -f /var/log/messages
May  8 23:22:50 rtpdb kernel: Buffer I/O error on device dm-0, logical block 89588645
May  8 23:22:50 rtpdb kernel: lost page write due to I/O error on dm-0
May  8 23:22:50 rtpdb kernel: Buffer I/O error on device dm-0, logical block 89588646
May  8 23:22:50 rtpdb kernel: lost page write due to I/O error on dm-0
May  8 23:22:50 rtpdb kernel: Buffer I/O error on device dm-0, logical block 89588647
May  8 23:22:50 rtpdb kernel: lost page write due to I/O error on dm-0
May  8 23:22:50 rtpdb kernel: JBD2: Detected IO errors while flushing file data on dm-0-8
May  9 03:40:04 rtpdb rhsmd: In order for Subscription Manager to provide your system with updates, your system must be registered with the Customer Portal. Please enter your Red Hat login to ensure your system is up-to-date.
May  9 12:38:21 rtpdb kernel: NET: Unregistered protocol family 36
May  9 12:38:21 rtpdb kernel: NET: Registered protocol family 36

3.Rman检查报错的文件是否有坏块

RMAN> VALIDATE DATAFILE 118;

Starting validate at 09-MAY-25
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=647 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00118 name=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:00:15
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
118  FAILED 0              85421        268800          74401038924
  File Name: /oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              99872           
  Index      0              82856           
  Other      49             651             

validate found one or more corrupt blocks
See trace file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_ora_22883.trc for details
Finished validate at 09-MAY-25

RMAN> list backup summary;

specification does not match any backup in the repository

RMAN> 

从/u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_ora_22883.trc中检查具体的有哪些block损坏, 一下检查到这么多corrupt block

而且还没有物理备份(非归档模式的库)?该如何处理

[oracle@rtpdb ~]$ cat /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_ora_22883.trc | grep -i "Corrupt"
Corrupt block relative dba: 0x1d82e5cf (file 118, block 189903)
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data。
Corrupt block relative dba: 0x1d82e5ff (file 118, block 189951)
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data

这里出现连续的block 189903 到 block 189918 至少有 16坏块

4.查看坏块对应的object

检查这些坏块是输入对应的哪个object,看到是一个表

SELECT tablespace_name, segment_type, owner, segment_name 
FROM dba_extents 
WHERE file_id = 118 AND 
      (block_id BETWEEN 189903 AND 189918 OR
       (block_id + blocks - 1) BETWEEN 189903 AND 189918 OR
       block_id < 189903 AND (block_id + blocks - 1) > 189918);
TABLESPACE_NAME                SEGMENT_TYPE       OWNER                          SEGMENT_NAME
------------------------------ ------------------ ------------------------------ ---------------------------------------------------------------------------------
TBS_ODS                        TABLE              ODS                            LOT_MATERIAL_MASTER

5.标记坏块,防止操作失败

标记这些坏块并跳过,这个不算是标准处理流程,因为是报表库,元数据都是从另外一个库拉取的,这里标记跳过,联系报表的同事重建这个表

BEGIN
  DBMS_REPAIR.SKIP_CORRUPT_BLOCKS (
    schema_name   => 'ODS',
    object_name   => 'LOT_MATERIAL_MASTER',
    object_type   => DBMS_REPAIR.TABLE_OBJECT,
    flags         => DBMS_REPAIR.SKIP_FLAG);
END;
/
PL/SQL procedure successfully completed.

5.1 DBMS_REPAIR.SKIP_CORRUPT_BLOCKS包介绍

DBMS_REPAIR.SKIP_CORRUPT_BLOCKS  用于告诉数据库在访问特定表或索引时跳过已知的坏块(corrupt blocks),从而避免访问错误中断操作


✅ 主要作用

  • 标记指定对象中的坏块为可跳过,当应用或查询访问这些坏块时,Oracle 会跳过它们,而不是报错。

  • 适用于:

    • 表(TABLE_OBJECT

    • 索引(INDEX_OBJECT

  • 常用于数据库文件损坏、硬盘故障、备份文件不完整等情况下临时绕过问题块继续业务运行或数据导出


🧠 使用场景举例:

  • 表中某些数据块损坏,导致全表扫描失败。

  • 临时需要导出未损坏的数据,用于转移或恢复。

  • 配合 DBMS_REPAIR.CHECK_OBJECT 检测坏块后,继续运行业务逻辑。


📌 工作机制

启用后,对象上的查询或操作:

  • 遇到坏块 → Oracle 跳过不访问这些坏块

  • 这样能 最大程度保留/导出/访问完好数据

  • 不影响数据块的实际内容(不会修复坏块,仅跳过)


 常用调用格式

BEGIN DBMS_REPAIR.SKIP_CORRUPT_BLOCKS ( schema_name => 'SCOTT', object_name => 'EMP', object_type => DBMS_REPAIR.TABLE_OBJECT, flags => DBMS_REPAIR.SKIP_FLAG -- 开启跳过 ); END;

关闭跳过功能:

BEGIN DBMS_REPAIR.SKIP_CORRUPT_BLOCKS ( schema_name => 'SCOTT', object_name => 'EMP', object_type => DBMS_REPAIR.TABLE_OBJECT, flags => DBMS_REPAIR.NOSKIP_FLAG -- 关闭跳过 ); END;


⚠️ 注意事项

  1. 此操作不会修复坏块,只是忽略它们

  2. 配合 DBMS_REPAIR.CHECK_OBJECT 使用,先识别出坏块。

  3. 通常用于应急,不应长期依赖。

  4. 处理后建议尽快进行数据恢复或表重建
     

总结

 因为这个是库是非归档模式的,所以没有物理备份,这样遭遇了block corrupt确实非常麻烦,建议重要的库还是一定要启用归档并使用RMAN备份。

SQL> archive log list;
Database log mode              No Archive Mode
Automatic archival             Disabled
Archive destination            /u01/app/oracle/product/11.2.0/db_1/dbs/arch
Oldest online log sequence     4231143
Current log sequence           4231147

expdp备份部分表的脚本 供参考

[oracle@rtpdb ~]$ cat $HOME/jobs/expback.sh
#!/bin/bash
#backup table on noarchive db
#create by norton.fan 20220729
PATH=$PATH:$HOME/bin
ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
ORACLE_SID=rtp
PATH=$ORACLE_HOME/bin:$PATH
export ORACLE_BASE ORACLE_HOME ORACLE_SID
export PATH
NLS_LANG=AMERICAN_AMERICA.UTF8
export NLS_LANG
#export DELTIME=`date -d "15 days ago" +%Y%m%d`
export BACKUPTIME=`date +%Y%m%d%H%M%S`
expdp ods/ods dumpfile=ods$BACKUPTIME.dmp logfile=ods$BACKUPTIME.log parfile=/home/oracle/jobs/exp.par
#echo "Delete backup cycle before 15 days"
find /oradata/backup/ -mtime +1 -name  *.dmp -exec rm -f {} ';'   
find /oradata/backup/ -mtime +7 -name  *.log -exec rm -f {} ';'
[oracle@rtpdb ~]$ 
[oracle@rtpdb ~]$ cat /home/oracle/jobs/exp.par
DIRECTORY = dmpdir
SCHEMAS = ods
INCLUDE = TABLE:"IN (select table_name from exptab)" ##将需要备份的表名放入到exptab表中

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2373093.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

华为私有协议Hybrid

实验top图 理论环节 1. 基本概念 Hybrid接口&#xff1a; 支持同时处理多个VLAN流量&#xff0c;且能针对不同VLAN配置是否携带标签&#xff08;Tagged/Untagged&#xff09;。 核心特性&#xff1a; 灵活控制数据帧的标签处理方式&#xff0c;适用于复杂网络场景。 2. 工作…

数据库实验10

设计性实验 1&#xff0e;实验要求 1.编写函数FsumXXX&#xff0c;1~n&#xff08;参数&#xff09;求和&#xff1b; GO CREATE FUNCTION Fsum065 (n INT) RETURNS INT AS BEGIN DECLARE sum INT 0 WHILE n > 0 BEGIN SET sum sum n SET n n - 1 END RETURN sum END …

jflash下载时出现 Could not read unit serial number! 的解决方法

出现的原因是由于Jlink原厂固件SN码是-1 我用的版本是v6.40 解决方法&#xff1a;添加序列号 1.打开&#xff1a;J-Link commander 之后在命令栏输入&#xff1a;exec setsnxxxxxxxx 2.添加序列号到license&#xff0c;打开J-Link License Manager V6.40 jlink-v640下载软件…

Linux 信号终篇(总结)

前文&#xff1a;本文是对信号从产生到被处理的过程中的概念和原理的总结&#xff0c;如果想了解具体实现&#xff0c;请查看前两篇博客&#xff1a;Linux 信号-CSDN博客、Linux 信号&#xff08;下篇&#xff09;-CSDN博客 一、信号的产生 1.1 信号产生的五种条件 ①键盘组…

LVGL对象(Objects)

文章目录 &#x1f9f1; 一、LVGL 中的对象&#xff08;lv\_obj&#xff09;&#x1f539; lv\_obj\_t 的作用 &#x1f9e9; 二、对象的分类结构&#xff08;类比继承&#xff09;&#x1f9f0; 三、对象的创建与销毁✅ 创建对象示例&#xff1a;创建一个按钮❌ 删除对象 &…

服务器配置错误导致SSL/TLS出现安全漏洞,如何进行排查?

SSL/TLS 安全漏洞排查与修复指南 一、常见配置错误类型‌ 弱加密算法与密钥问题‌ 使用弱密码套件&#xff08;如DES、RC4&#xff09;或密钥长度不足&#xff08;如RSA密钥长度<2048位&#xff09;&#xff0c;导致加密强度不足。 密钥管理不当&#xff08;如私钥未加密存…

路由重发布

路由重发布 实验目标&#xff1a; 掌握路由重发布的配置方法和技巧&#xff1b; 掌握通过路由重发布方式实现网络的连通性&#xff1b; 熟悉route-pt路由器的使用方法&#xff1b; 实验背景&#xff1a;假设学校的某个分区需要配置简单的rip协议路由信息&#xff0c;而主校…

C++修炼:stack和queue

Hello大家好&#xff01;很高兴我们又见面啦&#xff01;给生活添点passion&#xff0c;开始今天的编程之路&#xff01; 我的博客&#xff1a;<但凡. 我的专栏&#xff1a;《编程之路》、《数据结构与算法之美》、《题海拾贝》、《C修炼之路》 欢迎点赞&#xff0c;关注&am…

【计算机视觉】优化MVSNet可微分代价体以提高深度估计精度的关键技术

优化MVSNet可微分代价体以提高深度估计精度的关键技术 1. 代价体基础理论与分析1.1 标准代价体构建1.2 关键问题诊断 2. 特征表示优化2.1 多尺度特征融合2.2 注意力增强匹配 3. 代价体构建优化3.1 自适应深度假设采样3.2 可微分聚合操作改进 4. 正则化与优化策略4.1 多尺度代价…

软考错题集

一个有向图具有拓扑排序序列&#xff0c;则该图的邻接矩阵必定为&#xff08;&#xff09;矩阵。 A.三角 B.一般 C.对称 D.稀疏矩阵的下三角或上三角部分包含非零元素&#xff0c;而其余部分为零。一般矩阵这个术语太过宽泛&#xff0c;不具体指向任何特定性 质的矩阵。对称矩阵…

T2I-R1:通过语义级与图像 token 级协同链式思维强化图像生成

文章目录 速览摘要1 引言2 相关工作统一生成与理解的 LMM(Unified Generation and Understanding LMM.)用于大型推理模型的强化学习(Reinforcement Learning for Large Reasoning Models.)3 方法3.1 预备知识3.2 语义级与令牌级 CoT语义级 CoT(Semantic-level CoT)令牌级…

Dockers部署oscarfonts/geoserver镜像的Geoserver

Dockers部署oscarfonts/geoserver镜像的Geoserver 说实话&#xff0c;最后发现要选择合适的Geoserver镜像才是关键&#xff0c;所以所以所以…&#x1f437; 推荐oscarfonts/geoserver的镜像&#xff01; 一开始用kartoza/geoserver镜像一直提示内存不足&#xff0c;不过还好…

扩增子分析|微生物生态网络稳定性评估之鲁棒性(Robustness)和易损性(Vulnerability)在R中实现

一、引言 周集中老师团队于2021年在Nature climate change发表的文章&#xff0c;阐述了网络稳定性评估的原理算法&#xff0c;并提供了完整的代码。自此对微生物生态网络的评估具有更全面的指标&#xff0c;自此网络稳定性的评估广受大家欢迎。本系列将介绍网络稳定性之鲁棒性…

【含文档+PPT+源码】基于微信小程序的社区便民防诈宣传系统设计与实现

项目介绍 本课程演示的是一款基于微信小程序的社区便民防诈宣传系统设计与实现&#xff0c;主要针对计算机相关专业的正在做毕设的学生与需要项目实战练习的 Java 学习者。 1.包含&#xff1a;项目源码、项目文档、数据库脚本、软件工具等所有资料 2.带你从零开始部署运行本套…

【MySQL】存储引擎 - ARCHIVE、BLACKHOLE、MERGE详解

&#x1f4e2;博客主页&#xff1a;https://blog.csdn.net/2301_779549673 &#x1f4e2;博客仓库&#xff1a;https://gitee.com/JohnKingW/linux_test/tree/master/lesson &#x1f4e2;欢迎点赞 &#x1f44d; 收藏 ⭐留言 &#x1f4dd; 如有错误敬请指正&#xff01; &…

代码随想录第41天:图论2(岛屿系列)

一、岛屿数量&#xff08;Kamacoder 99&#xff09; 深度优先搜索&#xff1a; # 定义四个方向&#xff1a;右、下、左、上&#xff0c;用于 DFS 中四向遍历 direction [[0, 1], [1, 0], [0, -1], [-1, 0]]def dfs(grid, visited, x, y):"""对一块陆地进行深度…

VUE CLI - 使用VUE脚手架创建前端项目工程

前言 前端从这里开始&#xff0c;本文将介绍如何使用VUE脚手架创建前端工程项目 1.预准备&#xff08;编辑器和管理器&#xff09; 编辑器&#xff1a;推荐使用Vscode&#xff0c;WebStorm&#xff0c;或者Hbuilder&#xff08;适合刚开始练手使用&#xff09;&#xff0c;个…

Java EE初阶——初识多线程

1. 认识线程 线程是操作系统能够进行运算调度的最小单位。它被包含在进程之中&#xff0c;是进程中的实际运作单位。 基本概念&#xff1a;一个进程可以包含多个线程&#xff0c;这些线程共享进程的资源&#xff0c;如内存空间、文件描述符等&#xff0c;但每个线程都有自己独…

如何删除网上下载的资源后面的文字

这是我在爱给网上下载的音效资源&#xff0c;但是发现资源后面跟了一大段无关紧要的文本&#xff0c;但是修改资源名称后还是有。解决办法是打开属性然后删掉资源的标签即可。

FPGA图像处理(5)------ 图片水平镜像

利用bram形成双缓冲&#xff0c;如下图配置所示&#xff1a; wr_flag 表明 buffer0写 还是 buffer1写 rd_flag 表明 buffer0读 还是 buffer1读 通过写入逻辑控制(结合wr_finish) 写哪个buffer &#xff1b;写地址 进而控制ip的写使能 通过状态缓存来跳转buffer的…