1、使用VS code搭建Python编译环境
2、安装pdf2doc库1
pip install pdf2docx
 

3、编写代码
3.1 使用parse将pdf转化为docx
编写 pdf2docxParse.py
from pdf2docx import parse
# 文件名
pdf_file = 'demo-image-overlap.pdf'
docx_file = 'demo-image-overlap.docx'
# 将pdf转为docx
parse(pdf_file, docx_file)
 
运行 pdf2docxParse.py
python pdf2docxParse.py 
 

3.2 使用convert将pdf转化为docx
3.2.1 编写 pdf2docxConvert.py
from pdf2docx import Converter
# 文件名
pdf_file = 'demo-image-overlap.pdf'
docx_file = 'demo-image-overlap.docx'
cv = Converter(pdf_file)
cv.convert(docx_file, start=0, end=None)
cv.close()
 
3.2.2 运行 pdf2docxConvert.py
python pdf2docxConvert.py
 

3.3 使用命令行输入pdf 转化pdf
3.3.1 编写 SMQHPdf2Docx.py
'''
  @Description 使用命令行到处pdf
  @Author: 少莫千华
  @Time:  2023-06-11
'''
# import logging
import argparse
from pdf2docx import Converter
def main(pdf_file,docx_file):
    cv = Converter(pdf_file)
    cv.convert(docx_file, start=0, end=None)
    cv.close()
    
if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--pdf",type=str)
    args = parser.parse_args()
    # logging.debug(args.pdf)
    main(args.pdf,args.pdf + '.docx')
 
3.2.3 运行 SMQHPdf2Docx.py
python SMQHPdf2Docx.py --pdf demo-image-overlap.pdf
 

3.3 转化效果

DOCX

点击查看pdf2doc详细说明 ↩︎


















