chrome镜像
selenium提供了一个镜像,但这个镜像里面包含了比较多的东西:
镜像地址-github
-  supervisord 
-  java 
-  chrome 
-  webDriver 
实际的使用中遇到了一些问题
- chrome遇到一些比较耗费内存和cup的操作的时候,有的时候会kill掉java进程,但supervisord不会自动拉起java进程。从ps看的话,webdriver和chrome就是孤儿进程了。
- 在某些情况下,webdriver退出不彻底,再次启动webdriver会在启动浏览器,此时,就会有两个浏览器。
当然加大内存是最简单的处理方式,但为了可控,随即自己封装chrome,将chrome和自己的业务代码封装在一起,要是发生上面的情况,可以调用命令行来做操作。
提供两种镜像
- 基于Ubuntu
- 基于Debian
大体流程如下
- 下载对应平台的安装包(手动下载到本地在安装,或者直接用apt-get 去搜索下载)。
- 下载对应版本的webDriver。
- apt-get 安装字体。
- 安装python,pip,用pip安装selenium库(这一点可以按照自己所熟悉的来,python我提供了代码,java也可以自己来安装jdk的环境,将自己的代码写成jar包,java jar运行,或者可以安装jdk17,命令行交互来操作)
基于Ubuntu
下载对应平台的安装包
-  手动下载安装 需要下载两个东西 -  chromium-codecs-ffmpeg-extra 90.0.4430.72-0ubuntu0.16.04.1 (amd64 binary) in ubuntu xenial -  访问页面 https://launchpad.net/ubuntu/xenial/amd64/chromium-codecs-ffmpeg-extra/90.0.4430.72-0ubuntu0.16.04.1 
-  下载链接 http://launchpadlibrarian.net/534151129/chromium-codecs-ffmpeg-extra_90.0.4430.72-0ubuntu0.16.04.1_amd64.deb 
 
-  
 
-  

-  chromium-browser 90.0.4430.72-0ubuntu0.16.04.1 -  访问页面 https://www.ubuntuupdates.org/package_metas?cx=005406051221663887916%3Aaw7ejs-ceqo&cof=FORID%3A11&ie=UTF-8&q=google-chrome-stable&commit=Package+Search 
  
-  下载链接 https://www.ubuntuupdates.org/package/core/xenial/universe/updates/chromium-browser 
 
-  
-  apt-get在线下载安装 这个安装方式基于Debian也是可以的,在Debian里面就不写了。 这里的安装方式来源于selenium提供的Dockfile https://github.com/SeleniumHQ/docker-selenium/blob/trunk/NodeChrome/Dockerfile ARG CHROME_VERSION="google-chrome-stable" RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \ && echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list \ && apt-get update -qqy \ && apt-get -qqy install \ ${CHROME_VERSION:-google-chrome-stable} \ && rm /etc/apt/sources.list.d/google-chrome.list \ && rm -rf /var/lib/apt/lists/* /var/cache/apt/* #============================================ # Chrome webdriver #============================================ # can specify versions by CHROME_DRIVER_VERSION # Latest released version will be used by default #============================================ ARG CHROME_DRIVER_VERSION RUN if [ -z "$CHROME_DRIVER_VERSION" ]; \ then CHROME_MAJOR_VERSION=$(google-chrome --version | sed -E "s/.* ([0-9]+)(\.[0-9]+){3}.*/\1/") \ && NO_SUCH_KEY=$(curl -ls https://chromedriver.storage.googleapis.com/LATEST_RELEASE_${CHROME_MAJOR_VERSION} | head -n 1 | grep -oe NoSuchKey) ; \ if [ -n "$NO_SUCH_KEY" ]; then \ echo "No Chromedriver for version $CHROME_MAJOR_VERSION. Use previous major version instead" \ && CHROME_MAJOR_VERSION=$(expr $CHROME_MAJOR_VERSION - 1); \ fi ; \ CHROME_DRIVER_VERSION=$(wget --no-verbose -O - "https://chromedriver.storage.googleapis.com/LATEST_RELEASE_${CHROME_MAJOR_VERSION}"); \ fi \ && echo "Using chromedriver version: "$CHROME_DRIVER_VERSION \ && wget --no-verbose -O /tmp/chromedriver_linux64.zip https://chromedriver.storage.googleapis.com/$CHROME_DRIVER_VERSION/chromedriver_linux64.zip \ && rm -rf /opt/selenium/chromedriver \ && unzip /tmp/chromedriver_linux64.zip -d /opt/selenium \ && rm /tmp/chromedriver_linux64.zip \ && mv /opt/selenium/chromedriver /opt/selenium/chromedriver-$CHROME_DRIVER_VERSION \ && chmod 755 /opt/selenium/chromedriver-$CHROME_DRIVER_VERSION \ && sudo ln -fs /opt/selenium/chromedriver-$CHROME_DRIVER_VERSION /usr/bin/chromedriver
安装字体
#================
# Font libraries
#================
# libfontconfig            ~1 MB
# libfreetype6             ~1 MB
# xfonts-cyrillic          ~2 MB
# xfonts-scalable          ~2 MB
# fonts-liberation         ~3 MB
# fonts-ipafont-gothic     ~13 MB
# fonts-wqy-zenhei         ~17 MB
# fonts-tlwg-loma-otf      ~300 KB
# ttf-ubuntu-font-family   ~5 MB
#   Ubuntu Font Family, sans-serif typeface hinted for clarity 在Debian里面也可以用
# Removed packages:
# xfonts-100dpi            ~6 MB
# xfonts-75dpi             ~6 MB
# fonts-noto-color-emoji   ~10 MB
# Regarding fonts-liberation see:
#  https://github.com/SeleniumHQ/docker-selenium/issues/383#issuecomment-278367069
# Layer size: small: 50.3 MB (with --no-install-recommends)
# Layer size: small: 50.3 MB
RUN apt-get -qqy update \
  && apt-get -qqy --no-install-recommends install \
    libfontconfig \
    libfreetype6 \
    xfonts-cyrillic \
    xfonts-scalable \
    fonts-liberation \
    fonts-ipafont-gothic \
    fonts-wqy-zenhei \
    fonts-tlwg-loma-otf \
    ttf-ubuntu-font-family \
    fonts-noto-color-emoji \
  && rm -rf /var/lib/apt/lists/* \
  && apt-get -qyy clean
下载webdriver
下载对应版本的webDriver,这里下载90版本的
http://chromedriver.storage.googleapis.com/index.html
Dockfile
前期准备工作已经就绪,需要注意,我这里用的是本地先下载好安装包,本地安装的方式,如果是在线安装,就把本地安装的代替掉
FROM ubuntu:18.04
RUN mkdir /google
#==============================
# copy chrome and driver
#==============================
COPY chromium-browser_90.0.4430.72-0ubuntu0.16.04.1_amd64.deb \
    chromedriver \
	chromium-codecs-ffmpeg-extra_90.0.4430.72-0ubuntu0.16.04.1_amd64.deb \
    /google/
#==============================
# Locale and encoding settings,install fonts and chrome,python3.6 pip3
#==============================
ENV LANG_WHICH en
ENV LANG_WHERE US
ENV ENCODING UTF-8
ENV LANGUAGE ${LANG_WHICH}_${LANG_WHERE}.${ENCODING}
ENV LANG ${LANGUAGE}
# Layer size: small: ~9 MB
# Layer size: small: ~9 MB MB (with --no-install-recommends)
RUN apt-get -y update \
  && apt-get -y --no-install-recommends install \
    language-pack-en \
    tzdata \
    locales \
  && locale-gen ${LANGUAGE} \
  && dpkg-reconfigure --frontend noninteractive locales \
  && apt-get -y autoremove \
  && apt-get -y --no-install-recommends install \
    libfontconfig \
    libfreetype6 \
    xfonts-cyrillic \
    xfonts-scalable \
    fonts-liberation \
    fonts-ipafont-gothic \
    fonts-wqy-zenhei \
    fonts-tlwg-loma-otf \
    ttf-ubuntu-font-family \
    fonts-noto-color-emoji \
    python3.6  python3-pip \
  && apt-get install -y --no-install-recommends  \
      /google/chromium-codecs-ffmpeg-extra_90.0.4430.72-0ubuntu0.16.04.1_amd64.deb  \
     /google/chromium-browser_90.0.4430.72-0ubuntu0.16.04.1_amd64.deb \
  && rm -rf /var/lib/apt/lists/* \
  && apt-get -qyy clean
#==============================
# pip3 install selenium
#==============================
RUN pip3 install selenium
目录结构

构建命令
在当前目录下面运行build命令
docker build . -t test/chrome:90
进入容器查看

基于Debian
和上面相比,主要是安装包不一样,别的都一样, chrome也得下载对应的版本
这里安装的chrome的版本是108的
下载对应平台的安装包
需要安装两个东西
-  chromium(108.0.5359.94-1~deb11u1) -  访问页面 https://packages.debian.org/bullseye/chromium 
-  下载链接 https://packages.debian.org/bullseye/amd64/chromium/download 
 
-  
-  chromium-common(108.0.5359.94-1~deb11u1) -  访问页面 https://packages.debian.org/bullseye/chromium-common 
-  下载链接 https://packages.debian.org/bullseye/amd64/chromium-common/download 
 
-  
要注意,webdriver也需要下载108的
Dockerfile
FROM ubuntu:18.04
RUN mkdir /google
#==============================
# copy chrome and driver
#==============================
COPY chromium-common_108.0.5359.94-1~deb11u1_amd64.deb \
    chromedriver \
	chrome/chromium_108.0.5359.94-1~deb11u1_amd64.deb \
    /google/
#==============================
# Locale and encoding settings,install fonts and chrome,python3.6 pip3
#==============================
ENV LANG_WHICH en
ENV LANG_WHERE US
ENV ENCODING UTF-8
ENV LANGUAGE ${LANG_WHICH}_${LANG_WHERE}.${ENCODING}
ENV LANG ${LANGUAGE}
# Layer size: small: ~9 MB
# Layer size: small: ~9 MB MB (with --no-install-recommends)
RUN apt-get -y update \
  && apt-get -y --no-install-recommends install \
    language-pack-en \
    tzdata \
    locales \
  && locale-gen ${LANGUAGE} \
  && dpkg-reconfigure --frontend noninteractive locales \
  && apt-get -y autoremove \
  && apt-get -y --no-install-recommends install \
    libfontconfig \
    libfreetype6 \
    xfonts-cyrillic \
    xfonts-scalable \
    fonts-liberation \
    fonts-ipafont-gothic \
    fonts-wqy-zenhei \
    fonts-tlwg-loma-otf \
    ttf-ubuntu-font-family \
    fonts-noto-color-emoji \
    python3.6  python3-pip \
  && apt-get install -y --no-install-recommends  \
      /google/chromium-common_108.0.5359.94-1~deb11u1_amd64.deb  \
     /google/chrome/chromium_108.0.5359.94-1~deb11u1_amd64.deb \
  && rm -rf /var/lib/apt/lists/* \
  && apt-get -qyy clean
#==============================
# pip3 install selenium
#==============================
RUN pip3 install selenium
ps:需要注意,我这里在安装成功之后并没有删除deb包,可以在上面结束之后删除掉。
Python 代码
这里是将网页导出PDF
import base64
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.ui import WebDriverWait
import time
option = webdriver.ChromeOptions()
option.add_argument("--headless")
option.add_argument("--no-sandbox")
driver = webdriver.Chrome(executable_path="/usr/src/app/chromedriver", port = 5055,options=option)
# 绑定命令
driver.command_executor._commands["SendCommand"] = ("POST","/session/$sessionId/chromium/send_command")
driver.command_executor._commands["executeCdpCommand"] = ("POST","/session/$sessionId/goog/cdp/execute")
driver.get("https://www.baidu.com/")
time.sleep(10)
driver.set_page_load_timeout(300) # set page load time out for wait pdf render
param = {"paperWidth": 8.27,"paperHeight": 11.69,"printBackground": True}
page_res = driver.execute("executeCdpCommand", {"cmd": "Page.printToPDF", "params": param})["value"]
img_byte_arr = base64.b64decode(page_res["data"])
with open("/test1.pdf","wb") as f:
    f.write(img_byte_arr)
这里保存在容器里面,需要通过cp命令从容器copy出来才可以查看
  相关的可以看
相关的可以看
https://stackoverflow.com/questions/49614217/selenium-clear-chrome-cache
https://chromedevtools.github.io/devtools-protocol/tot/Storage/
 https://peter.sh/experiments/chromium-command-line-switches/#hide-scrollbars
 https://stackoverflow.com/questions/53902507/unknown-error-session-deleted-because-of-page-crash-from-unknown-error-cannot










![[拆轮子] PaddleDetection中__shared__、__inject__ 和 from_config 三者分别做了什么](https://img-blog.csdnimg.cn/b9e372fc214c4ff89128210a7661db71.png)








