This one annoyed me a little bit because the information available through Google is so scattered these days, and I really struggled to find answers to my troubles.
A certain person has decided that this isn’t headless. F-you. Now I have to do it properly :(
You need to install python
and python-pip
. Then you need to use pip
to install selenium
.
You can install a web browser without any kind of DE installed, but there will be some dependencies. Personally, I tested with firefox-esr
on Debian 11.
Next you will need something called xvfb
which is a virtual display server thing. It performs all actions in memory without showing any screen input; perfect for what we need.
You’ll notice that this doesn’t require geckodriver being immediately available. No clue why, but I’m guessing because you’re technically just running firefox-esr
in a virtual display.
Putting this all together:
FROM debian:11
RUN apt -y update && apt -y upgrade
RUN apt -y install python3 python3-pip firefox-esr xvfb
RUN pip install selenium
Python code:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("https://j7b.net/jsload")
print(driver.page_source)
driver.close()
To run this: DISPLAY=:99 python3 test.py
It’s worth noting that this will be quickly detected by WAFs. Specifically, I noticed that SMH would timeout my connection. So this works for things within your control however if you’re trying to circumvent a WAF, then you’ll run into bad times.
test.py
Note that the executable_path and firefox_binary are specified so that I don’t ever need to Google for them again. Ever. Please, never again.
from selenium import webdriver
from selenium.webdriver import FirefoxOptions
# geckodriver location
geckodriver_path = "/usr/bin/geckodriver"
# firefox location
firefox_path = "/usr/bin/firefox"
# Set Options
options = FirefoxOptions()
options.add_argument("--headless")
# binary = FirefoxBinary('path/to/installed firefox binary')
browser = webdriver.Firefox(options=options, executable_path=firefox_path, firefox_binary=firefox_path)
browser.get("https://j7b.net")
print(browser.page_source)
Dockerfile
FROM debian:11
RUN apt -y update && apt -y upgrade
RUN apt -y install wget unzip tar
RUN apt -y install python3 python3-pip firefox-esr
RUN pip install selenium
RUN wget -qO- https://github.com/mozilla/geckodriver/releases/download/v0.32.2/geckodriver-v0.32.2-linux64.tar.gz | tar zxvsf - -C /usr/bin
ENTRYPOINT ["/usr/bin/python3"]
CMD ["/app/test.py"]
Running
docker build /path/to/Dockerfile/parent/dir somename:sometag
docker run -v "/path/to/project:/app" somename:sometag