Checking for dead links locally in a static website (using wget?)
So I think you are running in the right direction. I would use wget
and python
as they are two readily available options on many systems. And the good part is that it gets the job done for you. Now what you want is to listen for Serving HTTP on 0.0.0.0
from the stdout
of that process.
So I would start the process using something like below
python3 -u -m http.server > ./myserver.log &
Note the -u
I have used here for unbuffered output, this is really important
Now next is waiting for this text to appear in myserver.log
timeout 10 awk '/Serving HTTP on 0.0.0.0/{print; exit}' <(tail -f ./myserver.log)
So 10
seconds is your maximum wait time here. And rest is self-explanatory. Next about your kill $pid
. I don't think it is a problem, but if you want it to be more like the way a user does it then I would change it to
kill -s SIGINT $pid
This will be equivalent to you processing CTRL+C
after launching the program. Also I would handle the SIGINT
my bash script as well using something like below
The above basically adds below to top of the bash script to handle you killing the script using CTRL+C
or external kill signal
#!/bin/bashexit_script() { echo "Printing something special!" echo "Maybe executing other commands!" trap - SIGINT SIGTERM # clear the trap kill -- -$$ # Sends SIGTERM to child/sub processes}trap exit_script SIGINT SIGTERM
Tarun Lalwani's answer is correct, and following the advices given there one can write a clean and short shell script (relying on Python and awk). Another solution is to write the script completely in Python, giving a slightly more verbose but arguably cleaner script. The server can be launched in a thread, then the command to check the website is executed, and finally the server is shut down. We don't need to parse the textual output nor to send a signal to an external process anymore. The key parts of the script are therefore:
def start_server(port, server_class=HTTPServer, handler_class=SimpleHTTPRequestHandler): server_address = ('', port) httpd = server_class(server_address, handler_class) thread = threading.Thread(target=httpd.serve_forever) thread.start() return httpddef main(cmd, port): httpd = start_server(port) status = subprocess.call(cmd) httpd.shutdown() sys.exit(status)
I wrote a slightly more advanced script (with a bit of command-line option parsing on top of this) and published it as: https://gitlab.com/moy/check-links