TryHackMe – Advent of Cyber 2023 – Day 2

TryHackMe Advent of Cyber 2023 Day 2

Day 2 of the Advent of Cyber 2023 challenge involves several great topics, including Python, Data Science, Jupyter Notebooks, Pandas, and Matplotlib.

This might seem daunting, but the challenges are easy and the installed VM comes with everything we need, making it easy to tinker around and enjoy the learning process. Day 2’s activities are also a great way of introducing basic programming concepts like printing to the console, data types, variables, and working with data. We’re also forced to write a few lines of code which is always the best way to learn.

This room can be found at: https://tryhackme.com/room/adventofcyber2023

Day 2 – Log Analysis – O Data, All Ye Faithful

Question 1

Open the notebook “Workbook” located in the directory “4_Capstone” on the VM. Use what you have learned today to analyse the packet capture.

To complete the tasks for Day 2, we need to use the VM (virtual machine) provided by THM. You should see a green ‘Start Machine’ button at the top of the task.

Important: In order for your code to run, you need to click on each of the code cells above it and press ‘shift + enter’. This will run the code in each cell, including loading the libraries needed for the tasks. So when you go into the Capstone workbook, make sure to run the code in each cell so that your own code will work when you execute it.

Answer:

No answer needed

Question 2

How many packets were captured (looking at the PacketNumber)?

We can use the count() function to determine how many packets were captured. You can find the official documentation for the count method here. According to the documentation, count() takes in two parameters, axis and numeric_only:

DataFrame.count(axis=0numeric_only=False)

Both of these parameters are optional; using count() without any parameters defaults to ‘count(0,false)’.

So we can use:

df.count()
Packet capture count Advent of Cyber 2023 Day 2

The number of packets should correspond with the number of rows in each column.

Answer (Highlight Below):

100

Question 3

What IP address sent the most amount of traffic during the packet capture?

This question is asking us to determine who sent the most traffic, which we can get by looking at the source IP addresses for the packets.

In other words, we need to group using the ‘Source’ column and then use the size() function to determine the counts for each IP address:

df.groupby(['Source']).size()
Source IP Address count Advent of Cyber 2023 Day 2

Which IP address has sent the most packets?

Answer (Highlight Below):

10.10.1.4

Question 4

What was the most frequent protocol?

We are instructed to use the “value.counts” function to complete this task. This requires a bit of research. I found a value_counts() function in the documentation, which you can find here.

At the end of the documentation page, it shows how we can use value_counts() to generate a sorted set with count numbers from a single column:

The value_counts() function in Python Pandas

In this example from the documentation, we have retrieved a set using the ‘first_name’ column.

Applying this to the task at hand, we know that we should be targeting the ‘Protocol’ column, i.e.:

df.value_counts("Protocol")
Using value_counts function Advent of Cyber 2023 Day 2

All we have to do is identify the protocol with the highest number of counts.

Alternatively, we can also use the groupby() function that we used to answer the last question. The only real difference is that the protocols aren’t sorted by count:

df.groupby(['Protocol']).size()
Using the groupby function to analyze protocol counts Python Pandas Advent of Cyber 2023 Day 2

Answer (Highlight Below):

ICMP

Conclusion

Day 2 was really interesting, covering a fairly wide range of topics including Python, Data Science, Jupyter Notebooks, Pandas, and Matplotlib. We get to scratch the surface on these topics, which is interesting because you could probably spend years mastering each of these topics individually. What I really liked about Day 2 is that it also simulates the type of learning that hackers need to get comfortable with, which often means getting a working knowledge of an unknown technology in a very short period of time.