Skip to main content

Posts

Showing posts from January, 2023

Behind the Booking: Deconstructing a High-Context Hospitality Phishing Campaign

 A highly targeted phishing campaign has been hitting hotel guests across Luxembourg. Originally flagged by the Computer Incident Response Center Luxembourg (CIRCL), this campaign stands out not because of advanced malware, but because of its impeccable contextual credibility . Threat actors aren't guessing targets; they are hitting actual hotel guests on WhatsApp with exact, legitimate booking details to steal credit card data. As part of a technical review into the infrastructure, we analyzed a recent Indicator of Compromise (IoC) linked to this campaign: [https://stay-hotel607923.com](https://stay-hotel607923.com) . Here is the deep dive into how this attack works, the infrastructure behind it, and how to track it. The Attack Workflow: Smishing with Context Most phishing campaigns rely on volume, hoping a small fraction of a massive email list bites. This campaign relies on precision. The Data Exposure: CIRCL assesses that the campaign's source data may originate from servi...

Evaluating the Performance — Part 4

  We’ll need to evaluate the performance of the detector built to ensure that we are achieving a higher true positive rate than a false positive rate. Also as we increase the types of features built and used, we’ll need to monitor their performance. ROC Curve In order to evaluate the performance of the detector, we are going to use the Receiver Operating Characteristic (ROC) curve. We plot the false-positive rates against the true positive rates at various thresholds. This will help determine how to configure our detector to get the optimal settings. Detectors are not perfect, there will be false positives but we can use this method to reduce the false positive rate and increase our true positive rate.  When you think about the process and the possibilities then it seems like a never-ending story but we should look at it as evolving our detector. As we implement our function to evaluate the detector performance, we will delve further into the requirements of the ROC curve and ...

Applying Data Science to Malware — Part 3

  Now we will build a machine learning detector. In order to build a machine learning detector, we need to extract a substantial amount of features from our software binary, not just malware because the point of the detector is to determine whether the software binary is malicious or benign.  But at this moment in time, I’m only using the strings feature, in the future I plan to add more features. Strings feature def get_string_features(path,hasher):  chars = r” -~”  min_length = 5  string_regexp = ‘[%s]{%d,}’ % (chars, min_length)  file_object = open(path)  data = file_object.read()  pattern = re.compile(string_regexp)  strings= pattern.findall(data) string_features = {}  for string in strings:  string_features[string] = 1 hashed_features = hasher.transform([string_features]) hashed_features = hashed_features.todense()  hashed_features = numpy.asarray(hashed_features)  hashed_features = hashed_features[0] print “Extracted...

Applying Data Science to Malware — Part 2

  Shared code analysis In the last section, I wrote about building networks and producing a visual graph that shows the connections between Malware. In this section, I will go through the script where we create a system that will show the links between Malware based on shared code analysis. Terminology Before we start to build the system, we first need to understand the following: 1. Jaccard index 2. Minhashes Jaccard index The Jaccard index is quite simple, it is worked out by diving the total of shared attributes (between malware) and the total attributes. For example: Jaccard index = 0.5 when shared attributes (5) / total attributes (10). Now, this is useful for small data sets, but when we want to compare large data sets then we turn to “minhashes”. Minhashes Now Minhashes isn’t so simple. A minhash is a technique used to estimate the similarity of two sets.  Our minhash is a malware sample’s feature (in our below system the features will be the results from “strings”) and...

Applying Data Science to Malware —Part 1

  With Malware exploding in numbers, I decided to learn and apply Data Science to Malware. So first I need a number of Malware samples, which I obtained from  https://github.com/fabrimagic72/malware-samples Now the following techniques can work on any set of Malware, maybe if your a business/organization who is being targeted or you’ve been following a certain group of Malware authors and you want to see how the Malware is connected, if they use the same resources, hosts, code, etc then that would yield some interesting data and start to paint a picture. Unfortunately, I don’t have access to those sets of Malware but that doesn’t say we can’t apply the techniques to Malware collected from honeypots. Ransomeware samples From the Malware samples, the Ransomware folder looks to have a number of samples we could apply the techniques on. Step one: unzip all the Malware within that dir: find . -name “*.zip” | while read filename; do 7z x $filename -pinfected -aou; done; Step two: st...