Skip to content

Commit

Permalink
Added code
Browse files Browse the repository at this point in the history
fredtux committed Jun 15, 2024
0 parents commit 8aeee0d
Showing 49 changed files with 9,002 additions and 0 deletions.
2,017 changes: 2,017 additions & 0 deletions NgramAll_AE.ipynb

Large diffs are not rendered by default.

1,092 changes: 1,092 additions & 0 deletions NgramAll_LR.ipynb

Large diffs are not rendered by default.

1,950 changes: 1,950 additions & 0 deletions NgramLower_AE.ipynb

Large diffs are not rendered by default.

1,160 changes: 1,160 additions & 0 deletions NgramLower_LR.ipynb

Large diffs are not rendered by default.

15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Detectia anomaliilor in traficul DNS
## Licenta FMI 2024 - Florin Silviu Dinu


### Dataset
Directorul **data** contine subdirectoare cu fisiere **README.md** pentru preluarea datasetului din sursele oficiale. Acesta nu poate fi incarcat ca atare intrucat nu detin drepturile, dar este mentionat cu referintele corespunzatoare in lucrare.

Lista dataseturilor folosite:
* Preia datasetul CICBellEXFDNS2021 de la: [https://www.unb.ca/cic/datasets/dns-exf-2021.html](https://www.unb.ca/cic/datasets/dns-exf-2021.html)
* Preia top 1.000.000 domenii de la: [https://www.crawlson.com/domains](https://www.crawlson.com/domains)
* Preia lista de SLD-uri de la: [https://github.com/gavingmiller/second-level-domains](https://github.com/gavingmiller/second-level-domains)


Pentru datasetul CICBellEXFDNS2021 trebuie mentionata si urmatoarea lucrare:
* Samaneh Mahdavifar, Amgad Hanafy Salem, Princy Victor, Miguel Garzon, Amir H. Razavi, Natasha Hellberg, Arash Habibi Lashkari, “Lightweight Hybrid Detection of Data Exfiltration using DNS based on Machine Learning”, The 11th IEEE International Conference on Communication and Network Security (ICCNS), Dec. 3-5, 2021, Beijing Jiaotong University, Weihai, China.
390 changes: 390 additions & 0 deletions Statistics.ipynb

Large diffs are not rendered by default.

1,028 changes: 1,028 additions & 0 deletions TimeSeries_All.ipynb

Large diffs are not rendered by default.

1,034 changes: 1,034 additions & 0 deletions TimeSeries_lower.ipynb

Large diffs are not rendered by default.

Binary file added ae_all_cm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ae_lower_cm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions autoencoder_all.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"roc_auc": 1.0,
"accuracy": 1.0,
"benign": {
"precision": 1.0,
"recall": 1.0,
"f1": 1.0
},
"attack": {
"precision": 1.0,
"recall": 1.0,
"f1": 1.0
},
"tp": 28355,
"tn": 10065,
"fp": 0,
"fn": 0,
"false_alerts": 0,
"attack_passed": 0.0
}
Binary file added autoencoder_all.pkl
Binary file not shown.
20 changes: 20 additions & 0 deletions autoencoder_lower.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"roc_auc": 0.7224043715846995,
"accuracy": 0.8545549193128579,
"benign": {
"precision": 0.8353710632531007,
"recall": 1.0,
"f1": 0.9103020963754855
},
"attack": {
"precision": 1.0,
"recall": 0.4448087431693989,
"f1": 0.615733736762481
},
"tp": 28355,
"tn": 4477,
"fp": 5588,
"fn": 0,
"false_alerts": 0,
"attack_passed": 55.51912568306011
}
Binary file added autoencoder_lower.pkl
Binary file not shown.
44 changes: 44 additions & 0 deletions common/bigram_processing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import numpy as np
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.preprocessing import StandardScaler

def bigram_split(text):
text = '$' + text + '$'
return [text[i:i+2] for i in range(len(text)-1)]

def bigram_list(text_list):
bigrams = []
for text in text_list:
for level in text.split('.'):
bigrams += bigram_split(level)

return bigrams

def bigram_freq2(bigrams_list):
bigram_freq = {}
for bigram in bigrams_list:
if bigram in bigram_freq:
bigram_freq[bigram] += 1
else:
bigram_freq[bigram] = 1
for bigram in bigram_freq:
bigram_freq[bigram] /= len(bigrams_list)
return bigram_freq


def bigram_freq(bigrams_list):
bigram_freq = {}
for bigram in bigrams_list:
if bigram in bigram_freq:
bigram_freq[bigram] += 1
else:
bigram_freq[bigram] = 1
return bigram_freq

def rank_bigrams_freq(bigram_freq):
sorted_bigrams = sorted(bigram_freq.items(), key=lambda x: x[1], reverse=True)

ranked_bigrams = {bg: freq for i, (bg, freq) in enumerate(sorted_bigrams)}

return ranked_bigrams

27 changes: 27 additions & 0 deletions common/data_loading.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import numpy as np
import scapy.all as scapy

def load_ngram(path):
result = []
for pkt in scapy.rdpcap(path):
if pkt.haslayer(scapy.DNSQR):
query = pkt[scapy.DNSQR].qname.decode()
result.append(query)
return result

def load_ts(path):
result = []
for pkt in scapy.rdpcap(path):
if pkt.haslayer(scapy.DNSQR):
# query = pkt[scapy.DNSQR].qname.decode()
# # query = query.split('.')
# # query = '.'.join(query[:len(query) - 1])
# data_attack.append(query)
# Get src from ethernet
result.append({
'time': float(pkt.time),
'query': pkt[scapy.DNSQR].qname.decode(),
'src': pkt[scapy.Ether].src,
'len': pkt[scapy.UDP].len,
})
return result
13 changes: 13 additions & 0 deletions common/data_paths.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# File containing paths to data
DS_PATH = 'data/CICBellEXFDNS2021/'
CSV_PATH = DS_PATH + 'CSV/'
PCAP_PATH = DS_PATH + 'PCAP/'

ATTACK_HEAVY_PATH = 'Attack_heavy_Benign/'
ATTACK_LIGHT_PATH = 'Attack_Light_Benign/'
BENIGN_PATH = 'Benign/'
ATTACK_PATH = 'Attacks/'

DATA_PATH = 'data/'
DOMAINS_PATH = DATA_PATH + 'domains/'
MISC_PATH = DATA_PATH + 'misc/'
16 changes: 16 additions & 0 deletions common/data_processing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import re

def not_exfil(d):
return not re.fullmatch(r'^[0-9]+\.[A-z\-\_0-9]+\.cicresearch\.ca\.*|init\.[A-z\-\_0-9]+\.base64\.cicresearc\.*|init\.[A-z\-\_0-9]+\.base64\.cicrese\.|init\.[A-z\-\_0-9]+\.base64\.cicre\.|init\.[A-z\-\_0-9]+\.base64\.+|init\.[A-z\-\_0-9]+\.|^\d+\.[A-z0-9\-\_]+\.b|init\.[A-z\-\_0-9]+\.b\.*|^\d+\.[A-z0-9\-\_]+\.', d)


def get_domain_name(text, slds):
text = text.split('.')
if len(text) > 2 and text[-2] in slds:
return text[-3].lower()
else:
return text[-2].lower()

def same_domain(d1, set_domains):
d1 = get_domain_name(d1)
return d1 in set_domains
20 changes: 20 additions & 0 deletions common/globals.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from sklearn.preprocessing import StandardScaler
import torch
import numpy as np
import random

input_dim = 64
grams_num = 2
unfound_value = 0
scaler = StandardScaler()

# Random seed 42
## On CPU
random.seed(42) # For random
np.random.seed(42) # For numpy
torch.manual_seed(42) # For torch CPU

## On GPU
torch.cuda.manual_seed(42)
torch.cuda.manual_seed_all(42)
torch.backends.cudnn.deterministic = True
1 change: 1 addition & 0 deletions data/CICBellEXFDNS2021/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Get dataset from: [https://www.unb.ca/cic/datasets/dns-exf-2021.html](https://www.unb.ca/cic/datasets/dns-exf-2021.html)
1 change: 1 addition & 0 deletions data/domains/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Get dataset from: [https://www.crawlson.com/domains](https://www.crawlson.com/domains)
1 change: 1 addition & 0 deletions data/misc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Get dataset from: [https://github.com/gavingmiller/second-level-domains](https://github.com/gavingmiller/second-level-domains)
20 changes: 20 additions & 0 deletions logisticregression_all.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"roc_auc": 0.9997019374068554,
"accuracy": 0.9998438313378448,
"benign": {
"precision": 0.999788441874405,
"recall": 1.0,
"f1": 0.9998942097468086
},
"attack": {
"precision": 1.0,
"recall": 0.9994038748137108,
"f1": 0.9997018485390579
},
"tp": 28355,
"tn": 10059,
"fp": 6,
"fn": 0,
"false_alerts": 0,
"attack_passed": 0.05961251862891207
}
Binary file added logisticregression_all.pkl
Binary file not shown.
20 changes: 20 additions & 0 deletions logisticregression_lower.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"roc_auc": 0.9746646795827123,
"accuracy": 0.9867256637168141,
"benign": {
"precision": 0.9823315433916507,
"recall": 1.0,
"f1": 0.9910870325061167
},
"attack": {
"precision": 1.0,
"recall": 0.9493293591654247,
"f1": 0.9740061162079511
},
"tp": 28355,
"tn": 9555,
"fp": 510,
"fn": 0,
"false_alerts": 0,
"attack_passed": 5.067064083457526
}
Binary file added logisticregression_lower.pkl
Binary file not shown.
Binary file added logisticregression_ts_all.pkl
Binary file not shown.
Binary file added logisticregression_ts_lower.pkl
Binary file not shown.
Binary file added lr_all_cm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added lr_lower_cm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added prophet_ts_all.pkl
Binary file not shown.
Binary file added prophet_ts_lower.pkl
Binary file not shown.
73 changes: 73 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
asttokens==2.4.1
cmdstanpy==1.2.2
comm==0.2.2
contourpy==1.2.1
cycler==0.12.1
debugpy==1.8.1
decorator==5.1.1
exceptiongroup==1.2.1
executing==2.0.1
filelock==3.14.0
fonttools==4.51.0
fsspec==2024.5.0
holidays==0.49
importlib_resources==6.4.0
ipykernel==6.29.4
ipython==8.24.0
jedi==0.19.1
Jinja2==3.1.4
joblib==1.4.2
jupyter_client==8.6.2
jupyter_core==5.7.2
kiwisolver==1.4.5
MarkupSafe==2.1.5
matplotlib==3.9.0
matplotlib-inline==0.1.7
mpmath==1.3.0
nest-asyncio==1.6.0
networkx==3.3
numpy==1.26.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.1.105
packaging==24.0
pandas==2.2.2
parso==0.8.4
pexpect==4.9.0
pillow==10.3.0
platformdirs==4.2.2
prompt-toolkit==3.0.43
prophet==1.1.5
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
Pygments==2.18.0
pyparsing==3.1.2
python-dateutil==2.9.0.post0
pytz==2024.1
pyzmq==26.0.3
scapy==2.5.0
scikit-learn==1.4.2
scipy==1.13.0
six==1.16.0
stack-data==0.6.3
stanio==0.5.0
sympy==1.12
threadpoolctl==3.5.0
torch==2.3.0
tornado==6.4
tqdm==4.66.4
traitlets==5.14.3
triton==2.3.0
typing_extensions==4.11.0
tzdata==2024.1
wcwidth==0.2.13
Binary file added roc_auc_lr_all_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_all_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_all_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_lower_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_lower_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_lower_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_ts_all_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_ts_all_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_ts_all_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_ts_lower_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_ts_lower_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added roc_auc_lr_ts_lower_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions timeseries_all.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"roc_auc": 0.9996714093360534,
"accuracy": 0.9996095783446122,
"benign": {
"precision": 0.9999294383290996,
"recall": 0.9995415270675366,
"f1": 0.999735445069578
},
"attack": {
"precision": 0.9987098054783644,
"recall": 0.9998012916045703,
"f1": 0.9992552504840871
},
"tp": 28342,
"tn": 10063,
"fp": 2,
"fn": 13,
"false_alerts": 13,
"attack_passed": 0.01987083954297069
}
20 changes: 20 additions & 0 deletions timeseries_lower.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"roc_auc": 0.9987083954297069,
"accuracy": 0.9993232691306612,
"benign": {
"precision": 0.99908389415454,
"recall": 1.0,
"f1": 0.9995417371686408
},
"attack": {
"precision": 1.0,
"recall": 0.9974167908594138,
"f1": 0.9987067250298448
},
"tp": 28355,
"tn": 10039,
"fp": 26,
"fn": 0,
"false_alerts": 0,
"attack_passed": 0.25832091405861896
}
Binary file added ts_all_cm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ts_lower_cm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 8aeee0d

Please sign in to comment.