The project is to demonstrate Randomized PCA trainer to detect anomalies such as unusual login times. The Randomized PCA trainer requires normalization of the values, Caching is not necessary and no additional NuGet Package is required to utilize the trainer.
The input is a known vector size of the Float type. The output comprises two properties: Score and PredictedLabel Score values is of the Float Type, non-negative and unbounded. PredictedLable property indicates a valid anomaly based on the threshold set; a true values indicates an anomaly, while a value of false indicates otherwise. Default value given by ML.Net is 0.5. Values higher than threshold returns true, and false if they are lower.
The sampledata.csv file in the Common Folder contains 100 rows of login data. Feel free to adjust the data to fit your own observation or to adjust the trained model. Here is a snippet of the data:
Each of these rows contains the value for the properties in the LoginHistoryClass. These correspond to UserID, CorporateNetwork, HomeNetwork, WithinWorkHours, WorkDay, Label
In addition to this, testdata.csv file in the Common Folder contains additional data points to test the newly trained model against and evaluate. Here is a snippet of the data:
Run the Console Application with commandline arguments:
- Train and test-evaluate the model using sampledata.csv and testdata.csv
D:\Machine Learning Projects\LoginAnomalyDetector\bin\Debug\net8.0 train "D:\Machine Learning Projects\LoginAnomalyDetector\Data\sampledata.csv" "D:\Machine Learning Projects\LoginAnomalyDetector\Data\testdata.csv"
- After training the model, build a sample JSON file and save it as input.json as follows:
- To run the model with the input.json, simply pass in the filename to built application and the predicted output will appear:
D:\Machine Learning Projects\LoginAnomalyDetector\bin\Debug\net8.0\LoginAnomalyDetector.exe predict "D:\Machine Learning Projects\LoginAnomalyDetector\Data\input.json"