-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caution: Please be suspicious of this project and code #65
Comments
Yes, I agree with your points. Plus, leveraging the anomaly labels from the "testing dataset" when computing anomaly threshold seems to be simply just wrong regardless of whether some previous works did it in the same way or not (it cannot justify the validity of the method when the method looks obviously faulty). Besides, in the used data_loader.py code, the validation set is simply equal to test set (except for SMD data). That is, there is no actual validation set used unlike how it was explained in the paper. Unfortunately, this problem is being propagating in the community as I found a few other works that adapt this evaluation method for anomaly task, where exactly the same issue as this work are found in as well. I believe the authors of the works have noticed the incorrectness of the method. Be aware of the works (I found two below) that followed the same evaluation method. |
I agree with your point of view; the arguments in the references mentioned by the author are not sufficient. Given the rigor of scientific research, I believe this method should not be continued. The views in the original text are as follows: “”“ |
This method achieves excellent performance primarily due to the utilization of 'detection adjustment' and 'softmax'. In 'solver.py', the author employs 'softmax' to compute the 'metric' in lines 319 and 280. This results in a high value close to 1 for each window, thereby causing each window to contain at least one timestamp with a notably large anomaly score compared to others within the same window. Consequently, within the 'pred' output, each window is likely to be flagged as containing at least one anomaly. Subsequently, when 'detection adjustment' is applied, the entire continuous anomaly sequence is labeled as anomalous. However, 'softmax' is not suitable in this context because it cannot effectively model the relationship between different timestamps within a window. Removing 'softmax' from lines 319 and 280 would lead to a significant decrease in performance. |
This problem is very obvious and has been clearly pointed out by other questioners as well.
#4
You can easily see the big difference before and after adding the suspicious code called "detection adjustment" in the above link.
The author gives the following incomprehensible and unclear answer to this:
->The two papers you link to as evidence are not published in official journals, and an author is credited with both papers.
If so, please provide review papers published in validated journals that support your claims.
Even if you are right, if tuning to improve a model's low performance to this high is a practice in academia, then that practice should be eliminated.
-> Remember the real time industrial data has no label!! Especially, when it comes to the anomaly detection!
The text was updated successfully, but these errors were encountered: