Towards a more sustainable anomaly detectionnew methods and practical applications
- Meira, Pablo Alexandre Novo
- Amparo Alonso Betanzos Director
- Maria Goreti Carvalho Marreiros Co-director
- Verónica Bolón-Canedo Co-director
Defence university: Universidade da Coruña
Fecha de defensa: 07 June 2023
- María Jesús Taboada Iglesias Chair
- Bertha Guijarro-Berdiñas Secretary
- João Gama Committee member
Type: Thesis
Abstract
Anomaly detection is a critical problem in many fields, with applications ranging from intrusion detection to fault diagnosis and predictive maintenance. Unsuper- vised methods have gained widespread popularity due to their ability to learn from data without requiring labeled examples. This doctoral thesis presents a comprehen- sive overview of anomaly detection methods, with a particular focus on unsupervised techniques, and their applications in a wide variety of domains. The thesis also emphasizes sustainability by presenting methods that are de- signed to be scalable, efficient, and able to handle large and complex datasets. The automatic hyperparameter tuning mechanisms, combined with the distributed prop- erties of some of the methods, enable efficient processing and minimize the need for manual tuning, which can be time-consuming and resource-intensive. This results in a more sustainable and efficient approach to anomaly detection, reducing the risk of overloading systems and minimizing the carbon footprint of the processing involved. These approaches are applied to various datasets and domains, including an IoT intrusion detection dataset, a railway system data stream, and tourist preferences based on the TripAdvisor reviews dataset. The performance of the methods is evaluated using a range of metrics, such as classification accuracy, precision, recall, area under the curve ROC, processing time, and statistical tests such as the Nemmenyi post hoc test, showing state-of-art results. The research presented in this dissertation makes a significant contribution to the field of anomaly detection by introducing new methods that are more efficient for dealing with large and complex datasets. Moreover, the methods are scalable and sustainable, which are important factors for their deployment in real-world applications. Overall, the work in this thesis provides a detailed and up-to-date overview of anomaly detection methods, with a focus on unsupervised techniques and their practical applications, specially with the new tendencies towards a greener AI.