Unsupervised Anomaly Detection in Deep Learning: From Autoencoders to Transformers | IJET – Volume 12 Issue 1 | IJET-V12I1P3

International Journal of Engineering and Techniques (IJET) Logo

International Journal of Engineering and Techniques (IJET)

Open Access • Peer Reviewed • High Citation & Impact Factor • ISSN: 2395-1303

Volume 12, Issue 1  |  Published: January 2026

Author:Virendra Tank, Dr. Swati Agarwal, Shivangi Sharma

DOI: https://doi.org/{{doi}}  â€˘  PDF: Download

Abstract

Anomaly detection is a significant problem in various applications ranging from cyber security to manufacturing quality control, which necessitates the ability to detect rare and unusual patterns. With the development of deep learning from traditional methods, it is possible to learn finer-grained data representation and pattern structure in complex high-dimensional information. We thoroughly review unsupervised deep learning methods for anomaly detection in this work, ranging from classical ones to the latest transformer-based. The contributions of our paper are as follows: 1) We conduct a comprehensive survey on state-of-the-art unsupervised deep models with attention mechanisms; 2) We investigate and summarize the applicability and possible variants of them, covering their pros and cons. We study the subfamilies of autoencoders, such as variational and adversarial autoencorders, discuss GAN-based detection techniques and examine recently introduced transformer architectures targeted for outlier detection. By analyzing applications in cyber security, manufacturing defect detection and fraud detection, we explain how these techniques face practical issues while we provide academics with research directions on this fast-evolving domain.

Keywords

Anomaly detection, Unsupervised learning, Autoencoders, GANs, Transformers, Deep learning, Outlier detection

Conclusion

From classical statistical and computer science learning techniques to complex deep learning structures, the style of unsupervised anomaly detection has undergone dramatic changes. Autoencoders, including VAEs and adversarial forms, offer amazing reconstruction-based detection by learning normality itself. Methods using GAN- based approaches use generative modeling to learn data distribution complexity and check for distribution anomalies. Attention mechanisms and the modeling of long-range dependence allowed by the transformer let us discover abnormality in context and on a large scale. Real-world application scenarios in cybersecurity, manufacturing e-commerce apps, finance services and healthcare services highlight the effectiveness of these methods and their practical use-cases. Meanwhile, they also present sticking points include (class imbalance, interpretability); also (how can we make these methods adaptable to different domains), in particular healthcare. efficiency of computation The field continues to advance through innovations in self-supervised learning, few-shot detection, explainability, and continual learning. Anomaly detection is becoming more and more important in self-driving cars, protecting critical infrastructure and safety-critical applications. This means that we must produce methods that are robustly efficient and interpretable. The convergence of learning paradigms brought by multiple deep neural networks, along with combining it with domain-specific expertise and rigorous evaluation guidelines assures that this is still a critical area of advancement in machine learning.

References

[1]Chandola, V., et al. (2009). “Anomaly Detection: A Survey,”, ACM Computing Surveys 41 (3):1-58. [2]Pang, G., Cao, L., van der Hengel, A. (2021) “Deep Learning-Based Anomaly Detection: A Review” ACM Computing Surveys 54 (2):1-38. [3]Liu, FT, et al. (2008) “Isolation Forest,” Proc. ICDM 2008:413-22. [4]Scholkoph B, Smola A, et al. (1999) “Support Vector Method for Novelty Detection,” in NIPS 12(1). [5]Zimek,A, et al. (2012). “A Survey of Unsupervised Outlier Detection in High-Dimensional Numerical Data” Data Mining and Statistical Analysis 5 (5):363-87 [6]LeCun, Y., Yoshua, V., Geoffrey, H. (2015) “Deep Learning.” Nature 521 (7553) 436-44. [7]Through this unpaid labour, the result is to be obtained without using any Algorithm of loss of interest. [8]Tax, D., & Duin, R. (2004). Support vector data description. Machine Learning. [9]Ramaswamy, S., Rastogi, R., & Shim, K. (2000). Efficient algorithms for mining outliers from large data sets. SIGMOD. [10]Christian Even, M., Kedem A.j, Tibshirani R. (1979). LOF: Identifying density-based local outliers. SIGMOD. [11]Reynolds, D. A. (2009). Gaussian mixture models. In Encyclopedia of Biometrics (741-659). [12]Goodfellow, I., Bengio, Y., & Courville, A. (2016). “Deep learning.” MIT Press. [13]Erfani, S. M., Rajasegarar, S., Karunasekera, S., & Leckie, C. (2016). High-dimensional and large- scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition. [14]Ruff, L., Kauffmann, J. R., Vandermeulen, R. A., et al. (2021). A Unifying Review of Deep and Shallow Anomaly Detection. Proceedings of the IEEE. [15]Hinton G., & Salakhutdinov R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786):504-507. [16]Sakurada, M. and Yairi, T. (2014), “Anomaly detection using autoencoders with nonlinear dimensionality reduction”, MLSDA Workshop, 4-11 [17]R Chalapathy, A K Menon, S Chawla, Anomaly detection using one-class neural networks, arXiv:1802.06360 (2018). [18]J Masci, U Meier, D Cireşan, J Schmidhuber, Stacked convolutional auto-encoders for hierarchical feature extraction, ICANN, 52-59 (2011). [19]Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P. A., “Stacked denoising autoencoders.”, JMLR Vol. 11, (2010)., No. 12 (Wang et al 2016). [20]Diederik P Kingma and Max Welling, “Auto- encoding variational bayes”, ICLR (2014). [21]An & Cho, Variational autoencoder based anomaly detection using reconstruction probability, 1-18 (2015) [22]J Lucas, G Tucker, R Grosse & M Norouzi (2019) Understanding posterior collapse in generative latent variable models, ICLR Workshop. [23]I Higgins, L Matthey, A Pal, et al., “β-VAE: Learning basic visual concepts with a constrained variational framework”, ICLR (2017). [24]I Tolstikhin, O Bousquet, S Gelly, B Schoelkopf. Wasserstein auto-encoders, ICLR (2018) [25]A Makhzani, J Shlens, N Jaitly, I Goodfellow & B Frey (2016). Adversarial autoencoders, ICLR [26]H Zenati, C.S Foo, B Lecouat, G Manek & V.R Chandrasekhar, Efficient GAN-based anomaly detection. ICLR Workshop (2018). [27]I J Goodfellow, J Pouget-Abadie, M Mirza, et al., “Generative adversarial nets”, NIPS (2014). [28]Schlegl, T., SeebĂśck, P., Waldstein, S. M., Schmidt-Erfurth, U., and Langs, G. (2017). Unsupervised Anomaly Detection with Generative Adversarial Networks. In: IPMI, pp. 146–157. [29]Akcay, S., Atapour-Abarghouei, A., & Breckon, T. P. (2018). Semi-Anomaly GAN: Reducing the Anomaly Discovery Gap using Adversarial Training. In: ACCV, pp. 622–637. [30]Schlegl, T., SeebĂśck, W., Waldstein, S. M., Langs, G., and Schmidt-Erfurth, U. (2019). f- AnomaliesGAN: Fast unsupervised anomaly detection with generative adversarial networks. Medical Image Analysis. 54:30-44. [31]Zenati, H., Romain, M., Foo, C.S., Lecouat, B., and Chandrasekhar, V. (2018). Adversarially Learned Anomaly Detection. In: ICDM 2018, pp. 727–736. [32]Salimans, T., Goodfellow, I.,Zaremba,W., et al. (2016). Techniques for training generative adversarial networks continue to improve. In: NeurIPS, 2234–2242. [33]Arjovsky,M.,Chintala, S., and Bottou, L. (2017). Wasserstein generative adversarial networks. In: ICML, pp. 214–223. [34]Di Mattia, F., Galeone, P., De Simoni, M., and Ghelfi, E. (2019). A survey on GANs for anomaly detection. arXiv:1906.11632. [35]Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. NeurIPS, 5998- 6008. [36]Devlin,J., Chang,M.W.,Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL, 4171–4186. [37]Xu,J.,Wu,H.,Wang, J.,& Long, M.(2022).Distilling Anomalies with Association Discrepancy: A New Transformer Model for Time Series Anomaly Detection. In: ICLR. [38]Tuli, S., Casale,G.,Jennings,N.R.,TranAD: Deep transformer networks for anomaly detection in multivariate linear stochastic simulation models. In: VLDB, 15(6):1201-1214. [39]Dosovitskiy,A.,Beyer,L.,Kolesnikov, A., et al. (2021). An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale. ICLR. [40]P. Mishra, R. Verk, D. Fornasier, C. Piciarelli, & G. L. Foresti.,VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization, ISIE, pp. 01-06, 2021. [41]K. Roth, L. Pemula, J. Zepeda, B. SchĂślkopf, T. Brox, & P. Gehler.,Towards Total Recall in Industrial Anomaly Detection CVPR, pp. 14318-14328, 2022. [42]S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, & M. Shah.,Transformers in Vision: A Survey, ACM Computing Surveys, vol. 54, no. 10s, pp. 1-41, 2022. [43]Y. Tay, M. Dehghani, D. Bahri, & D. Metzler.,Efficient Transformers: A Survey, ACM Computing Surveys, vol. 55, no. 6, pp. 1-28, 2022. [44]and [45] A. L. Buczak, E. Guven A survey of data mining and machine learning methods for cyber security intrusion detection., IEEE Communications Surveys Tutorials, vol. 18, no. 2, pp. 1153-1176, 2016, : Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection, NDSS. Y. Mirsky, T. Doitshman, Y. Elovici, & A. Shabtai,2018 [46]P. Malhotra, A. Ramakrishnan, G. Anand, L. Vig, P. Agarwal, & G. Shroff.,LSTM-based Encoder- Decoder for Multi-Sensor Anomaly Detection, ICML Workshop, 2016. [47]M. Ring, D. SchlĂśr, D. Landes,& A. Hotho,Flow-based Network Traffic Generation using Generative Adversarial Networks,Computers Security, vol. 82, pp. 156-172, 2019. [48]L. S. Lin, R. Clark, R. Birke, et al.,LSTM-based Network Intrusion Detection using attention mechanism, IEEE Access, vol. 8, pp. 30747-30758, 2020.? [49]and [50] G. Apruzzese, M. Colajanni, L. Ferretti, A. Guido,& M. Marchetti,On the Effectiveness of Machine and Deep Learning for Cyber Security,CYBER, pp. 371-373, 2018. : P. Bergmann, M. Fauser, D. Sattlegger, & C. Steger.,MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection, CVPR, pp. 9592- 9600, 2019. [51]S.,Loewe, P. Bergmann, M. Fauser, D. Sattlegger, & C. Steger,”Improving Unsupervised Defect Segmentation By Applying Structural Similarity To Autoencoders,” in VISIGRAPP, pp. 372-380,2019. [52]Baur, C., Wiestler, B., Albarqouni, S., & Navab, N. (2019). Delving deep into convolutions will allow us to autoencode data. In Proceedings of the European Conference on Computer Vision (ECCV), pages 161–169. [53]Zavrtanik, V., Kristan, M., & Skočaj, D. (2021). DRAEM: a discriminatively trained reconstruction embedding for surface anomaly localisation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 8330–8339. [54]Czimmermann, T., Ciuti, G., Milazzo, M., et al. (2020). Review of visual defect detection and classification methods in the field industrial production. Sensors, 20(5), 1459. [55]West, J., & Bhattacharya, M. (2016). Intelligent financial fraud detection: State of development and challenges. Computers & Security, 57, 47-66. [56]Zheng, L., Liu, G., Yan, C., et al. (2018). Researches into the improved Adaboost and its application in transaction fraud detection. The IEEE Transactions on Computational Social Systems, 5(4), 1304-1316. [57]Pumsirirat, A., & Yan, L. (2018). Credit card fraud detection Base on deep learning: An auto- encoder and Boltzmann machines in combination. Future Mathematics Journal, 9(1), 18-25. [58]Fiore, U., De Santis, A., Perla, F., Zanetti, P., & Palmieri, F. (2019). Using generative adversarial networks to improve classification effectiveness in credit card fraud detection. Information Sciences, 479, 448-455. [59]Zheng, Y., Liu, Q., Chen, E., Ge, Y., & Zhao, J.L. (2014). Multi-channel deep CNNs for time series classification. WAIM, 298-310. [60]Wang, C., Han, D., Liu, Q., & Luo, S. (2021).P2P Loan credit scoring using deep learning with attention mechanism and LSTM. IEEE Access, 9:211–222.

Cite this article

APA
Virendra Tank, Dr. Swati Agarwal, Shivangi Sharma (January 2026). Unsupervised Anomaly Detection in Deep Learning: From Autoencoders to Transformers. International Journal of Engineering and Techniques (IJET), 12(1). https://doi.org/{{doi}}
Virendra Tank, Dr. Swati Agarwal, Shivangi Sharma, “Unsupervised Anomaly Detection in Deep Learning: From Autoencoders to Transformers,” International Journal of Engineering and Techniques (IJET), vol. 12, no. 1, January 2026, doi: {{doi}}.
Submit Your Paper