Enlisted are some of our contributions to the area of AIOps:
2025
AmocRCA: At Most One Change Segmentation and Relative Correlation Ranking for Root Cause Analysis Anton Altenbernd, Odej Kao, Zhiyuan Wu Causal Methods in Software Engineering, 33rd ACM Symposium on the Foundations of Software Engineering (FSE ‘25), accepted for publication
2024
LogRCA: Log-Based Root Cause Analysis for Distributed Services Thorsten Wittkopp, Philipp Wiesner, Odej Kao 30th International European Conference on Parallel and Distributed Computing (Euro-Par), 362–376. 2024.
Investigating Memory Failure Prediction Across CPU Architectures Qiao Yu, Wengui Zhang, Min Zhou, Jialiang Yu, Zhenli Sheng, Jasmin Bogatinovski, Jorge Cardoso, Odej Kao 54th IEEE/IFIP International Conference on Dependable Systems and Networks - Supplemental Volume (DSN-S). 2024.
Unveiling DRAM Failures across Different CPU Architectures in Large-Scale Datacenters Qiao Yu, Jorge Cardoso, Odej Kao 44th IEEE International Conference on Distributed Computing Systems (ICDCS), 1-2. 2024.
Baiji: Domain planning for cdns under the 95th percentile billing model Juan Vanerio, Huiran Liu, Qi Zhang, and Stefan Schmid. 2024 IFIP Networking Conference (IFIP Networking), pages 1–9, 2024.
Tero: Offloading cdn traffic to massively distributed devices Juan Vanerio, Lily Hügerich, and Stefan Schmid 25th International Conference on Distributed Computing and Networking, ICDCN ’24, page 186–198, New York, NY, USA, 2024.
2023
Progressing from Anomaly Detection to Automated Log Labeling and Pioneering Root Cause Analysis Thorsten Wittkopp, Alexander Acker, Odej Kao 2023 IEEE International Conference on Data Mining Workshops (ICDMW), 1231–1239. 2023.
Exploring Error Bits for Memory Failure Prediction: An In-Depth Correlative Study Qiao Yu, Wengui Zhang, Jorge Cardoso, Odej Kao 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). 2023.
Auto-Logging: AI-centered Logging Instrumentation
Jasmin Bogatinovski, Odej Kao
45th International Conference on Software Engineering, 2023.
PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning Thorsten Wittkopp, Dominik Scheinert, Philipp Wiesner, Alexander Acker, Odej Kao 56th Hawaii International Conference on System Sciences (HICSS), 1376–1385. 2023.
HiMFP: Hierarchical Intelligent Memory Failure Prediction for Cloud Service Reliability Qiao Yu, Zhang Wengui, Haeri Soroush, Notaro Paolo, Jorge Cardoso, Odej Kao 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Network (DSN). 2023.
Towards a Peer-to-Peer Data Distribution Layer for Efficient and Collaborative Resource Optimization of Distributed Dataflow Applications Dominik Scheinert, Soeren Becker, Jonathan Will, Luis Englaender, Lauritz Thamsen IEEE International Conference on Big Data (BigData), 2339–2345. 2023.
2022
Failure Identification from Unstable Log Data using Deep Learning
Bogatinovski Jasmin, Nedelkoski Sasho, Wu Li, Cardoso Jorge, Kao Odej
22nd International Symposium on Cluster, Cloud and Internet Computing IEEE Press New York
Leveraging Log instructions for Log-based Anomaly Detection
Bogatinovski Jasmin, Madjarov Gjorgij, Nedelkoski Sasho, Cardoso Jorge, Kao Odej
2022 IEEE International Conference on Services Computing IEEE Press New York
First CE Matters: On the Importance of Long Term Properties on Memory Failure Prediction
Bogatinovski Jasmin, Qiao Yu, Cardoso Jorge, Kao Odej
2022 IEEE International Conference on Big Data IEEE Press New York
QuLog: Data-Driven Approach for Log Instruction Quality Assessment
Bogatinovski Jasmin, Nedelkoski Sasho, Acker Alexander, Cardoso Jorge, Kao Odej
30th International Conference on Program Comprehension ACM Press (Association for Computing Machinery) New York
A Taxonomy of Anomalies in Log Data
Wittkopp Thorsten, Wiesner Philipp, Scheinert Dominik, Kao Odej
2021 International Conference on Service Oriented Computing Springer Springer Nature Dubai
LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision
Wittkopp Thorsten, Acker Alexander, Wiesner Philipp, Scheinert Dominik
2021 International Conference on Service Oriented Computing Springer Nature Dubai 2022
R-mpls: recursive protection for highly dependable mpls networks Stefan Schmid, Morten Konggaard Schou, Jiří Srba, and Juan Vanerio. The 18th International Conference on Emerging Networking EXperiments and Technologies, CoNEXT ’22, page 276–292, New York, NY, USA, 2022
Mpls-kit: An mpls data plane toolkit Juan Vanerio, Stefan Schmid, Morten Konggaard Schou, and Jiří Srba. The 18th International Conference on Emerging Networking EXperiments and Technologies, CoNEXT ’22, page 276–292, New York, NY, USA, 2022
Improved throughput for all-or-nothing multicommodity flows with arbitrary demands Anya Chaturvedi, Chandra Chekuri, Andr´ea W. Richa, Matthias Rost, Stefan Schmid and Jamison Weber. SIGMETRICS Perform. Eval. Rev., 49(3):22–27, March 2022
2021
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models
Harald Odtt, Bogatinovski Jasmin, Alexander Acker, Nedelkoski Sasho, and Odej Kao
In Proceedings of the 43th International Conference on Software Engineering (ICSE 2021) Workshop on Cloud Intelligence 2021 Workshop on Cloud Intelligence 2021. To appear.
arXiv
Learning dependencies in distributed cloud applications to identify and localize anomalies
Scheinert D., Acker A., Thamsen L., Geldenghuys M. K., Kao O.
In 43-rd International Conference on Software Engineering, To appear. ACM, 2021.
arxiv
Artificial Intelligence for IT Operations (AIOPS) Workshop White Paper
Bogatinovski J., Nedelkoski S., Acker A., Schmidt F., Wittkopp T., Becker S., Cardoso J., Kao O.
arXiv
2020
Self-Supervised Anomaly Detection from Distributed Traces
Bogatinovski J., Nedelkoski S., Cardoso J., Kao O. 2020
IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC), Leicester, UK, 2020, pp. 342-347.
arXiv
Self-Supervised Log Parsing
Bogatinovski J., Nedelkoski S., Acker A., J Cardoso, Kao O.
In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD 2020, pages 1–742, 2020
arXiv
Self-attentive classification-based anomaly detection in unstructured logs
Nedelkoski S., Bogatinovski J., Acker A., Cardoso J., Kao O.
In ICDM 2020: 20th IEEE International Conference on Data Mining, pages 1196–1201
arXiv
Multi-source distributed system data for AI-powered analytics
Nedelkoski S., Bogatinovski J., Mandapati AK., Becker S., Cardoso J., Kao O.
In ESOCC 2020: European Conference On Service-Oriented And Cloud Computing, pages 161–176. Springer International Publishing
Zenodo