Enlisted are some of our contributions to the area of AIOps:

2023

“Auto-Logging: AI-centered Logging Instrumentation”
Bogatinovski Jasmin, Kao Odej
45th International Conference on Software Engineering, 2023.

PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning Thorsten Wittkopp, Dominik Scheinert, Philipp Wiesner, Alexander Acker, Odej Kao 56th Hawaii International Conference on System Sciences (HICSS), 1376–1385. 2023.

HiMFP: Hierarchical Intelligent Memory Failure Prediction for Cloud Service Reliability Qiao Yu, Zhang Wengui, Haeri Soroush, Notaro Paolo, Jorge Cardoso, Odej Kao 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Network (DSN). 2023.

Karasu: A Collaborative Approach to Efficient Cluster Configuration for Big Data Analytics Dominik Scheinert, Philipp Wiesner, Thorsten Wittkopp, Lauritz Thamsen, Jonathan Will, Odej Kao 42th International Performance Computing and Communications Conference (IPCCC), 403–412. 2023.

Software-in-the-Loop Simulation for Developing and Testing Carbon-Aware Applications Philipp Wiesner, Marvin Steinke, Henrik Nickel, Yazan Kitana, Odej Kao Software: Practice and Experience, 1–15. 2023.

Towards Benchmarking Power-Performance Characteristics of Federated Learning Clients Pratik Agrawal, Philipp Wiesner, Odej Kao 2nd Workshop on Machine Learning Networking (MaLeNe) at NetSys ‘23. 2023.

Selecting Efficient Cluster Resources for Data Analytics: When and How to Allocate for In-Memory Processing? Jonathan Will, Lauritz Thamsen, Dominik Scheinert, Odej Kao International Conference on Scientific and Statistical Database Management (SSDBM), 1–4. 2023.

Offloading Real-Time Tasks in IIoT Environments under Consideration of Networking Uncertainties Ilja Behnke, Philipp Wiesner, Paul Voelker, Odej Kao 2nd International Workshop on Middleware for the Edge (MiddleWEdge) at Middleware ‘23. 2023.

Towards a Peer-to-Peer Data Distribution Layer for Efficient and Collaborative Resource Optimization of Distributed Dataflow Applications Dominik Scheinert, Soeren Becker, Jonathan Will, Luis Englaender, Lauritz Thamsen IEEE International Conference on Big Data (BigData), 2339–2345. 2023.

Predicting Dynamic Memory Requirements for Scientific Workflow Tasks Jonathan Bader, Nils Diedrich, Lauritz Thamsen, Odej Kao 2023 IEEE International Conference on Big Data (Big Data). 2023.

Karasu: A Collaborative Approach to Efficient Cluster Configuration for Big Data Analytics Dominik Scheinert, Philipp Wiesner, Thorsten Wittkopp, Lauritz Thamsen, Jonathan Will, Odej Kao 42th International Performance Computing and Communications Conference (IPCCC), 403–412. 2023.

Software-in-the-Loop Simulation for Developing and Testing Carbon-Aware Applications Philipp Wiesner, Marvin Steinke, Henrik Nickel, Yazan Kitana, Odej Kao Software: Practice and Experience, 1–15. 2023.

Selecting Efficient Cluster Resources for Data Analytics: When and How to Allocate for In-Memory Processing? Jonathan Will, Lauritz Thamsen, Dominik Scheinert, Odej Kao International Conference on Scientific and Statistical Database Management (SSDBM), 1–4. 2023.

2022

Failure Identification from Unstable Log Data using Deep Learning
Bogatinovski Jasmin, Nedelkoski Sasho, Wu Li, Cardoso Jorge, Kao Odej
22nd International Symposium on Cluster, Cloud and Internet Computing IEEE Press New York

Leveraging Log instructions for Log-based Anomaly Detection
Bogatinovski Jasmin, Madjarov Gjorgij, Nedelkoski Sasho, Cardoso Jorge, Kao Odej
2022 IEEE International Conference on Services Computing IEEE Press New York

First CE Matters: On the Importance of Long Term Properties on Memory Failure Prediction
Bogatinovski Jasmin, Qiao Yu, Cardoso Jorge, Kao Odej
2022 IEEE International Conference on Big Data IEEE Press New York

QuLog: Data-Driven Approach for Log Instruction Quality Assessment
Bogatinovski Jasmin, Nedelkoski Sasho, Acker Alexander, Cardoso Jorge, Kao Odej
30th International Conference on Program Comprehension ACM Press (Association for Computing Machinery) New York

A Taxonomy of Anomalies in Log Data
Wittkopp Thorsten, Wiesner Philipp, Scheinert Dominik, Kao Odej
2021 International Conference on Service Oriented Computing Springer Springer Nature Dubai

LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision
Wittkopp Thorsten, Acker Alexander, Wiesner Philipp, Scheinert Dominik
2021 International Conference on Service Oriented Computing Springer Nature Dubai 2022

A2Log: Attentive Augmented Log Anomaly Detection
Wittkopp Thorsten, Acker Alexander, Bogatinovski Jasmin, Nedelkoski Sasho, Scheinert Dominik, Fan Wu, Kao Odej
55th Hawaii International Conference on Systems Science University of Hawaii at Manoa Hamilton Library USA 2022

Leveraging Reinforcement Learning for Task Resource Allocation in Scientific Workflows
Bader Jonathan, Zunker Nicolas, Becker Sören, Kao,Odej
2022 IEEE International Conference on Big Data IEEE Press New York

Towards Advanced Monitoring for Scientific Workflow
Bader Jonathan, Witzke Joel, Becker Sören, Lößer Ansgar, Lehmann Fabian, Döhler Leon, Anhduc Vu
2022 IEEE International Conference on Big Data IEEE Press New York

Lotaru: Locally Estimating Runtimes of Scientific Workflow Tasks in Heterogeneous Clusters
Bader Jonathan, Lehmann Fabian, Thamsen Lauritz, Will Jonathan, Leser Ulf, Kao Odej
34th International Conference on Scientific and Statistical Database Management (SSDBM 2022) ACM Press (Association for Computing Machinery) New York

Lotaru: Locally Estimating Runtimes of Scientific Workflow Tasks in Heterogeneous Clusters
Bader Jonathan, Lehmann Fabian, Thamsen Lauritz, Will Jonathan, Leser Ulf, Kao Odej
34th International Conference on Scientific and Statistical Database Management (SSDBM 2022) ACM Press (Association for Computing Machinery) New York

Efficient Runtime Profiling for Black-box Machine Learning Services on Sensor Streams
Becker Sören, Scheinert Dominik, Schmidt Florian, Kao Odej
6th IEEE International Conference on Fog and Edge Computing 2022 IEEE Press New York

Network Emulation in Large-Scale Virtual Edge Testbeds: A Note of Caution and the Way Forward
Becker Sören, Scheinert Dominik, Schmidt Florian, Kao Odej
6th IEEE International Conference on Fog and Edge Computing 2022 IEEE Press New York

Probabilistic Time Series Forecasting for Adaptive Monitoring in Edge Computing Environments
Scheinert Dominik, Aghdam Babak, Becker Sören, Thamsen Lauritz, Kao Odej
2022 IEEE International Conference on Big Data IEEE Press New York

Let’s Wait Awhile: How Temporal Workload Shifting Can Reduce Carbon Emissions in the Cloud
Wiesner Philipp, Behnke Ilja, Scheinert Dominik, Gontarska Kordian, Thamsen Lauritz
22nd International ACM/IFIP Middleware Conference ACM Press (Association for Computing Machinery) Quebec

Cucumber: Renewable-Aware Admission Control for Delay-Tolerant Cloud and Edge Workloads
Wiesner Philipp, Scheinert Dominik, Wittkopp Thorsten, Thamsen Lauritz, Kao Odej
28th International European Conference on Parallel and Distributed Computing (Euro-Par) Springer/ Springer Nature Glasgow 2022

2021

Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models
Harald Odtt, Bogatinovski Jasmin, Alexander Acker, Nedelkoski Sasho, and Odej Kao In Proceedings of the 43th International Conference on Software Engineering (ICSE 2021) Workshop on Cloud Intelligence 2021 Workshop on Cloud Intelligence 2021. To appear. arXiv

MicroDiag: Fine-grained Performance Diagnosis for Microservice Systems
Wu L., Tordsson J., Bogatinovski J., Elmroth E., Kao O.
In Proceedings of the 43th International Conference on Software Engineering (ICSE 2021) Workshop on Cloud Intelligence 2021
HAL

Learning dependencies in distributed cloud applications to identify and localize anomalies
Scheinert D., Acker A., Thamsen L., Geldenghuys M. K., Kao O.
In 43-rd International Conference on Software Engineering, To appear. ACM, 2021. arxiv

Artificial Intelligence for IT Operations (AIOPS) Workshop White Paper
Bogatinovski J., Nedelkoski S., Acker A., Schmidt F., Wittkopp T., Becker S., Cardoso J., Kao O.
arXiv

2020

Self-Supervised Anomaly Detection from Distributed Traces
Bogatinovski J., Nedelkoski S., Cardoso J., Kao O. 2020
IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC), Leicester, UK, 2020, pp. 342-347.
arXiv

Multi-Source Anomaly Detection in Distributed IT Systems
Bogatinovski J., Nedelkoski S.
In 18th International Conference on Service-Oriented Computing, To appear, Dubai,United Arab Emirates, December 2020
arXiv

Self-Supervised Log Parsing
Bogatinovski J., Nedelkoski S., Acker A., J Cardoso, Kao O.
In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD 2020, pages 1–742, 2020
arXiv

Self-attentive classification-based anomaly detection in unstructured logs
Nedelkoski S., Bogatinovski J., Acker A., Cardoso J., Kao O.
In ICDM 2020: 20th IEEE International Conference on Data Mining, pages 1196–1201
arXiv

Multi-source distributed system data for AI-powered analytics
Nedelkoski S., Bogatinovski J., Mandapati AK., Becker S., Cardoso J., Kao O.
In ESOCC 2020: European Conference On Service-Oriented And Cloud Computing, pages 161–176. Springer International Publishing
Zenodo

Superiority of Simplicity: A Lightweight Model for Network Device Workload Prediction
Acker A., Wittkopp T., Nedelkoski S., Bogatinovski J., Kao O.
Superiority of simplicity: A lightweight model for network device workload prediction. In 15th Conference on Computer Science and Information Systems, pages 7–10. IEEE, 2020.
arXiv

Optimizing convergence for iterative learning of arima for stationary time series
Styp-Rekowski K., Schmidt F., Kao O.
In 2020 IEEE Inter-national Conference on Big Data, To appear. IEEE, 2020.
arxiv

Learning more expressive joint distributions in multimodal variational methods Nedelkoski S., Bogojevski M., Kao O.
In 2020 International Conference on Machine Learning, Optimization, and Data Science, LOD 2020, pages 137–149, 2020.
arxiv

Performance diagnosis in cloud microservices using deep learning
Wu L., Bogatinovski J., Nedelkoski S., Tordsson J., and Kao O.
In 18th International Conference on Service-Oriented Computing, To appear, Dubai,United Arab Emirates, December 2020. Springer.
arxiv

Microras: Automatic recovery in the absence of historical failure data for microservice systems Wu L., Tordsson J., Acker A., Kao, O.
In 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC), pages 227–236. IEEE, 2020
arxiv

Microrca: Root cause localization of performance issues in microservices
Wu L., Tordsson J., Elmroth E., Kao O. In NOMS 2020 IEEE/IFIP Network Operations and Management Symposium, pages 1–9. IEEE, 2020.
arxiv

Towards aiops in edge computing environments
Becker S., Schmidt F., Gulenko A., Acker A., Kao O.
In 2020 IEEE International Conference on Big Data, pages 3470–3475. IEEE, 2020
arxiv

Ai-governance andlevels of automation for aiops-supported system administration
Gulenko A., Acker A., Kao O., Liu F.
In The 29th International Conference on Computer Communications and Networks, pages 1–6. IEEE, 2020
link

Bitflow: An in situ stream processing framework
Gulenko A., Acker A., Schmidt F., Becker S., Kao O.,
In International Conference on Autonomic Computing and Self-Organizing Systems, pages 182–187.IEEE, 2020.
link

Telesto: A graph neural network model for anomaly classification in cloud services
Scheinert D., Acker A.
In 18th International Conference on Service-Oriented Computing, To appear, Dubai,United Arab Emirates, December 2020
arXiv

Decentralized federated learning preserves model and data privacy
Wittkopp T., Acker A.
In 18th International Conference on Service-Oriented Computing, To appear, Dubai,United Arab Emirates, December 2020
arXiv

Sensor artificial intelligence and its application to space systems - a whitepaper
Bearner A., Hübers H. M., Kao O., Schmidt F., Becker S., Denzler J., Matolin D., Haber D., Lucia S., Samek W., et al.
arxiv

Autoencoder-based condition monitoring and anomaly detection method for rotating machines
Ahmad S., Styp-Rekowski K., Nedelkoski S., Kao O.
In 2020 IEEE International Conference on Big Data, To appear. IEEE, 2020.
arxiv

Mary, hugo, and hugoi: Learning to schedule distributed data-parallel processing jobs on shared clusters
Tran V. T., Nedelkoski S., Thamsen L., Beilharz J., Kao O.
Concurrency and Computation: Practiceand Experience, page e5823, 2020.
link