Artificial intelligence in the service of system administrators
Citations Over Time
Abstract
The LHCb online system relies on a large and heterogeneous IT infrastructure made from thousands of servers on which many different applications are running. They run a great variety of tasks: critical ones such as data taking and secondary ones like web servers. The administration of such a system and making sure it is working properly represents a very important workload for the small expert-operator team. Research has been performed to try to automatize (some) system administration tasks, starting in 2001 when IBM defined the so-called self objectives supposed to lead to autonomic computing. In this context, we present a framework that makes use of artificial intelligence and machine learning to monitor and diagnose at a low level and in a non intrusive way Linux-based systems and their interaction with software. Moreover, the multi agent approach we use, coupled with an object oriented paradigm architecture should increase our learning speed a lot and highlight relations between problems.
Related Papers
- → Big Blue in the Bottomless Pit: The Early Years of IBM Chile(2008)27 cited
- → A history of the IBM Systems Journal(1998)11 cited
- → The IBM information retrieval center-(ITIRC) system techniques and applications(1966)3 cited
- Big Blue in the Bottomless Pit: The Early Years of IBM Chile(2008)
- → IBM(2021)