Navin Sabharwal
New Delhi, India
Gaurav Bhardwaj
New Delhi, India
ISBN 978-1-4842-8266-3 e-ISBN 978-1-4842-8267-0
https://doi.org/10.1007/978-1-4842-8267-0
Navin Sabharwal and Gaurav Bhardwaj 2022
Apress Standard
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made.
This Apress imprint is published by the registered company APress Media, LLC, part of Springer Nature.
The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
Preface
Before starting our AIOps journey, lets briefly discuss automation and how it has evolved over the last decade.
Automation in the technology domain is defined as a system where a process or task can be performed with minimal human supervision and action. Humans have been automating tasks forever, and eventually mechanical machines that reduced human effort and increased efficiency were invented to reduce human effort. Today things such as manufacturing, which were partly automated earlier, are moving to 3D printing, which completely automates the process of manufacturing; however, designing what to manufacture is still in the human domain.
Thus, automation reduces human effort and uses machines or software to complete definable, repeatable tasks.
In the IT domain, there are various tasks that humans perform, including envisioning a new product or application, developing the software that translates these requirements into working software, and deploying infrastructure and applications and keeping them updated through their lifecycle.
IT teams have used automation extensively in every area, from software development to operations; however, this has largely been siloed and done without a formal system or method to automation. IT teams have used scripts, runbook automation tools, job scheduling systems, and robotic process automation systems to automate their tasks. These tools have resulted in increased efficiency and reduced the human requirement to operate IT environments.
With the increased adoption of cloud computing and DevOps principles, the provisioning of infrastructure-as-a-service and platform-as-a-service environments has also been automated, as has the deployment of applications. This has resulted not just in automating the tasks and increasing efficiency but also in agility and speed, which provides businesses with the support to pivot and adapt to changing market needs by quickly changing the functionality and features based on customer and market feedback.
IT runs on three pillars: process, people, and technology. To be able to automate, one needs to be aware of the inter-relationships between these pillars. People use defined processes to work on technology, and with automation we are essentially automating the current processes that people are using to operate an environment. However, with the changing technology landscape and increased adoption and maturity of AI and machine learning capabilities, we can now look at the current processes and formulate new ones to leverage the transformational capabilities provided by these technologies. The current processes were set up with state-of-the-art technology at that point in time, and these processes then defined how humans should operate within that process to execute tasks to accomplish a goal; however, with a drastic change in the technology landscape, the processes need to change and adopt. As an example, with the cloud becoming prevalent, the IT processes need to change and adapt, and the sequential, nonautomated processes need to change to cater to the new capabilities such as autoprovisioning. In 2013, I talked about how capacity management would drastically change in the cloud computing world and that new procedures for cloud cost management would be required. Some of these concepts have been expanded to cover the entire financial operations piece under the umbrella of cloud FinOps.
Similarly, IT operations automation had existed in siloes for all this while. People used scripts, monitoring tools, runbook automation, configuration and deployment automation, and RPA tools and automated service management processes using ITSM tools. However, all that was happening under different domains, all getting integrated with siloed integrations.
On the technology front, AI and machine learning technologies became mainstream and were being used heavily in all aspects of business-facing and customer-facing applications from websites to search engines to collaboration and communications tools. AI took over the world of IT quickly; however, IT was late to adopt these technologies. While IT teams were using these technologies to create new applications with AI capabilities to customers, their own internal systems were still using older technologies and worked on processes and systems that had largely remained untouched for the past decade.
Through the DevOps and Agile movements, that transformational change had already transformed the way applications were being built, tested, and deployed, and most of the tasks in the development value stream were automated and integrated, resulting in organizations moving up to continuous delivery and continuous deployment. Similar to the transformation seen with DevOps, with AIOps gaining traction, we are going to witness a transformational change that will drastically change the way IT operations has been run in the past. Old processes, systems, tools, and ways of working will give way to the AIOps way of operating and help realize the vision of NoOps, where operations work seamlessly without disruption in an automated way without human supervision and intervention.
Enterprises today are at various levels of maturity when it comes to automation. Most are yet to achieve a high level of maturity in automation, partly because changing processes, breaking down walls of organization structures, and deploying new technology are complex, time-consuming tasks. In some organizations, where digitization and cloud computing programs are a part of large-scale transformation; AIOps and associated technology and process changes are enabling a complete transformation of digitization. One key factor in adopting AIOps has been a lack of comprehensive process and technology guidance in this domain. There is limited guidance available, and most of it is focused on products that vendors are trying to sell as a one-stop solution, which will leapfrog an organization to the next level. With this publication we are aiming to provide hands-on pragmatic guidance on how an organization can adopt these changes in the origination and what the pitfalls are to be avoided. Implementing AIOps is not about deploying an event correlation tool; it is about infusing AI and algorithms and automating all aspects of operations.