The Rise and Future of AIOps: How ML can Resolve Multicloud Complexity

“CIOs today are caught in between a rock and a hard place. The rock is […]

“CIOs today are caught in between a rock and a hard place. The rock is
maintaining high availability while doing that as cheaply as possible. The
hard place is enabling digital transformation and increasing business and
developer velocity,” answered Assaf Resnick, CEO and Co-Founder of
BigPanda, at the opening session of RESOLVE ’22.

Every CIO feels this paradox and tradeoff daily. One long major incident can
disrupt a CIO’s transformation agenda, particularly when many business
leaders believe that apps running in the cloud should deliver 100 percent
reliability. Incident management is a challenge for many IT organizations,
and in
StarCIO’s recent research, 70 percent of respondents said it typically takes over three hours to
resolve high priority (P1) issues. Those issues get more complex – not less
– when an organization runs hybrid/multi clouds. So CIOs are looking for
answers.

“The rise of AIOps” was the theme at Resolve ’22, a community event for IT
Ops, NOC, DevOps, and SRE professionals. I attended the conference as I’ve
been writing and reviewing AIOps for several years, and I looked forward to
hearing how IT Ops professionals were transforming.

While addressing this paradox is a challenge, Sanjay Poonen, former COO at
VMWare and President at SAP, provided advice on solving the operational
versus transformation paradox. He says, “The best CIOs understand how
software is being developed and model their organization after not just a
startup, but the largest company they can find that’s operating like a
startup.”

Many CIOs understand this charter, and their digital transformations aim to
modernize their business with improved customer experiences, real-time
analytics capabilities, and efficiencies through workflow automation.

Translating this directive to IT Operations, operating as an enterprise
startup requires them to

  • Provide higher reliability and performance that exceeds end-user
    expectations while operating in more complex multicloud and edge
    environments
  • Improve the mean time to recover (MTTR) from incidents and perform
    accurate root cause analysis despite the “astronomical data volumes,” as
    one session panelist remarked
  • Focus IT Ops on higher-value work by creating ITSM automations that can
    also reduce errors and eliminate shoulder tapping as an acceptable form of
    communications

The Rise of AIOps

Panelists in the “AIOps 2022-2025: Expert predictions” keynote panel shared
insights on why IT Ops needs AIOps. James Maguire, Editor in Chief at eWeek,
shared today’s reality in simple terms, stating, “Today’s IT systems are too
complex for humans to run them.”

It’s not just about going from the data center to the cloud, multicloud, and
edge computing. Over the last decade, IT shifted from three-tiered to
microservices architectures, virtual servers to serverless computing, and
behind-the-firewall software to ecosystems of integrated SaaS, low-code, and
customer-facing applications.

Now, ask IT Ops which service is the root cause of the outage when dozens to
hundreds of observable systems are pumping out telemetric data. The answer
is rarely easy to get to, and while teams are searching for the answers, the
incident persists.

Carlos Casanova, Principal Analyst at Forrester, explained the pressure
everyone in IT feels daily. He said, “If that system goes down, someone’s on
social media complaining about it within a half a second. The public
relations fallout can have a greater financial impact than the system being
down.”

IT can’t solve these issues today by hiring more people. As Eric Noeth,
Partner at Advent, explains, “There’s an incredible tailwind around
complexity, and it’s also against a market backdrop of the real scarcity of
talent.”

AIOps is a Strategy that Transforms IT Operations

The rise of AIOps closes this gap
by centralizing IT data, providing machine learning capabilities to improve
ITSM and IT Ops, and enabling automations across IT’s ecosystem of tools and
technologies.

I was on the panel “How AIOps works in the real world” with Sean McDermott,
CEO of Windward Consulting, and he
shared one of his secrets on implementing successful AIOps programs. He
said, “We see AIOps as a strategy, not a tool. AI and machine learning
entering into the operations domain is transformational. The use cases we’re
focusing on now around event management, correlation, and root cause
analysis are just the beginning of the journey.”

Many organizations start AIOps with a POC in a single domain and address
several primary problems, such as reducing MTTR in a mission-critical
business process. We described how setting up
centers of excellence
and building product management disciplines can help IT leaders expand AIOps
from a single domain and use case to a platform leveraged across the
enterprise.

The Future of AIOps

Since we’re only at the beginning of the AIOps journey, it begs us to
forecast what capabilities might become available in the near future.
Panelists on the “AIOps 2022-2025: Expert predictions” keynote panel had
their answers.

Michael Yamnitsky, Managing Director at Insight Partners, believes AIOps
platforms will “shift to domain-specific insights” and beyond incident
management, root cause analysis, and automation capabilities. He said, “We
will see AIOps tools deliver more insights to the DevOps and engineering
teams that are actually responsible for fixing the problems.”

Maguire adds, “We need to be able to predict the problems before they
happen. We can’t just respond to them anymore and need predictive
analytics.”

These are promising capabilities and possibly the closest thing to an “easy
button” for IT Ops facing greater business pressures and technology
complexities.
The rise of AIOps
has started in many organizations such as Sony PlayStation, Cardinal Health,
Wells Fargo, and Honeywell, and listening to their stories, you can see how
AIOps is truly transforming IT operations.

BigPanda sponsored this conference and content.

The views and opinions expressed herein are those of the author and do
not necessarily represent the views and opinions of BigPanda.

 


The original article can be found at: Star CIO