Home Enterprise Solutions The day IT operations got its mojo back

The day IT operations got its mojo back

The advent of site reliability engineering and observability gives new skills and techniques to the operations side of DevOps, says Andi Mann, chief technology advocate for machine data aggregator and analysis vendor, Splunk.

Mann is an Australian living in Boulder, Colorado, with a global audience. In his role, he is charged with learning and researching what is important to Splunk's customers, understanding what leading-edge customers are doing, identifying what Splunk should adopt into its a product roadmap, what it can do to make customer's lives more successful, and advocating to customers about technologies and Splunk capabilities they can utilise to be better at what they do.

Mann is currently in Sydney to speak at Splunk Live, introducing customer stories about their innovative use of Splunk. "I always love doing Splunk Live. Any day with a customer is a good day," he says.

Mann took time from his busy schedule to speak to iTWire about what's currently caught his interest in all these discussions and research.

IT Operations

"I'm seeing a resurgence of IT Ops," he said. "So many businesses and vendors and analysts stop talking about DevOps when it comes to app release."

Two years ago, a group of Google Engineers wrote a book, "Site Reliability Engineering", published by O'Reilly Media, and its concepts have gained traction. Part of this includes observability, Mann explained.

"Observability," he says, is a term that comes from industrial manufacturing where you have systems you cannot see into, for example, a water treatment plant has pipes all over the place and can't see what's happening inside. Is the water dirty or clean? Which direction does it flow? Is the pipe full or not? To answer these questions engineers installed purity sensors which allow them to observe from the outside, using telemetry to see what's going on inside the pipe.

Google has brought this concept into IT to describe how new operations models can get visibility into applications – and with this, get better data and better metrics, and thus get ahead of problems.

DevOps brought a lot of goodness in collaboration across the entire software development lifecycle, Mann says. It's been good for developers, giving access to IT operations capabilities like automated software release, troubleshooting and triage.

However, "observability and software reliability engineering gives IT Ops their mojo back," Mann says.

VictorOps

Another current topic is Splunk's announcement of its agreement to acquire VictorOps.

"VictorOps has great talent, which is a significant part of why we wanted to bring that team aboard," Andi Mann says. "They have forward-looking tech which brings together teams to not just review problems but collaborate on triaging, troubleshooting and launching automation to fix problems."

"The team is fantastic," he reiterates. It also establishes an official Splunk presence in Mann's hometown of Boulder, Colorado, advancing the "Silicon Mountain" moniker.

"Google has recently built a complex for 1500 people in Boulder. This is the calibre of town Boulder is for technical talent and I'm excited we (Splunk) have an opportunity to attract and retain talent."

VictorOps is a beautiful fit for Splunk, Mann says.

The OODA loop is the decision cycle of observe, orient, decide and act, Mann explains, developed by military strategist and United States Air Force Colonel John Boyd.

These first two phases are where Splunk "lives and breathes" – do we have a problem at all, and what is the problem? The conventional Splunk monitoring and analytics tools, augmented by machine learning-driven event analytics, does this out-of-the-box, allowing teams to see when things are going wrong and identifying the notable cause causing the problem.

VictorOps comes in at phase three — how do we work together to solve the problem? — providing a modern, cloud-based system that incorporates ideas around triaging and troubleshooting together. Using VictorOps multiple people can be geographically distributed but work in the one chatroom, pulling in Splunk dashboards, and getting the right people together at the right time.

Then, when the resolution is agreed upon, automations can be kicked off from right within the VictorOps chatroom, being Splunk actions or other third-party integrations.

Splunk previously acquired Phantom Cyber Corporation as an orchestration solution, to execute a workflow to implement known processes. Phantom gives the opportunity to execute recovery actions, working on how all the pieces are seamlessly integrated.

Thus, with the combination of Splunk and VictorOps, Mann says, IT teams can go all the way from "Aha, I have a problem. What is it? Let's work together to get the right people to make a decision after triaging and troubleshooting, then let's use Phantom, or maybe Puppet, Chef, or something else, to go and resolve that problem."

This is why Splunk speaks about a "platform for engagement", Mann says. "It's not just a monitor in a corner nobody looks at, and it's not just spitting out metrics. It enables IT pros to make decisions and act on them and return service to normal, all the while engaging with different teams – it's a platform for engagement."

Splunk has been working with VictorOps for a while, Mann says. He personally facilitated some early integrations which were literally customer-led. "Customers were asking us to work together so we released a two-way integration last year, with the ability to send alerts directly out of Splunk IT Service Intelligence (ITSI) to isolate a notable event using ML and integrate it in the GUI to send an alert to VictorOps.

"Customers said that's great, we know what the problem is when it happens but we need to work together in Splunk to fix the problem. So we continued to work on that integration to literally drop Splunk dashboards into a VictorOps chatroom and see the same information and speak the same language – so this integration has been around for a year or so.

"VictorOps doesn't just solve a problem and make Splunk a better platform for engagement. It's something our customers have proven for us works in a production environment, and it's a great acquisition because we've been doing that for a year or more."

LEARN HOW TO REDUCE YOUR RISK OF A CYBER ATTACK

Australia is a cyber espionage hot spot.

As we automate, script and move to the cloud, more and more businesses are reliant on infrastructure that has the high potential to be exposed to risk.

It only takes one awry email to expose an accounts’ payable process, and for cyber attackers to cost a business thousands of dollars.

In the free white paper ‘6 Steps to Improve your Business Cyber Security’ you’ll learn some simple steps you should be taking to prevent devastating and malicious cyber attacks from destroying your business.

Cyber security can no longer be ignored, in this white paper you’ll learn:

· How does business security get breached?
· What can it cost to get it wrong?
· 6 actionable tips

DOWNLOAD NOW!

10 SIMPLE TIPS TO PROTECT YOUR ORGANISATION FROM RANSOMWARE

Ransomware attacks on businesses and institutions are now the most common type of malware breach, accounting for 39% of all IT security incidents, and they are still growing.

Criminal ransomware revenues are projected to reach $11.5B by 2019.

With a few simple policies and procedures, plus some cutting-edge endpoint countermeasures, you can effectively protect your business from the ransomware menace.

DOWNLOAD NOW!

David M Williams

David has been computing since 1984 where he instantly gravitated to the family Commodore 64. He completed a Bachelor of Computer Science degree from 1990 to 1992, commencing full-time employment as a systems analyst at the end of that year. David subsequently worked as a UNIX Systems Manager, Asia-Pacific technical specialist for an international software company, Business Analyst, IT Manager, and other roles. David has been the Chief Information Officer for national public companies since 2007, delivering IT knowledge and business acumen, seeking to transform the industries within which he works. David is also involved in the user group community, the Australian Computer Society technical advisory boards, and education.

 

Popular News

 

Telecommunications