Home Enterprise Solutions The day IT operations got its mojo back

The advent of site reliability engineering and observability gives new skills and techniques to the operations side of DevOps, says Andi Mann, chief technology advocate for machine data aggregator and analysis vendor, Splunk.

Mann is an Australian living in Boulder, Colorado, with a global audience. In his role, he is charged with learning and researching what is important to Splunk's customers, understanding what leading-edge customers are doing, identifying what Splunk should adopt into its a product roadmap, what it can do to make customer's lives more successful, and advocating to customers about technologies and Splunk capabilities they can utilise to be better at what they do.

Mann is currently in Sydney to speak at Splunk Live, introducing customer stories about their innovative use of Splunk. "I always love doing Splunk Live. Any day with a customer is a good day," he says.

Mann took time from his busy schedule to speak to iTWire about what's currently caught his interest in all these discussions and research.

IT Operations

"I'm seeing a resurgence of IT Ops," he said. "So many businesses and vendors and analysts stop talking about DevOps when it comes to app release."

Two years ago, a group of Google Engineers wrote a book, "Site Reliability Engineering", published by O'Reilly Media, and its concepts have gained traction. Part of this includes observability, Mann explained.

"Observability," he says, is a term that comes from industrial manufacturing where you have systems you cannot see into, for example, a water treatment plant has pipes all over the place and can't see what's happening inside. Is the water dirty or clean? Which direction does it flow? Is the pipe full or not? To answer these questions engineers installed purity sensors which allow them to observe from the outside, using telemetry to see what's going on inside the pipe.

Google has brought this concept into IT to describe how new operations models can get visibility into applications – and with this, get better data and better metrics, and thus get ahead of problems.

DevOps brought a lot of goodness in collaboration across the entire software development lifecycle, Mann says. It's been good for developers, giving access to IT operations capabilities like automated software release, troubleshooting and triage.

However, "observability and software reliability engineering gives IT Ops their mojo back," Mann says.

VictorOps

Another current topic is Splunk's announcement of its agreement to acquire VictorOps.

"VictorOps has great talent, which is a significant part of why we wanted to bring that team aboard," Andi Mann says. "They have forward-looking tech which brings together teams to not just review problems but collaborate on triaging, troubleshooting and launching automation to fix problems."

"The team is fantastic," he reiterates. It also establishes an official Splunk presence in Mann's hometown of Boulder, Colorado, advancing the "Silicon Mountain" moniker.

"Google has recently built a complex for 1500 people in Boulder. This is the calibre of town Boulder is for technical talent and I'm excited we (Splunk) have an opportunity to attract and retain talent."

VictorOps is a beautiful fit for Splunk, Mann says.

The OODA loop is the decision cycle of observe, orient, decide and act, Mann explains, developed by military strategist and United States Air Force Colonel John Boyd.

These first two phases are where Splunk "lives and breathes" – do we have a problem at all, and what is the problem? The conventional Splunk monitoring and analytics tools, augmented by machine learning-driven event analytics, does this out-of-the-box, allowing teams to see when things are going wrong and identifying the notable cause causing the problem.

VictorOps comes in at phase three — how do we work together to solve the problem? — providing a modern, cloud-based system that incorporates ideas around triaging and troubleshooting together. Using VictorOps multiple people can be geographically distributed but work in the one chatroom, pulling in Splunk dashboards, and getting the right people together at the right time.

Then, when the resolution is agreed upon, automations can be kicked off from right within the VictorOps chatroom, being Splunk actions or other third-party integrations.

Splunk previously acquired Phantom Cyber Corporation as an orchestration solution, to execute a workflow to implement known processes. Phantom gives the opportunity to execute recovery actions, working on how all the pieces are seamlessly integrated.

Thus, with the combination of Splunk and VictorOps, Mann says, IT teams can go all the way from "Aha, I have a problem. What is it? Let's work together to get the right people to make a decision after triaging and troubleshooting, then let's use Phantom, or maybe Puppet, Chef, or something else, to go and resolve that problem."

This is why Splunk speaks about a "platform for engagement", Mann says. "It's not just a monitor in a corner nobody looks at, and it's not just spitting out metrics. It enables IT pros to make decisions and act on them and return service to normal, all the while engaging with different teams – it's a platform for engagement."

Splunk has been working with VictorOps for a while, Mann says. He personally facilitated some early integrations which were literally customer-led. "Customers were asking us to work together so we released a two-way integration last year, with the ability to send alerts directly out of Splunk IT Service Intelligence (ITSI) to isolate a notable event using ML and integrate it in the GUI to send an alert to VictorOps.

"Customers said that's great, we know what the problem is when it happens but we need to work together in Splunk to fix the problem. So we continued to work on that integration to literally drop Splunk dashboards into a VictorOps chatroom and see the same information and speak the same language – so this integration has been around for a year or so.

"VictorOps doesn't just solve a problem and make Splunk a better platform for engagement. It's something our customers have proven for us works in a production environment, and it's a great acquisition because we've been doing that for a year or more."

47 REASONS TO ATTEND YOW! 2018

With 4 keynotes + 33 talks + 10 in-depth workshops from world-class speakers, YOW! is your chance to learn more about the latest software trends, practices and technologies and interact with many of the people who created them.

Speakers this year include Anita Sengupta (Rocket Scientist and Sr. VP Engineering at Hyperloop One), Brendan Gregg (Sr. Performance Architect Netflix), Jessica Kerr (Developer, Speaker, Writer and Lead Engineer at Atomist) and Kent Beck (Author Extreme Programming, Test Driven Development).

YOW! 2018 is a great place to network with the best and brightest software developers in Australia. You’ll be amazed by the great ideas (and perhaps great talent) you’ll take back to the office!

Register now for YOW! Conference

· Sydney 29-30 November
· Brisbane 3-4 December
· Melbourne 6-7 December

Register now for YOW! Workshops

· Sydney 27-28 November
· Melbourne 4-5 December

REGISTER NOW!

LEARN HOW TO REDUCE YOUR RISK OF A CYBER ATTACK

Australia is a cyber espionage hot spot.

As we automate, script and move to the cloud, more and more businesses are reliant on infrastructure that has the high potential to be exposed to risk.

It only takes one awry email to expose an accounts’ payable process, and for cyber attackers to cost a business thousands of dollars.

In the free white paper ‘6 Steps to Improve your Business Cyber Security’ you’ll learn some simple steps you should be taking to prevent devastating and malicious cyber attacks from destroying your business.

Cyber security can no longer be ignored, in this white paper you’ll learn:

· How does business security get breached?
· What can it cost to get it wrong?
· 6 actionable tips

DOWNLOAD NOW!

David M Williams

David has been computing since 1984 where he instantly gravitated to the family Commodore 64. He completed a Bachelor of Computer Science degree from 1990 to 1992, commencing full-time employment as a systems analyst at the end of that year. David subsequently worked as a UNIX Systems Manager, Asia-Pacific technical specialist for an international software company, Business Analyst, IT Manager, and other roles. David has been the Chief Information Officer for national public companies since 2007, delivering IT knowledge and business acumen, seeking to transform the industries within which he works. David is also involved in the user group community, the Australian Computer Society technical advisory boards, and education.

 

Popular News

 

Telecommunications

 

Sponsored News

 

 

 

 

Connect