Tuesday, 18 December 2018 11:40

Pure Storage: ML leads to high NPS

Pure Storage international CTO Alex McMullan Pure Storage international CTO Alex McMullan

Pure Storage uses machine learning to help its customers' systems run better, and some of its customers use Pure Storage arrays to make their machine learning systems run better.

Each Pure Storage array generates between 600MB and 1GB of telemetry data per day, including behavioural data concerning workload characteristics, Pure Storage international chief technology officer Alex McMullan told iTWire.

Different types of data are directed to different streams. So temperature alerts and information about network issues flow to the help desk for immediate attention. Some issues can be fixed remotely, often before an actual fault occurs; others are brought to the customer's attention.

This type of service has led to Pure achieving an NPS (net promoter score) in the mid 80s, he said. For comparison, Macquarie Telecom claims "Australia's best" customer experience based on an NPS of 76, and the average NPS of the Australian retail industry is 15 according to the Perceptive Group.

If you think an NPS of 15 sounds low, keep in mind that retail achieved the second-highest industry-wide NPS in Australia for 2018, trailing only the charity sector which had an NPS of 27. US retail managed 54, according to Forbes.

But back to Pure's telemetry. Applying machine learning to the data also allows the company to identify issues caused by hardware or software provided by other vendors. In addition, customers can use it to predict the effect of making changes to their arrays, such as upgrading a controller.

The company is very aware that there are significant differences between workloads. Pure Storage was originally used largely in conjunction with VMware, etc, but now software such as Mongo and Cassandra is commonplace, and these workloads have very different characteristics in terms of storage use. So the models used to analyse the telemetry data keep changing — Pure's "data science team never stops", said McMullan.

To process all this data, Pure augments its on-premises infrastructure with AWS, which McMullan describes as "a great force multiplier."

Pure has more than 10PB of data stored on AWS, but "much more" is stored on premises. The company is moving even more data on-premises in order to take advantage of its own FlashBlade hardware to improve analytics performance.

Looking at AI more generally, McMullan sees it as "an undisciplined, unregulated space." What regulations there are vary significantly in different jurisdictions, there's no agreement on how accurate a model needs to be (see, for example, recent concerns over the accuracy of face recognition used by the police in the UK), and the 'black box' nature of most models leaves people wondering whether any conscious or unconscious bias has gone into their development.

McMullan suggests that if the international community can agree on air traffic lanes, it should be able to come up with overarching guidelines for AI.

He's not suggesting that all applications should be regarded in the same way. But there will be a high level of reliance on some AIs (eg, autonomous vehicles), so lots of ongoing checks are reasonable, especially when a given set of inputs do not necessarily lead to the same output.

It's important to realise that the computer isn't always right, he suggested.

Another issue that needs attention is data ownership (does healthcare and vehicle data belong to the individual or the owner, or to the manufacturer or a third-party provider?), he said.

That raises some interesting issues. Should a hospital be allowed to train an AI using patients' data without their explicit consent? Is that consent meaningful if it was granted as part of 'take it or leave it' terms and conditions, eg no consent means no treatment. Should future patients only benefit from their predecessors' contribution to the development of AI-assisted diagnosis and treatment if they in turn allow their data to be used in that tool's ongoing development and training?


26-27 February 2020 | Hilton Brisbane

Connecting the region’s leading data analytics professionals to drive and inspire your future strategy

Leading the data analytics division has never been easy, but now the challenge is on to remain ahead of the competition and reap the massive rewards as a strategic executive.

Do you want to leverage data governance as an enabler?Are you working at driving AI/ML implementation?

Want to stay abreast of data privacy and AI ethics requirements? Are you working hard to push predictive analytics to the limits?

With so much to keep on top of in such a rapidly changing technology space, collaboration is key to success. You don't need to struggle alone, network and share your struggles as well as your tips for success at CDAO Brisbane.

Discover how your peers have tackled the very same issues you face daily. Network with over 140 of your peers and hear from the leading professionals in your industry. Leverage this community of data and analytics enthusiasts to advance your strategy to the next level.

Download the Agenda to find out more


Stephen Withers

joomla visitors

Stephen Withers is one of Australia¹s most experienced IT journalists, having begun his career in the days of 8-bit 'microcomputers'. He covers the gamut from gadgets to enterprise systems. In previous lives he has been an academic, a systems programmer, an IT support manager, and an online services manager. Stephen holds an honours degree in Management Sciences and a PhD in Industrial and Business Studies.



Recent Comments