Monday, 12 November 2018 23:56

Cloud move means sky's the limit for Australian genome research


The human genome consists of seven billion DNA base pairs and it takes 100GB to represent the unique sequence for a person. Australian National University researchers have turned to the cloud to enable clinical applications.

The Human Genome Project was a vast long-running and internationally collaborative project to determine the sequence of nucleotide base pairs that make up human DNA, and identify and map all the genes of the human genome, both physically and functionally. In fact, it is the world’s largest collaborative biological project across all of history.

However, it’s massive. Who’d have thought humans are so complex? With a genome of seven billion DNA base pairs, it takes 100GB to store the unique genetic sequence for any individual human being as a string of text using the letters A, T, C and G that refer to the bases – adenine, thymine, cytosine and guanine.

Researchers have already dealt with the issues of collecting and storing genome data, but actually analysing the data — to understand and identify disease markers and explore the difference between healthy and cancerous cells — has previously been a slow and complex affair. Specialised high-performance computing hardware has been employed, but with a high cost to buy, researchers are queuing up to use the small numbers they have.

The Australian National University turned to the cloud, working with Microsoft partner BizData, and found it’s taken away the administrative overhead of managing HPC devices, and more importantly, it’s also accelerating research with more computing power at lower costs.

ANU’s Department of Genome Science had access to 30-40 local servers and a number of shared HPC environments. The researchers had workstations armed with 16 cores, 10TB storage and 128GB RAM. Originally, the cloud was viewed as a way of giving temporary boosts in power during peak demand, charged by usage, instead of having to buy another HPC server.

However, while the cost reduction — a quarter of that of managing their own hardware — was expected, ANU also found they received four times the computational power they had on-premise for that cost.

Biological data science research fellow, Dr Sebastian Kurscheid, of ANU’s John Curtin School of Medical Research, explained that cloud computing promised to accelerate the focus on the health aspects of genome research. “There are questions about how medical genomics has a more increasing relevance to clinical practice. It’s already important in the field of rare diseases but it’s becoming more and more relevant also in more common diseases,” he said.

Dr Kurscheid says he has spent about a third of his three-and-a-half year research program getting technical elements in place so he could run HPC analytics over many large data sets. Based on his work within Azure so far, he says that moving earlier to a could-based solution would have saved nine months, freeing him to focus on research.

“Our focus at BizData has been to deliver a seamless experience for researchers using the Microsoft Cloud. For example, today we enable a researcher to take an existing pipeline (for example in Snakemake or Galaxy) that they have already built and allow them to run secondary analysis in the cloud with as much computing power as needed, without changing a line of code. We also make it easy to analyse and collaborate on the research outputs, without having to wait for large volumes of data to download again.”

Dr Kurscheid notes “The general infrastructure is available for going from raw data — as primary as it gets — to a highly analysed and visualised result and that would probably be used for some work that we are currently finalising that’s actually looking at the 3D structure of the genome in cancer cells. I’m envisaging that if we conduct all this analysis using Azure then also doing some really nice visualisation and exploratory analysis using the platform.”

Making genome analytics more accessible and affordable would open new clinical applications, he said.

“Part of the long-term vision is that in the medical field genomics becomes more widely available – it’s already important in rare diseases. As it becomes more common smaller hospitals or pathology services might see demand for this.

“I think that making these workflows and tools and analysis pipelines publicly available in a manner that is adaptable for others would support the broader uptake of genomics in the medical field.”


Did you know: 1 in 10 mobile services in Australia use an MVNO, as more consumers are turning away from the big 3 providers?

The Australian mobile landscape is changing, and you can take advantage of it.

Any business can grow its brand (and revenue) by adding mobile services to their product range.

From telcos to supermarkets, see who’s found success and learn how they did it in the free report ‘Rise of the MVNOs’.

This free report shows you how to become a successful MVNO:

· Track recent MVNO market trends
· See who’s found success with mobile
· Find out the secret to how they did it
· Learn how to launch your own MVNO service


David M Williams

David has been computing since 1984 where he instantly gravitated to the family Commodore 64. He completed a Bachelor of Computer Science degree from 1990 to 1992, commencing full-time employment as a systems analyst at the end of that year. David subsequently worked as a UNIX Systems Manager, Asia-Pacific technical specialist for an international software company, Business Analyst, IT Manager, and other roles. David has been the Chief Information Officer for national public companies since 2007, delivering IT knowledge and business acumen, seeking to transform the industries within which he works. David is also involved in the user group community, the Australian Computer Society technical advisory boards, and education.



Recent Comments