Monday, 12 November 2018 23:56

Cloud move means sky's the limit for Australian genome research


The human genome consists of seven billion DNA base pairs and it takes 100GB to represent the unique sequence for a person. Australian National University researchers have turned to the cloud to enable clinical applications.

The Human Genome Project was a vast long-running and internationally collaborative project to determine the sequence of nucleotide base pairs that make up human DNA, and identify and map all the genes of the human genome, both physically and functionally. In fact, it is the world’s largest collaborative biological project across all of history.

However, it’s massive. Who’d have thought humans are so complex? With a genome of seven billion DNA base pairs, it takes 100GB to store the unique genetic sequence for any individual human being as a string of text using the letters A, T, C and G that refer to the bases – adenine, thymine, cytosine and guanine.

Researchers have already dealt with the issues of collecting and storing genome data, but actually analysing the data — to understand and identify disease markers and explore the difference between healthy and cancerous cells — has previously been a slow and complex affair. Specialised high-performance computing hardware has been employed, but with a high cost to buy, researchers are queuing up to use the small numbers they have.

The Australian National University turned to the cloud, working with Microsoft partner BizData, and found it’s taken away the administrative overhead of managing HPC devices, and more importantly, it’s also accelerating research with more computing power at lower costs.

ANU’s Department of Genome Science had access to 30-40 local servers and a number of shared HPC environments. The researchers had workstations armed with 16 cores, 10TB storage and 128GB RAM. Originally, the cloud was viewed as a way of giving temporary boosts in power during peak demand, charged by usage, instead of having to buy another HPC server.

However, while the cost reduction — a quarter of that of managing their own hardware — was expected, ANU also found they received four times the computational power they had on-premise for that cost.

Biological data science research fellow, Dr Sebastian Kurscheid, of ANU’s John Curtin School of Medical Research, explained that cloud computing promised to accelerate the focus on the health aspects of genome research. “There are questions about how medical genomics has a more increasing relevance to clinical practice. It’s already important in the field of rare diseases but it’s becoming more and more relevant also in more common diseases,” he said.

Dr Kurscheid says he has spent about a third of his three-and-a-half year research program getting technical elements in place so he could run HPC analytics over many large data sets. Based on his work within Azure so far, he says that moving earlier to a could-based solution would have saved nine months, freeing him to focus on research.

“Our focus at BizData has been to deliver a seamless experience for researchers using the Microsoft Cloud. For example, today we enable a researcher to take an existing pipeline (for example in Snakemake or Galaxy) that they have already built and allow them to run secondary analysis in the cloud with as much computing power as needed, without changing a line of code. We also make it easy to analyse and collaborate on the research outputs, without having to wait for large volumes of data to download again.”

Dr Kurscheid notes “The general infrastructure is available for going from raw data — as primary as it gets — to a highly analysed and visualised result and that would probably be used for some work that we are currently finalising that’s actually looking at the 3D structure of the genome in cancer cells. I’m envisaging that if we conduct all this analysis using Azure then also doing some really nice visualisation and exploratory analysis using the platform.”

Making genome analytics more accessible and affordable would open new clinical applications, he said.

“Part of the long-term vision is that in the medical field genomics becomes more widely available – it’s already important in rare diseases. As it becomes more common smaller hospitals or pathology services might see demand for this.

“I think that making these workflows and tools and analysis pipelines publicly available in a manner that is adaptable for others would support the broader uptake of genomics in the medical field.”


You cannot afford to miss this Dell Webinar.

With Windows 7 support ending 14th January 2020, its time to start looking at your options.

This can have significant impacts on your organisation but also presents organisations with an opportunity to fundamentally rethink the way users work.

The Details

When: Thursday, September 26, 2019
Presenter: Dell Technologies
Location: Your Computer


QLD, VIC, NSW, ACT & TAS: 11:00 am
SA, NT: 10:30 am
WA: 9:00 am NZ: 1:00 pm

Register and find out all the details you need to know below.



iTWire can help you promote your company, services, and products.


Advertise on the iTWire News Site / Website

Advertise in the iTWire UPDATE / Newsletter

Promote your message via iTWire Sponsored Content/News

Guest Opinion for Home Page exposure

Contact Andrew on 0412 390 000 or email [email protected]


David M Williams

David has been computing since 1984 where he instantly gravitated to the family Commodore 64. He completed a Bachelor of Computer Science degree from 1990 to 1992, commencing full-time employment as a systems analyst at the end of that year. David subsequently worked as a UNIX Systems Manager, Asia-Pacific technical specialist for an international software company, Business Analyst, IT Manager, and other roles. David has been the Chief Information Officer for national public companies since 2007, delivering IT knowledge and business acumen, seeking to transform the industries within which he works. David is also involved in the user group community, the Australian Computer Society technical advisory boards, and education.



Recent Comments