Stephen Withers
Wednesday, 13 July 2011 11:55
Business IT -
Technology
Page 1 of 2
Researchers at Melbourne's Swinburne University of Technology have been examining the tradeoff between processing and storage costs in cloud environments.
One of the big attractions of cloud computing is that you only pay for what you use. The downside is that there's no upper limit, so it isn't difficult to end up with a much bigger bill than you expected.
Another feature of cloud computing is that providers typically charge separately for storage and processing. So researchers at Swinburne University of Technology have been exploring the management of raw and intermediate data.
Is it better to keep the raw data and recreate intermediate datasets as required, or should you keep both? "The trade-off is going to be between storage cost and computation cost," said John Grundy, who works in the University's Center for Computing and Engineering Software Systems (SUCCESS). "Finding this balance is complex, and there are currently no decision-making tools to advise on whether to store or delete intermediate datasets, and if to store, which ones."
Funded by the Australian Research Council, Prof Grundy, Yun Yang and Jinjun Chen (who is now with the University of Technology, Sydney) have developed a mathematical model that takes into account the size of the original dataset, the amount of intermediate data stored, and the rates charged by service providers.
What adds to the complexity is that intermediate datasets are not necessarily generated directly from the original data, but from intermediate results. So the team also developed an intermediate data-dependency Graph (IDG) to helps users decide whether they are better off spending money on storage or computation for intermediate datasets.
CONTINUED