Reduce Genomics & Data Analysis Costs by 90% on AWS with Spot Instances

2 minute read
Lifebit

Lifebit

On paper, cloud computing is a relatively straightforward model which allows access to compute and storage resources. However, the reality of cloud pricing can be quite complex. Have you ever looked at the long list of instances and pricing associated to each? Can you come up with the price of a cloud instance if you are only given a certain CPU and memory requirement? Probably not.

spot1

 

Let’s be realistic, to come up with a price you would have to go through an extensive list of instances and filter the ones that have certain requirements. Although you might find multiple instances that suit your needs in this filtered list, you will [most probably] select the cheapest one and go on to making scientific magic happen.

If you decide to go for some seriously big data workloads, you might end up having to select big instances with more CPUs and more memory. The problem is that prices go up considerably: big machines such as x1.16xlarge or x1.32xlarge cost around $16/hr and $32/hr, respectively. And if you’re planning long running analyses, these prices become prohibitive. To go easy on your wallet, you may want to consider using reduced cost instances, like spot instances.

If you thought cloud pricing was complex, spot instances pricing is even more convoluted. This service offers the cloud users excess capacity with large discounts. For instance, in the example below, for an instance type of x1.32large, we see a reduction of 75% in price through time. Although spot instances can reach up to 80% of initial discounts, management of the instances can be quite tricky.

spot2

 

The main issue with discounted instances is that they are subject to price fluctuations that depend on both market availability and current bids in the market. In the end, you are not the only bidder in the market, which can affect your chances of getting the exact machine you want.

In the best case scenario, if you do manage to get a hold of an instance you desperately need, you can end up losing it at any time. This happens if the market availability changes, or if your bid gets superseded due to price increases. If you do lose an instance, you will receive a notification letting you know that your resources are going down.

And if that wasn’t enough, cloud providers, do not give you enough information to make predictions of availability. Currently, you are provided with dates, instance types, availability zones and market prices along intervals that span 90 days. With all that information, it is up to the user to investigate and draw their own conclusions on patterns to optimise deployments…and not whittle away while doing so.

Although spot instances can be a hassle to manage, especially in nail biting deployment scenarios where you either need to get your job done fast or need a highly available service, they are worth the trouble and headaches as they save you a considerable amount of money. Ultimately, cloud users looking to benefit from the spot instance market have to adapt all their applications to take into account the market complexity.

At Lifebit, we have always been conscious of how much big data workloads cost. In order to address these issues, we adapted our system to work in sync with reduced costs instances. Besides adapting our internal tooling, we also provided a system that handles all possible disaster scenarios and allows our team to relax and forget about rushed deployment, monitoring and scaling.

spot4

 

Once we solved our own issues with spot instances, we decided to be considerate and offer these functionalities to users :) . Through CloudOS, users can throw their big data pipelines into our system and it will take care of all the provisioning, maintenance, and scaling while users can sit back and enjoy their coffee. The user only needs to select a group of reduced cost instances that match their resource needs, and the pipeline is ready to go without any fuss.

spot5

 

Hopefully, navigating through spot instance territory will no longer be frightening with our new CloudOS functionalities.

You’re welcome. Drop mic


We would like to know what you think! Please fill out the following form or contact us at hello@lifebit.ai. We welcome your comments and suggestions!

 

Featured news and events

What is a Data Lakehouse?
Continue reading

Lifebit and Lupus Research Alliance Partner to Accelerate Lupus Research through Secure Data Analytics Platform
Continue reading

Lifebit and Flatiron Health Bring Cutting-Edge Research Technology to Japan, Advancing Global Cancer Care through Real-World Data
Continue reading

Lifebit Joins AWS Marketplace to Boost Health Data Research
Continue reading

Streamlining Internal Data Analysis with Trusted Research Environments
Continue reading

Data Security and Compliance in Nonprofit Health Research
Continue reading

Data Harmonization: Overcoming Challenges with Proprietary and Outsourced Datasets
Continue reading

Lifebit, CanPath and AWS Collaborate to Advance Health Research with Innovative Cloud-Based Data Analytics Platform
Continue reading

Maximizing Research Efficiency with Trusted Research Environments
Continue reading

Revolutionizing Pharma: Unlocking the Power of a Global Federated Data Network
Continue reading