Currently, when you work with EC2 Spot Instances you are only allowed to reboot or terminate an instance. This is not an issue for stateless applications as they are designed to easily scale horizontally. If you have a stateful application or an application that is designed to withstand node failure such as a database cluster then you may have decided in the past that Spot instances were not the best use case for you. Today we are excited to announce the launch of our new Stateful Spot service. This new service will finally allow you to utilize Spot Instances and save up to 80% on your EC2 Stateful environment.
How Stateful configuration works
Provision a new Stateful Spot instance from the Spotinst Console or API. This will be just like provisioning a new instance from the EC2 console. Spotinst will also take regular backup snapshots of your instances over time.
- Provision a new Stateful Spot instance from the Spotinst Console or API. This will be just like provisioning a new instance from the EC2 console. Spotinst will also take regular backup snapshots of your instances over time.
- If a Spot Interruption occurs, the instance will be shut down and terminated.
- The EBS volumes associated with the instance will become detached.
- The Original root volume and data volumes will become available and the last EBS snapshot will be taken. A new AMI will be created from this snapshot.
- A new Recovery clone instance will be launched to replace the previous instance.
- The same data, private-IP, security groups, load balancers, and other metadata will be available to you.
- The recovery time can take as little as three minutes.
- New EBS volumes will be created and attached from the newly created AMI
- The new instance is launched and becomes healthy. From the standpoint of this single server, it will appear as if it were shut down for a period of time.
Things to note
Keep the machine’s root volume – The same data (OS / Configuration etc will be maintained for your instance. To further increase the reliability of your instances we also create periodic snapshots of your data volumes while your instance is running. The new instance will be created from a “final” snapshot that will be taken only after the original instance is terminated and the EBS volumes change to an “available” state. For this to function properly it is necessary to turn off the delete on termination flag when provisioning a new instance.
Keep the machine’s private IP – New instances will be provisioned with the same configuration as the old one. The instance will be an exact clone of the old one with the same private/public IPs (Elastic IP is required). For example, for a Cassandra node, it is necessary to use the same private IP of the replaced instance for the cluster to recognize the newly created clone.
Keep the machine’s data volumes – All data volumes that were attached at the time of the previous instance termination will be automatically re-attached using the same BlockDeviceMapping configuration upon instance replacement.
Cassandra – If your Cassandra node is replaced we’ll clone the instance and bring it back. Your Cassandra cluster will behave as if the instance was down for some time. Bringing up a clone of the previous instance ensures that cluster IOPs are not wasted on bringing a new instance up.
Elastic.co – Elasticsearch node recovery will take a fraction of the time required to provision a brand new instance. From the standpoint of your Elasticsearch cluster the instance was only down for a period of time (depending on the size of the data volumes attached). No changes are necessary for your cluster to provision this as long as you have enough instances for quorum.
Single Server Database – If you have non-production environments it is very likely that you do not have a requirement for 100% uptime for your database instances. You can also create a RDB cluster with spot instances and use the stateful spot feature to ensure that you do not lose application availability.
Monolithic – or COTS (commercial off the shelf software) – Any monolithic or Off the shelf Windows applications can be used with Stateful Elastigroup. Keep in mind that if a replacement is necessary your instance will be down for a few minutes as the Recovery process takes place.
Development instances – You can run non-production nodes on Spot Instances with occasional downtime. If an interruption occurs on your instance it will be brought back automatically within a few minutes.
Hadoop cluster -Support for “Stateful Spot” instances in Spotinst Elastigroups allows you to provision Spot Instances and automatically recover the full state of the instance including the private ip. When a recovery occurs we will automatically create a clone of the previous instance and it will appear as if the instance was brought down for a restart. For instructions please see: Hadoop use case
Please note: Stateful replacement will only happen if the instance is replaced by AWS (During Spot recovery or if you manually terminate the instance from AWS). It will not take place if you use scheduling or detach the instance from Spotinst console.