Table of Contents
Data is the lifeblood of any business. Data is used to make decisions, drive innovation, and serve customers. But data can also be expensive to store at scale in the cloud. That's where storage lifecycle configurations come in.
An Amazon s3 lifecycle configuration is a set of rules that define actions that Amazon S3 applies to a group of objects. With lifecycle configurations, you can automatically move old data to a lower-cost storage tier with Transition actions or delete it with Expiration actions. This can save you a significant amount of money on your AWS bill over time.
This article will start by listing the available transition storage tiers in s3 and then explain scenarios where s3 lifecycle configurations can help optimize storage costs. After that, you will learn about the components of an s3 lifecycle configuration and how to create one. Ultimately, this article will share considerations to remember when using s3 lifecycle configurations.
S3 storage classes you can move data to
The following are some of the storage classes that you can move objects to from S3 Standard (default storage class for Amazon S3) with Amazon S3 Lifecycle Configurations:
- S3 Standard-IA(Infrequently Accessed): This is a lower-cost storage class than S3 Standard. It is a good choice for objects that are accessed less frequently. It has no minimum storage duration.
- S3 Intelligent-Tiering: It is an option for data with changing or unknown access patterns. It has 30 days minimum storage duration.
- S3 One Zone-IA: This is for when you want to store infrequently accessed data in a single availability zone. It has a 30 days minimum storage duration.
- S3 Glacier Instant Retrieval: You transition data to this storage class if you will only access the data once a quarter (three months). It has a 90 days minimum storage duration.
- S3 Glacier Flexible Retrieval: You use this storage class to store data accessed once a year, but will retrieval time of minutes to hours. It has a 90 days minimum storage duration.
- S3 Glacier Deep Archive: This is the lowest-cost storage class. It is a good choice for objects that are accessed once a year with a retrieval time of hours.
Use cases of Amazon s3 lifecycle configurations
You can use Amazon s3 lifecycle configurations in many scenarios to help manage your storage costs and optimize your usage. Some prominent use cases include:
- To automatically move old objects to a lower-cost storage class: For example, you could create a lifecycle rule that moves objects that have not been accessed in 30 days to s3 Standard-IA. This would save you money on storage costs while still allowing you to access the objects if you need them in the future.
- To automatically delete old objects: If you have an s3 bucket that contains temporary files your users create, you could create a lifecycle rule that deletes objects that are older than 7 days. This would help you to avoid storing unnecessary data in S3 and incurring unnecessary storage costs.
- To abort incomplete multipart uploads: When you upload a large object to Amazon S3, you can use multipart upload to break the object up into smaller parts and upload them separately. This can be helpful for uploading large objects over a slow network connection. However, if you interrupt a multipart upload, the parts will remain in your S3 bucket, and you will continue to be charged for them. To avoid this, you can use a lifecycle configuration to abort incomplete multipart uploads after a specified number of days. This will help to free up storage space in your bucket and prevent you from forgetting about the incomplete upload.
- To keep track of your data retention policies: You can use lifecycle configurations to ensure that your data is retained for the required amount of time in accordance with your organization's data retention and compliance policies. For example, you could create a lifecycle rule that keeps all logs for 7 years, then delete them.
Components of an s3 lifecycle configuration
A lifecycle configuration contains a set of rules. A rule is made up of four components: ID, Filters, Status, and Actions.
The ID is a unique identifier of the lifecycle rule. This is crucial since one lifecycle configuration can have up to 1000 rules, and the ID will make it easier for you to remember what each rule performs.
The filters component defines WHICH objects in your bucket you’d like to take action on. You can decide whether to apply actions to every object in a bucket or only some of them. You can filter based on prefix, object tag, or object size if you select a subset of objects. Or, if you wanted to be more specific, you could filter using a combination of these characteristics.
To enable and disable each lifecycle rule, you’d use the status component. When evaluating lifecycle settings and determining the most effective rules for your workload, this can be useful.
Arguably the most important component of them all. The actions component is where you define WHERE you want to happen to your — are you transitioning them to a lower storage class, or are you deleting them?
There are six main actions you can use: Transition, expiration, NoncurrentVersionExpiration, NoncurrentVersionTransition, ExpiredObjectDeleteMarker, and the AbortIncompleteMultipartUpload action. With Transition actions, you can automatically move old data to a lower-cost storage tier, and expiration actions automate the deletion of your objects in S3.
Transition actions and expiration actions only apply to the most recent version of your item if versioning is enabled for your bucket. Use the NoncurrentVersionTransition action to transition between noncurrent versions of your object. Similarly to this, you must utilize the NoncurrentVersionExpiration action to remove noncurrent versions of your object.
With the AbortIncompleteMultipartUpload action. You should perform this step if you need to clean up any incomplete multipart uploads. You can set the maximum number of days that your multipart uploads can be in progress using this action.
To learn more about each of the above components, check out the Amazon s3 lifecycle rules documentation.
Step-by-Step Guide: How to Create a Lifecycle Configuration on an S3 Bucket (DEMO)
Like creating an s3 bucket, there are several ways to create an s3 lifecycle configuration — using the AWS Console, CLI, SDKs, or Rest API. In this demo, you will create a lifecycle configuration for an s3 bucket using the AWS Console.
The S3 lifecycle configuration you will create in this demo will transition new log data from the s3 Standard to Glacier Deep Archive after 30 days to store for compliance purposes and delete them after 7 years.
To follow along in this demo, create a demo s3 bucket and then download the following sample log data:
Upload the above 3 sample datasets into your demo s3 bucket as in the image below.
To create the lifecycle configuration, at the top right corner, click on the “Management” tab, as highlighted in the above image. In the “Management” tab, click “Create lifecycle rule,” as in the image below.
On the next page, name your lifecycle rule, select the “Apply to all objects in the bucket” rule scope option, and other rule actions as in the image below.
In the image above:
- Selecting “Apply to all objects in the bucket” means that the lifecycle rule applies to all the log data in the bucket.
- If you select the other option, you will be able to filter objects by prefix, object tags, object size, or whatever combination suits your use case. Learn more about filters here.
- For the “Lifecycle rule actions,” seeing there are no other versions of the logs for this demo, this demo selects the “Move current versions of objects between storage classes” and “Expire current versions of objects” options.
The next option is to select the storage class and days — 30 — to transition, as in the image below, and then the days — 2557 (7 years) — to expire the objects after the retention period is over.
Note: You can ignore the warning. It pops up because the objects set to transition to Glacier Deep Archive in this demo are relatively small to the size of objects you would transition in a real-world scenario.
The next step is to review the rule transition and actions and then create it as in the image below.
After you create the rule, you should see it in the image below. From the page, you can edit the rule, enable and disable it, and create more rules.
Ensure you delete the demo lifecycle rule and s3 bucket to avoid incurring unnecessary AWS charges.
Key Considerations for Creating S3 Lifecycle Configurations
While a lifecycle configuration can be powerful, there are several considerations you should keep in mind when creating one:
Moving between storage classes
Think about the s3 storage classes as a staircase. S3 Standard is at the top of the staircase, while s3 Glacier Deep Archive is at the bottom of the staircase. And all of the other storage classes are in between.
With lifecycle configurations, this staircase only goes one way: down. Once you transition data down the staircase to a lower-cost storage class, you can't move objects back up. For example, let's say you move your data to S3 One Zone-IA. Once your data transitions to that storage class, you can't use a lifecycle configuration to move your data back to S3 Standard or S3 Intelligent Tiering.
Lifecycle configuration costs
Costs follow a similar staircase model. The costs can be categorized in two ways: storage transition costs and minimum storage duration fees. Both of which increase as you move down the staircase.
For storage transition costs, you will be charged $0.01 for every 1,000 lifecycle transition requests when objects are moved from the S3 Standard to the S3 Standard-IA storage class. As you go down all the way to S3 Glacier Deep Archive, this cost increases and can be up to $0.05 for every 1000 transition requests.
For minimum storage duration fees, seeing most storage classes have a minimum storage duration before you can delete, overwrite, or transition those objects. And the minimum storage duration periods increase as you go down the staircase as well.
To learn more about the considerations when creating S3 lifecycle configurations, check out this documentation on that.
In this article, you learned all you need to get started creating Amazon s3 lifecycle configurations. From its use cases to key considerations to make. There is so much more to learn about Amazon s3. To do so, explore the documentation on s3.
The Practical DevOps Newsletter
Your weekly source of expert tips, real-world scenarios, and streamlined workflows!