How to automatically maintain EBS snapshots in AWS.
This solution was developed before AWS released EBS Lifecycles feature and this solution relies on an EC2 instance. As it's fairly light-weight it can run on a free tier instance type. I also have a Python version of this script that run as an Lambda function which I will post in another time.
In our environment, the lowest frequency at which a snapshot was required was every 30 minutes and at the highest frequency, weekly. So the frequency of required snaps looks like this:
- M: Half-Hourly, bottom of the hour
- H: Hourly: top of the hour
- D: Daily: At 0100
- W: Weekly: At 0100 on Sunday morning
So using the 4 period above, we create a new tag called SnapRate to be applied to any volumes that we want to mange using this method.
- M/H/D/W, for example 12/24/7/6
Where we provide the total number of snapshots we want to maintain PER that period. In the above example, we would have 12 half-hourly (for past 6 hours), 24 daily (for past 24 hours), 7 daily (for past week), and 6 weekly (for past 6 weeks). You can picture the snapshots like this:
Support Function:
This is
get_cycle.bsh, this returns the array index for the snapshot tag defined above.
#!/bin/bash
DOM=$(TZ='America/New_York' date +%-d)
DOW=$(TZ='America/New_York' date +%u)
HOUR=$(TZ='Amreica/New_York' date +%-H)
MINU=$(TZ='America/New_York' date +%-M)
if [ $DOW == 1 ] && [ $HOUR == 1 ] && [ $MINU -lt 30 ]
then
CYCLE="3"
elif [ $HOUR == 1 ] && [ $MINU -lt 30 ]
then
CYCLE="2"
elif [ $MINU -lt 30 ]
then
CYCLE="1"
else
CYCLE="0"
fi
echo $CYCLE
Main Functions:
Use this,
aws_create_snapshots.bsh in your crontab to automatically create snapshots, set the crontab to run every 30 mins.
#!/bin/bash
###########################################
#
# aws_create_snapshots.bsh
# Description: Ability to create snapshots in the account for all volumes that fits tag criteria
#
# Last edit: 11/21/2018
#
# Prereq:
# aws cli
# jq
###########################################
source /etc/profile
# In my case aws cli was installed under /usr/local/bin
PATH=$PATH:/usr/local/bin
echo "Path is set to $PATH"
echo "AWS_CA_BUNDLE is at $AWS_CA_BUNDLE"
LOG_DATE=`TZ='America/New_York' date +%Y%m%d`
LOGFILE="${BASH_SOURCE%/*}/logs/aws_create_snapshots_$LOG_DATE.log"
echo "Logs are written to $LOGFILE"
FORMAT_DATE=`TZ='America/New_York' date +%Y%m%d.%H%M%S`
echo "=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=" >> $LOGFILE
echo "Script Start" >> $LOGFILE
echo "Current Time: $FORMAT_DATE" >> $LOGFILE
CYCLE=`${BASH_SOURCE%/*}/get_cycle.bsh`
CYCLE_WORD_ARRAY=("Half-Hourly" "Hourly" "Daily" "Weekly")
CYCLE_LETT_ARRAY=("M" "H" "D" "W")
CYCLE_TIME_ARRAY=(30 60 1440 10080)
echo "Current Cycle: ${CYCLE_WORD_ARRAY[CYCLE]}" >> $LOGFILE
EC2_AZ=`curl http://169.254.169.254/latest/meta-data/placement/availability-zone`
REGIONID="`echo \"$EC2_AZ\" | sed 's/[a-z]$//'`"
VOLUMES=`aws ec2 describe-volumes \
--filter "Name=tag-key,Values='SnapRate'" \
"Name=tag-key,Values='Name'" \
"Name=tag-key,Values='Application'" \
"Name=tag:Mode,Values='Auto'" \
"Name=tag:Keep,Values='Yes'" \
"Name='attachment.status',Values='attached'" \
--query "Volumes[*].{VolumeID:VolumeId, \
Name:Tags[?Key==\\\`Name\\\`].Value, \
Function:Tags[?Key==\\\`Function\\\`].Value, \
Application:Tags[?Key==\\\`Application\\\`].Value, \
SnapRate:Tags[?Key==\\\`SnapRate\\\`].Value}" \
--region $REGIONID`
echo "=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=" >> $LOGFILE
for VOL in $(echo "${VOLUMES}" | jq -c '.[]'); do
VOLID=`echo ${VOL} | jq -r '.VolumeID'`
echo "Volume ID: $VOLID" >> $LOGFILE
APP=`echo ${VOL} | jq '.Application' | jq -r .[0]`
echo "Application: $APP" >> $LOGFILE
FUNCTION=`echo ${VOL} | jq '.Function' | jq -r .[0]`
echo "Function: $FUNCTION" >> $LOGFILE
NAME_PRE=`echo ${VOL} | jq '.Name' | jq -r .[0]`
SEP="_"
NAME=$NAME_PRE$SEP$FORMAT_DATE
echo "New Name: $NAME" >> $LOGFILE
SNAPRATE=`echo ${VOL} | jq '.SnapRate' | jq -r .[0]`
echo "SnapRate: $SNAPRATE" >> $LOGFILE
THIS_CYCLE=$CYCLE
NEW_CYCLE_VALUE=0
IFS='/' read -r -a SNAPARRAY <<< "$SNAPRATE"
NEW_CYCLE_VALUE=${SNAPARRAY[THIS_CYCLE]}
while [ $NEW_CYCLE_VALUE -eq 0 ] && [ $THIS_CYCLE -gt 0 ]; do
let THIS_CYCLE-=1
NEW_CYCLE_VALUE=${SNAPARRAY[THIS_CYCLE]}
done
if [ $NEW_CYCLE_VALUE -gt 0 ]
then
THIS_CYCLE_LETT=${CYCLE_LETT_ARRAY[THIS_CYCLE]}
TIME_SEED=${CYCLE_TIME_ARRAY[THIS_CYCLE]}
THIS_CYCLE_TIME=`echo "$((TIME_SEED * NEW_CYCLE_VALUE))"`
EXPIREDATE=`date -d "+$THIS_CYCLE_TIME minutes" "+%Y%m%d.%H%M%S"`
echo "VOL CYCLE: $THIS_CYCLE_LETT" >> $LOGFILE
echo "EXPIRE DATE: $EXPIREDATE" >> $LOGFILE
OUTPUT=`aws ec2 create-snapshot --volume-id $VOLID --description $NAME --region $REGIONID`
echo $OUTPUT >> $LOGFILE
SNAPID=`echo ${OUTPUT} | jq '.SnapshotId' -r`
if [ $SNAPID != "null" ] && [ -n $SNAPID ]
then
# Tags in JSON
TAGS=`echo '[{"Key":"Name","Value":"'$NAME'"},\
{"Key":"Cycle","Value":"'$THIS_CYCLE_LETT'"},\
{"Key":"Keep","Value":"Yes"},\
{"Key":"Source","Value":"'$VOLID'"},\
{"Key":"Application","Value":"'$APPLICATION'"},\
{"Key":"Function","Value":"'$FUNCTION'"},\
{"Key":"Mode","Value":"Auto"},\
{"Key":"ExpirationDate","Value":"'$EXPIREDATE'"}]'`
echo "New SnapID: $SNAPID" >> $LOGFILE
OUTPUT2=`aws ec2 create-tags --resources $SNAPID --tags $TAGS --region $REGIONID`
echo $OUTPUT2 >> $LOGFILE
fi
else
echo "No snapshot requested in SnapRate" >> $LOGFILE
fi
echo "---------" >> $LOGFILE
done
Use this, aws_cleanup_snapshots.bsh in your crontab to automatically delete snapshots that are beyond the requested SnapRate value. Also, set this to run every 30 minutes.
#!/bin/bash
################################################################
#
# aws_cleanup_snapshots.bsh
# Description: Clean up snapshots
#
# Last edit: 12/7/2018
#
# Prereq:
# aws cli
# jq
################################################################
source /etc/profile
PATH=$PATH:/usr/local/bin
echo "Path is set to $PATH"
echo "AWS_CA_BUNDLE is at $AWS_CA_BUNDLE"
LOG_DATE=`TZ='America/New_York' date +%Y%m%d`
LOGFILE="${BASH_SOURCE%/*}/logs/aws_cleanup_snapshots_$LOG_DATE.log"
echo "Logs are written to $LOGFILE"
FORMAT_DATE=`TZ='America/New_York' date +%Y%m%d.%H%M%S`
echo "=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=" >> $LOGFILE
echo "Script Start" >> $LOGFILE
echo "Current Time: $FORMAT_DATE" >> $LOGFILE
CYCLE=`${BASH_SOURCE%/*}/get_cycle.bsh`
CYCLE_WORD_ARRAY=("Half-Hourly" "Hourly" "Daily" "Weekly")
CYCLE_LETT_ARRAY=("M" "H" "D" "W")
CYCLE_TIME_ARRAY=(30 60 1440 10080)
CURRENT_CYCLE=${CYCLE_LETT_ARRAY[CYCLE]}
echo "Current Cycle: ${CYCLE_WORD_ARRAY[CYCLE]}" >> $LOGFILE
EC2_AZ=`curl http://169.254.169.254/latest/meta-data/placement/availability-zone`
REGIONID="`echo \"$EC2_AZ\" | sed 's/[a-z]$//'`"
VOLUMES=`aws ec2 describe-volumes \
--filter "Name=tag-key,Values='SnapRate'" \
"Name=tag-key,Values='Name'" \
"Name=tag-key,Values='Application'" \
"Name=tag:Mode,Values='Auto'" \
"Name=tag:Keep,Values='Yes'" \
--query "Volumes[*].{VolumeID:VolumeId, \
Name:Tags[?Key==\\\`Name\\\`].Value, \
SnapRate:Tags[?Key==\\\`SnapRate\\\`].Value}" \
--region $REGIONID`
echo "=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=" >> $LOGFILE
for VOL in $(echo "${VOLUMES}" | jq -c '.[]'); do
VOLID=`echo ${VOL} | jq -r '.VolumeID'`
echo "Volume ID: $VOLID" >> $LOGFILE
SNAPRATE=`echo ${VOL} | jq '.SnapRate' | jq -r .[0]`
echo "SnapRate: $SNAPRATE" >> $LOGFILE
IFS='/' read -r -a SNAPARRAY <<< "$SNAPRATE"
CYCLE_VALUE=${SNAPARRAY[CYCLE]}
if [ $CYCLE -lt ${#CYCLE_TIME_ARRAY[@]} ]
then
timeaway=$((${CYCLE_TIME_ARRAY[$CYCLE]} * $CYCLE_VALUE * -1))
else
timeaway=-10080
fi
#This time is in Zulu because the start_time on Snapshot is also Zulu
THRESHOLD=`date -d "+$timeaway minutes"`
SNAPS=`aws ec2 describe-snapshots --owner-ids self \
--filters "Name=tag:Mode,Values='Auto'" \
"Name=volume-id,Values=$VOLID" \
"Name=tag:Cycle,Values=$CURRENT_CYCLE" \
"Name=status,Values=completed" \
--query "Snapshots[*].{SnapshotId:SnapshotId, \
Description:Description, \
StartTime:StartTime, \
Cycle:Tags[?Key==\\\`Cycle\\\`].Value}" \
--region $REGIONID`
for SNAP in $(echo "${SNAPS}" | jq -c '.[]'); do
SNAPID=`echo ${SNAP} | jq -r '.SnapshotId'`
START_TIME_STRING=`echo ${SNAP} | jq -r '.StartTime'`
START_TIME=`date -d $START_TIME_STRING`
if [[ $(date -d "$START_TIME" +%s) < $(date -d "$THRESHOLD" +%s) ]]
then
echo "Delete SnapshotID: $SNAPID" >> $LOGFILE
OUTPUT=`aws ec2 delete-snapshot --snapshot-id $SNAPID --region $REGIONID`
echo $OUTPUT >> $LOGFILE
fi
done
echo "--------------------------" >> $LOGFILE
done