Posts Tagged ‘retention’

AWS Cloud Protection Manager Part 3: Backup and Restore

August 21, 2017

Backup

Backups are created according to the schedule specified in the backup policy. We discussed how to configure backup policies in the previous blog post of the series. The list of backups you see on the Backup Monitor tab are your restore points. Backups that are older then the specified retention policy will be purged from the list and you will not see them there, unless you move them to “Freezer”.

It is important to understand that apart from volume snapshots, for each backed up instance CPM also creates an AMI. Those who has hands-on experience with AWS may already know, that AMIs is the only way to create clones of Windows EC2 instances in AWS. If you go to AWS console and try to find a clone action under the instance Action menu, you won’t find any. You will have “Create Image” instead. It creates an AMI, from which you can then spin up a clone of an instance the image was created from.

CPM does exactly that. For each backup policy the instance is under, it creates one AMI. In our example we have four backup policies, that will result in four AMIs for each of the instances. Every AMI has to have at least one storage volume. So CPM will include the root volume of each instance into AMI, just because it has to. But AMIs are required only to restore EC2 instance configuration. Data is restored from volume snapshots, that can be used to create new volumes from them and then attach them to the instance. You can click on the View button under Snapshots to find the corresponding snapshot and AMI IDs.

There is a backup log for each job run as well that is helpful for issue troubleshooting.

Restore

To perform a restore click on the Recover button next to the backup job and you will get the list of the instances you can recover. CPM offers you three options: instance recovery, volume recovery and file recovery. Let’s go back to front.

File recovery is probably the most used recovery option. As it lets you restore individual files. When you click on the “Explore” button, CPM creates new volumes from the snapshots you are restoring from and mount them to the CPM instance. You are then presented with a simple file system browser where you can find the file and click on the green down arrow icon in Download column to save the file to your computer.

If you click on “Volume Only”, you can restore particular volumes. Restored volumes are not attached to any instance, unless you specify it under “Attach to Instance” column. You can then select under “Attach Behaviour” what CPM should do if such volume is already attached to the instance or if you want to automatically detach the original volume, but the instance is running (you can do it only if instance is stopped).

And the last option is “Instance”. It will create a clone of the original instance using the pre-generated AMI and volume snapshots, as we discussed in the Backup section of this blog post. You can specify many options under Advanced Options section, including recovery to another VPC or different availability zone. If anything, make sure you specify a new IP address for the instance, otherwise you’ll have a conflict and your restore will fail. Ideally you should also shut down the original EC2 instance before spinning up a restore clone.

Advanced Features

There are quite a few worth mentioning. So far we have looked at simple EC2 instance restore. But you don’t have to backup whole instances, you can also backup individual volumes. On top of that, CPM supports RDS database, Aurora and Redshift cluster backups.

If you run MS Exchange, Sharepoint or SQL on your EC2 instances, you can install CPM backup agent on them to ensure you have application-consistent backups via VSS, as opposed to crash-consistent backups you get if agent is not used. If you install the agent, you can also run a script on the instance before and after the backup is taken.

Last but not least is DR. Restoring to another availability zone within the region is already supported on instance recovery level. You can choose availability zone you want to restore to. It is not possible to recover to another region, though. Because AWS snapshots and AMIs are local to the region they are created in. If you want to be able to recover to another region, you can configure DR in CPM, which will utilise AWS AMI and snapshot copy functionality to copy backups to another region at configured frequency.

Conclusion

Overall, I found Cloud Protection Manager very easy to install, configure and use. If you come from infrastructure background, at first glance CPM may look to you like a very basic tool, compared to such feature-rich solutions like Veeam or Commvault. But that feeling is misleading. CPM is simple, because AWS simple. All infrastructure complexity is hidden under the covers. As a result, all AWS backup tools need to do is create snapshots and CPM does it well.

Advertisements

AWS Cloud Protection Manager Part 1: Intro to Snapshots

August 7, 2017

Cloud computing conceptOverview

We have all been dealing with on-premises infrastructure for years before public cloud became a thing. Depending on complexity and size of your environment, retention and other requirements you can choose from plethora of products, such as Veeam, Commvault, NetWorker, you name it.

When public cloud started getting more traction we realised that the paradigm has shifted. Public cloud providers are pushing us to treat instances of application servers as cattle versus pets and build fault tolerance into code and not rely on underlying infrastructure reliability.

If application is no longer monolithic and is distributed among multiple instances of the same server, loss of any one instance has little overall impact on the application and does not require a restore. We can simply spin up a new instance instead.

However, majority of applications in today’s IT environments are still monolithic and need to be backed up. Data needs to be protected as well, if it gets corrupted or if for any other reason we need to roll back our application to a certain point in time, backups are necessary.

AWS Snapshots

When I started researching the topic of backup in the cloud I realised that it is still a wild west. Speaking of AWS specifically, if you are using one of the native AWS services, such as RDS, you would usually have the backup feature built-in. Otherwise, if you simply want to backup some application or service running inside an AWS instance, all you have is snapshots.

We have all heard the mantra, that snapshot is not a backup, because usually it is kept in the same place as the original data it was created from. This is true for most of the traditional storage arrays. AWS is slightly different.

Typically when you start an EC2 instance, its disk volumes are kept on Elastic Block Store (EBS), which is an equivalent of block storage in traditional infrastructure world. But when you create a snapshot of an EBS volume, the snapshot is transferred to S3 object storage.

When the first snapshot is created, the whole volume needs to be copied to S3. It may take some time to complete, depending on the volume size, but the copy process happens in the background and doesn’t impact the original EC2 instance. All subsequent snapshots will be significantly quicker, unless you overwrite large amount of data between snapshots.

CloudWatch Event Rules

As we can see, unlike traditional storage arrays, AWS snapshots are kept separately from the original data. They can even be restored to another availability zone or copied to another region to protect from region-wide failures, which makes AWS snapshots a decent backup solution. But devil is in the details.

Creating a volume snapshot manually is simple, you pick “Create Snapshot” from the Actions menu and the rest is handled by AWS. To automate the process, AWS offers CloudWatch event rules for scheduled snapshot creation. By going to CloudWatch > Events > Create Rule, setting up a schedule, selecting “EC2 CreateSnapshot API call” and the volume ID in the Target section you can implement a simple backup in AWS.

cloudwatch_snapshot1.jpg

However, there are two immediate problems with such approach. The first is scalability of it. Snapshots in AWS are created on volumes, not instances. This has interesting implications, such as it is not possible to create a consistent snapshot across multiple volumes. But more importantly, if you have more than one volume on every instance, you may end up having dozens of CloudWatch rules, that are hard to manage and keep track of.

cloudwatch_snapshot2.jpg

Second is retention. CloudWatch events let you create snapshots, but you cannot specify how long you want to keep them for. You will need to write your own scripts using AWS APIs to delete snapshots, which makes the whole solution questionable, as you can then use APIs to create snapshots as well.

Conclusion

As you can see, AWS does not offer a simple out of the box solution for EC2 backup. You either need to write your own scripts using AWS APIs or lean on backup solutions built specifically to solve the problem of backups in AWS. One of such solutions Cloud Protection Manager we will discuss in the next blog post of the series.

GFS backup scheme in Symantec Backup Exec

March 23, 2012

Grandfather-Father-Son is an industry standard backup scheme, where you have 5 daily backups, 5 weekly backups and as many monthly as you need. Symantec Backup Exec has prebuilt policy for GFS, but before going into configuring backup scheme itself, lets talk a little bit about general backup job configuration in Backup Exec.

Basic Terminology

Inside user interface you see Jobs, Policies, Selection Lists and Media Sets. First of all you need to create Selection List, which describes what you want to backup. There you select files and folders from your Windows, Unix or NDMP servers. Then you create Media Set, which is a collection of tapes with particular append and retention periods. Append period specifies how long data can be added to the same tape and retention period tells for how long data cannot be overwritten. Retention period starts form the time of last append to the tape. Then you create Policy. Policy, by means of templates, defines when backup jobs are run, where backups are stored and what is the type of backup – incremental, differential or full. One policy can consist of several templates. In template you specify backup date and time, as well as target tape library.

GFS Implementation

Backup Exec has a template for GFS backup rotation scheme. Click “New policy using wizard”, choose GFS scheme and then select schedule, target backup device and media sets for daily, weekly and monthly backups. By default Backup Exec suggests the following configuration.

Three tape media sets:

  • Daily Media Set – 1 week overwrite, 1 week append
  • Weekly Media Set – 5 weeks overwrite, 5 weeks append
  • Monthly Media Set – 1 year overwrite, 1 year append

Policy with three templates:

  • Daily Backup – Monday to Friday, Incremental
  • Weekly Backup – every Friday, Full
  • Monthly Backup – first Saturday of each month, Full

Also Backup Exec automatically creates rules to resolve conflicts. For example when both Daily and Weekly backups try to run on Friday, jobs do not conflict, because weekly backups always supersede daily. Same for monthly.

I personally prefer another schedule. First of all, if you run your jobs after midnight, you will need to shift your schedules from Mon – Fri to Tue – Sat. Additionally, I run monthly backup on the first Saturday of the month. Backup Exec by default (taking into consideration my one day shift) would suggest first Sunday for the monthly backup. However, it doesn’t make much sense to have weekly on Saturday and then monthly next day on Sunday. You would just consume more space without any benefit. Also, you can schedule monthly on the last Saturday of the month, but if the last day is Thursday, for example, then you will loose four business days from your monthly backup.

After the policy is created, you need to create backup jobs using this policy by clicking on New jobs using policy. All three jobs will be created automatically according to Selection List, as well as Policy Schedule, Target, and Backup Type parameters.

I’d also recommend everyone to configure notifications. There are general Alerts properties as well as inside each job.

Why I wouldn’t recommend Microsoft Data Protection Manager as a backup solution

February 22, 2012

When taking into account different backup solutions which are in the market, MS DPM 2010 was rather attractive for us. It’s a solution from leading software vendor and it’s cheap. However, DPM has a number of limitations which forced us to abandon it and rebuild our backup procedures from ground up. Here I’d like to describe major points as impartial as I can.

The first thing about DPM is that it uses VSS snapshots as one and only backup method. The major consequence is that you are very limited in flexibility and cannot do incremental, differential or full backups, implement GFS backup strategy or Progressive Paradigm. The only option you have is to exclude weekends or any other particular days from backups. That means ineffective storage utilization and inability to have longer data retention as you could have with flexible backup policy as GFS for instance, not to mention additional spendings on storage.

Another problem of VSS is that it supports only 64 snapshots. Basically, that means if you exclude weekends from your backup policy, you will be able to have backups for 89 day period. It’s clearly not enough if, for example, you work in a financial institution where you have strict policies of long data retention. DPM assumes that you will use tapes for prolonged data retention. If you already have tapes then you are good to go, if not then once again it’s additional expenditures.

Interestingly enough in DPM you cannot have different retention policies for data which resides on the same volume. Say I want to keep database backups for 3 months and transaction logs for the last week. If database backups and transaction logs reside on the same volume then you will have to create the second volume and separate them.

Limitation which I personally find very inconvenient is space reservation. Each time you create a Protection Group you reserve space for it. Say 500GB. And you cannot change it. In case one folder from ten, which you backup from the server, moves to another place and 250GB become free, the only option you have is to destroy Protection Group, loose all backups and recreate it. DPM helps you in situation when you don’t know how your data will grow and you can specify smaller storage size initially and it will automatically grow as needed. However, it can extend only 32 times. If you hit that limit, then you are in the same situation as before.

Another major issue arise when you change protected server name. If server name is changed the only option you have is destroying protection group, loosing all backups and recreating it.

The next limitation I also find inflexible is target storage. In DPM you can only use blocked devices as target storage to keep your backups. So it’s either DAS, FC or iSCSI storage. NAS is not supported.

If you work in SMB then you would probably have issues with installation and support of legacy systems. DPM works only on 64-bit Windows Server 2008 platforms (Windows Server 2003 is not supported). DPM doesn’t support Bare Metal Recovery (BMR) of Windows Server 2003.

And lastly, DPM keeps all data on a raw volume. Raw volumes are more efficient in terms of disk I/O performance. But when it’s helpful for DBMS it doesn’t seem to make any difference for backup software. The downside of it is higher risk of loosing all data in case of volume damage or DPM bug. It’s arguable, so I will leave it as my personal opinion.

In conclusion, it’s rather disappointing to see how software with 7 years history (DPM 2006 was released in 2005) has more limitations than any backup software solution I can think of. Even if you don’t have enough money for Symantec Backup Exec, ARCServe Backup, HP Data Protector or any other software, I would recommend to make more effort and search for some other solution. Otherwise you can fall into the same trap as we did.