Business Continuity Concepts | CompTIA IT Fundamentals FC0-U61 | 6.7

In this video you will about the business continuity concepts such as fault tolerance & disaster recovery.

Fault Tolerance

Fault tolerance is the way in which an operating system (OS) responds to a hardware or software failure.  The keys to fault tolerance in IT include replication and redundancy.

Replication

Replication is the continuous copying of data changes from one device to another device.  To make replication possible, many other factors must be taken into account, including hardware and data redundancy, backups, backup storage, and contingency planning.

Redundancy

A fully redundant system describes a component of a computer or network system that is used to guard the primary system from failure by acting as a back up system. Redundant components can include hardware, software, and networks which are designed to switch over automatically to the secondary components in case of a failure. Here are some examples of how redundancy can be achieved:

Data

Data redundancy is the existence of data that is additional to the actual data and permits correction of errors in stored or transmitted data. The additional data can simply be a complete copy of the actual data, or only select pieces of data that allow detection of errors and reconstruction of lost or damaged data up to a certain level. Data redundancy can be achieved by using high availability databases, RAID arrays for storage, and backups.  Some database apps include high availability (the ability to recover from a failure quickly) as a configuration option.  For example, some editions of Microsoft SQL Server and Oracle support high availability options.

RAID Arrays

Redundant Array of Inexpensive Disks (RAID) is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both.  Despite the name, RAID 0, also known as striping, is not redundant storage. RAID 0 treats two drives as a single logical unit by striping data continuously across both drives. This improves read/write performance, but if either drive fails, all data is lost.  Actual redundancy is available with all other RAID levels. The most common RAID levels are 1, 5, and 10 (also known as 1+0). Some desktop computers and most servers have built-in RAID host adapters. RAID adapter cards can be added when necessary.

RAID Levels

Software RAID is supported by most operating systems that do not have RAID-compatible host adapters.  Software RAID supports the same type of RAID arrays as hardware RAID. Real-time data redundancy for any type of data can be achieved by using RAID 1, RAID 5, or RAID 10 storage arrays.  However, RAID is not a substitute for backups. Depending on how backups are created, backup files can be used immediately or might need to be restored before use.

Network

Network redundancy is a process through which additional or alternate instances of network devices, equipment and communication mediums are installed within network infrastructure. It is a method for ensuring network availability in case of a network device or path failure and unavailability. Network redundancy is designed to help businesses achieve 99.999% (five-nines) network reliability, meaning minimal downtime and maximum availability. Five-nines is also the goal of high availability databases.

Power

A redundant power supply is when a single piece of computer equipment operates using two or more physical supplies. Each of the power supplies will have the capacity to run the device on its own, which will allow it to operate even if one goes down. Most redundant power supplies are designed for rack-mounted use.  To provide replacement power for a particular device in the event of AC power failure, connect it to a battery backup unit (also known as an uninterruptible power supply, or UPS). To provide redundant power protection against a failure of the electrical service to a location, use a backup standby generator.

Backup Considerations

When hardware fails, you can purchase replacements, but when storage devices fail, the data is lost unless you have backup copies.  A backup is a copy of information stored on a computing device (laptop, desktop, server, or mobile device). A backup can be restored in the event of data loss.  A backup copy of information on a system can be used by the system in case the original is lost or corrupted. There are many backup methods designed for different requirements.  The following sections discuss backup methods and when to use them.

Types of Data Backups

There are three different backup types:

  • Full backup: backs up all files whether they were backed up previously or not. When the file is backed up, a file attribute known as the archive bit is changed to indicate the file has been backed up.
  • Differential backup: backs up all files changed since the last full backup.
  • Incremental backup: backs up only the files changed since the last full or incremental backup.

A differential backup and incremental backup are two different methods of backing up only changed files. If incremental backups are used between full backups, in the event that a full restoration is needed, the last full backup and all incremental backups must be restored. However, if differential backups are used between full backups, only the last full backup and the last differential backup must be restored.

Data

The most fundamental backup concerns include what to backup, and where to store the backup.  The following sections will discuss the question of what to backup.

File Backups

A file backup is the practice of protecting important data by storing duplicate files on a different location on the same drive, on different drives, or different computers, and/or different sites. This type of data can be backed up using several methods:

  • File Synchronization:  Files are copied from the original location to a matching folder on another local or network storage device by an app that tracks additions, changes, and deletions.
  • File Copying:  Files are copied from the original location to another location by the operating system’s built-in copy commands.
  • File History:  Files are copied to another location in such a way that different versions of the same file can be restored when desired.
  • File Backup with Compression:  Files are compressed into archives that are created on another location by a backup utility that might be provided with the operating system or a third-party provider.  Files must be retrieved by the backup utility. Depending on the utility, the files might need to be restored to their original location or another location before they can be used.  The operating system tracks which files have been backed up so they are not backed up again.

Critical Data

Critical data is the data that is critical to success in a specific business area, or data required to get the job done. It needs to be backed up and available more quickly than old or stale data. You can make access to your latest data easier in case of loss with these steps:

  • Move outdated information to a separate drive.
  • Use file synchronization and versioning on current folders only.
  • Run periodic backups on current folders only.

Database

A database backup is the process of backing up the operational state, architecture and stored data of database software. It enables the creation of a duplicate instance or copy of a database in case the primary database crashes, is corrupted or lost.

OS Backups

OS (operating system) backups enable a crashed system to be returned to use quickly or migrated to new hardware.  An operating system backup also includes installed apps, is usually created as an image backup, and often uses different backup software than file-oriented backups.  Operating system backups are often known as disaster recovery backups.

Location

The following sections compare and contrast backup storage locations.

Stored Locally

Backups that are stored locally can be restored immediately to local systems in the event of file corruption, limited data loss, or widespread data loss.  However, if your organization uses mobile systems, locally stored backups might not be restorable until the mobile systems are back in the home office.

Cloud Storage

Cloud-based backup runs continuously whenever a device is connected to the internet, and devices that suffer data loss can restore a backup in the same way.  Some backup vendors also offer the option of receiving backup files on a portable hard drive for faster restoration.

On-Site vs. Off-Site

On-site backups can be accessed immediately for restoration to on-site systems. However, in the event of a manmade or natural disaster of sorts, on-site backup files can be lost.  Off-site backup files are stored away from the point of need but must be delivered to the device location for backup. Via a communication of on-site and off-site backup storage with local and cloud components, backups can provide both quick access and off-site security.

Contingency Plans

Some elements of a contingency plan in order to keep IT running in the event of any type of interruption should include the following:

  • Quick access to backup information
  • Availability of replacement systems that can be used to continue business
  • Email, website, and social media accounts that can be used to update current and potential customers of any changes in phone numbers or physical addresses during a crisis
  • Rapid deployment of replacement IT hardware and software systems

Disaster recovery sites are sites where IT functions can be set up when a disaster prevents the use of the original location. These fall into three categories:

  • Cold Site:  A cold site has power, HVAC, and network connections, but would need equipment and data before it could be used for IT functions.  This is the least expensive to maintain before a disaster but takes the longest time to set up during a disaster.
  • Warm Site:  A site that has power, HVAC, network, and hardware suitable for IT functions is a warm site.  Systems at the warm site might need to have operating systems, apps, and data restored, or operating systems and apps could be already installed to save time.  A warm site costs more than a cold site, and would require ongoing maintenance of hardware and possibly software, but can be made ready in hours, rather than days, compared to a cold site.
  • Hot Site:  A hot site is, in IT terms, a duplicate of your primary IT functions, with hardware, apps, and data ready to run in minutes or less in the event of a disaster.  This is the most expensive of the three disaster recovery plans, but for an organization that can afford no downtime, it might be the only one that is worth considering.

Disaster Recovery

Disaster recovery is a set of policies and procedures which focus on protecting an organization from any significant effects in case of a negative event due to manmade or natural disasters. From an IT standpoint, there are three parts to disaster recovery:

  • Data restoration
  • Prioritization
  • Restoring access

As you develop a disaster recovery and business continuity plan, make sure you test and evaluate the policies and procedures you create.  Look for weakness and resource gaps and fix any problems you discover. Train personnel, making sure to clearly define each person’s roles and responsibilities to help improve performance, communication, and coordination and to avoid panic.  Make sure your plans and procedures meet legal and regulatory requirements.

Data Restoration

Data restoration is the process of salvaging inaccessible, lost, corrupted, damaged or formatted data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a normal way.  What can be done in advance to make sure that data can be restored as quickly as possible?

  • Use the fastest local or network connections available for restoring backups.
  • Migrate backups on slower media to faster media.
  • Restore only the data needed for current operations.

The fastest local connection in general use for external drives is USB 3.2 Gen 1 (USB 3.0), which runs at 5Gbps.  If any backups needed for immediate use are stored on older USB 2.0 drives (480Mbps, or ten times slower than USB 3.2 Gen 1), that data should be migrated to USB 3.1 Gen 1 drives before a disaster.  The fastest LAN connections in general use are Gigabit Ethernet (1000Mbps) and Wireless AC (433-1669Mbps). If data will be restored via LAN/WAN connections, USB 3.1 Gen 1 adapters using these standards should be used to replace slower built-in network adapters.

Prioritization

All data is deemed very important, but current data is more important than old data.  Here are some useful rules to help prioritize data within your organization (keep in mind, rules may vary by industry):

  • Define your key assets and restore them first.
  • Make sure you restore information that enables you to do current business before you restore historical information.
  • Make sure the information you need first can be restored as quickly as possible.

Define your key assets. Key assets include:

  • Customer or client information
  • Accounting information
  • Products and marketing plans
  • Line-of-business information

Restoring Access

A disaster recovery plan needs to be in place in order to restore vital services not only to your company, but to that of your customers and potential customers as well. Services like telephone, internet, email, & social media connections need to be restored as soon as possible.