CASE STUDY - 0001
- Disaster Recovery using a BDR and Virtualization
Your main server goes down and your business production comes to halt. Do you have a plan of action? Is your business prepared for such events?
Problem: Monday morning and the server does not start. Technologies used: itAMPM BDR system, MXLogic email caching utility Managed Service Contract: Gold Level Senior Technician in Charge: Raymundo Manriquez
Downtime: 3 Hrs. Data Loss: None Disaster Recovery Cost Labor: $0
When confronted with this type of disaster, itAMPM is there to keep your business moving forward by protecting vital business data and to keep your network healthy and up to date. For this case of a Domain Controller with an Exchange Server Role that has unexpectedly shutdown, the customer had all the
necessary services and equipment to minimize business downtime due to a server crash. The server had two
redundant power supplies, both of them went bad. Later it was found out that the building had scheduled electrical work during the weekend, however office managers did not inform itAMPM to remotely turn off the servers during the weekend. Having a full managed GOLD service plan with a BDR unit and MXLogic's email caching service, the client was
able to continue working on a virtualized server until the damaged server was repaired and put online.
Senior Technician in charge Ray Manriquez.
Monday 6:30 am
With managed service support agents monitoring the network, we were immediately notified that repeated attempts to contact the server from the itAMPM's data center were failing.
Monday 8:30 am
Receiving these alerts, help desk technicians respond by trying to
contact the server manually from inside and outside the network to
no avail. A phone call is made by the help desk to the client’s
office where the server is located. They ask to speak to the office
manager to check if the server can be turned on manually. The office
manager attempts to manually start the server but nothing is
happening. A technicians is deployed to the site
Monday 9:30 am
Arriving, the technician could not get the server to start. Symptoms
such as these are usually caused due to malfunctioning hardware. The
hot swappable redundant power supplies are checked by pulling either
one out. Still no response. The server is built with Intel hardware.
Having worked on Intel products for many years, he knows that Intel
Product Support reserves the right to not authorize any returned
merchandise without first troubleshooting their hardware with one of
their technicians. A call is placed to Intel product support and a
support case with the Intel Channel Partner Call Center is opened.
The technician troubleshoots the hardware with the Intel Technical
Support agent on the phone. They work together to find which device
is causing the server from booting. They remove hardware a piece at
a time, attempting a reboot after each piece is removed. Based on
their summations, both technicians concluded that the problem was
with the power supply cage. An RMA was quickly created by the Intel
Technical agent
Monday 10:30am
Spending just around an hour on the phone troubleshooting the
server, the technician knew the physical server would not be
accessible that day. He started the virtualization process of the
server, which is provided by itAMPM in partnership with the BDR
(Backup and Disaster Recovery) system provider. Working together
with the BDR technical support online, the virtualization process
begins. With the BDR backups of the server state and server data
every 15 minutes, it is able to be virtualized and online in about
an hour. Monday 12:30 pm
Once the virtual server was up and running, all things seemed to be
back to normal. Office employees noticed that they had not received
any email while the server was down. There were employees waiting
for important email. How could they know if the email they were
waiting for was already sent to them? Good thing the site was
serviced by itAMPM and their partner, MxLogic. itAMPM technician explained to them that whenever the Exchange Server is unreachable by
MXLogic monitoring systems, all emails are cached offsite onto MxLogic servers. Once the site Exchange server is reachable again from MxLogic, all messages are released.
Monday 1:30 pm
After about an hour, the performance was back up to par. The
technician returned back with the server to headquarter offices.
Tuesday 10:15 am
The new power cage arrives and the work commences by replacing the
used for the new. After installing the new unit the server did not
boot.
Tuesday 10:45 am
Another call is placed to Intel Technical Support and the previous
case is reopened. This time, the Intel Support technician realizes
that even though the fan on the power supplies’ turn on when a power
cord is connected to them, they are malfuntioning. Another power supply with similar power properties confirms this and the server boots up without any problems. An RMA is created for the original unresponsive power supplies.
Tuesday 11:30 am
Upon boot up, the technician notices that the cage fan has failed to
throttle back down from initial startup speed. The throttle of the
cage fan stays high and is extremely loud. According to Intel tech
support, a BIOS upgrade is needed to control the fan speed. itAMPM
technicians upgrade the BIOS. The BIOS is reconfigured with RAID 1
on the four physical disks to create two Virtual Disks. The physical
server is now ready to be restored from the virtual server.
Tuesday 2:00 pm
Before taking it to the client’s site to perform the virtual to
physical transfer of the OS and data, a test is ran on the restore
procedure as far as it can get without having to be at the client
site. The physical server is booted to a boot Bare Metal
Restore(BMR) CD. The CD contains Shadow Protect software to restore
backup images. The raided drives are not seen, but the CD has
options to load drivers for such devices. The driver to be
installed, should be the driver for the operating system that the
server is running which was Windows Server 2003. After extensive and
exhausting effort of installing the driver, the raided drives are
unable to be seen by the CD.
Wednesday 9:00 am
The RMA’s power supplies arrive from Intel and are placed into the
server. The server boots up and power cycles fine. Now the client
has an extra power supply, just in case. The process of installing
the Windows Server 2003 driver is started again. The booting CD
start the system with Windows Vista operating system files. The
choice was made that since the CD is running on Vista operating
system files, the right driver for the raided drives to be seen
should probably be for the Windows Vista operating system. The
correct drivers are loaded and the two raided volumes pop up on the
screen. Success. The server is ready to be taken to the client site
to start the restore from the last backup of the virtual server.
Wednesday 5:00 pm
The technician arrives at the client site after business hours and performed one last backup of the virtual server. He shuts the virtual server down and begins the process of transferring the backup images over to an external hard drive which he will then use to copy the data back onto the physical server using the BMR CD. The transfer takes one and a half hours. Now, the itAMPM technician can connect the external drive, which has the last backup image of the virtual server, to the physical server and copy the images back to each raided drive, respectively. This transfer takes another hour and a half. Once the transfer is complete the physical server is rebooted, it starts up normally like it had never had a problem. The backup to virtual machine and restore to physical has been successful.
- Cost to customer: $0 (Since customer was on Gold Level Managed service contract)
(A charge of $170 for a spare power supply, in case of a power supply failure in the future)