This is a guest post by Graham French
Putting the power into PowerCLI
Working in the world of virtualization, you meet some tough challenges. Some you’ve seen before, most of them are the first and last time you’ll ever encounter them. This particular challenge was new to me. How to shut down 550 virtual machines at the same time, in an orderly manner and then power them all back up, again in an orderly fashion. I was looking to shut down the core network for some major remedial work over a weekend and had to have it up and running for the client’s 5,000 users on Monday morning.
This task was part of the programme of works that I was undertaking for a local government client and their under resourced staff were struggling with their day jobs of keeping the lights on. This was a task that naturally lended itself to a PowerCLI script. However, I quickly found out that members of the team reporting to me, were not familiar with PowerCLI or PowerShell scripts. So this one was going to be down to me.
As well as creating the scripts, I had to ensure that the strategy for the power down and power up was sound, well researched and implemented with as little human interaction as possible. As there was the potential for the staff to be in the office in the late hours of Sunday night or early hours of Monday morning, I needed to ensure that the scripts were easy to follow and interpret. Therefore the likely tired admin guys needed to do as little thinking as possible, just run the scripts.
It’s rare that an organisation shuts down its entire server infrastructure in one go, so I had to be mindful of the fact that my client had never had to do something like this previously and would be unfamiliar with the process. You can’t just power down servers in any old fashion, you have to ensure that certain servers are powered off before others.
As it is impossible for the main VMware administrator to intimately know all of the applications running on the infrastructure, I had to ensure that the applications manager was onboard and up to speed with the requirements of this part of the project. His team was responsible for the range of applications that the 5,000 end users accessed on a daily basis and would have a good overview on what the shutdown and startup process would be for each individual application suite. The application team were also responsible for creating and using the test scripts to ensure that all of the applications were functioning correctly.
From the data returned by the applications team, it was decided to break down the scripts into five separate phases for power down. That would mean we would have four phases for the applications and the fifth for the IT infrastructure itself (Active Directory/DNS/DHCP etc)
From a high level, an example of the shutdown priorities were as follows;
Web Servers, Load balancers, print servers
Application Servers, other servers in a distributed architecture application
Database servers, clustered servers
NAS headers, File Servers
Active Directory, DNS, DHCP, Virtual Appliances
As I’d done a small amount in the past with PowerShell, I decided to look for some sample scripts that I could use as a framework to customise for my particular requirements. I found a shutdown/startup script on Mike Preston’s website at http://www.discoposse.com/ and then got to work.
As I wanted to make this as painless as possible, I decided to have the script read in the server names from a series of CSV files. One file for each tier of the shutdown process, both virtual and physical. Also, as the startup procedure may be different from the power down, I created another separate script and set of CSV files for that. They also include logging, so everything, including errors and all outcomes are captured in a time stamped text file for later scrutiny.
The application team ran their tests before the shutdown process, this was done for two reasons. The first was to ensure that their scripts were correct and valid, the second and most important reason, was to ensure that the applications were working correctly before we powered down the VM’s and any issues they had beforehand could be either fixed now or documented and remedied at a later date.
Then the shutdown scripts were run successfully and the infrastructure shutdown in a controlled and orderly manner. Then the network guys got to work and they had completed their activities before lunchtime on Sunday.
All that was left to do was to run the startup scripts for the Virtual Machines and restart the physical servers and then ask the application team to re-run their tests.
By mid-afternoon on Sunday, everything was backup and running and fully tested. The reconfigured network was working exactly as planned and there had been no mishaps along the way.
If I didn’t have the opportunity to use PowerCLI to automate the bulk of the work, it would have been difficult to complete the work over a single weekend and almost impossible to remove any potential for human error. The pre-work undertaken in the weeks before the weekend and the testing of the scripts, certainly paid off on the day!
You can download the full set of scripts and CSV files for your own use here
Happy scripting! 🙂
Graham F French is an IT veteran of almost two decades and specialises in virtualisation and cloud infrastructure technologies. He can be found on twitter at @NakedCloudGuy. He holds VCP5-DCV and MCSE qualifications and is currently studying for the VCAP-DCD.