fbpx
Call us +1-231-421-7160

Unification: A Key Role in Web Hosting Platform Management

Technology is a funny thing. There is a perpetual push to create systems that are better, stronger and faster. This constant drive has led to innovations unimaginable just a few years ago, but being on the leading-edge leaves a lot of room for error. It’s a paradox really: in order to succeed you must experience failure. The web hosting industry certainly isn’t exempt from this catch 22. This blog explores the early days of our web hosting service, some of the failures we experienced on our journey, and how we achieved ultimate success by creating a unified web hosting platform.

pawelp-blog-post

Our Web Hosting Startup Phase

Starting our web hosting company was great fun. We were a web development company at first and we had a strong client base that wanted to host websites with us. Building our first server, turning it on, and seeing it rock gave us a lot of satisfaction. In no time, we were off and running.

As our company grew, so did our infrastructure. We added a few more servers and we created the Cloud Control Panel™(CCP), a management portal where clients could access server settings and manage their own applications. Managing two or three servers was fairly easy, but each server was inherently different. After reaching a certain number of clients, we were getting hosting support requests that required different solutions based on which server the site was hosted. Some servers were running on Centos, some on Debian and we even had some Free BSD machines.

To make matters more complex, clients on different servers needed a slightly different version of the CCP, and things got a little messy. We quickly came to the realization that it simply wasn’t realistic to provide individual hosting solutions for each customer. In short, we had a mess to clean.

Our First Attempt at Building a Better Place

After researching platform automation tools we started to incorporate solutions like Puppet and Ansible. We spent hours upon hours learning Ruby and Puppet’s declarative languages. We got to the point where we were able to write an automation that could control multiple configuration files and even install new software.

But we still needed to adjust Puppet configs for different server types, and the worst thing that could happen did happen. When running an update for a major platform component, a few servers went down, causing downtime for clients. We were on top of it and identified the problem quickly: a package installer caused the issue which was remedied by adding a small exception for a certain server group. As we added exceptions, we found that overtime our automation procedures contained more exceptions than actual procedures.

It was a vicious cycle: we performed and update, experienced downtime, upset clients, and created exceptions to fix the immediate issue. It wasn’t a fun experience because providing the most uptime possible is our goal. This was the turning point - a critical moment for us. We took a big step back to look at our entire platform so we could fix the underlying problem.

Doing the Right Thing - Server Unification

Platform unification can be difficult and we had to make some big choices. We selected a single software platform that best suited our business model and the needs of our clients. We decided to use a brand new, shiny virtual machine that contained all of the features our clients needed.

When creating our hosting stack we identified all machine-specific variables and reviewed all config files. When we located and resolved potential issues we always followed the a general rule: update one, update all, no exceptions. If one server was adjusted, adjustments were generic enough to be applied to all servers. We created a top-of-the-line server stack generic enough for all machines, and we exported all machine-specific settings into a shared file resulting in one distinct config instead of several.

The next step was to build our first bare metal servers. We used VM imaging to replicate the server. VM management made it easier to create and store system images. If something went wrong with a new version, we could easily take a step back and revert to a previous image.

Once the new servers were running smoothly, it was time to migrate our customers to their new homes. This process took time and patience, but ultimately each server was running the same operating system image. We had a good number of servers in the platform, and it became too time intensive to update each server individually. It was time again to look at automation tools and we started with Pdsh.

Pdsh is not a real automation tool. It’s a shell that can run parallel commands on multiple servers. Because we unified the platform, applying software updates or config file changes is a lot easier now and we know results of the Pdsh actions will be same across the whole platform. Pdsh is accompanied by two useful programs - pdcp and dshbak. Pdcp allows you to easily distribute new files over the servers and dshbak makes output from pdsh look much better.

Pdsh is useful, but it does have limitations. It doesn’t keep track of configuration changes and it may be hard to perform complex tasks. If you gain confidence with mass server management, consider tools mentioned earlier in this blog like Chef, Puppet or Ansible. These tools can perform a variety of more complex tasks.

Tips from CloudAccess.net System Administrators

You might say that we had a bumpy road, but all is well that ends well. Identifying platform issues forced us to take a real close look and make hard decisions. We had to throw away everything we built during the early days in order to build a unified platform.

We completely rebuilt our hosting infrastructure, making it a better place for our customers, and that’s the name of the game: offering a quality product. All of our servers run the same flavor of Linux, the same hosting stack and each of our major services are dependent on very few config files. At this point, new server provisioning takes less time, and we can trust that the server was built using the same VM image.

The new environment we created boosted the development of our Cloud Control Panel™(CCP), making our platform even more attractive. CCP developers experience no issues when implementing new features because we know exactly what needs to be done server-side.

Whether you own a company or you’re a system administrator, don’t be afraid to think big about small things, especially when you’re just getting started. When building two servers to host 100 customers think like you’re building a datacenter for 20,000 users. You’ll save time in the long run. Trust me.

Introducing the CloudAccess.net API
Site Sanitization: Cleaning up a Hacked Website