DevOps in a nutshell
The term DevOps is a conjunction of development (software developers) and operations (manage and put software into production). Many IT organizations have started to adopt such a concept, but the question is how and what? Is it a job? Is it a process or a part of ITIL best practices?
DevOps is a development and operations compound, which basically defines a methodology of software development. It describes practices that streamline the software delivery process. This is not all. In fact, it is more about raising communication and integration between developments, operators (including administrators), and quality assurance. The essence of the DevOps movement is in the benefits of collaboration. Different disciplines can relate to DevOps and bring their experiences and skills together under the DevOps label to build a cover of shared values.
So, we agree that this is a methodology that puts several disciplines on the same wave length as shown in the following figure:
This new movement is intended to resolve the conflict between developers and operators. Delivering a new release affects the production systems that put different teams in a change conflict. DevOps fills the gap and optimizes each side focus.
Note
DevOps is neither a toolkit nor a job; it is the role of synergy.
Let's see how DevOps can incubate a cloud project.
DevOps and cloud – everyone is coding
Let's bring down the cloud architecture's layers under the scope and see what we have. Basically, we have Software as a Service (SaaS), which operates at all layers of the IT stack. Then comes Platform as a Service (PaaS), where databases and application containers are delivered on demand to reach the bottom, where we find Infrastructure as a Service (IaaS) delivering on-demand resources, such as virtual machines, networks, and storage. All these layers form complete, basic stacks of the cloud. You should think about how each layer of the cloud should be developed and implemented.
Obviously, layered dependency relies on the ability to create full stacks and deliver them under a request of simple steps. Remember that we are talking about a large scalable environment! The amazing switch to bigger environments nowadays is to simplify everything as much as possible. System architecture and software design are becoming more and more complicated. Every new release of software affords new functions and new configurations. Then, you are asked to integrate the new software in a particular platform where somehow, sufficient documentation about requirements or troubleshooting is missing! You may ask yourself questions such as: Did something change? What kind of change? To which person should we assign a ticket to fix it? What if it just does not work? According to your operational team, the software needs to be updated often in order to apply the new functions. The update might happen every time you get that e-mail asking you to update the software. You may start to wonder whether your operational team will be happy about this announcement, contrary to the software provider who sent the e-mail with the header; "we are happy to announce the release of the new version of our software; please push the button."
Let's take a serious example that crystallizes this situation. A company's operational team was extremely happy about purchasing a new storage appliance that worked well on redundancy. During the first few months, everyone was happy; nothing was broken! It worked like a charm!
When the day came to change the charm to a true headache, the storage system failed to fail over. Both nodes stopped working. No one could access any data! In spite of the existence of a backup somewhere else, the operational team did not like the "was that highly available?" part. After a long night of investigation, the error of causing the HA to fail was concluded from the log files: there was an appliance system update! The previous version was somehow automatically updated and broke the HA function in the active node. The update was propagated to the passive one. Unfortunately, the active version decided to fail over and tackle the cluster that was passive. However, that did not work. It was as if there was a bug somewhere in the code of the previous release!
What about if you are running similar solutions for other systems? Everything is running software to keep it running! In this case, it is wise to stop for a while and ask yourself questions such as this: What is missing? Shall I hire more people for system maintenance and troubleshooting? Obviously, if you take a look at the previous example, you will probably notice that the owner of the hardware does not really own it!
The simple reason is that being dependent on external parties will affect your infrastructure efficiency. Well, you may ask a pertinent question: Shall I rewrite the software appliance by myself? Let's reformulate the question: Shall I write the code? The answer, almost always, is yes! It is an ambiguous answer, right? Let's keep using examples in order to clear out this fogginess. We talked about DevOps, the synergetic point between developers and operational guys. Everything is communicated between them thanks to the magic of DevOps. Remember that it is our goal to simplify things as much as possible! Administrating and deploying a large infrastructure would not be possible without adopting a new philosophy: Infrastructure as code. At this point, we bring in another aspect of the DevOps style: we see our machines as pieces of code! In fact, we have now assigned the main tasks of DevOps.
Where everything will be seen as code, it might be possible to model a given infrastructure as modules of code. What you need to do is just abstract, design, implement, and deploy the infrastructure.
Furthermore, in such a paradigm, it will be essential to adhere to the same discipline as an infrastructure developer as compared to a software developer.
Without doubt, these terms are quite misty at the first glance. For this reason, you should ask this question related to our main topic about OpenStack: if infrastructure as code is so important for a well-organized infrastructure deployment, what is the case with OpenStack? The answer to this question is relatively simple: developers, network, and compute engineers and operators are working alongside each other to develop OpenStack Cloud that will run our next generation data center. This is the DevOps spirit.
DevOpsing OpenStack
OpenStack is an open source project, and its code is extended, modified, and fixed in every release. Of course, it is not your primary mission to check the code and pe into its different modules and functions. This is not our goal! What can we do with DevOps, then? Eventually, we will "DevOps" the code that makes the code run! As you might have noticed, a key measure of the success of a DevOps story is automation. Everything in a given infrastructure must be automated!
Breaking down the OpenStack pieces
Let's gather what we covered previously and signal a few steps towards our first OpenStack deployment:
- Break down the OpenStack infrastructure into independent and reusable services
- Integrate the services in such a way that you can provide the expected functionalities in the OpenStack environment.
It is obvious that OpenStack includes many services, as discussed in Chapter 1, Designing OpenStack Cloud Architecture. What we need to do is see these services as packages of code in our "infrastructure as code" experience. The next step will investigate how to integrate the services and deploy them via automation.
Starting to deploy the services that are seen as code is similar to writing a web application or some software. Here are important points you should not ignore during the entire deployment process:
- Simplify and modularize the OpenStack services
- Integrate OpenStack services to use other services
- Compose OpenStack services as building blocks by accomplishing a complete integration between systems
- Facilitate the modification and improvement of services when demanded
- Use the right tool to build the services
- Be sure that the services provide the same results with the same inputs
- Switch your service vision from how to do it to what we want to do
- Details comes later; focus on the function of the service first
As an infrastructure developer, you will start building and running the entire infrastructure on which all systems, either being tested or in production in a system management platform, are operating.
In fact, many system-management tools are intensely used nowadays due to their efficiency of deployment. In other words, there is need for automation!
You have probably used some of the automation tools, such as Chef, Puppet, Salt, Ansible, and many more. Before we go through them, we need to create a succinct, professional code-management step.
Making the infrastructure deployment professional
Ultimately, the code that abstracts, models, and builds the OpenStack infrastructure is committed to source code management. Most likely, we reach a point where we shift our OpenStack infrastructure from a code base to a redeployable one by following the latest software development best practices.
At this stage, you should be aware of the quality of your OpenStack infrastructure deployment, which roughly depends on the quality of the code that describes it.
Maintaining the code needs more attention in order to have a bug-free environment when it is delivered as a final release. We will consider the "bug" term in an infrastructure development context as harmful and functional to the system.
It is important to highlight a critical point that you should keep in mind during all deployment stages: automated systems are not able to understand human error when it is propagated to all pieces of your infrastructure. This is essential, and there is no way to ignore it. The same way is applicable to traditional software development discipline. You'll have to go through an ensemble of phases and cycles using agile methodologies to end up with a final release that is a normally bug-free software version in production.
Remember the example given previously? Surprises do happen! However, if an error occurs in a small corner of a specific system and needs to be fixed in that specific independent system, it might not be the same when considering the automation of a large infrastructure.
On the other hand, if mistakes cannot be totally eradicated at the first stage, you should think about introducing more flexibility into your systems by allowing wise changes without exaggeration. The code's life management is shown in the following figure:
Changes can be scary—very scary indeed! To handle changes, it is recommended that you:
- Keep track and monitor the changes at every stage
- Flex the code and make it easy to change
- Refactor when it becomes difficult to change
- Test, test, and test
Keep checking every point that has been described previously till you start to get more confident that your OpenStack infrastructure is being conducted by code that won't break.
Bringing OpenStack to the chain
To keep the OpenStack environment working with a minimum rate of surprises, ensure that its infrastructure code delivers the functionalities that are required.
Beyond these considerations, we will put the OpenStack deployment in a toolchain, where it will inform you about how we will conduct the infrastructure development from the test stage to the production stage. Underpinning every tool selection must be the purpose of your testing endeavors, and it will also help you ensure that you build the right thing.
Continuous integration and delivery
Let's see how continuous integration can be applied to OpenStack. Whatever we use for system management tools or automation code will be kept as a standard and basic topology, as shown in the next model, where the following requirements are met:
- SMTA can be any System Management Tool Artifact, such as Chef cookbook, Puppet manifest, Ansible playbook, or juju charms.
- VCS or Version Control System stores the previous artifacts that are built continuously with a continuous integration server. Git can be a good outfit for our VCS. Alternatively, you can use other systems, such as CVS, Subversion, Bazaar, or any other system that you are most familiar with.
- Jenkins is a perfect tool that listens to changes in version control and makes the continuous integration testing automated in production clones for test purposes.
Take a look at the model:
The proposed topology for infrastructure as code consists of infrastructure configuration files (Chef cookbooks, Puppet artifacts, and Vagrant files) that are recorded in a version control system and are built continuously by the means of a continuous integration (CI) server (Jenkins, in our case). Infrastructure configuration files can be used to set up a unit test environment (a virtual environment using Vagrant, for example) and makes use of any system-management tool to provision the infrastructure (Chef, Puppet, and so on). The CI server keeps listening to changes in version control and automatically propagates any new versions to be tested, and then it listens to target environments in production.
Note
Vagrant allows you to build a virtual environment very easily; it is based on Oracle VirtualBox (https://www.virtualbox.org/) to run virtual machines, so you will need these before moving on with the installation in your test environment.
Using such model designs could make our development and integration code infrastructure more valuable. Obviously, the previous OpenStack toolchain highlights the test environment before moving on to production, which is normal! However, you should give a lot of importance to, and care a good deal about, the testing stage, although this might be a very time-consuming task.
Especially in our case, with infrastructure code within OpenStack, things can be difficult for complicated and dependent systems. This makes it imperative to ensure an automated and consistent testing of the infrastructure code.
The best way to do this is to keep testing thoroughly in a repeated way till you gain confidence about your code. When you do, introducing changes to your code when it's needed shouldn't be an issue.
Let's keep on going, get the perfect tool running, and push the button.