Brief Summary Of The DevOps Handbook - WordPress

Transcription

Brief Summary of The DevOps HandbookBrief Summary ofThe DevOps HandbookHow to create world class agility, reliability and security in technologyorganizationsGene Kim, Jez Humble, Patrick Debois, John WillisPart I - Introduction1. Focus on the Principles of Flow (which accelerate the delivery of work fromDevelopment to Operations to customers), Principles of Feedback (which enable usto create ever safer systems of work, Principles of Continual Learning andExperimentation – which fosters a high trust culture and a scientific approach toorganizational improvement.2. DevOps is the outcome of applying the most trusted principles from the domain ofphysical manufacturing and leadership to the IT value stream.3. DevOps relies on bodies of knowledge from Lean, Theory of Constraints, The ToyotaProduction System, resilience engineering, learning organizations., safety culture,human factors and many others – such as high trust management cultures, servantleadership and organizational change management.Chapter 1 – Agile, Continuous Delivery and the Three Ways1. One of the fundamental concepts in Lean is the value stream – “the sequence ofactivities of an organization undertakes to deliver upon a customer request”2. In DevOps, a technology value stream is the process required to convert a businesshypothesis into a technology enabled service that delivers value to the customer.3. The value stream is often easy to see and observe in manufacturing operations.4. In IT, the value stream begins when any engineer checks in a change in versioncontrol and ends when the change successfully run in production, providing value tothe customer.5. The goal is to have testing and operations happening simultaneously with design/development, enabling fast flow and high quality.6. The three-key metrics in technology value stream are Lead time, Process time (Cycletime) and Percentage Complete and Accurate (%C/A). %C/A reflects the quality ofthe output of each step in our value stream.7. The Three ways – the principles underpinning DevOps area. Enable fast left to right flow of work from Dev to Ops to Customer. This isdone through making work visible, reduce batch sizes, build in quality bypreventing defects and constantly optimize for global goals.b. Enable fast and constant flow of feedback from left to right at all stages ofthe value stream. By seeing problems as, they occur, swarming them untileffective counter measures, feedback loops are shortened and amplified.Srinath RamakrishnanPage 1 of 22

Brief Summary of The DevOps Handbookc. The Third way enables creation of a generative high trust culture thatsupports a dynamic, disciplined and scientific approach to experimentationand risk taking.Chapter 2 – The First way: The Principles of Flow1. We increase flow by making work visible, by reducing batch sizes, and intervals ofwork and by building quality in, preventing defects from being passed todownstream work centers.2. The goal is to decrease the amount of time required for changes to be deployed inproduction and to increase the reliability and quality of those services.3. A significant difference between technology and manufacturing value streams is thatour work is invisible. To help us see where work is flowing well and where work isqueued or stalled, we need to make our work as visible as possible.4. By putting all work for each work center in queues and making it visible - allstakeholders can more easily prioritize work in the context of global goals.5. Controlling queue size (Limiting WIP) is an extremely powerful management tool asit is one of the few indicators of Lead time. Limiting WIP also makes it easier to seeproblems that prevents the completion of work.6. Another key component to create a fast and smooth flow is performing work in smallbatch sizes.7. Large batch sizes result in high levels of WIP and high levels of variability and flowthat cascade through the entire process – resulting in long lead times and poorquality.8. One of the factors in longer lead times is the large number of handoffs which weoften see in a value stream. We must strive to reduce the number of handoffs byautomating significant portions of work or by reorganizing teams that can delivervalue to the customers themselves.9. To reduce lead times and increase throughput we need to continually identify oursystem’s constraints and improve its work capacity.10. Goldratt’s five focusing steps in addressing constraints includea. Identify the system’s constraintb. Decide how to exploit the system’s constraintc. Subordinate everything else to the above decisiond. Elevate the system’s constrainte. If in the previous steps, a constraint has been broken, go back to step one.11. In typical DevOps transformations, the constraints could possibly be ina. Environment creation - Have environments that can be created on demand,and completely self-serviced so that they are available when we need themb. Code deployment – This is about automating deployments as much aspossible with the goal of having it completely automatedc. Test setup and run - Automate tests so that we can execute deploymentssafely and to parallelize them so that the test rate can keep up with thedevelopment rate.Srinath RamakrishnanPage 2 of 22

Brief Summary of The DevOps Handbookd. Overly tight architecture – create loosely coupled architecture so thatchanges can be made safely and with more autonomy, increasing developerproductivity.12. Eliminating waste in software developmenta. Partially done work – becomes obsolete and loses value as time progressesb. Extra processes – add effort and increase lead timesc. Extra features – add complexity and effort to testing and managingfunctionalityd. Task switching - leads to additional time and efforte. Waiting – increase cycle time and prevent the customer from getting valuef. Motion – Handoffs create motion wastes and often require additionalcommunication to resolve ambiguitiesg. Defects – the longer the time between defect creation and defect detectionthe more difficult it is to resolve the defecth. Nonstandard or manual work – reliance on nonstandard or manual workfrom others such as using non-rebuilding servers, test environments,configurations etc. causes issuesi. Heroics – heroic deeds – working late hours regularly, deployment at oddhours sap the energy and enthusiasm of the team.Chapter 3 – Principles of Feedback1. We make our system of work safer by creating fast, frequent, high qualityinformation flow throughout the value stream and our organization which includesfeedback and feedforward loops.2. Complex systems typically have a high degree of interconnectedness of tightlycoupled components and system level behavior – and failure is inherent andinevitable in such complex systems – hence the need for designing a safe system ofwork.3. It is not sufficient to merely detect issues when the unexpected happens – we mustalso swarm them, mobilizing whoever is required to solve the problem.4. Swarming is required to preventing the problem from going downstream, andpreventing the work center from starting new work.5. To enable fast feedback in the technology value stream, we must create theequivalent of an Andon cord and the related swarming response.6. Gary Gruver - “It is impossible for a developer to learn anything when someone yellsat them for something they broke six months ago- that is why we need to providefeedback to everyone as quickly as possible, in minutes, not months.”Chapter 4 – Principles of Continual Learning and Experimentation1. High performing manufacturing operations promote learning – the work is not veryrigidly defined, the system of work is dynamic, they conduct experiments togenerate new improvements.Srinath RamakrishnanPage 3 of 22

Brief Summary of The DevOps Handbook2. Ron Westrum, one of the first to observe the importance of organizational cultureon safety and performance defined three types of culturea. Pathological organizations – characterized by large amounts of fear andthreat. People often hoard information, withhold it, distort it for their owngood. Failure is hidden.b. Bureaucratic organizations – characterized by rules and processes, often helpindividual departments to hold on to their “turf”. Failure is processedthrough a system of judgement resulting in either punishment or justice andmercy.c. Generative organizations – characterized by actively seeking and sharing ofinformation to better enable the organization to achieve its mission.Responsibilities are shared and failure results in reflection and genuineinquiry.3. We improve daily work by explicitly reserving time to pay down technical debt, fixdefects, and refactor and improve problematic ideas of our code and environments.4. Example of Alcoa – who improved their safety record significantly over 2 years – andnow have one of the most enviable safety records in the industry.5. When new learnings are discovered locally, there must also be a mechanism toenable the rest of the organization to use and benefit from that knowledge – converttacit knowledge into explicit, codified knowledge which becomes someone else’sexpertise through practice.6. Lower performing organizations buffer themselves from disruptions in many ways –they bulk up or add flab (an increased inventory buffer, hiring more people thanrequired – often leading to increased costs). High performing organizations achievethe same results by improving daily operations, introducing tension to elevateperformance, and engineering more resilience into their system.7. Leaders reinforce a learning culture by making all the right decisions.8. The Leader helps coach the person conducting experiments with questions such as – what was the last step, what happened? what did you learn? what is your next target condition? what is your next step? What obstacle are you working on now? What is the expected outcome?Part II – Where to startChapter 5 – Selecting which value stream to start with1. Software products and services can often be categorized as Greenfield orBrownfield.2. In technology, a greenfield project is a new software project or initiative in the earlystages of planning or implementation where we build applications and infrastructureanew with few constraints. Typically used as pilot projects to demonstrate feasibilityof something new.Srinath RamakrishnanPage 4 of 22

Brief Summary of The DevOps Handbook3. Brownfield projects are existing products or services that are already servingcustomers and have in operations for years. These projects often come withsignificant amounts of technical debt, have no test automation in place and run onunsupported platforms.4. Though it is usually believed that DevOps is ideally suited for Greenfield project, wefind that this has been successful for Brownfield projects as well.5. Similarly, it is important to consider both Systems of Record and Systems ofEngagement for implementation of DevOps6. Systems of Record (ERP systems) where correctness of the transactions and data areparamount have typically a slower pace of change due to regulatory and compliancerequirements. This focuses on “Doing it right”.7. Systems of Engagement are customer facing systems, have a higher pace of changeto support rapid feedback loops to enable them to conduct experimentation howbest to meet customer needs. This focuses on “Doing it fast”8. When we improve brownfield systems, we should not only strive to reduce theircomplexity and improve their reliability and stability, we should also make themfaster, safer and easier to change.9. We expand our DevOps operations across the organization in small incrementalsteps. Some of the steps to broad base the support for DevOps include :a. Find innovators and early adopters – identify those who are respected andhave a high degree of influence and can give credibility to the initiativeb. Build critical mass and silent majority - work with teams who are receptive tonew ideas and expand the coalition, generating more successesc. Identify the holdouts – these are the high profile influential detractors whoare likely to resist our efforts. Tackle them after having made substantialgains in the organization.Chapter 6 – Understanding the work in the value stream, making it visible and expandingit in the organization.1. It is important to gain a sufficient understanding of how value is delivered to thecustomer, what work is performed and by whom and what steps we can take toimprove the flow.2. After which, identify the members of the value stream who are responsible forworking together to create value for the customer – Product Owner, Development,QA, Operations, InfoSec, Release Managers, Technology Executives and others formpart of this group.3. Once the “actors” are identified, the next step is to gain an understanding of howwork is performed, documented in the form of a “Value Stream Map”. The goal isnot to document every step and associated details, but understand the areas of thevalue stream that are jeopardizing the goals of fast flow, short lead times andreliable customer outcomes.4. Then create a dedicated transformation team and have them accountable forachieving a clearly defined, measureable system level result. For this, we wouldneed toSrinath RamakrishnanPage 5 of 22

Brief Summary of The DevOps Handbooka. Assign members of the dedicated team to be solely allocated to the DevOpstransformation effortsb. Select team members who are generalists who have skills across a widevariety of domainsc. Select team members who have long standing and mutually respectfulrelations with the rest of the organizationd. Create a separate physical space for the dedicated team if possible tomaximize communication flow within the team.5. Agree on a Shared goal – Defining a measurable goal with a clearly defined deadlineagreeable to all stakeholders is important. Examples could bea. Reduce percentage of budget spent on production support and unplannedwork by 50%b. Ensure lead time from

Enable fast left to right flow of work from Dev to Ops to Customer. This is done through making work visible, reduce batch sizes, build in quality by preventing defects and constantly optimize for global goals. b. Enable fast and constant flow of feedback from left to right at all stages of the value stream. By seeing problems as, they occur, swarming them until effective counter measures .