Problem Management 10pt - IT Infrastructure Library (ITIL .

Transcription

Pro b l e mManagementProblem Management ContentsPrM 1 Topic introduction – Aim and objectives of this topic . . . . . . . . . . . . . . . . . . . . . . . . .1PrM 2 Overview – An introduction to the process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1PrM 3 Implementation guide – How to implement the process . . . . . . . . . . . . . . . . . . . . .8PrM 4 Operations guide – The ongoing operation of the process . . . . . . . . . . . . . . . . . . .14PrM 5 Roles and responsibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17PrM 6 Review – Summary and checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25KeyGlossary term:Glossary termCross reference: Cross referenceFramework for ICT Technical Support

Problem Management Becta 2004You may reproduce this material free of charge in any format or medium withoutspecific permission, provided you are not reproducing it for profit, material orfinancial gain. You must reproduce the material accurately and not use it in amisleading context. If you are republishing the material or issuing it to others, youmust acknowledge its source, copyright status and date of publication.Publication date March 2004Originally published online in September 2003 as part of the Becta websitehttp://www.becta.org.uk/tsasWhile every care has been taken in the compilation of this information to ensure thatit is accurate at the time of publication, Becta cannot be held responsible for any loss,damage or inconvenience caused as a result of any error or inaccuracy within thesepages. Although all references to external sources (including any sites linked to theBecta site) are checked both at the time of compilation and on a regular basis, Bectadoes not accept any responsibility for or otherwise endorse any product orinformation contained in these pages, including any sources.British Educational Communicationsand Technology Agency,Millburn Hill Road,Science Park,Coventry CV4 7JJ

Problem ManagementPrM 1 Introduction to Problem ManagementYou have the same problem occurring time after time with your ICT and it never gets fixedproperly? You need the FITS Problem Management process.PrM1.1AimThe aim of this section is to introduce the topic of Problem Management and tohelp you implement the process in your school with a minimum of preparationand training.PrM1.2ObjectivesThe objectives of this section are to enable you to: understand the difference between incidents and problems understand when a quick fix is not enough to resolve a problem permanently decide whether you need to implement problem control understand how to implement the Problem Management process understand how to achieve workarounds and solutions decide which Problem Management reports your school requires and howto produce them.PrM 2 OverviewPrM2.1What is Problem Management?The goal of Problem Management is to minimise both the number and severity ofincidents and problems in your school. It should aim to reduce the adverse impactof incidents and problems that are caused by errors in the ICT infrastructure, and toprevent recurrence of incidents related to these errors. You should address problems in priority order, paying attention to the resolutionof problems that can cause serious disruption. The degree of management and planning required is greater than that neededfor incident control, where the objective is restoration of normal service asquickly as possible. The function of Problem Management is to ensure that incident information isdocumented in such a way that it is readily available to all technical support staff.Problem Management has reactive and proactive aspects: Becta 2004 reactive – problem solving when one or more incidents occur proactive – identifying and solving problems and known errors before incidentsoccur in the first place.FITS Problem Management1

Problem Management includes:PrMPrM2.1.12.1.2 problem control, which includes advice on the best workaround available forthat problem error control.Differences between incident management and problem management The aim of incident management is to restore the service to the customer asquickly as possible, often through a workaround, rather than through trying tofind a permanent solution. Problem management differs from incident management in that its main goalis the detection of the underlying causes of an incident and the best resolutionand prevention. In many situations the goals of problem management can be in direct conflictwith the goals of incident management. Deciding which approach to take requires careful consideration. A sensibleapproach would be to restore the service as quickly as possible (incidentmanagement), but ensuring that all details are recorded. This will enable problemmanagement to continue once a workaround had been implemented. Discipline is required, as the idea that the incident is fixed is likely to prevail.However, the incident may well appear again if the resolution to the problemis not found.Incident vs problemAn incident is where an error occurs: something doesn’t work the way it is expected.This is often referred to as: a fault error it doesn’t work! a problembut the term used with FITS is ‘incident’.A problem can be: the occurrence of the same incident many times an incident that affects many users the result of network diagnostics revealing that some systems are not operatingin the expected way.Therefore a problem can exist without having immediate impact on the user, whereasincidents are usually more visible and the impact on the user is more immediate.PrM2.1.3Error controlError control covers the processes involved in the successful correction of knownerrors. The objective is to remove equipment with known errors that affects the ITinfrastructure in order to prevent the recurrence of incidents. Error control activitiescan be both reactive and proactive.Reactive activities include: Becta 2004 identifying known errors through incident management implementing a workaround.FITS Problem Management2

Proactive activities include:PrM2.1.4 finding a solution to a recurring problem creating a solution including the solution in the database of known errors.Examples of problemsTechnical problems can exist without impact to the user. However, if they are notspotted and dealt with before an incident occurs, they can have a major impact onthe availability of the computer service.User-experienced problems The printer will not form-feed paper, so users have to advance the paper by usingthe form-feed button. Each time new users log onto a computer, they have to reinstall the printer driver. Windows applications crash intermittently without an error message. Thecomputer will restart and work properly afterwards.Technical problemsPrM2.2 Disk space usage is erratic: sometimes there appears to be plenty of disk space,but at other times not much is available. There is no obvious reason and noimpact on the users – yet! A network card is creating a high level of unnecessary traffic on the network. Thiscould eventually reduce the bandwidth available, which would lead to a slowresponse to network requests.Who uses Problem Management?Problem Management is used mainly by technicians. At this stage, reference toprevious incidents, a knowledge base or quick fixes will not be effective as theproblem has not previously occurred. This is where the technician calls upon alltheir problem-solving skills and analysis techniques to decide how to approachthe problem, how much time to allocate and what to do if the problem cannotbe resolved.If your school has numerous incidents that cannot be resolved readily and you areimplementing lots of quick fixes, you should decide to tackle the cause of theincidents using problem management. If you get the same incident occurringrepeatedly, you should implement the FITS Problem Management process.All schools should have a process to deal with major incidents – for example, a servercrash, a virus attack or an unexplained slow network. If you would like to manageyour approach to major incidents, you should also consider introducing ProblemManagement at your school.Most organisations, including schools, need to keep records of how well their ICTsystems are functioning, what is failing and how long systems are unavailable. Theinformation you will gain from problem management should enable you to reportto the school on the technical issues that create incidents and problems. To provideyour school with an effective approach to its technical support, you should alwaysimplement problem management alongside incident management. Becta 2004FITS Problem Management3

PrM2.3Why use Problem Management?The benefits of taking a formal approach to problem management includethe following.PrM2.3.1 There is a standard way to approach every problem – this saves time. The number of incidents will reduce. The solutions will be permanent. There will be a gradual reduction in the numberand impact of problems and known errors, as those that are resolved willstay resolved. You learn from your mistakes. The process provides the historical data from whichto identify trends, and the means of minimising failures and reducing the impactof failures. You will obtain a better first-time fix rate of incidents because you will have aknowledge database available to the service desk and technicians when a call isfirst logged.What happens if Problem Management is not used?Without problem management, you may observe that your school:PrM2.3.2 faces up to problems only after the service to users has already been disrupted loses faith in the quality of its technical support, with high costs and lowmotivation for both users and technicians, since similar incidents have to beresolved repeatedly without anyone able to provide permanent solutions.Objectives of recording problem management informationOne function of problem management is to ensure the documenting of incidentinformation in such a way that it is readily available to service desk staff andtechnicians. The information should be recorded so that it is easily referencedby simple and detectable triggers from new incidents.Regular inspection of your problem management records can ensure the continuedrelevance of documentation in the light of changes in: technology available external solutions school practices and requirements in-house skills frequency and impact of recurring incidents interpretation of internal best practice.It is important that you review your process for recording incidents and problems toenable you to make continuous improvements to the way information from previousincident resolutions is used. You may like to consider these suggestions. Becta 2004 Staff using the information should be trained to understand the depth and powerof the information available, how to access and interpret it, and their role inproviding feedback on its relevance and ease of use. You should maintain a suitable spreadsheet or database for recordingthe information. Develop an integrated service management tool (see Service Desk) that cancapture the information at the logging or analysis stage of the incidenthandling process.FITS Problem Management4

PrM2.3.3Benefits of problem managementThe benefits of taking a formal approach to problem management include the following. Improved quality of the ICT serviceHigh quality reliable service is good for school leaders, teachers and students.It is also good for the productivity and morale of the technical support staff. Reduction in the volume of incidentsProblem management helps to reduce the number of incidents that can interruptthe school day. Permanent solutionsThere will be a gradual reduction in the number and impact of problems andknown errors, as those that are resolved stay resolved. Improved technical support knowledgeThe FITS Problem Management process is based on the concept of learning fromexperience. The process provides the historical data to identify trends, and themeans of minimising failures and of reducing the impact of failures, resulting inimproved productivity. A more effective service deskEventually the service desk will be able to resolve a number of incidents. Therewill be a better first-time fix rate at the service desk as problem managementenables the service desk staff to know how to deal with problems and incidentsthat have previously been resolved and documented.PrM2.3.4What weakens the benefits of problem management?The following can weaken the benefits of problem management. Poor incident controlThe absence of a good incident control process means that you will not havedetailed historical data on incidents, which is necessary for the correctidentification of problems. Absence of co-ordination with incident managementFailure to link incident records with problem/error records means a failure togain many of the potential benefits. This is a key feature in moving from reactivesupport to a more planned and proactive support approach. Lack of management or leadership commitmentThe result of lack of commitment at the top is likely to be that support staff(who are usually also involved in reactive incident control) cannot allocateenough time to structured problem-solving activities. Undermining the service desk roleAll incident reports must come through the service desk and not direct to thetechnician. Difficulties will arise if the service desk is dealing with multiple reportsof incidents and the technician is not fully aware of the extent of the problem. Not maintaining call log or incident sheetsAny failure to set aside time to build and update the call log or incident sheetswill restrict the benefit of understanding the bigger picture on the network andlooking at trends that may point towards an underlying problem. Ignorance of the impact of incidents and problemsIf you are unable to determine accurately the impact on the school of incidentsand problems, you will not be in a position to give critical incidents and problemsthe correct priority. Becta 2004FITS Problem Management5

PrM2.4How problem management worksProblem management works by using analysis techniques to identify the cause ofthe problem. Incident management is not usually concerned with the cause, onlythe cure. Problem management therefore takes longer and should be done onceyou have dealt with the urgent stage of the incident: for example, removing a faultycomputer and replacing it with a working computer. This takes the urgency awayand leaves the faulty computer ready for diagnostics.Problem management can take time. It is important to set a time limit on how muchtime should be spent on the problem – or the cost of resolution can become expensive.To achieve the goal, problem management aims to:PrM2.4.1PrM2.4.1.1 identify the root cause initiate actions to improve and correct the situation.Summary of the Problem Management processInputs to Problem ManagementInputs to the Problem Management process are:PrM2.4.1.2 incident details from the Incident Management process configuration details from the configuration management database details about changes made to the part of the network with the problem any defined workarounds (from incident management).Outputs from Problem ManagementOutputs from the Problem Management process are:PrM2.4.1.3 known errors requests for change (through change management) an updated problem record (including a solution and/or any available workarounds) for a resolved problem, a closed problem record knowledge base content to use in incident management management information through reports.Activities of Problem ManagementThe major activities of Problem Management are:PrM2.4.1.4 problem control error control the proactive prevention of problems identifying trends obtaining management information from problem management data the completion of major incident or problem reviews.Roles and responsibilities in Problem ManagementFor an outline of the roles and responsibilities, see PrM 5. Becta 2004FITS Problem Management6

2.4.2Problem Management life resolution05Results ofresolution06Closure07iterativeComplete calllog and file alldiagnosticssheetsProblem management life cyclePrM2.5What does Problem Management cost? Initially it costs someone’s time and effort to look at problems and document anapproach to resolving them in the future. The technician should be able to put time aside each week to look at problems.This time should be protected.As a proactive process, problem management: Becta 2004 will save time, as fewer incidents are logged will save budget, as the technician's salary is not used on resolving the sameincident many times will increase the availability of equipment since it will fail less often will increase the confidence of the users – both teaching staff and students –as the systems become more reliable.FITS Problem Management7

PrM 3 Implementation guidePrM3.1Define what needs to be done to implementProblem ManagementProblem Management should be implemented with Incident Management orshortly afterwards.PrM3.1.1 Ensure that you are recording your calls and can track their progress. Understand the difference between problems and incidents. Have a procedure to separate incidents from problems. Decide how much time each week to devote to problem management. Choose which areas to improve and which of your current processes to remove. You need to sell the idea to other staff, so make sure you’re happy with it first.The Problem Management process1. Notification of problem2. Request for technical support3. Problem analysis4. Production of theory5. Production of resolution6. Results of resolution7. ClosureSee also the following.PrMPrM3.23.2.1 PrM 2.4.2 Problem Management life cycle PrM 2.1 What is Problem Management? PrM 2.1.1 Differences between incident management and problem management PrM 2.1.2 Incident vs problemPrepare to implement Problem Management Good problem management relies to a great extent on a well-run incidentmanagement process. So it is sensible to implement Problem Managementeither in parallel with or after Incident Management. If resources are scarce, it is advisable to concentrate on the implementation ofproblem and error control (reactive problem management). When these activitiesreach maturity, resources can be directed to proactive problem management,which depends largely on the successful implementation of Network Monitoringand Preventative Maintenance. Smaller schools can introduce reactive problem management by focusing daily onthe ‘top 10’ incidents of the previous week. This can prove to be effective, sinceexperience shows that 20% of problems cause 80% of service degradation!Risks to the implementation of Problem ManagementThe following can weaken the benefits of Problem Management. Becta 2004The absence of a good incident control process means that you will not havedetailed historical data on incidents, which is necessary for the correctidentification of problems.FITS Problem Management8

PrMPrM3.2.23.2.3 The result of lack of commitment at the top is likely to be that support staff(who are usually also involved in reactive incident control) cannot allocateenough time to structured problem-solving activities. In order not to undermine the service desk role, all incident reports must comethrough the service desk and not direct to the technician. Difficulties will arise ifthe service desk is dealing with multiple reports of incidents and the technicianis not full

The goal of Problem Management is to minimise both the number and severity of incidents and problems in your school. It should aim to reduce the adverse impact of incidents and problems that are caused by errors in the ICT infrastructure, and to