The Total Economic Impact Of Automating Systems Management

Transcription

NetIQ User Conferentie 2010The Total Economic Impact ofAutomating SystemsManagementTravis GreeneChief Service Management Strategist

Agenda The Challenges of Non-Automated SystemsManagement Steps to Apply Automation Effectively Examples of Systems ManagementAutomation The Total Economic Impact ResultsNetIQ User Conferentie 2010

The Challenges of Non-AutomatedSystems ManagementNetIQ User Conferentie 2010

Systems Management Challenge #1Reduce management and administration costs to meet budgetrealties and shift resources to new and innovative services BillionUSDPower & CoolingMillions50Management & Administration 200Hardware Expenditures40Servers Installed 15030 10020 5010 01996199719981999Source: IDC, March 110

Systems Management Challenge #2Balance customer demand for quality with costRapid ResponseService Levels MetConsistent meryvileent D MettsisConels sevLe onecspr vieeSRdipRaoglckaBWorkloadorrd BacklogranEloaHumanErrorkor HumyeesWNew ostsIT OrganizationChallenges

Steps to Apply Automation EffectivelyNetIQ User Conferentie 2010

Step 1: Reduce InefficienciesWith current management tools using automation Each group hasService Desktools, often withfeature overlap A small percent offeatures areactually agementApplicationManagementITFunctions ITPA, throughadapters, canleverage morefeatures, reducing: Human error Manual laborBest ofBreedManagementToolsWhile improving: Knowledge sharing ConsistencyNetIQ User Conferentie 2010ManagedTechnologies

Step 2: Integrate ToolsAcross domains and disciplinesBegin withintegration withinoperations andsecurity domainsseparatelySecurityOperationsPerformance &AvailabilityChange AuthorityTicketingSecurityPerformance yTicketingNetIQ User Conferentie 2010Service nsChange ControlUserMonitoringProblemEscalationResponse TimeMonitoringProblemEscalationResponse TimeMonitoringService LevelReportingEventPrioritizationTicketingAs maturityincreases,integrate acrossdomains

Step 3: Integrate with the BusinessInvolve them with IT Management ProcessesBillingSystem Gain just-in-time approvalsfrom stakeholders andsatisfy self-service requestsManagementITPAROI, SLAReports Provision virtual machines Approve policy exceptions New user provisioningLOB Requestoror StakeholderTicketing Provide automated reportingand charge-back ROI, SLA, Process Improvement Charge-back for resource usage(e.g. virtual machines) or processexecution (e.g. change mgmt)NetIQ User Conferentie 2010Other Sources (RFCs,CMDB, Changemonitoring, etc.)SystemsManagementHelpdesk

Examples of System ManagementAutomationNetIQ User Conferentie 2010

Perform Routine MaintenanceSuch as Rebooting Servers1. NetIQ Aegis initiates the server rebootprocess based on a schedule andsuppresses reboot related eventsNetIQ Aegis82. NetIQ Aegis commands the load balancer toblock new sessions to the first serverSaved: 1 minutesAdministrator93. NetIQ Aegis commands NetIQ AppManagerto monitor for the server to reach zero activesessions3Saved: 15 minutes4. NetIQ Aegis commands NetIQ AppManagerto reboot the server and wait for completionSaved: 3 minutesNetIQAppManagerResponseTime7Saved: 1 minute8. NetIQ Aegis sends a progress notificationemail to the administratorActiveSessionsSaved: 1 minute9. NetIQ Aegis repeats steps 2-8 for eachadditional server in the groupLoadBalancer4 Web ServersSaved: 10x minutesNetIQ User Conferentie 2010656. NetIQ Aegis commands the load balancer toenable new sessionsSaved: 5 minutes7. NetIQ Aegis commands NetIQ AppManager toverify service performance2NetIQAppManagerSaved: 15 minutes5. NetIQ Aegis commands NetIQ AppManagerto validate server health1Total Time Saved: 410 Minutes

Recover from Common EventsSuch as Low Disk Space ConditionsNetIQ Aegis8641. Available disk space falls below thresholdFile Type2. NetIQ AppManager generates an event,triggering a process in NetIQ Aegis*.dmp*.log3. NetIQ Aegis requests disk usage analysisfrom NetIQ AppManagerAdministratorDelete? Archive? 53Saved: 15 minutes4. NetIQ Aegis sends email to adminrequesting approval to clean upManagementSaved: 5 minutesNetIQAppManager5. If no response is received within a definedtime NetIQ Aegis escalates to a higherlevel of managementSaved: 5 minutes76. Administrator approves partial cleanupthrough NetIQ Aegis2Saved: 4 minutesArchive7. NetIQ Aegis commands NetIQAppManager to perform cleanupSaved: 15 minutes8. NetIQ Aegis sends confirmation email tothe administrator1NetIQ AppManagerAgentSaved: 4 minutesNetIQ User Conferentie 2010Total Time Saved: 48 MinutesTrash

Update the CMDBWith Reconciled CIs from Multiple Management ToolsNetIQ Aegis1. A new NetIQ Aegis adapter is implemented,providing connectivity to a monitoring toolsuch as NetIQ Secure Configuration Manager2. NetIQ Aegis reconciles the configurationinformation from the new tool with what isknown from other tools by synchronizingcomputers and groups using NetIQ IQRMSaved: 60 minutesAdmin214NetIQ SecureConfiguration Manager5 33. NetIQ Aegis updates the CMDB using aspecific adapter or the NetIQ AegisAdapter for DatabasesSaved: 30 minutes4. NetIQ Aegis continues to reconcile newconfiguration information as it isreceived via multiple adaptersSaved: 5 minutesNetIQAppManagerNetIQ SecurityManager5. NetIQ Aegis continues to update theCMDB on scheduleSaved: 15 minutes6. If a conflict is found between the CMDBand the configuration information thatNetIQ Aegis has, an event is raisedrequesting manual reconciliationNetIQ User Conferentie 2010Total Time Saved: 110 MinutesBMC RemedyCMDB

Identify Change-Induced IncidentsFor Faster Service Restoration1. An end user raises an incident with the helpdesk, describing unavailability of a serviceand the help desk logs a ticketNetIQ Aegis2. NetIQ Aegis collects information fromchange monitoring tools such as NetIQChange Guardian, or simply collects allrecent changes5Saved: 30 minutes3473. NetIQ Aegis populates the ticket withinformation from change monitoringSaved: 10 minutes4. NetIQ Aegis monitors the changemonitoring tools for additional informationand updates the ticket2AdministratorsTicketingSystemOther Sources (RFCs,CMDB, NetIQ ChangeGuardian, etc.)Saved: 10 minutes5. If the Help Desk can not resolve with theinformation provided, administrators arecontacted to resolve the incident6. NetIQ Aegis monitors the ticket for aresolution code and looks for unintendedconsequences of the resolution from othermonitoring tools, updating the ticket asnecessary1BusinessServiceUserHelpdeskSaved: 15 minutes7. NetIQ Aegis closes the ticket if noadditional events are detected within aspecified amount of timeSaved: 1 minuteNetIQ User Conferentie 20106Total Time Saved: 66 Minutes

Run Business JobsAnd Replace Costly Job Scheduling Tools1. NetIQ Aegis initiates the “Data Replication”process based on a daily scheduleSaved: 1 minutesCustomerDownloadServer4Admin2. NetIQ Aegis transfers 3000 files from thecustomer download server to six loadbalanced application serversSaved: 60 minutes3. NetIQ Aegis confirms successful transfer ofall files after a designated time periodbased on file size and transfer ratesNetIQ Aegis72153 6Saved: 20 minutes4. If there are any failures, NetIQ Aegiscollects information and notifies anadministrator via email and re-initiates thetransfer after approval or after a designatedamount of time5. NetIQ Aegis continues to retry the transferand contact the admin a designatednumber of timesSaved: 5 minutes6. Once file transfer is completed NetIQ Aegisinitiates the processing of data on eachapplication server and waits for completionSaved: 5 minutes7. NetIQ Aegis sends a completion email to thedesignated administrator or a failure email ifnot completed on timeSaved: 4 minutesNetIQ User Conferentie 2010Application ServersTotal Time Saved: 95 Minutes

Centralize MonitoringAnd Resolve Custom Business Application Events1. Business application performance beginsto degrade and the application writes anevent to a MS SQL databaseNetIQ Aegis4!Admin2. NetIQ Aegis detects the new row that hasbeen added to the database using theDatabase Adapter and reads the detailsSaved: 1 minutesBusinessServiceResolutionSelectOption 1Option 2Option 3 3. NetIQ Aegis forwards the event into NetIQAppManager, populates the event detailswith affected user names and event loginfo, reconciles the application name withthe associated server name and object5326!Saved: 10 minutes4. NetIQ Aegis sends an email to theadministrator with designated options forevent recovery5. Administrator replies to NetIQ Aegis, whichcommands NetIQ AppManager to resolvethe known error using establishedproceduresSaved: 20 minutes1NetIQ AppManager6. NetIQ Aegis updates the database, closesthe event in NetIQ AppManager andemails the application administratorSaved: 15 minutesNetIQ User Conferentie 2010Total Time Saved: 46 MinutesDatabaseServer

Prioritize and Resolve EventsBased on the Impact to End Users1. NetIQ AppManager detects multiple MS SQLdatabase events, including high lock utilization,a high number of master DB locks and lock waittime high2. NetIQ AppManager ResponseTime for Webdetects degradation in performance for abusiness application3. These events are correlated in NetIQ Aegisbased on a business service that has beendefined in the Resource Management DatabaseNetIQ Aegis!5 6BusinessService2374Saved: 10 minutesAdministrator4. NetIQ Aegis closes the symptomatic events inNetIQ AppManager and opens a new reprioritized event that indicates high SQL lockutilizationSaved: 10 minutesResolution Kill SPIDYesNo5. NetIQ Aegis alerts the database administrator,describing the situation along with arecommendation for resolution in an emailNetIQ AppManagerSaved: 5 minutes16. The database administrator approvestermination of the SQL PID associated with theuser consuming the most locks via reply to theemail to resolveSaved: 8 minutes7. NetIQ Aegis commands NetIQ AppManager toterminate the SQL PID and replies to theadministrator with the resultsSaved: 15 minutesNetIQ User Conferentie 2010 !Web Server!ApplicationServerTotal Time Saved: 48 Minutes!DatabaseServer

The Total Economic Impact ResultsNetIQ User Conferentie 2010

FORRESTER * Determined using the Aegis ROI calculator developed by Forrester Consulting based on arepresentative customer with 1,000 servers.NetIQ User Conferentie 2010 Required Optional

ROI Analysis AvailableIndependently developed by an analyst firmNetIQ User Conferentie 2010

NetIQ User Conferentie 2010

information from the new tool with what is known from other tools by synchronizing computers and groups using NetIQ IQRM. 1. A new NetIQ Aegis adapter is implemented, providing connectivity to a monitoring tool such as NetIQ Secure Configuration Manager. 3. NetIQ Aegis updates the CMDB using a specific adapter or the NetIQ Aegis Adapter for .