ExtraHop IT Operational Intelligence White Paper

Transcription

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsWHITE PAPERThe ExtraHop IT Operational Intelligence Platform:How It Works and Real-World ResultsBy Tyson SupasatitTechnical Marketing ManagerAbstractExtraHop accelerates IT transformation with real-time IT operations analytics. The ExtraHop platformequips all IT teams with correlated, cross-tier visibility so they can answer the question, “What ishappening in my environment right now?” With this operational intelligence, organizations in all industrieshave built a sustainable competitive advantage by running their IT more efficiently and with greater agility.This white paper explains the technology that powers the ExtraHop platform and how IT organizations useExtraHop to accomplish critical IT tasks and add significant value to the business.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsTable of ContentsIntroduction33Wire Data: Unlocking the Potential of Data on the Wire34Simple, Non-Invasive DeploymentFull-Stream Reassembly and Full-Content Analysis55Streaming Datastore and Intelligent Alerting EngineOpen, Extensible, and Shareable Platform667Wire Data Can Transform IT OperationsHow ExtraHop WorksExtraHop in ActionEnd-User IntelligenceProactive Remediation77Infrastructure OptimizationApplication Optimization89IT Decision ManagementIT and Business Intelligence910Security and Compliance1112Conclusion

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsIntroductionExtraHop enables companies to achieve a sustainable competitive advantage through moreproactive and agile IT Operations. Organizations that have adopted the ExtraHop operationalintelligence platform have transformed their IT Operations so that they are making informeddecisions regarding IT infrastructure, answering questions that impact millions of dollars inrevenue, and preventing problems instead of reacting to them. In short, the ExtraHop platform ishelping these IT organizations become a strategic asset to the business.Wire Data Can Transform IT OperationsTechnology is not a panacea, but the right set of solutions is essential to help IT Operationsrespond faster to changing business needs. Most IT organizations purchase monitoring tools tomeet narrow departmental requirements, not according to a strategic, overarching plan. Thisbehavior results in an ad hoc accumulation of niche products that exist in siloes, not thecohesive IT operational intelligence framework that will equip these organizations to accelerateIT maturity. ExtraHop is part of a set of next-generation technologies that together equip ITteams with holistic operational intelligence.Fundamentally, there are only four key sources of data available for IT operations management.Each data source is necessary, although the role and importance of each is evolving. Machine data, including logging provided by vendors, SNMP, and WMI. Thisinformation about system internals helps IT teams identify overburdened machines, plancapacity, and perform forensic analysis of past events. New distributed log-file analysissolutions enable IT organizations to address a broader set of tasks, including answeringbusiness-related questions. Agent data from byte-code instrumentation, call-stack sampling, and custom logging.Code diagnostic tools have traditionally been the purview of Development and QAteams, helping to identify hotspots or errors in the software code. New SaaS vendorshave dramatically simplified the deployment of agent-based products. External data from synthetic transactions and service checks. This data enables ITteams to test common transactions from locations around the globe. Wire data, which has traditionally included NetFlow, HTTP traffic analysis, and packetcapture. The information available off the wire historically has been used for measuring,mapping, and forensics. ExtraHop unlocks the tremendous potential of real-time wiredata, opening up vastly greater opportunities and serving as the lynchpin of IToperational intelligence.Wire Data: Unlocking the Potential of Data on the WireThe information needed for operational intelligence has always existed on the wire, butpreviously was not available in real time or in a way that was meaningful to all IT teams. TheExtraHop platform introduces revolutionary new high-speed packet processing capabilities thatmake it possible—for the first time—to fully analyze the wealth of data passing over the wire inreal time and present it in a way that makes sense for network engineers, security professionals,DBAs, storage administrators, application architects, application owners, and others. ExtraHopextracts this real-time wire data without the use of agents.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsExtraHop provides value to all IT teams, equipping them with real-time operational intelligence needed to answer the question, “What’shappening in my IT environment right now?”The traditional approach to obtaining visibility across all the tiers of an IT environment would beto pull as many discrete metrics as possible from each tier and then try to make sense of thecollected data with analysis and reporting servers. This bottom-up approach providesinformation that is often hours old, uncorrelated, and frequently unreliable because of poorintegration between various tools. Worse still, these legacy tools become more expensive andrequire more effort to manage as the environment grows in complexity, leaving IT organizationspaying more and getting less in return.ExtraHop takes a radically different approach, using wire data as the source for cross-tierinsight. The network is the common element that ties all components of the application deliverychain together, even as those components become more numerous and distributed. Eachcomponent communicates with others using transport and application protocols. Theseprotocols definitively describe what is happening in the IT environment. The networking adage,“packets don’t lie,” applies here. Moreover, these protocols seldom change, making the networkthe ideal instrumentation point in increasingly heterogeneous and fluid environments.How ExtraHop WorksThe ExtraHop platform performs full-stream reassembly and full-content analysis of networktraffic to extract IT and business insights. ExtraHop analyzes application transactionscontinuously and in real time, at speeds up to a sustained 20Gbps. An open and extensibleplatform, ExtraHop enables IT teams to define and implement new metrics within minutes, andintegrates seamlessly with manager of managers (MOM) systems and other next-generationmonitoring products such as Keynote, New Relic, SevOne, and Splunk.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsSimple, Non-Invasive DeploymentThe ExtraHop platform is a completely passive, out-of-linenetwork appliance that is easy to deploy and manage.Deployed using a network tap, SPAN, or other data-accesstechnology, ExtraHop analyzes every applicationtransaction, not just a sample portion of network traffic aswith synthetic transactions. Where a tap or SPAN are notavailable, ExtraHop offers a high-speed packet forwarderthat can be packaged in automated configuration utilitiessuch as Chef.The ExtraHop Context and Correlation Engine is built for massivelyscalable transaction analysis—up to a sustained 20Gbps.As soon as traffic is detected by the platform, ExtraHop’sContext and Correlation Engine automatically discovers andclassifies devices, both physical and virtual, and determinesrelationships between devices based on MAC addresses,IP addresses, naming protocols, and other heuristicelements. As the IT environment changes—with newsoftware builds and upgraded infrastructure components,for example—ExtraHop automatically detects and adjusts tothose changes.For distributed environments, the ExtraHop Central Manager delivers a consolidated view ofwire data from multiple ExtraHop appliances, enabling organizations to gain visibility into thecommunications of hundreds of thousands of devices across datacenters and branch offices. ITadministrators can easily update the platform firmware remotely, making the ExtraHop platforman ideal choice for deployment within physically isolated, or lights-out, datacenters.Full-Stream Reassembly and Full-Content AnalysisWhile other products only inspect L4 headers, only the ExtraHop Context and CorrelationEngine performs full-stream reassembly continuously in real time. This advanced approachreassembles multiple packets into a stream and reconstructs transactions, flows, andsessions—a prerequisite for true application fluency. ExtraHop is purpose-built for productionon-premises and cloud environments, supporting real-world traffic patterns such as IPfragments, out-of-order segments, and microbursts. When packet loss occurs on the monitoringlink, ExtraHop resynchronizes and recovers. Because it was built to take full advantage ofmulticore processing, the ExtraHop Context and Correlation Engine is able to perform fullstream reassembly at a sustained 20Gbps.Through full-stream reassembly, the ExtraHop Context and Correlation Engine can analyze thefull content of transaction payloads (not to be confused with packet payloads) and extract crucialdetails such as the specific URI included in a HTTP 500 Error, slow stored procedures in adatabase, or the location of a corrupt file in network-attached storage. ExtraHop offers protocolmodules for web applications, NoSQL and relational databases, network-attached storage(NAS) and storage-area networks (SANs), directory services, and industry-specific protocols forfinancial and telecommunications verticals.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsStreaming Datastore and Intelligent Alerting Engine“ExtraHop gives us theintelligence we need tocontinually increaseefficiency and sustaina competitiveadvantage.”Drew GarnerDirector of Cloud ArchitectureConcurThe ExtraHop Context and Correlation Engine includes a high-speed, streamingdatastore that records and retrieves performance and health metrics in real time.Optimized for time-sequenced telemetry, the datastore writes to and reads fromunderlying block devices directly, translating into reliably superior recording andretrieval speeds without the tuning and management required by a relationaldatabase.The streaming datastore powers an intelligent alerting engine that helps IT teamsprevent small issues from growing into larger problems. IT teams can configure thedefault alerts and create new alerts for behaviors and events such as networkactivity, webserver and database errors, payload length, slow transactions, andexpiring SSL certificates.Open, Extensible, and Shareable PlatformExtraHop is a platform for IT Operations innovation, equipping IT organizations to quickly meetnew requirements for visibility and insight. ExtraHop offers generous options for integration withexisting IT management toolsets, including policy-based logging of events that are only availablethrough analysis of wire data. Best of all, the innovative extensions for the ExtraHop platformcan be easily bundled, shared, and improved upon through the ExtraHop community. Open – ExtraHop works with other management and monitoring solutions using bothpush and pull integration. For push integration, syslog export enables IT teams to sendpolicy-based, event-driven metrics to any IT management console, custom Big Dataanalysis store, SIEM product, or third-party management tool such as Keynote,SevOne, or Splunk. For pull integration, IT teams can use SDK documentation toaccess the same API that is used by the ExtraHop web interface. This API providesimmediate access to any metric in the ExtraHop datastore. Extensible – ExtraHop provides a programmatic interface to its Context and CorrelationEngine that IT teams can use to define and implement new custom metrics in minutes.Application Inspection Triggers (AI Triggers) make it possible to rapidly answerquestions such as “How many duplicate orders are occurring and whom do they affect?”“Which client types are affected by this new update?” “What users are accessing thissensitive storage file?” and “What are the front-end web requests that are associatedwith these slow SQL queries?” Shareable – What makes ExtraHop a true platform is the ability to package and shareextensions. IT teams can package together dashboards, alerts, geomaps, dynamicgroups, and AI Triggers and then share them within the organization or with the widerExtraHop user community. These solution bundles can be downloaded and extended tomeet particular IT management tasks or application monitoring requirements. In thisway, IT teams benefit from community-driven enhancements by quickly implementingand building on the innovation of others.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsExtraHop in ActionCompanies from a wide range of industries are using ExtraHop to transform how they run IT.The following examples provide a glimpse into what is possible with the IT and business insightsdelivered through the ExtraHop platform.End-User IntelligenceUnlike monitoring products that only show what users are doing and experiencing on thefrontend, ExtraHop can correlate user activity and experience to performance in the backend ITinfrastructure. In other words, ExtraHop does not just show what users are experiencing, it alsoexplains why.One telecommunications service provider used ExtraHop to identify the specific users whosedevices were adversely affected by a firmware update. Traditionally, service providers wouldrely on tools that show which systems are communicating and when. Only ExtraHop enables ITteams to see what is actually being said between systems. In the case of the service provider,ExtraHop reconstructed and analyzed the contents of all Diameter transactions, includingattribute-value pairs (AVPs) such as customer IDs and handset type. With this information, theservice provider could easily isolate which subscribers were affected by the firmware update andwork with the handset manufacturer to develop a fix.“ExtraHop has provenitself to be veryvaluable to AlaskaAirlines—and no othersolution in ourA large research hospital had spent weeks trying to isolate the cause of extremelyslow Citrix logins every morning around 8:30 a.m. With ExtraHop, the hospitalidentified severe contention at the storage tier—a single doctor was pulling down2GB of photos stored in his My Pictures folder every time he logged in. By deletingthe My Pictures folder from user profiles, the IT team at the research hospitalsolved the problem, helping to earn goodwill from users and paving the way for anexpansion of the hospital’s VDI deployment.environment has beenable to analyzeInformix the way thatExtraHop does. It hasenabled us to quicklyand accuratelydiagnose severalissues that would havebeen impractical orimpossible to pindown previously.”Kris KutcheraVP of Information TechnologyAlaska Air GroupProactive RemediationIn an ideal world, everything is tested and works perfectly when deployed toproduction. Reality works much differently, requiring IT Operations team to maintainreal-time visibility into the performance of production applications. ExtraHopprovides trend-based early-warning alerts for the entire production environment sothat IT teams can proactively identify and fix problems fast.Prior to deploying ExtraHop, Alaska Airlines’ IT team had no way of monitoring thereal-time performance of their Informix database. This database underlies AlaskaAirlines’ weights and balances application, which must calculate weight distributionon planes before they are cleared for takeoff. The IT team could not continuouslyrun profilers on the database in production because of the high overhead required.With ExtraHop, Alaska Airlines monitors the performance of its critical Informixdatabase continuously with zero overhead. By reconstructing and analyzing alltransactions, ExtraHop provides the IT team with real-time database performancemetrics, including details such as errors, methods, and users.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsWith ExtraHop, IT organizations can monitor the performance of databases in production, including details such as methods, withoutrunning any database profilers, which can add onerous system overhead.Infrastructure OptimizationOftentimes, IT Operations teams do not root out inefficiency from their infrastructure because noone is complaining and there are other urgent projects waiting. ExtraHop makes it easier toidentify inefficient activity as well as poor performance that users quietly tolerate. Detailedmetrics from ExtraHop also help IT teams to determine the optimal settings for applicationdelivery controllers (ADCs), WAN optimizers, and network-attached storage given the uniquerequirements of their applications.By assembling the TCP state machines for every endpoint, ExtraHop can monitor sophisticated TCP metrics such as PAWS-droppedSYNs, receive-window throttles, retransmission timeouts, and Nagle delays.At one company, an Operations team member was using ExtraHop to find SQL queries thatwere good candidates for caching. In the course of his investigation, he saw that CIFS trafficcomprised 70 percent of network bandwidth. This seemed odd, so he drilled into the CIFStransaction details and found some familiar file names in the list—files associated with thecompany’s homegrown logging system! A bug in the log archive script was causing five millionfiles to be copied across the network unnecessarily. The network team was unfamiliar with thelogging system and had assumed that this traffic growth was organic. In fact, they werepreparing a forklift upgrade of the network infrastructure to handle the increase. However, withthe archive script fixed, network utilization dropped by an astounding 70 percent, which helpedthe company defer hundreds of thousands of dollars in capital expense. Legacy networkmonitoring tools would not have helped in this case. Only ExtraHop, with its ability to analyze L7application-level details, is able to distinguish CIFS traffic and list the filenames for eachtransaction.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsHealthcare services provider MedSolutions used ExtraHop to identify a misconfiguration in theirF5 BIG-IP that was adding network latency for users in the corporate office. ExtraHop showed ahigh number of retransmission timeouts (RTOs) on LAN segments behind the corporate loadbalancers—behavior that was obvious looking at TCP analysis in ExtraHop, but would haverequired careful investigation with a packet sniffer to reveal otherwise. The IT team found the F5BIG-IP was misconfigured with a TCP profile for a wide-area network instead of local-areanetwork. In addition to RTOs, ExtraHop tracks sophisticated TCP metrics such as Nagle delaysand tinygrams, which help network teams and system administrators to determine whichcongestion control algorithms to turn on.“With ExtraHop,MedSolutions has areal-time, holistic viewof all of ourapplications andinfrastructure. ThisApplication OptimizationExtraHop supports the entire application management lifecycle, providingarchitects, developers, testers, and operations teams with a way to measure howupdates and configurations affect performance. With consistent and trusted datafrom ExtraHop, stakeholders can work together more effectively to ensure fastand smooth rollouts. ExtraHop also provides operational intelligence forpackaged applications, enabling IT teams monitor performance across all tiers ofthe application delivery chain.operationalintelligence enables usto quickly answerquestions and takeaction to improveperformance andefficiency.”Satish DaveCIOMedSolutionsA large outdoor equipment retailer rolled out mobile point-of-sale (POS) devicesin preparation for the holiday shopping season. The company estimated that byreducing lines at checkout counters, they would recoup nearly one million dollarsin lost sales. However, store managers complained that performance for thesemobile devices was so slow that they were useless, with product scans takingfrom 30 seconds to one minute. Using SSL analysis in ExtraHop, the ITOperations team discovered the third-party mobile POS software was performing15 SSL handshakes per transaction. The vendor provided a fix so that theapplication used recognized SSL tokens, reducing transaction times to less thanone second—faster even than traditional POS terminals.IT Decision ManagementExtraHop provides IT organizations with the insight they need to make decisionsabout capacity planning, application migrations, decommissioning legacysystems, and infrastructure changes.Practice Fusion, a provider of web-based electronic medical record (EMR) solutions, usedExtraHop to migrate a portion of their web application from physical to virtual infrastructure. Thisparticular workload was customized to run a particular HP server platform, and previousattempts to virtualize the workload had failed because the software encountered race conditionsand similar problems. With ExtraHop, the IT team at Practice Fusion measured baselineperformance for the application on the dedicated HP servers and on a parallel virtualinfrastructure. ExtraHop showed that performance was slightly better on the virtualinfrastructure, helping Practice Fusion to avoid spending 75,000 purchasing new hardware andrevalidating the software for the new platform.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld Results“We set up ExtraHopin our stagingenvironment so thatthe engineering teamcan see the impact ofnew code against ourbaseline performance.With visibility acrossall tiers of theenvironment, we cansee whether aperformance problemis due toinfrastructure,misconfiguration, orpossibly a code-levelissue.”John HlubokyVP of Technical OperationsPractice FusionConcur relies on more than 1,000 database instances to power its SaaSexpense reporting solution. So when the R&D Operations team wanted todramatically expand the cache in front of the database, finding the best SQLworkloads to migrate to the cache would have been next to impossible usingdatabase profilers. Instead, Concur used ExtraHop to analyze databasetransactions for the entire infrastructure and determine the total weight foreach SQL query by calculating the number of times that query was run by thetime required to return the data. This information helped Concur to justifyexpanding its cache from 13,000 hits per day to more than 500 million hits perday, which in turn resulted in a 20 percent improvement in applicationperformance.IT and Business IntelligenceThe wire data that flows through IT environments contains a wealth ofinformation that is valuable to the business. ExtraHop enables ITorganizations to tap this valuable business data to help drive additionalrevenue and analyze customer behavior and pricing trends.A large financial services firm knew its system was duplicating orders, butcould not find the source of the problem or discover which accounts wereaffected and how frequently. The IT team used ExtraHop to analyze the XMLpayload and extract details specified by the Orbital payment protocol,including user, merchant ID, account number, and order ID (see below).Through syslog export, the IT team set up ExtraHop to automatically forwardthis specific information to their Splunk deployment for search and analysis.ExtraHop enables IT teams to easily mine the full transaction payload and extract metrics that are relevant to the business, such asaccount numbers and order IDs.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsA major online advertising platform used ExtraHop to identify customers who had exhaustedtheir prepaid keyword accounts. Before using ExtraHop, the IT Operations team had no visibilityinto the cause for HTTP 500 errors returned by their own API. ExtraHop enabled the team toexamine the HTTP payload for these transactions and see what was causing the errors. In manycases, the application was returning an HTTP 500 message when the prepaid limit wasreached, not because of an actual server problem. By proactively identifying which large clientsneeded to replenish their account balances, the online advertising platform is able to collectrevenue that would otherwise be lost.Security and ComplianceBecause ExtraHop provides detailed metrics for every transaction passing over the wire, itprovides security teams with valuable information about who is accessing which systems andhow they are doing so. For example, IT teams can use ExtraHop to see which clients areaccessing the database using root or system administrator accounts. ExtraHop also facilitatescompliance audits by providing audit teams with detailed reports showing database and storageactivity that is in violation of policy, including unauthorized access to specific directories andfiles.The IT Operations team at an online retailer was trying to stop an attacker that was extractingdata from the database through SQL injection. Using ExtraHop, the IT team isolated the webrequests that resulted in HTTP 500 errors and database responses in excess of 5MB. The ITteam then used ExtraHop to analyze the web requests and find both the IP address of theattacker and the database vulnerability they were trying to exploit. This information enabled theIT team to quickly block connections from the attacker and patch the database.Many IT organizations use ExtraHop to defeat repeated brute-force FTP hacking attempts fromoverseas IP addresses. These IT teams set an alert in ExtraHop that fires when a specific clientfails three FTP login attempts within 30 seconds and triggers a Fail2Ban action for that particularclient IP address.With ExtraHop, IT teams can easily track all SSL certificate expirations and RSA key sizes.

T h e E x t r a H o p I T O pe r a t i o n a l I n t e l l i g e n c e P l a t f o r m :How It Wor ks and Real-Wor ld ResultsExtraHop provides the real-timeoperational intelligence requiredto make IT more agile andproactive. The world’s best-runIT organizations use ExtraHopto manage more than half amillion devices and monitor overa trillion transactions daily,including Adobe, AlaskaAirlines, Concur, Expedia, andMicrosoft.ExtraHop Networks, Inc.520 Pike Street, Suite 1700Seattle, WA 98101 ations teams stand at the intersection of IT and the business. Increasingly,business success will depend on how quickly and how well these IT Operations teamsrespond to new demands. ExtraHop delivers the greatest results in companies thatbelieve how they run IT matters. Organizations across a wide variety of industries—including telecommunications, financial services, retail, healthcare, and government—use ExtraHop to build sustainable advantages over their competition. By running IToperations better, these companies can respond faster to new requirements, roll outinnovative new services faster, provide superior user experiences, and quickly gatherbusiness insights.Using ExtraHop’s visibility and insight, Operations, Development, Security, and otherteams are working together to continually improve security, performance, andavailability. At the same time, these IT teams are cutting costs through a more elegant,scalable, and flexible framework for IT operational intelligence.T 877-333-9872F 206-274-6393Customer Supportsupport@extrahop.com877-333-9872 (US) 44 (0)845 5199150 (EMEA) 2013 ExtraHop Networks, Inc. All rights reserved.

Wire data, which has traditionally included NetFlow, HTTP traffic analysis, and packet capture. The information available off the wire historically has been used for measuring, . SevOne, and Splunk. The ExtraHop IT Operational Intelligence Platform: How It Works and Real-World Results Simple, Non-Invasive Deployment The ExtraHop platform .