The SOAR Buyer's Guide - Splunk

Transcription

BUYER’S GUIDEThe SOARBuyer’s GuideThe who, what, where, when and why ofbuying an analytics-driven security solution

BUYER’S GUIDETable of Contents1. Introduction. 3a. Defining Security Orchestration, Automation and Response. 4b. Identifying Security Use Cases. 42. Evaluation Criteria. 6a. Core Capabilities. 7b. Platform Attributes. 12c. Business Considerations. 153. Conclusion. 164. Evaluation Checklist. 17The SOAR Buyer’s Guide2

BUYER’S GUIDE1. IntroductionInvesting in a Security Orchestration, Automation and Response (SOAR) platform is a wiseand highly strategic decision. After all, choosing the platform to build your security operationcenter (SOC) on is arguably more important than choosing any point security product.The SOAR platform you choose will become a central part of your security infrastructure,effectively acting as the operating system for your security investments.This guide aims to outline the important criteria you should consider when evaluatingSOAR platforms.The SOAR Buyer’s Guide3

BUYER’S GUIDEDefining Security Orchestration,Automation and ResponseAlthough automation has been popular in othersoftware segments for years, including salesforceautomation, marketing automation, HR automation andIT au tomation, security teams are just beginning torealize the benefits of automation and orchestration.This revelation, in turn, is driving buyer interest inSOAR platforms. As a result, many security vendorsare pivoting toward the SOAR category to capturemindshare; using their existing offerings that originatedfrom adjacent market segments. The unfortunateeffect is that as hype around the SOAR segment builds,an ever-increasing number of vendors with differingperspectives blur market definitions and make offeringcomparisons difficult.To add clarity, we propose the following definitions:Security OrchestrationSecurity Orchestration is the machine-basedcoordination of a series of interdependent securityactions across a complex infrastructure.Security AutomationSecurity Automation is the machine-based executionof security actions.Security ResponseSecurity response is the policy-based coordination ofhuman and machine-based activities for event, caseand incident workflows.We will use these definitions throughout this guideto further examine what capabilities, attributes, andconsiderations work together to form a best-in-classSOAR platform.Identifying Security Use CasesTeams commonly identify security use cases toimplement with SOAR platforms. The use cases aremodeled after existing manual workflows and usuallyrepresent their greatest operational pain points. Theworkflows commonly contain many manual tasks andrequire working across multiple products to complete.In addition to pain points, it’s important to map out allpotential use cases before beginning your evaluation.This effort should involve key stakeholders across yoursecurity operations team. Identifying comprehensive usecases, even if they are not immediately implemented, isimportant to help ensure that the platform you choosetoday will also support your needs in the future.Below is a selection of security use cases spanninginvestigation, enrichment, containment andremediation categories:Alert TriageThe objective with alert triage is to validate andprioritize incoming alerts. Use cases that focus ontriaging inbound alerts involve enriching events withadditional context. They may also include logic toeliminate high-confidence false positive alerts fromfurther processing.Incident ResponseIncident response use cases can vary greatlydepending on the type of incident. For example,responding to a phishing attempt incident isquite different from responding to a successfulransomware attack.Indicator of Compromise (IOC) HuntingBy automating IOC hunting, teams can fully leveragethe threat intelligence they receive instead of limitingthe IOCs they hunt for due to resource constraints.They might also implement intelligence scoring toassist with deciding which threat intelligence sourcesto use.Vulnerability ManagementAutomating the cycle of identifying, classifying,remediating, and mitigating vulnerabilities yields notonly greater team efficiency, but also more consistentresults by ensuring that the process is performed thesame way every time.Network Access Control (NAC)SOAR platforms can augment dynamic access controlstrategies. One example is integrating a detectionsystem that previously was not part of the NACdecision making logic.The SOAR Buyer’s Guide4

BUYER’S GUIDEUser ManagementEnsuring that users are enabled and disabled accurately,rapidly and systematically can eliminate the chance thata user account is used maliciously by a threat actor.Penetration TestingActivities like asset discovery, classification and targetprioritization can be automated, thus increasing theproductivity of the pen testing team.Intelligence SharingOrganizations that have intelligence sharinginitiatives can greatly benefit from an automationassisted playbook. Automation can also increasean analyst’s productivity and provide time-sensitiveinformation back to a community faster than withmanual processes.Other Use CasesOther automation use case candidates come from wellknown scenarios where the security operation teamscan codify the criteria that can be used to automaticallymake decisions and take appropriate actions.The SOAR Buyer’s Guide5

BUYER’S GUIDE2. Evaluation CriteriaWe suggest segmenting SOAR evaluation criteria into at least three sections: core capabilities,platform attributes and business considerations. Core capabilities are typically functional innature and easily identified in a platform. Platform attributes are more subtle, like architecturalcharacteristics, that form criteria that will influence platform selection. Business considerationsround out the product offering and include value-add services offered by a company toaugment their core technology, like training and support.The SOAR Buyer’s Guide6

BUYER’S GUIDECore CapabilitiesCore capabilities can be thought of as the basic parts ofan SOAR platform. We will enumerate each capability,or component, and offer considerations to assist withevaluation and selection.OrchestratorThe orchestrator should direct and oversee allactivities relating to a given security scenario frombeginning to end. In all situations, it is critical that theorchestrator deliver consistently predictable resultsand optimal utilization of available resources.Data IngestionSecurity automation and orchestration begins withsecurity data ingestion. An orchestrator should be ableto ingest security data from any data source and in anyformat. It should be able to receive data that is pushedto the platform and it must have the ability to poll datasources and pull data into the platform. If unstructureddata is ingested, the orchestrator should allow a userto supply a data handler to interpret the data andmake it usable by the SOAR platform. The orchestratorshould also be capable of ingesting data from multiplesources and have the option to keep the ingested datalogically separated.Decision MakingUsers should be able to select the automationplaybooks that are applied to a data source. Forexample, an email phishing playbook might be appliedto an email-based ingestion source while a malwareinvestigation playbook might be applied to a SIEMalert ingestion source. This decision-making stepis closely related to alert management capabilities,described later.Task ExecutionIt is typically the role of the orchestrator to dispatchautomation tasks from its queue at the appropriateand optimal time, passing them to the automationengine for execution.Human SupervisionAn orchestrator should effectively balance machinebased automation with necessary human supervision.There are three common scenarios where an analyst isrequired: when approval by an asset owner is neededto execute a security action on a target, when reviewby an analyst is required to ensure that security isbalanced with business continuity, and when ananalyst needs to augment codified decision-makinglogic (for example when an error occurs).Data ManagementAn orchestrator should also ensure that the outputdata from one action is properly parsed, normalizedand structured so that future actions can makeuse of it. The orchestrator should also supportcaching relevant data when required to avoid taxingother resources.Fault ToleranceA SOAR platform regularly interacts with manydiscrete products and services to execute automationplaybooks. An orchestrator must expect that theavailability to products and services is not alwaysguaranteed. Access to external services can beinterrupted and broken. In these situations, anorchestrator should perform predictably, recoveringand resuming operation gracefully as configured.Automation EngineThe automation engine is the workhorse of mostSOAR platforms, receiving actions, or tasks, from theorchestrator and reliably executing them. Becauseautomation tasks run independently and largelywithout human interaction, attributes such as platformscalability and extensibility are important criteriato consider.ScalabilityIt is important to understand how the automationengine will scale both vertically and horizontally. Itis expected that a user will be automating more usecases over time. With each additional use case, therewill be additional processing load on the automationengine. The automation engine should be designedin a way that allows for vertical scaling (for exampleincreasing CPU and RAM resources) and horizontalscaling (for example increasing server instances) toincrease performance and protect the automationreturn on investment (ROI).The SOAR Buyer’s Guide7

BUYER’S GUIDEExtensibilitySecurity evolves quickly, thus new functions shouldbe supported by the automation engine withoutmajor re-engineering. The automation engine shouldsupport the ability to adapt to the unique capabilitiesof its environment.Alert ManagementJust after data ingestion, discussed earlier, an alertmanagement capability in a SOAR platform shouldqueue and prioritize inbound alerts to help analystsperform triage more efficiently. Alert investigationsmight be performed using manual or automatedexecution of actions to yield the highest levels of triageproductivity and accuracy. The interface of an alertmanagement capability should be built in a way thatenables all aspects of a security alert to be rapidlyconsumed and efficiently acted upon. The interfaceshould also arrange information in a way that surfacesthe right information at the right time, which avoidscausing an analyst to perform extensive searching orswitching between contexts.Alert DetailsThe technical attributes of a security alert should beorganized in a way that allows an analyst to quicklydigest them to understand the security scenario. Thisincludes an organized view of data like: IP addresses,domain names, file hashes, user names, email addressesand all other relevant data fields. Use of a standardformat such as Common Event Format (CEF) or anequivalent is highly beneficial for data exchange.Issuing ActionsWhen investigating an alert, a security analyst shouldbe able to issue manual actions to the platform thatmake use of alert data. This includes investigative,containment, corrective or generic actions. Theinterface should allow a user to execute an actionby selecting the data to operate on. This behavioris sometimes called contextual action executionand enables pivoting analysis around newlydiscovered information.Like manual action execution, an analyst should alsobe able to issue a collection of actions against an alert.This collection of actions is commonly referred toas a playbook.Action ResultsWhen manual or automated actions are taken againstan alert, the results should not only be viewable andmake sense to an analyst, but also make sense to theSOAR platform that might use action results to make anautomated decision. Action results should be availablein a summary format (for example a table view) as wellas in a more comprehensive format (for example JSON).Activity LogThe platform should provide a comprehensive activitylog that displays a record of all actions that haveexecuted against an alert, whether they were initiatedmanually or via an automation playbook. Each actionshould display its results, including an indicator ofaction success or failure, making it clear whether theaction fully executed.Alert Status, Severity and SensitivityEvery alert managed by the platform should includea status indicator (for example new, open or closed),a severity indicator and a sensitivity indicator (forexample Traffic Light Protocol or TLP designations).Each indicator should be modifiable within thealert management interface, as well as from withina playbook.Alert CollaborationThe interface should provide an area where analystscan collaborate, comment and provide miscellaneousinformation about an alert. It’s ideal for the record ofthis collaboration to be captured and organized alongwith all other alert data.Case ManagementOnce alerts or events are confirmed and escalated, acase management component should drive a broader,cross-functional lifecycle from creation to resolution.This component should accommodate additionalattributes of a case that differentiate it from an alert.Multiple alerts may have been confirmed, aggregatedand escalated as a single case. Alert management isusually technical, while case management commonlyincorporates technical and non-technical steps intothe process. Finally, cases tend to be lower in volumeversus alerts: Many organizations receive hundredsor thousands of alerts per day, while cases tend tonumber in single digits per day.The SOAR Buyer’s Guide8

BUYER’S GUIDECase Data OrganizationAll data relating to a case should be aggregated bythe case management component. Displaying theinformation in a single location enables users toefficiently consume it and avoids context switching.Activity AuditingInformation additions or modifications as well asstate changes are important details to a case. Anychanges to a case should be logged in an audit trailand be exportable.Adding Data to a CaseThe case management interface should supportattaching relevant technical data such as the alert’ssource data and action results to the case. The interfaceshould also support attaching relevant non-technicaldata such as notes, memos, emails, screenshots,recordings or any other arbitrary file with relevance tothe case. Automated attachment of information to acase should also be possible from within a playbook.Changes to a case might include:Linking Cases to AlertsDuring a case investigation, it is common to identifya piece of data that requires additional investigationor a scenario that requires issuing an immediatecontainment action. Therefore, if an analyst determinesan action should be taken, the case managementinterface should seamlessly link the analyst to the alertmanagement interface for the respective alert. Fromthe alert management interface, additional actions canbe executed and changes to relevant data should bereflected in the case management interface.Playbook ManagementMapping to Existing ProcessesMany organizations have developed standardoperating procedures (SOPs) for incident response,emergency, disaster and other critical situations. Thecase management functionality should provide a userwith the ability to define stages according to theirprocess and save them as a template. A user shouldhave to ability to break the SOP into multiple stageswhere each stage has one or more tasks, and eachtask can be assigned an owner. Additional contextualinformation associated with a task can be incorporatedwith the task description. Much like task managementapplications, tasks should be marked as closed whencompleted by the assignee. The interface shouldprovide an indicator of progress for the case as well asthe case status. Adding data Modifying data Modifying a stage or task Adding files or notes Modifying files or notes Completing a task Any other activity or modification to the casePlaybook Management assists with the maintenance ofSOPs. Ideally, this component should provide revisioncontrol and the ability to manage syndication of SOPs,in the form of playbooks, within an organization andpotentially across a community.Playbook OrganizationPlaybook management should allow for properorganization and grouping of playbooks. Users shouldbe able to define their own grouping based on whatworks best for their organization. For example, youmay choose to organize and group playbooks basedon themes, sensitivity, organizational segment orasset types.Custom FunctionsBeyond the actions that are available to users outof-the-box, playbook creation should allow forcustomization and scalability in order to automatesecurity processes effectively. Users should be ableto write custom code blocks — or custom functions— within a playbook. These custom functions shouldbe easily shareable across multiple playbooks whileproviding centralized code management and versioncontrol. This greatly accelerates playbook creationand execution, regardless of a user’s ability to code inPython or not.Bulk Edits to PlaybooksThe inner workings of each playbook are likely to beunique. There are commonalities of many playbooks,however, at the administrative level.The SOAR Buyer’s Guide9

BUYER’S GUIDEA playbook management system should allow for thebulk editing of playbooks such as: Ingestion sources Enabling/disabling automatic execution enabling/disabling safe mode operation Enabling/disabling enhanced logging Setting playbook category groupingRevision Control and DistributionIntegration with a version control system (VCS), suchas Git, is a strong recommendation for successfulplaybook management at scale. At the deploymentlevel, leveraging a VCS enables the systematicdistribution of playbooks across multiple systems.This is useful for syncing playbooks between adevelopment system and a production system, orsyncing across multiple production systems spanningmultiple sites. At the development level, a VCS isimportant for tracking revision changes and having theoption to roll back changes if necessary. A secondarybenefit is to enable a developer to edit playbooks inthe editor of their choice and easily synchronize themodified playbooks back into the platform.Automation EditorThe automation editor is where an analyst or managercodifies their processes into automation playbooks.The predecessor to a visual automation editor isthe basic source code editor. Editing automatedplaybooks exclusively in a source code editor madeconstructing playbooks a tedious and difficult processthat was only achievable by a relatively smaller groupof programmers. A visual automation editor allows allsecurity experts, who may not have the expert ability towrite playbooks at the source code level, to constructcomprehensive and sophisticated playbooks. Thevisual editor should adhere to Business Process Modeland Notation (BPMN) standards, which is a graphicalnotation for specifying business processes. BPMNsupports intuitive symbols for business users, whileproviding technical users with an ability to representhighly complex processes.User Interface ElementsThe user interface elements should start with a canvaswhere visual playbooks can be constructed. Thispart of the interface should provide an area where adesired action can be specified (for example block ipor file reputation). Once an action is selected, therewill likely be parameters required to properly configurethe action. The interface should provide the abilityto either manually enter the parameter or select theparameter from a list. Alert data and/or action resultdata might also be used as parameters.The interface should also have a location wheretesting and debugging can take place, allowing thetransition from edit mode to test mode to be seamless.Finally, a source code view should be accessible inthe event a user wants to see the source code for theautomated playbook.Block-Based Representation of CodeUsing blocks to represent meaningful steps in theautomation platform allows users to write comprehensive,complex playbooks without touching the underlyingsource code. Blocks should be connected in a one-toone, one-to-many or many-to-one fashion to dictate anorder of execution. Visually, a user should be able to builda playbook that includes action executions, platformAPI calls, conditional statements (if/then) and branchingstatements that connect one playbook to another.Inserting Humans Into the Decision ProcessSupervised automation support is a commonrequirement, which is the case where a human canbe inserted into an automation sequence to approve,review, or augment continued playbook execution.The automation editor should support this humansupervision step by inserting approval point(s) in aplaybook adjacent to one or many security actions.A playbook author should have the ability to specifywhich individual(s) will be inserted in the automationloop, along with the type of notification or approvaldesired. The playbook editor and underlying platformshould be able to define error handling logic that wouldbring a human into the automation loop as well, likewhen one or more reputation services is not availableto support decision making.Information Exchange of Action ResultsThe automation editor interface should allow fornew information resulting from preceding actionexecutions to be available as inputs, or parameters,to downstream actions or decision blocks. The actionresults of preceding actions should be accessiblevisually and selectable from a drop-down list whenpopulating the parameters of an upstream action.The SOAR Buyer’s Guide10

BUYER’S GUIDEAccess to Playbook Source CodeWhile constructing the playbook in a visual editor, theresulting playbook source code should be generatedin real time and be accessible to the author. There aresome users who may prefer to construct all or part ofthe playbook via a traditional source code method.The interface should allow the ability to collapse avisual editor and replace it with a source code editor.Switching between visual and source code modesshould be seamless and effortless.Simultaneous Visual and Non-VisualPlaybook ConstructionWhen working with a playbook’s source code, theautomation editor should allow the author to modifythe playbook at the source code level and retain theability to modify the playbook at the visual block level.There are times where the author requires individualblocks (for example actions, decision blocks) to bemodified at the source code level for customizationsbeyond the scope of the visual editor. When thesemodifications are done, a user should still have thefreedom to modify the playbook visually.Built-In Testing and Debugging and Runtime LoggingIt is standard for integrated development environments(IDEs) to provide execution and debug capabilities. Inthe case of an automation platform, a user should beable to execute the playbook against a security alert andobserve the execution activity and results. Logging anderror codes should be displayed in a debug window thatcan be displayed simultaneously with the visual blockeditor or source code editor, if the author prefers sourcecode. The objective is to enable the author to quicklyedit, test and debug playbooks within one interface.Safe ModeAn automation editor should also provide a safe modefor new playbooks that need pre-production testing. Thismode simulates the execution on automation targetswithout effecting change on them. Once an author orother platform user has gained sufficient confidence inthe playbook’s logic, this safe mode can be disabled andthe playbook can begin to function in the normal manner.App FrameworkThe app framework provides an extensible interfacefor new integrations that connect the platform to anyof the thousands of point products available in thesecurity market today.Open EcosystemA SOAR platform can lose its value over time withoutintegrations to new market offerings. To ensure apredictable roadmap for app integrations, a platform shouldadopt an open ecosystem that allows anyone to developintegrations. This affords users with autonomy and allowsthem to avoid vendor lock-in. Technologies can transitionin and out without negatively impacting automatedplaybooks. New technologies must be quickly integratedinto the platform without requiring any modification to thecore platform. Lastly, users must have the control to pullin support for additional platforms without relying on theSOAR vendor for additional development.App DevelopmentApps development is a key component of an openecosystem as it allows users to integrate with multipledifferent technologies to make their playbooks function.It’s critical that a SOAR solution be able to streamlineapp development within the product itself so thatusers can view, test, extend, and edit existing apps,as well as create entirely new apps, all from the SOARplatform UI. This greatly increases productivity, allowsfor customization to specific use cases, and ensuresvisibility into how an app functions.Metrics & ReportingMetrics and reporting are important to every automationplatform and SOAR platforms are no exception.Automation promises increased productivity andincreased quality. Metrics are critical to understanding theeffectiveness of the automation platform and identifyingwhere improvements can be made to increase ROI.Flexible DashboardsMetrics are specific to organizations and individuals.Because of this, users need the ability to organize theirmetrics in a way that makes the most sense for theirorganization. The SOAR platform should enable the abilityto organize metric information in a highly customizedway. This includes configuring the order information ispresented on the dashboard, specifying which metrics aredisplayed and which time window(s) they are displayed in.Performance ReportingAutomation is deployed to increase operationsefficiency. It is critical to understand the quantitativeperformance gain and resource savings that automationprovides and to have this information readily availablevia a dashboard.The SOAR Buyer’s Guide 11

BUYER’S GUIDEExamples of key performance metrics that should beavailable on the platform: Mean time to resolve (MTTR) Mean dwell time (MDT), which is defined here asthe period of time between a compromise (by athreat actor) and taking an appropriate responseTo identify gaps in automation, as well as theeffectiveness of tool integrations, the followingexample metrics should be provided by theautomation platform: Alerts closed through automation (per hour,day, week, month, or other time window) Analyst hours saved through automated execution Most active app integrations Number of full time equivalents (FTEs) gainedthrough automated execution Most active actions (manual and automated) Average time saved per playbook run Playbook execution time Money saved (FTE-cost x FTEs-gained) Action execution timeSecurity Effectiveness ReportingAutomation is also deployed to increase the securityeffectiveness and posture of the organization.Understanding the total number of security alertsmanaged, along with the pace at which they arebeing managed, are critical to understanding theeffectiveness that automation provides.Examples of key security effectiveness metrics thatthe platform should provide: MTTR and MDT (introduced above) Total number of open alerts Alerts opened per day (hour, week or monthalso appropriate) Alerts closed per day (hour, week or monthalso appropriate) Performance against service level agreements(SLAs)App Integration and Playbook PerformanceUnderstanding the most frequently invoked playbookscan help shed light on where further automationinvestments can be made. Ideally, playbook designshould strive for the automated closure of falsepositive or high-confidence true positive alerts. Incases where automation is not closing the alert triagegap, playbook revisions may be necessary. Most active automated playbooksHuman WorkloadWhile automation is intended to close the humanresource gap, there are still cases where humansneed to be involved in the day-to-day activity of aSOAR platform. These cases include where manualtriage and other actions are required on an alert, orwhen human approvals are inserted into the playbookto achieve “supervised automation.” Understandinghuman workload can also help identify areas wherefurther automation and tuning may be needed. Thefollowing example metrics should be provided bythe automation platform to understand the humanworkload involved in the automation process: Alerts assigned to an individual Alerts closed by an individual Average approval time Number of out

Automation and Response Although automation has been popular in other software segments for years, including salesforce automation, marketing automation, HR automation and IT au tomation, security teams are just beginning to realize the benefits of automation and orchestration. This revelation, in turn, is driving buyer interest in SOAR platforms.