Build a Pro-Level Detection Engineering Strategy | SOC Success Pt. 2

Estimated read time: 1:20

    Learn to use AI like a Pro

    Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo
    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo

    Summary

    In the second video of the SOC Success series by SANS Institute, the focus is on building a robust detection engineering strategy for a security operations center (SOC). The video emphasizes the importance of having accurate detection mechanisms in place through proper telemetry, threat intelligence, and rule management. It details the lifecycle management of detection rules and highlights the necessity of aligning these with frameworks like MITRE ATT&CK. By ensuring all proposed rules are meticulously reviewed and aligned with organizational threat intelligence, a SOC can maximize its detection capabilities and improve response times.

      Highlights

      • Effective SOC detection depends on balancing detection engineering and accurate rule sets. πŸ“Š
      • A robust idea capture system helps prioritize rules based on threat intelligence. πŸ—‚οΈ
      • Managing rule lifecycle involves constant modification and alignment with changing network environments. πŸ”„
      • Rules must be thoroughly tested for reducing false positives and improving accuracy. βœ…
      • Aligning detection capabilities with MITRE ATT&CK assists in understanding tactical effectiveness. πŸ›‘οΈ

      Key Takeaways

      • Detection engineering is crucial for a top-notch SOC setup. 🎯
      • Balancing false positives and negatives is key to effective rule management. βš–οΈ
      • Lifecycle management of rules is essential for adaptation to changing environments. πŸ“ˆ
      • Integration with frameworks like MITRE ATT&CK enhances detection alignment. πŸ•΅οΈ
      • Proper training for engineers ensures accurate rule writing across various data types. πŸ‘©β€πŸ’»

      Overview

      The video by SANS Institute provides insights into constructing a premier-level SOC with a focus on detection engineering. It begins by stressing the essential role of managing telemetry effectively to ensure precise detection. Collecting and analyzing the right data feeds is crucial in identifying and anticipating security threats within a network.

        A significant portion of the discussion revolves around rule lifecycle managementβ€”how SOC teams should prioritize and manage detection rules efficiently. By capturing ideas for new rules and refining existing ones, security teams can enhance their detection strategies and address vulnerabilities proactively. The guideline is to constantly evolve and fine-tune rules as networks and threats evolve.

          The video also highlights the importance of frameworks like MITRE ATT&CK, which act as a benchmark for aligning detection capabilities. It encourages integrating these frameworks to track and tackle various threats effectively. Overall, it calls for a disciplined approach to SOC management, ensuring all aspects from rule creation to alert triage, operate seamlessly.

            Chapters

            • 00:00 - 00:30: Introduction The introduction highlights the continuation of a video series on building a high-quality Security Operations Center (SOC). It mentions that the previous video covered the first SOC function - data collection. The current video will discuss the next function, detection, and encourages viewers to subscribe to the Sans YouTube channel for more content.
            • 00:30 - 01:00: Understanding the Detection Stage The 'Understanding the Detection Stage' chapter focuses on the key processes involved in the detection stage of cybersecurity. The main input to this stage is telemetry data from various sources within the environment, such as log files and packet captures. The objective is to accurately identify and document any potentially malicious events. It's crucial to balance the sensitivity of detection systems to avoid both false positives (incorrect alerts) and false negatives (missed threats). This requires careful calibration to ensure that the right data is captured and analyzed appropriately.
            • 01:00 - 01:30: Accurate Detection Engineering In this chapter, the focus is on 'Accurate Detection Engineering' which involves crafting a rule set that is balanced – neither too aggressive nor too lenient – to detect malicious activities effectively. Achieving this balance is termed as accurate detection engineering, which is a combination of generating precise rules and utilizing threat intelligence. The chapter underscores the complexity of this task, emphasizing the importance of getting it right to ensure optimal detection without over-alerting or missing threats.
            • 01:30 - 02:30: Managing Rule Lifecycles The chapter titled 'Managing Rule Lifecycles' discusses the process of managing rules throughout their lifecycle. It breaks down the management into several stages, starting with determining which rules should be created from all potential rules. This involves creating a backlog and prioritizing ideas. The rule creation stage follows, where roles such as detection engineers or SOC analysts are responsible for developing the rules.
            • 02:30 - 04:00: Rule Idea Capture and Prioritization The chapter discusses the process of rule creation and testing to ensure accurate identification without false positives. It emphasizes the need for managing these rules as environments change over time, such as variations in logs, network configurations, subnets, and protocols. The chapter highlights the importance of adapting rules to maintain their efficacy despite the evolving nature of IT environments.
            • 04:00 - 05:30: Rule Creation Process The chapter discusses the importance of managing the lifecycle of rule creation and modification. It emphasizes the need for tracking who made changes, when they happened, and what alterations were made to prevent issues with unidentified broken rules. Furthermore, the chapter highlights the necessity of having metrics for rule tracking and traceability aligned with frameworks like MITRE ATT&CK to ensure effective rule management. It introduces the concept of idea capture as a starting point for this process.
            • 05:30 - 06:30: Rule Tracking and Framework Alignment The chapter discusses the process of rule tracking and ensuring alignment with a framework for writing rules. It emphasizes the importance of having a system where team members can collect and store their ideas for new rules in a 'rules to be created' backlog. The chapter underscores the need to prioritize rule creation based on alignment with threats and organizational goals, acknowledging limitations in capacity to write every rule simultaneously.
            • 06:30 - 08:00: Questions for SOC Improvement This chapter discusses strategies for prioritizing tasks and ideas in a Security Operations Center (SOC) environment. It emphasizes the importance of formalized processes for rule building and idea management, suggesting the use of tools like a use case database, Excel sheets, OneNote, Confluence, or ticketing systems to maintain an organized approach. The key goal is to consistently address and mitigate the highest risks by keeping everyone involved constantly informed.
            • 08:00 - 10:00: Conclusion and Next Steps In the chapter titled 'Conclusion and Next Steps', the focus is on detection engineering and the strategic approach to creating rules. It discusses the importance of rule creation in detection systems and suggests having a robust process around this creation. This includes thorough testing of new ideas and ensuring rules undergo peer review to minimize false positives. The chapter underscores the need for quality assurance before deploying rules into production, setting the stage for effective detection engineering strategies moving forward.

            Build a Pro-Level Detection Engineering Strategy | SOC Success Pt. 2 Transcription

            • 00:00 - 00:30 hello we are back again for the second video in this series on building a top tier security operations center if you didn't get to see the first video go back and check that out but in the last video we talked about the first function of a sock and that is the ability to collect data and getting it to your sock in this video we're going to be talking about the detection function the next thing in the line so let's get started remember if you haven't subscribed to the Sans YouTube channel yet make sure you do that so you see everything in this video series and all the other awesome stuff we post from our other instruction and Summit speakers and
            • 00:30 - 01:00 everything else okay so how does this detection stage work the input to this stage is all the Telemetry you have coming in from the environment and that's going to be all your log files packet captures and the like and what we would like to do is be able to have evidence anytime something malicious occurs so we need to have the right data coming in and given that we want to be able to uniquely call out this is an event that might represent something bad and we wouldn't want to overstep and call out false positives things that shouldn't generate alerts and we also don't want to miss anything false
            • 01:00 - 01:30 negatives so we have to walk the line between having a rule set that is just well tuned enough that it can detect the bad things but isn't going to be overzealous or under alerting either it's a difficult thing to do we get this right by performing accurate detection engineering generating of the right rules that you need to detect all this malicious activity but how do we get that done well accurate detection engineering is really a combination of two different things the first thing we talked about in the first video in the series and that is just having threaten tell to know what you should be looking
            • 01:30 - 02:00 for in the first place the other thing is being able to manage those rules throughout their life cycle what is management of those rules mean well we can break that down into a number of different stages the first thing we have to consider is of all of the rules in the world we might write which ones are we going to write we need to create some kind of backlog when everyone has a good idea where are we putting it and how are we prioritizing which rules are being created in the first place beyond that we have to have the rule creation stage where we're going to have a detection engineer or a sock analyst depending on
            • 02:00 - 02:30 how you do it actually write those rules and test them and make sure they're going to identify the right thing not identify the wrong thing not generate false positives and all sorts of junk that makes our life more difficult and then after those rules go into production we have to manage how they are modified over time your environment's going to be changing your logs are going to change your network might look different over time with different subnets different protocols being used so although you wrote a rule that worked great at one point it may
            • 02:30 - 03:00 need to be tweaked over time and when those tweaks happen you need to know who made those changes when they made those changes and what those changes were because you do not want to have a rule that is broken and no one knew it was broken so management of the life cycle of not only the Rule's creation but as it changes over time on top of that you need to have some kind of metrics around your rule tracking traceability up to Frameworks like miter attack those are the other things that are going to be really really important for getting this right so let's start with idea capture at any point in in time someone on your
            • 03:00 - 03:30 team might have a really good idea for a rule that needs to be written and maybe they are or maybe they are not the person that writes that rule but either way we don't want to let those ideas get away so the first thing you want to think about is do you have a place where everyone can collect all of their good ideas that acts sort of as a rules to be created backlog and then within that backlog how can you start to prioritize what is the most important rule to be writing at any given time you probably can't constantly write every single rule in the list so you want to think about how do these things align with our threat and tell
            • 03:30 - 04:00 and which one should we prioritize at any given moment to be developed to make sure that we are constantly Crossing off the things that present the highest risk to the sock one of the ways of doing that is having some kind of formalized process for building rules whether that's a tool like a use case database that has a way to queue up a list of good ideas or any other system you want to come up with right it can be as simple as an Excel sheet or one note or something much more advanced like Confluence or some kind of ticketing system that does this for you no matter what it is you always want to make everyone is constantly aware of what is
            • 04:00 - 04:30 the best thing I can be doing right now and in the realm of detection engineering that's what rule should I be writing next the next is Rule creation do you have some kind of process wrapped around rule creation such that a new idea that is going to be written is going to be thoroughly tested do you need to have people doing some kind of uh peer review or something like that to make sure the rules that you're generating are not going to be setting off false positives and then once you have a rule that conceivably is high quality enough to put into production then how are you going to in introduce
            • 04:30 - 05:00 that into the rest of the rule set and then take a snapshot and say this is where the rule set was before this is where the rule set is now after this introduction of a new rule in case you need to revert back right we may think everything was going to be good but sometimes it's not so the ability to take those snapshots can be a really great way of doing that how might we go about that if you're managing your rules as a set of text files that can be as simple as GitHub or maybe you have a tool that allows you to do this within the UI no matter how you get it done it's something that's absolutely critical for every sock team to be able
            • 05:00 - 05:30 to do so what about rule tracking and traceability up to Frameworks like miter attack a lot of teams now are aligning capabilities of the sock with the things that are listed in miter attack the various tactics and techniques but can you press a button and simply get an answer to of all the tactics and techniques we need to be able to detect which ones can we detect well how do we make that happen because ideally you do want to be able to do that the short answer to that is you should be able to track with all of the rules that you have put in place which specific tactics
            • 05:30 - 06:00 and techniques they align with and if you can do that in a way that can be programmatically accessed across the entire rule set and then you can pull all that information out then it should be fairly easy to extract that kind of information and again if you use some kind of formalized format for your rule set let's say Sigma rules for example there is a field there that allows you to type in the alignment to miter attacks framework and say this is a rule that catches a specific persistence mechanism and then when you go to pull it it's a simple pull it out of that
            • 06:00 - 06:30 field and then note it down if you have a way of introducing new rules from a backlog and prioritizing which ones should be created when they're introduced doing thorough testing to make sure they're not going to make a slew of false positives or miss things in a non-obvious way and then if you can track the changes to those rules when those changes were made and who made those changes over time you're in a great place the rest of it is really just managing that list and seeing are we catching in totality the group of things we need to be able to catch those are the main questions we want to be
            • 06:30 - 07:00 able to answer so what we're looking for in a top tier sock in a group that is trying to maximize their capability in this one detection function is a controlled way of capturing all the good ideas for new rules introducing those new rules to the production data set and then managing those rules throughout their life cycle as they change who changed them when did they change how did they change all those sorts of things and then in any given moment being able to look at the whole set of rules and answer do you have the things that you need to have based on what your
            • 07:00 - 07:30 threat intelligence tells you is going to be a priority if you can get all of those things done and you have some process and metrics wrapped around that you're in a great position to always be capturing and catching the attacks that you should be catching so to give you a list of questions to consider specifically aligned with the contents in this video here are some things that you can take back to your sock and make sure you have a solid answer for number one how does your sock capture the collective list of ideas that your detection Engineers or your sock analysts may want to write how do you prioritize that list for development if
            • 07:30 - 08:00 there are more things than you can handle at any given moment is it tied to threat intelligence or is it just the whim of whoever happens to be writing the rule do your engineers have the right sets of data from your collection function to be able to even write the rules in the first place are you getting the right kind of data to be able to write a rule that relates to any kind of telemetry that you may receive in other words are you getting all the logs from all your on-prem systems all the stuff happening in the cloud are you getting metadata or actual packet captures from all portions of your network on Prem and
            • 08:00 - 08:30 in the cloud can you look at individual files if you're trying to write Yara rules or something that is specifically looking for the contents uh bite strings things like that in a file what about email do you have the ability to write targeted detections for email do your engineers have the proper training to write rules for all of these different environments rules based on logs are going to be very different than rules based on network traffic which is going to be very different from Rules based on let's say Yara looking for specific contents of files and your detection Engineers or stock analyst or whoever is
            • 08:30 - 09:00 writing these rules needs to be specifically trained in all of those languages if you want to be effective at catching something in that sort of content what is your process for testing in rule are you doing both true positive and false positive testing and ideally having another set of eyes look at that new rule before it goes into production to make sure there aren't any simple mistakes in it if not you might want to introduce a process around that and then finally how much context is available with each alert as it fires if you know the analysts are going to have to dig up additional data to verify whether
            • 09:00 - 09:30 something is a true or a false positive can you automatically produce that information and attach it to the alert and present it to the analyst right away so that they don't have to go dig and spend extra time to get it if so the triage step which is our next function in the sock which will be covered in our next video is going to be much much easier these are the kinds of questions we need to be thinking about when we're considering detection engineering now I know this was a very short video but hopefully we've given you some things to think about here this is incredibly important if you do not detect that the
            • 09:30 - 10:00 attack is happening as early as possible in the kill chain you're going to have an attacker that is continuing to run through your environment making incident response that much harder so collection of the right data leads to hopefully accurate detection for your sock and the output of the detection function is going to be alerts that are going to go to the triage queue in the next video we're going to talk about how to make sure the triage function goes as smoothly as possible an analyst can quickly say there's a true positive and jump into action with very little thought and very little pause so stick
            • 10:00 - 10:30 with us for the next video in this series and hope to see you then thank you for watching