Meet Scruff, Security's New AI Teammate (Custom Agent)

TL;DR

Scruff triages alerts by triggering when a new alert page is created in Notion’s alerts database.

Briefing Cornell Notes

Briefing

Notion’s security team built Scruff, a custom AI “teammate” designed to speed up alert triage without ever making the final call on whether an incident is a true positive or false positive. The core idea is to turn hand-written runbooks and security-tool context into evidence-driven investigation support—so responders can move faster between systems and ask better follow-up questions.

Scruff triggers when a new alert page is created in Notion’s alerts database. From there, it pulls the alert’s existing context—severity, event details, actor and source information, plus a linked runbook written by the detection and response team—and runs an investigation aligned to the runbook’s steps. Crucially, Scruff’s job is not to decide “this is real” or “this is noise.” Instead, it frames what evidence points toward concern versus legitimacy and surfaces targeted questions that help a human responder explore the next best leads.

A concrete example is the alert titled “AWS IM identity center manual modifications.” In Scruff’s investigation output, responders see an event timeline and supporting findings drawn from connected systems. One highlighted result: Slack search turned up no pre-announced change documentation. The workflow also supports collaboration—because the alert lives as an Ocean page in Notion, team members can add comments and context directly on the same page. In the example, a security teammate named Zach is shown as a confirmed security team member with a history of legitimate AWS IAM Identity Center administrative work, and Zach comments that the alert is a compliment—context that can shift how a responder interprets the event.

Under the hood, Scruff is configured through an instructions page that defines its identity, strengths, and operating principles. The setup emphasizes critical thinking and evidence focus: gather comprehensive context, question assumptions, and avoid making determinations about true/false positives. Scruff is also wired to multiple external security tools via MCP servers, enabling cross-system searching during triage. The transcript lists connections to Slack, CrowdStrike, Wiz, and Scanner, reflecting the practical need to correlate signals across disparate platforms.

Scruff also maintains a “memories” page that acts like a running journal of what worked and what didn’t. Each time Scruff runs, it records tuning recommendations—such as adding a suppression for a specific behavior, deduplicating an alert type, or switching to a different MCP tool next time. Over repeated runs, these memory entries reduce friction and improve future investigations, effectively learning from earlier missteps in a straightforward, operationally useful way.

For builders, the advice is to start small: don’t aim to automate everything. If the system can reliably handle one narrow triage improvement—backed by clear instructions, runbooks, and MCP servers that provide high-quality data—it can meaningfully reduce responder workload while keeping final judgment in human hands.

Cornell Notes

Scruff is Notion’s custom AI agent for security alert triage that accelerates investigation while keeping humans in control of the final verdict. When a new alert page is created in the alerts database, Scruff reads the linked runbook and produces evidence-focused findings and follow-up questions, but it never decides whether an alert is a true or false positive. Its setup connects to multiple security tools through MCP servers (including Slack, CrowdStrike, Wiz, and Scanner) so it can search across systems during triage. Scruff also writes “memories” after each run, recording tuning recommendations like suppressions, deduplication, and tool-selection changes to improve future performance. The approach is practical: start small, with clear runbooks and reliable data sources.

How does Scruff handle an incoming alert from start to finish, and what does it produce for responders?

Scruff triggers when a new page is created in the alerts database. It then reads the alert’s summary context (event name, severity, who did it, where it came from) and follows the linked runbook step-by-step. Instead of making a true-positive/false-positive decision, Scruff focuses on evidence: what signals point toward concern versus legitimacy, and what questions a responder should pursue next. In the “AWS IM identity center manual modifications” example, Scruff outputs an event timeline and findings such as “no pre-announced change documentation found in Slack search,” which helps guide human investigation.

Why does Scruff avoid deciding true positives vs. false positives, and what replaces that decision?

The detection and response team keeps final judgment with a security engineer. Scruff’s role is to support evidence gathering and reasoning, not to certify outcomes. It replaces the verdict with structured investigation help: evidence framing (concern vs. legitimacy) and prompts for follow-up questions that stimulate creative next steps. This design keeps automation from overreaching while still speeding up triage.

What does Scruff’s tool connectivity enable during triage?

Scruff is connected to multiple external systems via MCP servers, allowing it to search and correlate across the security stack during an investigation. The transcript lists connections to Slack, CrowdStrike, Wiz, and Scanner. That cross-tool access matters because responders often need to bounce between systems to confirm whether an activity is expected, authorized, or suspicious.

What is the “memories” mechanism, and how does it improve Scruff over time?

After each run, Scruff writes a journal-like entry into a memories page. These entries are generated by Scruff and include tuning recommendations. Examples include adding a suppression for a specific behavior, deduplicating an alert type, or recommending a different MCP tool to use next time. Over repeated executions, these memory entries reduce repeated mistakes and make future triage smoother.

How is Scruff configured to behave like an evidence-focused teammate?

Scruff’s behavior is driven by an instructions page in Notion that defines its identity, what it’s good at, and operating principles. Those principles stress evidence focus and critical thinking: question assumptions, gather comprehensive context, and avoid making true-positive/false-positive determinations. This instruction layer is what aligns Scruff’s outputs with the team’s triage workflow.

What practical guidance does the team give for someone building a similar agent?

The advice is to start small. The system doesn’t need to solve every security triage problem; it only needs to improve one narrow piece of the workflow. Success depends on having clear instructions, simple runbooks for resolving alerts, and MCP servers that connect to high-quality data sources.

Review Questions

What specific outputs does Scruff generate during triage that help a responder without making the final verdict?
How do MCP-connected tools like Slack, CrowdStrike, Wiz, and Scanner change the quality of an investigation?
What kinds of tuning recommendations appear in Scruff’s memories, and how do they affect future runs?

Key Points

1
Scruff triages alerts by triggering when a new alert page is created in Notion’s alerts database.
2
Scruff follows linked, hand-written runbooks but never makes true-positive or false-positive determinations.
3
The agent’s value comes from evidence framing (concern vs. legitimacy) and follow-up questions that guide human investigation.
4
Cross-tool correlation is enabled through MCP server connections to systems such as Slack, CrowdStrike, Wiz, and Scanner.
5
Scruff maintains a memories page that records tuning recommendations like suppressions, deduplication, and tool-selection changes.
6
The system is designed to improve over time by learning from earlier investigation friction in a simple, operational way.
7
Builders are encouraged to start small with clear instructions, runbooks, and reliable MCP data sources.

Highlights

Scruff accelerates triage by turning runbooks and alert context into evidence-focused findings—while keeping the final true/false decision with a security engineer.

A key example (“AWS IM identity center manual modifications”) shows Slack search results—like missing pre-announced change documentation—used to steer investigation.

Scruff’s memories page functions as a practical tuning log, recommending suppressions, deduplication, and different MCP tool usage for future runs.

Topics

Security Alert Triage
Custom AI Agent
MCP Integrations
Runbooks
Evidence-Based Investigation