Zero Trust for AI Agents: Securing the New Insider Threat

AI agents are becoming active participants in enterprise operations—with real access, real permissions, and real security implications.

posted on
June 10, 2026
Transcript

Brian Moody: So, continuing in our series of SoundBytes, we've talked about zero trust multiple times. We've brought you guys our thoughts and feelings around zero trust. There is no zero trust product. We see so many people in the industry say, well, this is a zero trust product.  

Really?  I mean, zero trust is a foundation. Zero trust is a security model that you put in, and we've talked about that multiple times. AI agents. We've talked about the challenge that most organizations have. We've discussed the security associated with AI agents.  

Let's bring these two models together today in talking about how and why would we want to implement a zero trust model behind these AI agents.

 

Shahin Pirooz: Well, what's fundamentally happening is the shift is changing from AI being a copilot or a helper helping us to do stuff to autonomously doing stuff. And that autonomous doing stuff needs guardrails. And, we've said in previous SoundBytes that you have to think of these AI agents as your digital employees in comparison to your corporeal or carbon-based employees.  

What do you do when you release an employee in your environment? You figure out what they should have access to. You figure out what authorization levels they have. You figure out what sandbox they're allowed to play in, and you build observability around what they do to make sure they don't go astray. If we're treating agents like digital employees, we should build those same constructs around agents.

 

Brian Moody: But we're not doing that. And I think that's the biggest thing that we see in most environments is folks are rushing to implement AI. They're rushing to implement these agents into their environment. And in almost all cases, the agents are given unprecedented access to the environment. But I think the challenge is—

 

Shahin Pirooz: Because that's easy.

 

Brian Moody: Well, sure.

 

Shahin Pirooz: It's a lot easier to say, here's your global admin, go do stuff I need you to do.

 

Brian Moody: Turn you on and let you go do what you're going to do. But I think what we're finding now is that the capability that these agents have, and now as quickly as the hacker community, as quickly as the criminal community is grabbing onto this, what if, not what if, they already are exploiting.

 

Shahin Pirooz: When if.

 

Brian Moody: It's not when if, I think more than anything is it's, you know, and these articles that I've read is it's not about when this is coming. It's now. I mean, it is now, folks. And they are taking advantage of the capability of what these agents can do and how they can exploit them.  

So that's what we're going to cover a little bit today. So I want to start, Shahin, talk about—

 

Shahin Pirooz: Before you go there, real quick, there's two things to bear in mind. First of all, to answer your question, why bring these two things together, the concept of agents and zero trust?  

The answer is everything I talked about in the context of who they are, what they have access to, when they have access to it, and can you see it, or you do you know that what they're doing— those things are technically foundations of a zero trust methodology and process.  

And the reason zero trust is important here is zero trust is this notion of trust but verify. So I've often said that it's confusing because we keep saying zero trust, but a lot of people get stuck around those words, and the easiest way to think about zero trust is you need to create explicit controls, guidance, right, in this context, right of access, what authorization level, what they have, what area they can play in.  

So explicitly state those things rather than implicitly saying it because they have an account in your environment, they're in your network, they're whatever. Just because they're in the office doesn't mean they have access to everything. So zero trust, all it's saying is don't just let them in your house and let them have free rein. Tell them this is the guest room, this is where you get to go. And you can go to the kitchen and open the fridge. That's it.

 

Brian Moody: So ideas like, you know, a database agent should have access to the database, right?

 

Shahin Pirooz: But not every server.

 

Brian Moody: But not every server. And in some cases, just read. It might not have write capability.

 

Shahin Pirooz: Exactly right.

 

Brian Moody: Some of the AI agents that we use today and all these folks about analyzing my mail, that agent should have access to the mail, but not necessarily have access to send that mail. So again, this is an idea of trusting, but then controlling, verifying. So never trust, always verify. It's one of our key concepts today.

 

Shahin Pirooz: And fundamentally, if you think about what we're really talking about, it's that there's a confusion. The second point I wanted to make, there's a confusion about we keep saying these agents—the bad actors are gonna take advantage of these agents.  

The question is how. How are they possibly going to take control of this agent that's inside my network, behind my firewall, doing stuff in my sacred castle. It's all about prompt injection. That's the thing that people don't understand about agents and AI in general.  

And how does prompt injection happen? It can be an instruction in an email, it can be an instruction in a document, it could be an instruction that's put into a web page, it could be a SQL injection-based instruction. These instructions, when the agent picks them up, it goes and does what they asked. Because that's what the agent's designed to do.  

So prompt injection sounds like this thing that a lot of people wonder, like, how are they injecting prompts? And the answer is they're embedding instructions in documents that you wouldn't think about. So when you go and say, go process this document and summarize it for me, it gets halfway through it and there's an instruction in that document that says, go delete every file or exfiltrate it to this site.

 

Brian Moody: So that's an indirect prompt. There's also direct prompt, right? So one of the key things that I was reading is a Microsoft study that showed that the AI components, they don't know the difference between informational data coming in and an executable prompt or code that, to them, it's information. So, however, as you just stated, when the agent reads that or the agent sees that, that's instructional code that tells them to go do something.

 

Shahin Pirooz: Go do something. So that's how, that's when we say people, the bad actors will take advantage of these agents you're writing, we're talking about prompt injection.

 

Brian Moody: Well, there's quite a few different components that I wanted to get your opinion on. So what I do is Shahin says, hey, here's these topics. I go out and study up because I wanna pull the information out of you.  

So before we jump into what are the threats, I just want a quick comment from you again. So why, as we get into this topic, why do traditional security tools not work? Because they don't when we're addressing this particular security topic.

 

Shahin Pirooz: You can't—when we say treat them like employees, it implies this notion of create an identity for them. Make them, give that identity access to things. The issue is people get stuck in, okay, I'm going to create an Active Directory identity for them. That's not really how it works.  

Agents need to have a cryptographic identity, something that very critically identifies them and is difficult to break. So when you think about creating an identity for an agent, that is similar to creating an identity for a firewall or an identity for a router or Wi-Fi access point.  

That thing doesn't have a logon. That thing has an identity that can identify it on the network and validate what it's authorized to do. So there's this concept of least privilege in humans. I've been re-tagging it as least agency. And least agency is the same concept of least privilege for humans, but it's least privilege for agents.  

You need to give them an identity. You need to restrict what they have access to based on a need-to-know. Least privilege is this concept of need-to-know. So if they don't need to be going into the HR data to answer the questions for a specific thing, don't give them access to the HR systems. It's that simple. And then authorization. What are they authorized to do? Are they authorized to read? Update? Delete? Those are constructs that you need to think about.  

And the last layer is observability. Do you know what they did? Can you visibly go back and look and see what are the things they did, and therefore set up triggers and alarms that say their behavior is malicious or different or straying from the process?

 

Brian Moody: So I'll pose this to y'all watching is, think about a question in your head. That if something happened, if an agent in your environment went rogue, how quickly would you know about it?  

Would you know about it in 15 minutes, 5 minutes, an hour, or would you know about it at all? And that's back to that observability point that you just make, is we have to know what they're doing.

 

Shahin Pirooz: And the tag on to that is how long would it take you to stop it?

 

Brian Moody: Well, we'll get to that. So, we talk about prompt injection, direct and indirect. So two very critical aspects around exploiting agents. Tool poisoning, tool chaining. So this is something else that again, you got back to agent capability.  

Talk a little bit about tool poisoning and tool chaining, because for me, that was one of the things that I started going, wait a minute.

 

Shahin Pirooz: Yeah, so tool chaining is this concept of leveraging one tool to engage with another tool to do a thing with a third tool, and so on and so forth. And this is a very important concept in the context of sandboxing.  

So when we think about how do we sandbox access and give limited control to these agents we're creating, we need to think about not just what systems do they have access to, but what tools do they have access to. And the only way to get to a zero-trust model in this context for agents is to whitelist the tools that you want them to use and everything else is blacklisted.  

There is no other way to solve this problem because we talked about, it was, I don't know, 4 months ago when we were doing a SoundByte, we talked about how Microsoft all of a sudden dumped all these agents into everybody's Office 365 tenant and said, have at it. And they were third-party agents, not agents that, Microsoft wrote, but they basically opened the floodgates and any of your employees could have grabbed one of those agents and started messing around.  

So how do you restrict access? How do you control it? You not only sandbox—and think about sandboxing as segmentation—you're segmenting what they have access to, but then on top of that, whitelist the tools they're allowed to use to do their job. You're only allowed to use these two tools, and those two tools can't chain to other tools.

 

Brian Moody: So privilege abuse, another critical aspect of AI agent exploitation. So talk a little bit about that capability because I think that directly falls right back to tool capability, tool chain.

 

Shahin Pirooz: And it's the same notion of authorization level and least agency. So we hinted at it at the beginning of this.  

It's really easy to give this agent you created domain admin credentials and say, I know you don't need all these rights, but it'll be so much easier just for me to give you the rights just in case you need them someday. Problem is, if you take that prompt injection and everything else we've talked about, and now they give it an instruction that says go create a domain admin user, name it this, and give it access to everything. It's a domain admin; it can do that.  

So, when you talk about taking advantage of the rights of an agent, it very easily can be done, so you want to least agency; they can only have access and see the things they need.  

And sandboxing, limit the scope of what they can do. And whitelist the tools. You're not allowed to mess with Active Directory. You're not allowed to add users. You're not allowed to delete users. Now, there's some use cases where IT folks want to create agents to go and provision a user. But again, sandbox the heck out of what they're allowed to do and what rights they're allowed to give that user and if they're allowed to elevate privilege.  

The two biggest risks in prompt poisoning is simply giving the agent a task to do. Think of that as malware on the system. Taking privilege, abusing privilege that an agent has, that's the equivalent of elevation of privilege. Now you all of a sudden have, you're in the network, you have this malware, you're in the infrastructure, and you now have elevated privilege because this agent has elevated privilege to do the things you want to do.  

So it is traditional similar attack chain methodologies, but it's working at the speed of the electronics. It's not working at a human speed.

 

Brian Moody: And the human's not making the decision. So the other aspect is that something that you, wrap your head around is these agents are making these decisions and executing these actions in your environment.

 

Shahin Pirooz: The speed of light.

 

Brian Moody: At speed of light. And so that's where, again, the term that we continue to see is blast radius, right? Is just imagine, you know, a tool that has access to thousands of data repositories, petabytes of data, the impact that an agent at the speed of light can have on that environment. Without proper segmentation, without proper sandboxing and permissions is catastrophic.

 

Shahin Pirooz: And it's no different than a domain admin whose account gets compromised. Like if you think about the human to digital employee comparison, you have a domain admin whose credentials get compromised, your blast radius is the entire network.  

So similarly, you need to, when we talk about sandboxing, sandboxing in the context of the network is the same as micro-segmentation. Segment what the agent has access to, limit what rights the agent has within that scope.

 

Brian Moody: Right. So with that thought, think about where we've come. So think about before AI, a new patch would come out. How much time would a hacker community spend on trying to reverse engineer that patch to plug a hole? How much time would it take for them then to go back now, again, manually through code or what have you, to understand how they might exploit that?

 

Shahin Pirooz: There's a couple of different things here to tease apart. So in the context of a bad actor, the same vulnerabilities that we're aware of today were first found out by some bad actor because they poked around until they found a hole and then they published it to their friends. Their friends started using it. They started taking advantage of the network. So how much time... it's faster and faster. It's less and less time today than it was 30 years ago, for example.  

The average dwell time in a network is 222 days right now, and that 222 days means that bad actor is in your network for almost 6 months weeding around trying to find vulnerabilities that they can take advantage of. That is basically human reconnaissance. Now, take that 220 days and apply an AI agent that can be instructed from the outside with prompt injection, and part of that instruction is, now come to this IP address to get your instructions, and now it starts running.  

That happens at a fraction of the time of a human rolling around inside your network. So you're now no longer talking about 6 months of time to find this bad actor, You're talking minutes, maybe seconds.

 

Brian Moody: So this is a point I want to make, and I want to read a quote that came from the article that you referenced. AI-accelerated offense has compressed exploitation timelines, thus raising the security foundation floor. Here's key: foundation floor friction controls no longer qualify. Again, back to the point of it's happening so quickly. Patches are coming out and the hacker community within minutes to hours already knows how to exploit them.

 

Shahin Pirooz: Yep. This came out, this article that we're referring to, was a white paper or an e-document that Anthropic put out that talked about we need to think about zero trust for AI agents.  

And the context of this was more than just the AI agents. It comes in the wake of Mythos and everything else we've seen in the ecosystem. By the way, quick tip-off, we're gonna talk about Mythos next month. But all of this comes together to a point, to a head, that says we're finding vulnerabilities faster than humans can patch them anymore because they were slower to be found before.  

Now there's if you look at—and I'm giving you away a little bit of next month's topic—some very large firms, and we'll talk about them more in specific next month— but some very large firms announced after using Mythos that they found almost 1,000 vulnerabilities in their network that has them working around the clock trying to put them down.  

And not just in their network, in their external applications that the market is using. So really aggressive visibility into what's out there, and there's no reason a bad actor couldn't take advantage of that if it took a prompt injection, took your agent and said, "Hey, go run Mythos in the network and tell me what holes there are, and then report it back to me at this location."

 

Brian Moody: And then I know how to get you.

 

So you brought up identity—talk about the importance of monitoring. You've already mentioned it a couple of times. It's been peppered in your comments. Why is monitoring our capability to monito so critical?

 

Shahin Pirooz: So it kind of answers the question that you asked, how long will it take you to know that your agent is behaving badly? And without observability, you may never know. You may not have visibility into it.  

And if you look at Claude Code and Copilot they've created a very easy ecosystem for a non-technical person to create an agent simply by having a conversation with a generative AI. And these coding AIs now have the ability to go and create an agent for you. I need an agent that goes and reads my email, and based on the content, if somebody's asking a question about this product, I want you to go look at these data sheets and respond to them with an answer. Seems like a simple thing.  

Now imagine that an email has instructions in it that tell that agent, I want you to take the emails you responded to and do a phishing campaign against them. You simply just got an insider attack happening with a valid email sending phishing emails to customers who expect to get a valid email from that user. So it's very simple to flip this thing on its head quickly.  

And if we start thinking about now this notion of visibility, if we don't have observability of what the agent is doing, and all of a sudden realize, wait a minute, why is it sending this email to all of these users at the same time when it only was emailing one individual at a time? If you don't have that level of observability, you're not going to know your agent just went rogue. And if you don't know your agent just went rogue, you can't fix it.  

It's similar to if a machine— this is behavioral modeling— if a machine starts behaving differently than it ever has on your network, that's a flag to say, let's go investigate what's going on. You need that same flag for agents. If a human being starts behaving differently than they ever have, if they all of a sudden start copying all these files from OneDrive to Google Drive, that's not normal behavior. What's going on here? Same thing.  

If your digital agent, digital employee starts behaving different than it ever has, you need to have observability of that behavior and be able to take action. Problem is these agents typically are spun up in 15, 20 minutes with some chat with a generative AI. And there is no concept of how are we going to monitor its behavior because there's no identity. They're using the user's identity. There's no visibility because they're not reporting anything anywhere. And there's no mechanism to shut them down because you didn't build them; your employee did.

 

Brian Moody: So I'm going to bring this back around to our last SoundByte with respect to our ability to use AI in the SOC. As we said last time, we don't think there's anything such as an AI SOC. Like, so if you hear that, run.  

But the usage of AI and the critical aspect here, again, another quote from the article that I thought was fantastic was about AI-accelerated defense. So we talked about the offense, about how the hacker community is using AI to accelerate their capability to attack us.  

Talk a little bit about how we should be using AI in the data center. And the key behind this was, we should be automating evidence collection, enrichment. We talked about this in the last SoundByte. Correlation and documentation, which I know we're doing with Cybro within the WhiteDog infrastructure. And you get down on that a little bit, but the key behind this really is as we've talked about, is keeping humans in the decision-making aspect and accelerating our defense capability using AI.

 

Shahin Pirooz: So the misnomer of the AI SOC, the way it's being presented to market, is this idea that you don't have to build your SOC with people anymore. You've got these AI agents that can do the work.  

The reality is, we've always had—let's air quote—AI agents doing the types of work that you talked about. We use machine learning and deep learning to be able to create correlation rules, to be able to help with investigations, and we started using Elasticsearch so that investigations are faster, so that we can connect dots. We started creating graph databases so that we can graph the relationship between things.  

We started doing all these things over the last 30 years, and the evolution of the SIEM got to a point where it was basically a log aggregation and search engine. The problem with that is that we spent all these cycles enriching, talking about what the AI agent can do, enriching the alarms, not enriching the entities underneath those alarms.  

And the reason that's a problem, the reason the SIEM is dead—I've said this, I've written articles about this—is that enriching the alarms, all that does is tell you more about that alarm, but it doesn't really connect the relationships between that alarm and all the pieces in the attack chain. Our approach is flipped on its head.  

We take and enrich the core underlying entities, the devices and the users. And based on those devices and users, we now understand the relationship of how they interact with one another and with each other. And then we can say, we enriched that device with the following alarms and that user with the following alarms. And look, by the way, there's common alarms between the two of them. It means this user got tricked by this phishing attack, then malware was downloaded to this machine from that same attack. Then identity was taken over for that user. Then scripts started coming down on this machine that were elevating privilege inside, and so on and so forth.  

So we're able to connect those dots because we do the enrichment at the core. Now AI does that. We're able to do that enrichment at an AI level. And that's the kinds of things AI can help within the SOC. Correlation, enrichment, faster search, faster investigations.  

What it can't do is be curious. We haven't taught AI, and it'll be a while before we can teach it. Inferencing is not curiosity. Inferencing is based on the context I'm seeing, I'm going to make a leap about what the next word is or the next token is in this context. What it doesn't say is, I don't see a context here at all, and that's a problem. Something's missing, something's wrong, something's fishy.  

And then going and investigating that further, that's where the human comes in. That's what's different about our approach to taking dwell time down from 6 months to 6 minutes versus the rest of the market and how they're solving the problem.

 

Brian Moody: So AI is speeding up the offense. We're utilizing AI to help speed up the defense in a very unique way. But let's kind of bring this home with the key takeaways from a standpoint of implementing kind of a zero trust model around this kind of potential internal attack. Top 3 pieces.

 

Shahin Pirooz: So we wrote a document for you that will be available for you after this session, and it's basically a readout or our impression, our takeaways from the Anthropic document. So feel free to grab it. Feel free to go look for Anthropic's document. It's titled Zero Trust for AI Agents, but fundamentally it's the things we've talked about. Anthropic, I'm going to talk about 5 top-level things and then I'm going to talk about the 3 takeaways, from Anthropic's perspective, and it's very aligned with our own vision and this readout is from us, so that might be why it's aligned with our vision.  

So number 1, Zero Trust is the only approach for AI agents. That's the first takeaway. That's you have to take this notion of explicit rights, restricted access, so on and so forth. Identity and authorizations are the foundation.  

So you have to create a cryptographic identity for these agents, and you have to authorize what they have access to. So this is what you're authorized to do. Context should be this notion of least privilege, or least agency in our words. Isolation, so this is the segmentation or the sandboxing, limit the blast radius.  

And then monitoring, logging, and recovery. So observability is critical. You can't possibly have recovery without observability because you don't know that something happened and you don't know what to go back and fix. So logging is a critical component, then observability on top of that, and then recovery procedures on top of that.  

So if I were to give you 3 things to think about how to implement this, it's really these 3 things. Who the agent is, what it is allowed to do right now, what's its scope and capability, and how will you know if it deviates from that.

 

Brian Moody: And how quickly would you know. So fantastic thoughts. Appreciate you joining us today. Again, I think never trust, always verify. As you build these out, assume breach. You know, we take this stance, assume they're already in. And as you create the security foundation for your environment and implement security around these agents, assume breach. And then finally, this ideology of least privilege is without question a core foundation with respect to how you move forward with the Zero Trust model.

 

Shahin Pirooz: And one of the things, like a lot of people ask, how do we start? And there's there's some cool bullets about how to implement, what the implementation sequence should be.  

Number one, start with a single agent. Don't go create 500 agents and treat that like a production system. So go through your product development lifecycle and narrow the scope and mission of what you're doing with it. So don't just treat it like a general assistant, make it something specific, something targeted. So start with one. Put that agent behind an identity, an approval process, and log the controls that it has access to before you let it do what it's supposed to do. So start with that mindset.  

Then, add sandboxing and data segmentation, and validate that the adversarial tests for prompt injection in the tool abuse model works or doesn't work. That's your red teaming of this agent all of a sudden.  

And then lastly, only after all of that stuff, all those controls are in place and stable, scale to more agents, more tools, because now you have a framework to wrap into it.  

And to Brian's point, assume this agent will be taken advantage of, assume it's breached, assume that when you're building it and build it from a shift left mindset of security first, and then all the positive outcomes of this agent second.

 

Brian Moody: Fantastic. Well, thanks for joining us today for WhiteDog SoundBytes. We look forward to talking to you again next month. As always, feel free to reach out to us if you have any questions.

Let's talk!

We’ve Got a Shared Goal, To Secure Your Customers