How to secure data access for AI Agents

With the growing adoption of AI agents, we are seeing more applications being built around AI-driven capabilities. These applications can range from small demos that highlight what AI agents can do, to full-scale enterprise integrations that enhance existing software products. For example, an AI agent could be embedded within a data governance platform for tasks like data profiling, or entire AI-first applications can be built from scratch.

Regardless of their size, these AI-driven applications may serve either a single user or many users across different teams within an enterprise. Naturally, these users often have varying levels of access to data and AI functionalities. This leads to important questions:

Does the data consumer have the right to use information generated by AI agents during data processing?
Should a user be able to view AI agent computations if they do not have direct access to the original data?
How do we manage access control for AI computations and data outputs across users with different privilege levels?

This article explores these challenges and demonstrates different implementation approaches for managing AI agent data access securely.

You can jump on a source code on Github and try it out.

Overview

An AI agent-based application typically includes these components:

Human Users — People interacting with AI agents and systems.
Systems — Software components that interact with APIs, databases, and other tools.
AI Agents — Entities that perform tasks, process data, and communicate with users and systems.
Data Sources — Structured and unstructured repositories accessed by AI agents such as databases.
Large Language Models (LLMs) — The underlying models powering the intelligence of the agents.

In such an environment:

Human users interact with AI agents through interfaces or systems.
AI agents access data through tools and APIs.
Tools fetch data via API calls, database queries, or access to structured/unstructured storage.

Access control must be enforced at several levels:

Controlling who can use the AI application.
Controlling which AI agents a user can interact with.
Controlling which data agents can retrieve for computation.
Controlling what parts of computed outputs a user can see.
Enforcing database-level access (table-level, row-level, column-level).

The key risk is that if AI agent orchestration bypasses these controls, a user could inadvertently be exposed to sensitive or restricted data. To avoid this, strict enforcement mechanisms are essential to ensure users only see data they are authorized to access.

We must put checkpoints in place to guarantee that users are only exposed to the data they are allowed to see, even if they are unaware of all the sources the AI agents use.

Base Scenario

In our setup, we define two users: User A and User B.

User A has full access to all resources, tools, and data.
User B has restricted access, depending on the resource type.

The application is designed with two AI agents:

Data Collector Agent: Gathers financial data from various sources.
Data Presenter Agent: Formats and presents the collected data.

In the Azure AI Agent Service implementation. See the full source code implementation on GitHub.

The Data Collector agent has access to five tools:

Internet search using Serper API
Full database query (Neon Postgres)
Limited columns query
Limited rows query (row-level restricted)
API call to Alpha Vantage

Example Tool Definitions:

User Roles and Identity Management

Instead of manually assigning permissions inside the application code, we manage user roles externally through a file like user_roles.yaml or, ideally, via an identity management service like Azure Entra ID.

Example user_roles.yaml:

The AI application reads these roles at runtime and dynamically adjusts agent behavior and available tools accordingly.

Implementation and Scenarios

Scenario 1: Unrestricted Access

In this case, User A can use all tools and see all data without any limits. User B, on the other hand, has some restrictions. We assume there are no limits on what data can be collected or processed for users with full access.

Both agents operate normally, and all information is accessible to the user.

Scenario 2: Block Database Access for User B

In this situation, User B should not be allowed to access the database directly. This setup makes sure that users without the right permissions cannot pull data from the database and can only use other available sources instead.

In many business applications, there are different ways to control who can access the database:

Service Account Model: The whole application connects to the database using one service account that has access to everything. This makes things easier to manage, but every user ends up having the same level of access, without any user-specific restrictions.
User Privilege Validation: The system checks each user’s permissions before allowing them to run database queries. This can be done in two ways:
- Role-Based Access Control (RBAC): Users are given roles that define which tables, rows, or queries they can access.
- Middleware Enforcement: A layer between the app and the database checks what users are allowed to do before sending their queries to the database.
Row-Level Security: The database itself is set up to automatically show different rows of data depending on who the user is and what they are allowed to see. Discover how to enable row-level security using Neon RLS.
Proxy Database Access: Instead of letting users connect directly to the database, they go through an API layer that applies all the necessary security checks based on their role.

We enforce this by removing database query tasks for restricted users. See the full source code implementation on GitHub.

Scenario 3: Restrict API Access

In this case, User B should not be allowed to access certain external APIs. There are several ways to control API access:

API Gateway Policies: API gateways can be set up to check user roles and only allow users with the right permissions to make API requests.
Middleware Layer: The application can use a middleware that checks a user’s role before sending any API requests, blocking those who do not have access.
Token-Based Access Control: API authentication tokens can carry information about what a user is allowed to do, limiting which APIs or endpoints they are allowed to call.

This simulates API-level access control, typically enforced via API Gateways, middleware, or token scopes.

Scenario 4: Limit Access to Specific Database Columns

In this case, User B should not be able to access certain tables in the database. User B can only view limited columns like company and stock_price. Table-level access control can be handled in a few ways:

Database Permissions: The database can be set up to control which tables a user can see or query based on their assigned role. Read more about how to manage database access.
Query Proxy Layer: A middleware layer can sit between the user and the database, blocking access to restricted tables before the queries are even sent to the database.

Instead of querying full tables, agents only retrieve selected columns.

Scenario 5: Restrict Access to Specific Rows

In this scenario, User B should not be able to view certain rows in a database table. Row-level security can be enforced in two main ways:

Database Row-Level Policies: The database applies rules that control which rows each user is allowed to see, based on their access rights.
Application-Level Filtering: The application itself checks and filters the query results before showing the data to the user, making sure they only see what they are permitted to access.

Row-level security ensures they cannot access data outside their assigned scope.

Scenario 6: Mask Sensitive Output Fields

In this scenario, User B should not be able to see certain sensitive information in the results generated by the AI. This can be handled in two ways:

Data Masking: Sensitive parts of the results are hidden or replaced before being shown to the user.
Post-Processing Filters: The AI agent reviews the output and removes or hides any restricted information before presenting it.

The Data Presenter agent applies masking before summarizing results.

Conclusion

AI agents open up new challenges in managing data access, especially in multi-user environments where different privilege levels exist. By combining proper authentication, API security, database-level filtering, and AI output moderation, it is possible to build AI-driven applications that are both powerful and secure.

This project demonstrates how role-aware agents can dynamically adjust their behavior based on user permissions while responsibly accessing multiple sources like Neon databases, financial APIs, and the open web.

Following these best practices ensures that AI-powered systems comply with data governance policies and protect sensitive information from unauthorized access.

How to secure data access for AI Agents

Overview