Nexi Lab
The unified context layer for the age of AI agents
TL;DR
Coding agents work because code has structure—filesystems, version control, standard formats. Everything else—emails, CRM, documents—is fragmented across hundreds of systems with no unified access. Nexus brings that same structure to all context: a unified layer where data, memory, and capabilities are accessible through one interface. Two principles: everything is a file, and just-in-time retrieval. Open source and available today.
AI models have reached a capability overhang—they can do far more than we're asking of them. The bottleneck isn't intelligence. It's context.
Agents fail not because models can't reason, but because they lack reliable access to the right information at the right time. Context is scattered across systems that don't talk to each other. Memory is trapped inside individual frameworks. Permissions are enforced inconsistently, or not at all.
This paper introduces Nexus, the unified context layer for AI agents. Built on two principles—everything is a file and just-in-time retrieval—Nexus connects all forms of context through a single interface, enabling retrieval that is accurate, fast, cheap, and secure.
"We have this AI overhang where the models are incredibly capable, and we just haven't caught up yet."
— Kevin Scott, CTO of Microsoft
If the models are ready, what's holding us back?
Context. Agents can't reliably access the right information at the right time. They lack memory across sessions. They can't discover which tools are available. They don't have permission-aware access to enterprise data. Each application, each agent, starts from scratch.
Over the past 18 months, a category of AI agents has dramatically outpaced the others: coding agents. Cursor, Claude Code, Windsurf, and others have achieved remarkable results—not only because of better models, but because code already lives in a well-organized structure.
Code has:
The software industry spent decades building infrastructure to organize code. AI agents benefit from that investment.
For everything else, no such structure exists. Consider a simple request: "Summarize my last ten customer calls and cross-reference with their CRM records to identify churn risks."
To answer this, an agent needs to access call recordings or transcripts (Gong? Zoom? a local folder?), find the corresponding customer records (Salesforce? HubSpot? a spreadsheet?), match them by customer ID or name, and synthesize the results. Each system has its own API, its own authentication, its own data model. There's no unified path, no consistent permission model, no way to search across them.
The agent fails—not because the model can't reason about churn, but because the context is fragmented.
The industry's initial response to the context problem was brute force: bigger context windows.
Gemini offers 1 million tokens. Claude offers 200,000. The assumption was that if we could just fit everything into the context window, the model would figure it out.
It's not working.
Research from Stanford confirms what practitioners already suspected: model accuracy follows a U-shaped curve based on where information appears in context. Accuracy exceeds 80% when relevant facts are at the beginning or end, but drops significantly when they're buried in the middle—even for models explicitly designed for long contexts.
And even if models could perfectly attend to long contexts, the enterprise data problem remains. The average enterprise runs nearly 900 applications; only a third are integrated. 72% of organizations report that their data exists in disconnected silos. Employees lose 30% of their working hours just chasing down information across systems.
Larger context windows make existing problems worse:
Structured context beats brute-force context.
Our benchmarks confirm this. In tests with 65 tools, dynamic discovery achieved a 78% reduction in context tokens (62K vs 276K) while maintaining 100% task success. Anthropic's own tool search, by comparison, degraded to 52% success at the same scale. The pattern is clear: structured retrieval scales; brute force doesn't.
This realization has given rise to a new discipline—context engineering—focused on curating and managing the information that goes into an AI system's context window. But discipline alone isn't enough. We need infrastructure.
Our Thesis
Context is all you need. The models are ready. Give them the right context, and they deliver.
If this thesis is true, then the most important infrastructure for AI agents isn't a better orchestration framework or a fancier prompt router. It's a context layer—a unified way to store, search, and retrieve the information agents need.
Nexus is a unified context layer for AI agents. It connects all forms of context—data, memory, and capabilities—through a single interface, enabling retrieval that is accurate, fast, cheap, and secure.
Nexus spans three categories of context:
Files, emails, SaaS applications (Salesforce, HubSpot, Notion, Slack), and databases (BigQuery, Snowflake, Postgres). Any source of information an agent might need.
Long-term knowledge that persists across sessions and short-term working state within a task. Memory that belongs to users, not trapped inside individual agent frameworks.
Skills, agents, tools, and APIs. The things an agent can do, not just the things it can know. Unified discovery so agents can find the right capability for the task.
Nexus is open source and available today at github.com/nexi-lab/nexus.
Nexus is being used today to power:
These systems use Nexus to solve the same context problem coding agents already solved: give the AI structured access to information instead of dumping everything into the context window.
In Unix, everything is a file. Devices, sockets, pipes, processes—they all expose the same interface: open, read, write, close. This simple abstraction enabled decades of composability. A program written to read files could automatically read from network sockets, hardware devices, or other processes—because they all looked like files.
We apply the same principle to AI context.
In Nexus, everything is a file: a Salesforce record, a memory fragment, an MCP server, a Slack conversation, a BigQuery table. They all expose the same interface: they have a path, metadata, content, and permissions. They can all be searched, retrieved, and composed.
This abstraction unlocks several benefits:
One query can search across all context types. "Find everything relevant to Project Aurora" returns memory, documents, CRM records, and available tools—without the developer specifying where to look.
Permissions are consistent across all context. If a user can't access a Salesforce record, they can't access it through Nexus. If an agent shouldn't see financial data, that's enforced at the context layer. One model, everywhere.
Because everything shares the same interface, new context types can be added without changing the retrieval logic. Add a new SaaS integration, and it's immediately searchable alongside everything else.
Traditional RAG systems pre-index everything. Documents are chunked, embedded, and stored in a vector database. At query time, the system retrieves the most similar chunks.
This works for static document collections but breaks down for dynamic, heterogeneous context:
Nexus takes a different approach: just-in-time retrieval.
Instead of pre-indexing everything, Nexus retrieves exactly what's needed, when it's needed. The retrieval strategy adapts to the query.
When an agent asks "What's our Q4 revenue?", Nexus doesn't search a pre-built vector index. It identifies the query type (structured data), routes to the appropriate source (BigQuery), executes a live query, and returns a formatted result. When the query is "What do we know about this customer?", it searches memory, CRM, and recent communications in parallel, then merges and ranks the results.
This mirrors how coding agents work. They don't pre-embed the entire codebase. They use grep, file system traversal, and targeted reads. The filesystem provides enough structure that brute-force indexing isn't necessary.
By treating all context as files in a structured namespace, Nexus enables the same pattern for non-code context: intelligent, targeted retrieval instead of exhaustive pre-processing.
These design principles—everything is a file, just-in-time retrieval—combine to make context retrieval:
| Property | How |
|---|---|
| Accurate | Unified search across all context types; retrieval adapts to the query |
| Fast | Single optimized path; no redundant indexing or retrieval |
| Cheap | Fetch only what's needed; fewer tokens in context window |
| Secure | Consistent permissions across all context types |
Early deployments are validating this approach. We're actively measuring improvements in retrieval accuracy, latency, token efficiency, and permission consistency across pilot customers. We'll publish detailed benchmarks as we scale.
We're entering an era of multi-agent systems. Swarms of specialized agents will collaborate on complex tasks, handing off work, sharing findings, and building on each other's progress.
This future requires shared context infrastructure:
If every agent maintains its own memory silo, collaboration breaks. If permissions aren't consistent, enterprise deployment stalls. If retrieval is slow or inaccurate, agents fail.
Nexus is the context layer for this future. One namespace. One permission model. One way to give AI agents the context they need.
Context is all you need.