Nexi Lab

Context is All You Need

The unified context layer for the age of AI agents

White Paper · December 2025

TL;DR

Coding agents work because code has structure—filesystems, version control, standard formats. Everything else—emails, CRM, documents—is fragmented across hundreds of systems with no unified access. Nexus brings that same structure to all context: a unified layer where data, memory, and capabilities are accessible through one interface. Two principles: everything is a file, and just-in-time retrieval. Open source and available today.

AI models have reached a capability overhang—they can do far more than we're asking of them. The bottleneck isn't intelligence. It's context.

Agents fail not because models can't reason, but because they lack reliable access to the right information at the right time. Context is scattered across systems that don't talk to each other. Memory is trapped inside individual frameworks. Permissions are enforced inconsistently, or not at all.

This paper introduces Nexus, the unified context layer for AI agents. Built on two principles—everything is a file and just-in-time retrieval—Nexus connects all forms of context through a single interface, enabling retrieval that is accurate, fast, cheap, and secure.

The Model Capability Overhang

"We have this AI overhang where the models are incredibly capable, and we just haven't caught up yet."
— Kevin Scott, CTO of Microsoft

If the models are ready, what's holding us back?

Context. Agents can't reliably access the right information at the right time. They lack memory across sessions. They can't discover which tools are available. They don't have permission-aware access to enterprise data. Each application, each agent, starts from scratch.

Why Coding Agents Pulled Ahead

Over the past 18 months, a category of AI agents has dramatically outpaced the others: coding agents. Cursor, Claude Code, Windsurf, and others have achieved remarkable results—not only because of better models, but because code already lives in a well-organized structure.

Code has:

A filesystem — hierarchical organization, clear paths, standard interfaces
Version control — history, diffs, the ability to understand change over time
Explicit dependencies — imports, packages, clear relationships between components
Standard formats — parseable, structured, machine-readable

The software industry spent decades building infrastructure to organize code. AI agents benefit from that investment.

For everything else, no such structure exists. Consider a simple request: "Summarize my last ten customer calls and cross-reference with their CRM records to identify churn risks."

To answer this, an agent needs to access call recordings or transcripts (Gong? Zoom? a local folder?), find the corresponding customer records (Salesforce? HubSpot? a spreadsheet?), match them by customer ID or name, and synthesize the results. Each system has its own API, its own authentication, its own data model. There's no unified path, no consistent permission model, no way to search across them.

The agent fails—not because the model can't reason about churn, but because the context is fragmented.

Brute Force Isn't Working

The industry's initial response to the context problem was brute force: bigger context windows.

Gemini offers 1 million tokens. Claude offers 200,000. The assumption was that if we could just fit everything into the context window, the model would figure it out.

It's not working.

Research from Stanford confirms what practitioners already suspected: model accuracy follows a U-shaped curve based on where information appears in context. Accuracy exceeds 80% when relevant facts are at the beginning or end, but drops significantly when they're buried in the middle—even for models explicitly designed for long contexts.

And even if models could perfectly attend to long contexts, the enterprise data problem remains. The average enterprise runs nearly 900 applications; only a third are integrated. 72% of organizations report that their data exists in disconnected silos. Employees lose 30% of their working hours just chasing down information across systems.

Larger context windows make existing problems worse:

Lost in the middle — models struggle to attend to information buried in long contexts
Cost explosion — more tokens means higher latency and higher spend
No permission model — dumping everything into context ignores access control
Stale context — pre-loaded context can't reflect real-time changes

Structured context beats brute-force context.

Our benchmarks confirm this. In tests with 65 tools, dynamic discovery achieved a 78% reduction in context tokens (62K vs 276K) while maintaining 100% task success. Anthropic's own tool search, by comparison, degraded to 52% success at the same scale. The pattern is clear: structured retrieval scales; brute force doesn't.

This realization has given rise to a new discipline—context engineering—focused on curating and managing the information that goes into an AI system's context window. But discipline alone isn't enough. We need infrastructure.

Our Thesis

Context is all you need. The models are ready. Give them the right context, and they deliver.

If this thesis is true, then the most important infrastructure for AI agents isn't a better orchestration framework or a fancier prompt router. It's a context layer—a unified way to store, search, and retrieve the information agents need.

Introducing Nexus

Nexus is a unified context layer for AI agents. It connects all forms of context—data, memory, and capabilities—through a single interface, enabling retrieval that is accurate, fast, cheap, and secure.

Nexus spans three categories of context:

Data & Documents

Files, emails, SaaS applications (Salesforce, HubSpot, Notion, Slack), and databases (BigQuery, Snowflake, Postgres). Any source of information an agent might need.

Memory

Long-term knowledge that persists across sessions and short-term working state within a task. Memory that belongs to users, not trapped inside individual agent frameworks.

Capabilities

Skills, agents, tools, and APIs. The things an agent can do, not just the things it can know. Unified discovery so agents can find the right capability for the task.

Nexus is open source and available today at github.com/nexi-lab/nexus.

Early Adoption

Nexus is being used today to power:

AI coding assistants via Model Context Protocol (MCP), integrating with Claude Desktop and Cursor
Agent frameworks including CrewAI and LangGraph for production workflows
Multi-agent teams that share findings across research, analysis, and writing agents

These systems use Nexus to solve the same context problem coding agents already solved: give the AI structured access to information instead of dumping everything into the context window.

Design Principle: Everything is a File

In Unix, everything is a file. Devices, sockets, pipes, processes—they all expose the same interface: open, read, write, close. This simple abstraction enabled decades of composability. A program written to read files could automatically read from network sockets, hardware devices, or other processes—because they all looked like files.

We apply the same principle to AI context.

In Nexus, everything is a file: a Salesforce record, a memory fragment, an MCP server, a Slack conversation, a BigQuery table. They all expose the same interface: they have a path, metadata, content, and permissions. They can all be searched, retrieved, and composed.

# All context types, one namespace
nexus/memory/user-123/preferences
nexus/data/salesforce/contacts/jane-doe
nexus/data/bigquery/analytics/q4-revenue
nexus/skills/github/SKILL.md
nexus/agents/deep_research/AGENT.md
nexus/docs/notion/product-roadmap
      

This abstraction unlocks several benefits:

Unified Search

One query can search across all context types. "Find everything relevant to Project Aurora" returns memory, documents, CRM records, and available tools—without the developer specifying where to look.

Single Permission Model

Permissions are consistent across all context. If a user can't access a Salesforce record, they can't access it through Nexus. If an agent shouldn't see financial data, that's enforced at the context layer. One model, everywhere.

Composability

Because everything shares the same interface, new context types can be added without changing the retrieval logic. Add a new SaaS integration, and it's immediately searchable alongside everything else.

Design Principle: Just-in-Time Retrieval

Traditional RAG systems pre-index everything. Documents are chunked, embedded, and stored in a vector database. At query time, the system retrieves the most similar chunks.

This works for static document collections but breaks down for dynamic, heterogeneous context:

Pre-indexing can't keep up with real-time data
Chunking destroys structure and relationships
Embedding similarity doesn't capture all relevance signals
Everything is indexed whether it's needed or not

Nexus takes a different approach: just-in-time retrieval.

Instead of pre-indexing everything, Nexus retrieves exactly what's needed, when it's needed. The retrieval strategy adapts to the query.

When an agent asks "What's our Q4 revenue?", Nexus doesn't search a pre-built vector index. It identifies the query type (structured data), routes to the appropriate source (BigQuery), executes a live query, and returns a formatted result. When the query is "What do we know about this customer?", it searches memory, CRM, and recent communications in parallel, then merges and ranks the results.

This mirrors how coding agents work. They don't pre-embed the entire codebase. They use grep, file system traversal, and targeted reads. The filesystem provides enough structure that brute-force indexing isn't necessary.

By treating all context as files in a structured namespace, Nexus enables the same pattern for non-code context: intelligent, targeted retrieval instead of exhaustive pre-processing.

The Result

These design principles—everything is a file, just-in-time retrieval—combine to make context retrieval:

Property	How
Accurate	Unified search across all context types; retrieval adapts to the query
Fast	Single optimized path; no redundant indexing or retrieval
Cheap	Fetch only what's needed; fewer tokens in context window
Secure	Consistent permissions across all context types

Early deployments are validating this approach. We're actively measuring improvements in retrieval accuracy, latency, token efficiency, and permission consistency across pilot customers. We'll publish detailed benchmarks as we scale.

Looking Forward

We're entering an era of multi-agent systems. Swarms of specialized agents will collaborate on complex tasks, handing off work, sharing findings, and building on each other's progress.

This future requires shared context infrastructure:

Agents from different vendors sharing context — A sales agent from one platform and a support agent from another should share the same view of a customer, not maintain separate, conflicting records.
Portable context across tools — When a user switches from one AI assistant to another, their memory and preferences should come with them. Context belongs to users, not applications.
Enterprise-wide agent memory — HR, Sales, and Engineering agents should all access the same institutional knowledge, with appropriate permissions, rather than each maintaining isolated silos.

If every agent maintains its own memory silo, collaboration breaks. If permissions aren't consistent, enterprise deployment stalls. If retrieval is slow or inaccurate, agents fail.

Nexus is the context layer for this future. One namespace. One permission model. One way to give AI agents the context they need.

Context is all you need.