Capability for Subagent Creation for runbear_file_search (and other tasks)

Runbear Feature Request: Agent-to-Agent Invocation & KB Search Subagents

Title: Enable Subagent Creation for Knowledge Base Search Delegation

Priority: High
Use Case: Token Budget Management & Scalability
Submitted by: Minted Analytics Team (Ask_Sam Agent)

***

Problem Statement

Our Ask_Sam agent (ff1379fe-51d1-4bd4-a870-e5ef6dc11d88) frequently hits the 1M token context limit due to:

1. 144 active tools consuming ~2,400 tokens per turn just for tool definitions
2. runbear_file_search results returning 5,000-15,000 tokens per query
3. Long-term memory (LTM) accumulating ~30,000 tokens of historical learnings
4. Multi-turn conversations requiring full context retention

Current Impact:

• Conversations terminated prematurely with "token limit exceeded" errors
• User experience disrupted mid-analysis
• Complex requests cannot be completed

***

Requested Feature: Agent-to-Agent Invocation

Core Capability:

runbear_invoke_agent(
agent_id: str,# Specialized subagent ID
task: str,# Scoped instruction
max_context_return: int,# Token limit for summary response
pass_context: bool = False# Whether to share parent context
)

Returns:

• Compressed summary (user-defined token limit)
• Full results remain in subagent's context
• Parent agent receives only the synthesized response

***

Proposed Architecture

Pattern 1: Knowledge Base Search Specialist

Ask_Sam (Parent Agent)
↓ Invokes
KB_Search_Specialist (Subagent)
- Has runbear_file_search access
- Executes 3-5 searches
- Receives 15,000 tokens of raw results
- Summarizes to 300 tokens
- Returns: "Summary: [key findings]"
↓ Returns to
Ask_Sam (receives 300 tokens, not 15,000)

Pattern 2: Code Analysis Specialist

Ask_Sam
↓ Invokes
GitLab_Code_Agent (Subagent)
- Has GitLab MCP access
- Searches 5 repositories
- Analyzes data lineage
- Returns: "Field X defined in file Y:Z, depends on tables A, B"
↓ Returns concise answer
Ask_Sam

***

Implementation Options

Option A: Dedicated Subagent Templates (Recommended)
Runbear provides pre-built specialist agents:

• kb_search_specialist - KB search + summarization
• code_analysis_specialist - GitLab/GitHub code inspection
• data_query_specialist - SQL/Snowflake/Hex analysis

Option B: Generic Agent Invocation
Allow any Runbear agent to invoke any other agent in the same workspace, with:

• Configurable result compression
• Context isolation (subagent context doesn't pollute parent)
• Timeout controls

Option C: Built-in Search Compression
Enhance runbear_file_search itself:

runbear_file_search(
query: List[str],
max_num_results: int = 5,
compress_results: bool = True,# NEW
compression_prompt: str = None# NEW: "Summarize in <200 words"
)

***

Expected Benefits

Token Savings:

• Current: 15,000 tokens per KB search
• With subagent: 300 tokens per delegated search
• Savings: ~14,700 tokens per search (~98% reduction)

Scalability:

• Parent agent can handle longer conversations (50+ turns vs 15-20 currently)
• Complex multi-step analysis becomes feasible
• Parallel specialist invocations possible

User Experience:

• No more "token limit exceeded" mid-conversation
• More sophisticated analysis without manual conversation splitting
• Better separation of concerns (parent = orchestration, subagents = execution)

***

Similar Patterns in Industry

• Hex Threads Agent: Uses create_thread / get_thread for analysis delegation
• Cursor Agent: Uses CURSOR_LAUNCH_AGENT for code tasks
• OpenAI Assistants API: Supports agent-to-agent tool calling
• LangChain: Multi-agent orchestration with context isolation

***

Proposed Pilot

Test Case: Minted Analytics Ask_Sam
Scenario: "Explain data lineage for bi_customers.mm_status"

Current Flow (45,000 tokens):

1. Search KB for "mm_status" → 8,000 tokens
2. Search GitLab refs → 10,000 tokens
3. Search Slack discussions → 9,000 tokens
4. Synthesize answer → 3,000 tokens
5. Tool defs (144 tools) → 2,400 tokens
6. LTM → 12,000 tokens
Total: ~45,000 tokens

With Subagents (~8,000 tokens):

1. Invoke KB_Search_Specialist("mm_status") → 300 tokens returned
2. Invoke GitLab_Code_Specialist("mm_status") → 400 tokens returned
3. Synthesize answer → 3,000 tokens

4. Tool defs (60 tools, reduced) → 1,000 tokens
5. LTM → 3,000 tokens
Total: ~8,000 tokens (82% reduction)

***

Alternative Workarounds (If Feature Delayed)

1. Disable unused tools (144 → 80) - saves 1,000 tokens/turn
2. Aggressive LTM pruning - archive entries older than 90 days
3. External MCP server - custom KB search compression service
4. Manual conversation splits - user restarts every 15 turns (poor UX)

***

Contact for Follow-up

Organization: Minted.com
Primary Contact: Patrick Codrington (patrick.codrington@minted.com)
Agent ID: ff1379fe-51d1-4bd4-a870-e5ef6dc11d88
Slack Workspace: minted.slack.com (proj_ant_nothing_to_see_here)

Willing to participate in beta testing: Yes

Runbear

Capability for Subagent Creation for runbear_file_search (and other tasks)

Subscribe to post

Subscribe to post