scroll
Home/Blog/Deep Dive/Agent Factory Deep D…
Agent Factory Deep Dive: Gemini 3, AI Studio, Antigravity, Nano Banana Cover Image

Agent Factory Deep Dive: Gemini 3, AI Studio, Antigravity, Nano Banana

Cracking the code of multi-agent systems with battle-tested tools and brutal honesty.

June 4, 2026
16 min read
1 views

Agent Factory Deep Dive: Gemini 3, AI Studio, Antigravity, Nano Banana

Alright, let's cut the crap. You've heard the buzz about AI agents. Everyone's building them, everyone's got an opinion, and frankly, 90% of it is rehashed boilerplate wrapped in a shiny new framework. We've been in the trenches, building something we affectionately call the "Agent Factory." This isn't just about chaining a few LLM calls; it's about a robust, scalable, and opinionated system where specialized agents cooperate to tackle complex problems.

We didn't just throw tools at the wall. We deliberately picked a stack, wrestled with its limitations, and built around its strengths. This is our honest, no-punches-pulled recap of using Gemini 3, Google AI Studio, our custom Antigravity orchestration layer, and the lightweight Nano Banana frontend library.

The Dream: Why an Agent Factory?

The promise of AI agents isn't just a smarter chatbot. It's about offloading complex, multi-step tasks that require planning, tool-use, and dynamic adaptation. Imagine a virtual assistant that doesn't just answer questions but does things: researches, plans, executes code, interacts with APIs, and synthesizes results. That's the factory floor, churning out solutions.

But the reality? It's messy. Large Language Models (LLMs) are phenomenal reasoners, but they're also prone to hallucination, context window limitations, and incredibly expensive if you're not careful. Building an "agent" means wrapping these models in enough guardrails, tools, and orchestration logic to make them reliable and economically viable.

Gemini 3: The Engine Room – Raw Power, Rawer Edges

When Gemini 3 (let's assume this is the latest, greatest iteration, pushing the envelope past Gemini 1.5 Pro) landed, we were cautiously optimistic. Google's narrative emphasized multimodal capabilities, massive context windows, and raw reasoning power. Our experience?

It's a powerhouse, no doubt. For specific tasks involving complex reasoning, code generation, or understanding nuanced, lengthy inputs (thanks to that context window), it often outperforms previous generations. We've seen it chew through multi-page technical documents and extract relevant data with remarkable accuracy, a task where other models would either truncate, drift, or simply choke.

The Good, The Bad, The Ugly:

  • Multimodal Magic: Seriously, the ability to process images and text seamlessly in the same prompt chain is a game-changer. For agents that need to interpret diagrams, UI screenshots, or visual data, it’s not just a nice-to-have; it's essential.
  • Context Window: A godsend. For long-running agentic conversations or tasks requiring extensive document analysis, this is where Gemini 3 truly shines. You can feed it entire codebases or research papers, giving your agents a memory and understanding that's hard to beat.
  • Raw Reasoning: Often incredibly good at complex logic puzzles, planning, and understanding intricate instructions. There are moments when you feel like you're talking to something truly intelligent.
  • Prompt Engineering is STILL Hell: Despite all the advances, coaxing consistent behavior out of these models is an art, not a science. Temperature, top_k, top_p – tweak them all you want, a slight change in wording can derail your agent. It’s an iterative nightmare, and you'll spend more time here than you think.
  • Cost vs. Performance: While often competitive, high-volume, high-context usage can quickly spiral into astronomical costs. Optimize your prompts, be aggressive with token counts, and ALWAYS stream.
  • API Stability & Latency: We've seen periods of inconsistent latency and occasional API quirks. This is the bleeding edge; expect some bumps. Your backend needs to be robust enough to handle retries and graceful degradation.
<br> ![Coding on a laptop](https://images.unsplash.com/photo-1555066931-4365d14bab8c?w=800&auto=format&fit=crop&q=80) <br>

Google AI Studio: The Prompt Sandbox – Fun for a Bit, Then You Grow Up

Google AI Studio is like a fancy playground. It's fantastic for rapidly iterating on prompts, experimenting with different model configurations, and seeing immediate results. For a solo dev or a small team getting started, it's invaluable.

Where it Shines:

  • Quick Prototyping: Get a prompt up and running in minutes. Test variations, compare outputs, and save different versions.
  • Tool Configuration: Defining custom tools and seeing how the model interprets them is straightforward. This is crucial for agentic workflows.
  • Visual Debugging: The UI gives you a clear view of the input, output, and even tool calls, which helps debug agent behavior.

The Hard Truths:

  • Production Deployment? Nope. AI Studio is not your production environment. Exporting prompts into code is clunky, and managing prompt versions across a team or CI/CD pipeline is a manual nightmare.
  • Version Control: Seriously, how are you managing prompt changes? Screenshots? Copy-pasting into text files? This is a fundamental flaw when moving beyond a toy project. We quickly moved to embedding prompts directly in our codebase (more on Antigravity) and versioning them like any other code.
  • Lack of Advanced Features: You quickly hit its ceiling when you need dynamic prompt generation, complex conditional logic, or integration with external data sources beyond simple key-value pairs.

My advice? Use AI Studio to discover effective prompts, then immediately port them into a version-controlled, code-driven system. It's a lab, not a factory.

Antigravity: The Orchestration Brain – Where Real Engineering Happens

This is the beating heart of our Agent Factory. "Antigravity" is our custom backend service, responsible for orchestrating multiple agents, managing their state, handling tool execution, and ensuring a robust, scalable workflow. We built this ourselves because existing frameworks (looking at you, LangChain and LlamaIndex) often carry too much abstraction overhead, introduce performance bottlenecks, or simply don't offer the granular control needed for complex, high-performance agentic systems.

We chose TypeScript/Node.js for its async capabilities and developer familiarity, paired with Fastify for a lean, performant API, and Redis for state management and task queuing.

Core Responsibilities of Antigravity:

  1. Agent Definition & Routing: Each agent has a specific role (e.g., ResearchAgent, CodeGenerationAgent, APIExecutionAgent). Antigravity routes user requests or internal agent messages to the appropriate specialized agent.
  2. State Management: This is critical. Agents need memory. Antigravity maintains conversation history, context windows, and tool outputs for each interaction, allowing agents to pick up where they left off or understand the broader session context.
  3. Tool Execution Gateway: When an LLM decides to use a tool (e.g., make an API call, run a database query, execute some code), Antigravity acts as the secure, controlled gateway. It validates tool calls, executes them, and feeds the results back to the LLM.
  4. Error Handling & Retries: Agents will fail. LLMs will hallucinate tool calls, APIs will return errors. Antigravity implements sophisticated retry mechanisms, fallback strategies, and clear error reporting to the frontend.
  5. Streaming & Real-time Updates: To provide a responsive user experience, Antigravity streams intermediate agent thoughts, tool outputs, and final responses back to the client.

Antigravity's API Endpoint (Simplified):

typescript
1// src/routes/agentFactory.ts
2import { FastifyInstance, FastifyRequest, FastifyReply } from 'fastify';
3import { z } from 'zod';
4import { GeminiClient } from '../services/geminiClient'; // Our wrapper around Google's API
5import { AgentOrchestrator } from '../orchestrator/agentOrchestrator';
6import { logger } from '../utils/logger';
7
8const messageSchema = z.object({
9  role: z.enum(['user', 'assistant', 'system', 'tool']),
10  content: z.string(),
11});
12
13const agentRequestSchema = z.object({
14  sessionId: z.string().optional(),
15  prompt: z.string(),
16  history: z.array(messageSchema).optional().default([]),
17});
18
19export default async function agentFactoryRoutes(fastify: FastifyInstance) {
20  fastify.post('/agent-factory', {
21    schema: {
22      body: agentRequestSchema,
23    },
24    handler: async (request: FastifyRequest<{ Body: z.infer<typeof agentRequestSchema> }>, reply: FastifyReply) => {
25      const { sessionId: rawSessionId, prompt, history } = request.body;
26      const sessionId = rawSessionId || `session_${crypto.randomUUID()}`;
27
28      // Set up streaming headers
29      reply.raw.setHeader('Content-Type', 'text/event-stream');
30      reply.raw.setHeader('Cache-Control', 'no-cache');
31      reply.raw.setHeader('Connection', 'keep-alive');
32      reply.raw.flushHeaders();
33
34      const geminiClient = new GeminiClient(); // Or retrieve from DI container
35      const orchestrator = new AgentOrchestrator(geminiClient, sessionId);
36
37      try {
38        await orchestrator.executeAgentFlow(prompt, history, (chunk) => {
39          // Stream intermediate chunks or final response back to client
40          reply.raw.write(`data: ${JSON.stringify({ type: 'chunk', content: chunk })}\n\n`);
41        });
42        reply.raw.write(`data: ${JSON.stringify({ type: 'end' })}\n\n`);
43      } catch (error) {
44        logger.error(`Agent Factory error for session ${sessionId}:`, error);
45        reply.raw.write(`data: ${JSON.stringify({ type: 'error', content: (error as Error).message })}\n\n`);
46      } finally {
47        reply.raw.end();
48      }
49    },
50  });
51}
typescript
1// src/orchestrator/agentOrchestrator.ts
2import { GeminiClient, GeminiMessage } from '../services/geminiClient';
3import { ToolRegistry } from '../tools/toolRegistry'; // Our collection of defined tools
4import { logger } from '../utils/logger';
5
6interface AgentMessage {
7  role: 'user' | 'assistant' | 'system' | 'tool';
8  content: string;
9}
10
11export class AgentOrchestrator {
12  private geminiClient: GeminiClient;
13  private toolRegistry: ToolRegistry;
14  private sessionId: string;
15
16  constructor(geminiClient: GeminiClient, sessionId: string) {
17    this.geminiClient = geminiClient;
18    this.toolRegistry = new ToolRegistry();
19    this.sessionId = sessionId;
20  }
21
22  // This is a simplified example. A real orchestrator would have more complex state management,
23  // multiple agents, conditional logic, etc.
24  async executeAgentFlow(
25    userPrompt: string,
26    history: AgentMessage[],
27    onChunk: (chunk: string) => void
28  ): Promise<void> {
29    const messages: GeminiMessage[] = [
30      {
31        role: 'system',
32        parts: [{ text: 'You are a helpful assistant specialized in executing tasks using provided tools.' }],
33      },
34      ...history.map(msg => ({ role: msg.role, parts: [{ text: msg.content }] })),
35      { role: 'user', parts: [{ text: userPrompt }] },
36    ];
37
38    let fullResponse = '';
39    const tools = this.toolRegistry.getTools(); // Get available tools for Gemini
40
41    try {
42      const stream = this.geminiClient.streamContent(messages, tools);
43
44      for await (const chunk of stream) {
45        if (chunk.text) {
46          fullResponse += chunk.text;
47          onChunk(chunk.text); // Stream text back to client
48        }
49
50        if (chunk.tool_calls && chunk.tool_calls.length > 0) {
51          logger.info(`Agent called tools for session ${this.sessionId}:`, chunk.tool_calls);
52          // Execute tools and append results to messages for the next turn
53          const toolResults = await Promise.all(
54            chunk.tool_calls.map(async (toolCall) => {
55              const toolFunc = this.toolRegistry.getToolFunction(toolCall.function.name);
56              if (!toolFunc) {
57                logger.error(`Tool '${toolCall.function.name}' not found.`);
58                return { tool_call_id: toolCall.id, function_response: { name: toolCall.function.name, response: { error: `Tool not found: ${toolCall.function.name}` } } };
59              }
60              try {
61                const result = await toolFunc.func(toolCall.function.args);
62                return { tool_call_id: toolCall.id, function_response: { name: toolCall.function.name, response: result } };
63              } catch (e) {
64                logger.error(`Error executing tool '${toolCall.function.name}':`, e);
65                return { tool_call_id: toolCall.id, function_response: { name: toolCall.function.name, response: { error: `Tool execution failed: ${(e as Error).message}` } } };
66              }
67            })
68          );
69
70          // Add tool calls and results to the history to be sent back to the LLM
71          messages.push({ role: 'assistant', parts: [{ toolCalls: chunk.tool_calls }] });
72          messages.push({ role: 'tool', parts: toolResults });
73
74          // Recursively call LLM with tool results, or implement explicit multi-turn agent logic
75          // For simplicity here, we'll just log and continue, but a real system would loop/re-evaluate.
76          logger.info(`Tool results sent back to LLM for session ${this.sessionId}.`);
77          // A full orchestrator would re-initiate the LLM call with updated messages to get the final response
78          // For this example, we'll assume the LLM provides a final text response after tool calls in the same stream.
79        }
80      }
81    } catch (error) {
82      logger.error(`Error in agent flow for session ${this.sessionId}:`, error);
83      throw error;
84    }
85  }
86}

This is where the rubber meets the road. No fancy UI, no click-and-drag. Just raw code that dictates behavior, state transitions, and error recovery. This is where you bake in your business logic and make agents genuinely useful, not just chatty.

Nano Banana: The User's Window – Lightweight & Responsive

On the frontend, we needed something snappy, extensible, and utterly focused on the user experience of interacting with a complex agent system. "Nano Banana" is our custom React hook and component library, designed specifically to integrate with Antigravity's streaming API and manage the rich state of agent interactions.

Why not a full-blown chat UI library? Because we needed fine-grained control over how agent messages, tool calls, loading states, and user feedback were displayed. Off-the-shelf solutions are often too opinionated or too generic. Nano Banana is minimal by design.

Key Components/Concepts:

  • useAgentFactory Hook: Manages local chat state, sends requests to Antigravity, and processes streaming responses.
  • Message Renderer: A component responsible for dynamically rendering different types of messages (user input, agent text, tool call indicators, error messages).
  • Real-time Updates: Displays agent "thinking" states, partial responses, and tool executions as they happen, preventing the user from staring at a blank screen.
tsx
1// src/hooks/useAgentFactory.ts - Nano Banana in action
2import { useState, useCallback, useRef } from 'react';
3
4// Using Antigravity's internal message structure for consistency
5interface AgentMessage {
6  id: string;
7  role: 'user' | 'assistant' | 'system' | 'tool';
8  content: string;
9  timestamp: Date;
10  isStreaming?: boolean;
11  toolCalls?: any[]; // Simplified for example, actual structure is more complex
12}
13
14interface StreamEvent {
15  type: 'chunk' | 'end' | 'error' | 'tool_call'; // Added tool_call if Antigravity sends these
16  content?: string;
17  toolCalls?: any[];
18}
19
20export function useAgentFactory(sessionId?: string) {
21  const [messages, setMessages] = useState<AgentMessage[]>([]);
22  const [isLoading, setIsLoading] = useState(false);
23  const abortControllerRef = useRef<AbortController | null>(null);
24  const currentSessionId = useRef(sessionId || crypto.randomUUID());
25
26  const sendMessage = useCallback(async (text: string) => {
27    if (!text.trim()) return;
28
29    const newUserMessage: AgentMessage = {
30      id: crypto.randomUUID(),
31      role: 'user',
32      content: text,
33      timestamp: new Date(),
34    };
35    setMessages((prev) => [...prev, newUserMessage]);
36    setIsLoading(true);
37
38    const currentController = new AbortController();
39    abortControllerRef.current = currentController;
40
41    let assistantMessageId: string | null = null;
42
43    try {
44      const response = await fetch('/api/agent-factory', { // Your Antigravity endpoint
45        method: 'POST',
46        headers: { 'Content-Type': 'application/json' },
47        body: JSON.stringify({
48          sessionId: currentSessionId.current,
49          history: messages.map(m => ({ role: m.role, content: m.content })),
50          prompt: text,
51        }),
52        signal: currentController.signal,
53      });
54
55      if (!response.ok || !response.body) {
56        throw new Error(`HTTP error! status: ${response.status}`);
57      }
58
59      const reader = response.body.getReader();
60      const decoder = new TextDecoder();
61      let receivedContent = '';
62
63      while (true) {
64        const { done, value } = await reader.read();
65        if (done) break;
66
67        const chunk = decoder.decode(value, { stream: true });
68        // EventSource-like parsing: data: {json}\n\n
69        const eventLines = chunk.split('\n\n').filter(Boolean);
70
71        for (const line of eventLines) {
72          if (line.startsWith('data: ')) {
73            const data = JSON.parse(line.substring(6)) as StreamEvent;
74
75            if (data.type === 'chunk') {
76              receivedContent += data.content || '';
77              if (!assistantMessageId) {
78                assistantMessageId = crypto.randomUUID();
79                setMessages((prev) => [
80                  ...prev,
81                  {
82                    id: assistantMessageId,
83                    role: 'assistant',
84                    content: receivedContent,
85                    timestamp: new Date(),
86                    isStreaming: true,
87                  },
88                ]);
89              } else {
90                setMessages((prev) =>
91                  prev.map((msg) =>
92                    msg.id === assistantMessageId ? { ...msg, content: receivedContent } : msg
93                  )
94                );
95              }
96            } else if (data.type === 'tool_call') {
97                // If Antigravity streams tool calls separately
98                setMessages((prev) => [
99                  ...prev,
100                  {
101                    id: crypto.randomUUID(),
102                    role: 'tool',
103                    content: `Executing tool: ${JSON.stringify(data.toolCalls)}`, // Display tool call
104                    timestamp: new Date(),
105                  },
106                ]);
107            } else if (data.type === 'error') {
108              setMessages((prev) => [
109                ...prev,
110                {
111                  id: crypto.randomUUID(),
112                  role: 'system',
113                  content: `Backend Error: ${data.content}`,
114                  timestamp: new Date(),
115                },
116              ]);
117            } else if (data.type === 'end' && assistantMessageId) {
118              setMessages((prev) =>
119                prev.map((msg) =>
120                  msg.id === assistantMessageId ? { ...msg, isStreaming: false } : msg
121                )
122              );
123              assistantMessageId = null; // Reset for next agent turn
124            }
125          }
126        }
127      }
128
129    } catch (error) {
130      if (error instanceof DOMException && error.name === 'AbortError') {
131        console.warn('Request aborted by user.');
132      } else {
133        console.error('Agent factory error:', error);
134        setMessages((prev) => [
135          ...prev,
136          {
137            id: crypto.randomUUID(),
138            role: 'system',
139            content: `Error: ${(error as Error).message}`,
140            timestamp: new Date(),
141          },
142        ]);
143      }
144    } finally {
145      setIsLoading(false);
146      abortControllerRef.current = null;
147    }
148  }, [messages]); // `messages` in dependency array is intentional here to send full history
149
150  const abortRequest = useCallback(() => {
151    if (abortControllerRef.current) {
152      abortControllerRef.current.abort();
153      setIsLoading(false);
154    }
155  }, []);
156
157  return { messages, isLoading, sendMessage, abortRequest };
158}
tsx
1// src/components/ChatInterface.tsx - Example Nano Banana Usage
2import React, { useState } from 'react';
3import { useAgentFactory } from '../hooks/useAgentFactory';
4
5const ChatInterface: React.FC = () => {
6  // We could pass a prop for sessionId here if it comes from props
7  const { messages, isLoading, sendMessage, abortRequest } = useAgentFactory();
8  const [input, setInput] = useState('');
9
10  const handleSubmit = (e: React.FormEvent) => {
11    e.preventDefault();
12    if (!input.trim()) return;
13    sendMessage(input);
14    setInput('');
15  };
16
17  return (
18    <div className="flex flex-col h-screen bg-gray-900 text-white p-4">
19      <h1 className="text-3xl font-bold mb-4 text-center text-blue-400">Agent Factory Console</h1>
20      <div className="flex-grow overflow-y-auto space-y-4 pr-2">
21        {messages.map((msg) => (
22          <div key={msg.id} className={`flex ${msg.role === 'user' ? 'justify-end' : 'justify-start'}`}>
23            <div
24              className={`max-w-[75%] p-3 rounded-lg shadow-md ${
25                msg.role === 'user'
26                  ? 'bg-blue-700 text-white'
27                  : msg.role === 'assistant'
28                  ? 'bg-gray-700 text-gray-100'
29                  : 'bg-red-800 text-white text-sm'
30              }`}
31            >
32              <strong className="block text-xs uppercase opacity-75 mb-1">
33                {msg.role === 'user' ? 'You' : msg.role === 'assistant' ? 'Agent' : 'System'}
34              </strong>
35              {msg.content}
36              {msg.isStreaming && <span className="animate-pulse ml-2">_</span>}
37            </div>
38          </div>
39        ))}
40        {isLoading && (
41          <div className="flex justify-start">
42            <div className="max-w-[75%] p-3 rounded-lg shadow-md bg-gray-700 text-gray-100 animate-pulse">
43              <strong className="block text-xs uppercase opacity-75 mb-1">Agent</strong>
44              Thinking...
45            </div>
46          </div>
47        )}
48      </div>
49      <form onSubmit={handleSubmit} className="flex mt-6 gap-2">
50        <input
51          type="text"
52          value={input}
53          onChange={(e) => setInput(e.target.value)}
54          className="flex-grow p-3 border border-gray-600 rounded-lg bg-gray-800 text-white focus:ring-2 focus:ring-blue-500 focus:border-transparent"
55          placeholder="Command the agents..."
56          disabled={isLoading}
57        />
58        <button
59          type="submit"
60          className="px-6 py-3 bg-green-600 text-white rounded-lg hover:bg-green-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors duration-200"
61          disabled={isLoading}
62        >
63          Send
64        </button>
65        {isLoading && (
66          <button
67            type="button"
68            onClick={abortRequest}
69            className="px-6 py-3 bg-red-600 text-white rounded-lg hover:bg-red-700 transition-colors duration-200"
70          >
71            Abort
72          </button>
73        )}
74      </form>
75    </div>
76  );
77};
78
79export default ChatInterface;

Nano Banana is all about componentizing the interaction. You don't want your core application state tied up with the transient nature of LLM responses. It's a pragmatic solution for building responsive, complex UIs around agentic workflows.

<br> ![Abstract AI illustration](https://images.unsplash.com/photo-1618005182384-a83a8bd57fbe?w=800&auto=format&fit=crop&q=80) <br>

Pain Points & Hard-Won Lessons

This wasn't a smooth ride. Anyone telling you agent building is easy is either lying or building a glorified chatbot.

  1. Observability is paramount: When an agent goes rogue, or an LLM call fails, you need to know why. Antigravity is riddled with structured logging, tracing IDs, and metrics. Without it, debugging multi-turn agent interactions is a nightmare.
  2. Cost Management: Seriously, monitor your token usage like a hawk. Every prompt engineering decision, every extra turn in an agentic loop, impacts your bill. Aggressive caching of LLM responses (where context allows) and careful prompt compression are non-negotiable.
  3. Latency is a UX Killer: LLMs are not instant. Streaming responses from Antigravity to Nano Banana isn't just a nicety; it's a necessity. Even then, complex agent chains can take seconds or even minutes. Manage user expectations ruthlessly.
  4. Tool Reliability: Your agents are only as good as their tools. If an API is flaky, your agent will be flaky. Build robust, well-tested tools within Antigravity, complete with timeouts, retries, and detailed error handling.
  5. Agentic Drifting: Agents, left to their own devices, can go off-topic. Strong system prompts, clear instruction following mechanisms, and periodic re-evaluation of the agent's goal within Antigravity are crucial. Sometimes, you need to explicitly tell the agent, "No, stop, you're doing it wrong."
  6. Context Window Management: While Gemini 3 has a huge context window, it's not infinite. Smart retrieval-augmented generation (RAG), summarization of past turns, and selective inclusion of relevant information are still essential for long-running conversations. Don't just dump everything into the prompt.

Conclusion: Build, Break, Learn

Building the Agent Factory with Gemini 3, AI Studio, Antigravity, and Nano Banana has been a brutal, enlightening journey. Gemini 3 offers unparalleled raw power, especially with its multimodal and context capabilities. AI Studio is a great sandbox, but keep it there.

The real magic, and the real pain, lies in Antigravity – your custom orchestration layer. This is where you tame the LLMs, enforce your business logic, and build a system that's resilient and scalable. And Nano Banana ensures your users aren't left in the dark, providing a smooth interface to a complex backend.

Don't buy into the "one-click agent" hype. Real-world agentic systems demand real engineering. You'll spend countless hours on prompt engineering, refining tool definitions, architecting robust backend services, and obsessing over every millisecond of latency. It's hard, it's messy, but when it works, it's genuinely transformative.

Now go forth, build something useful, and prepare to get your hands dirty. There are no shortcuts here, just hard-won expertise.

#AI Agents#Gemini 3#Google AI Studio#Antigravity#Nano Banana#TypeScript#React#LLMs#Agentic Systems#Software Architecture
Rakib Hasan Sohag

Rakib Hasan Sohag

MERN Stack / Full Stack Developer

Share: