How RapidNative's Two-Step AI Pipeline and Browser Bundler Power Instant React Native Code Generation

A technical deep-dive into RapidNative's two-step AI pipeline — how context gathering, code generation, a browser-based React Native bundler, and SSE streaming combine to turn a prompt into a live running app. (234 chars — trim to: "A technical deep-dive into RapidNative's two-step AI pipeline: context gathering, code generation, browser bundler, and SSE streaming — prompt to live app in seconds." — 166 chars ✓)

PA

By Parth

30th Mar 2026

How RapidNative's Two-Step AI Pipeline and Browser Bundler Power Instant React Native Code Generation

There is a specific, frustrating failure mode that happens when you ask a general-purpose AI assistant to write React Native code. The output looks reasonable. It might even pass a linter. Then you try to run it and get a blank screen, a cryptic Metro error, or a layout that's completely broken on device.

The root cause is almost always the same: the AI generated code without understanding its context. It didn't know your project's file structure. It didn't know you're using NativeWind with Expo SDK 52. It didn't know that space-x-4 and CSS grid classes don't work in the native renderer. It made confident, plausible-sounding guesses — and those guesses were wrong.

This is the exact engineering problem that RapidNative, the AI-native mobile app builder, was designed to solve. Rather than sending a prompt directly to a language model and hoping for the best, RapidNative runs a structured two-step pipeline — first understanding your project's context, then generating code informed by that context. Combined with a browser-based React Native bundler that compiles and previews code without a server, it creates a feedback loop that turns a plain-text prompt into a running app screen in seconds.

This post breaks down exactly how that works — from the first token of context gathering to the final rendered frame in the live preview.

A developer working with React Native code on multiple screens The challenge: making AI-generated code work in React Native's strict native rendering environment — Photo by Luca Bravo on Unsplash

Why React Native Code Generation Is Harder Than It Looks

Before diving into the architecture, it's worth understanding why React Native presents a harder target than web code generation.

React Native code runs through a native rendering engine, not a browser DOM. This means many common CSS patterns that any AI model has seen millions of times simply don't work. NativeWind — the utility-first CSS library for React Native — explicitly forbids certain Tailwind classes including space-x, space-y, grid utilities, and others that have no native equivalent. An AI model trained on web code will reach for these patterns reflexively.

Beyond styling, the file structure of a React Native / Expo project is template-dependent. A standard Expo project uses app/ for routes and components/ for shared components. But a fullstack Expo monorepo might use expo/app/ and expo/components/ as path prefixes. When the AI generates an import like import { Button } from '../components/Button', it needs to know which prefix applies — otherwise the import silently fails and the component never renders.

There are also Expo-specific patterns: SafeAreaView needs edges props for correct insets, dynamic routing isn't supported in certain configurations, certain navigation patterns require specific wrappers. These are the kinds of constraints that appear in RapidNative's system prompt under specific enforcement sections like MOBILE_NATIVE_SECTION and UNSUPPORTED_TAILWIND_SECTION.

The fundamental challenge is that correct React Native generation requires deep context: project-specific file paths, component patterns in the existing codebase, template constraints, and NativeWind rules. Getting that context reliably is what the two-step pipeline is built around.

The Two-Step Pipeline: Context First, Code Second

RapidNative's AI generation endpoint (/api/user/ai/generate-v2) structures every code generation request as two discrete stages, run sequentially.

Stage 1: Context Gathering. A faster, cheaper language model is given the user's prompt and a set of codebase tools. Its job is not to write code — it's to read the project and gather the information the main model will need. It can read specific files, list directories, search file contents, and even fetch images by keyword. After up to four tool-calling steps, its context-gathering work is complete.

Stage 2: Code Generation. The gathered context is injected into the system prompt for a more capable main model. This model receives the full context — layout files, theme definitions, current file content, template rules — and generates the actual code. Tool calling is disabled for this stage. Its only job is to produce output, which streams back to the client via Server-Sent Events.

This separation is a deliberate architectural choice that solves two problems at once. First, it keeps token costs down: reading files and searching codebases is done with a cheaper model, while the expensive tokens are reserved for the actual generation step. Second, it produces better code: the main model gets a clean, structured context injection rather than needing to decide what to read and write simultaneously.

The data flow looks like this:

User Prompt
    ↓
Stage 1: Context Gathering Model
  - list_dir (project file tree)
  - get_files_content (layout.md, theme.md, target file)
  - glob (find files by pattern)
  - batch_grep (search across file contents)
  - get_images_by_keywords (for UI assets)
    ↓
Structured context output
    ↓
Stage 2: Code Generation Model
  - Receives context-enriched system prompt
  - No tool calls — pure generation
  - Streams response via SSE
    ↓
Generated code (streamed)
    ↓
Editor state update → Bundler → Preview

Stage 1: Context Gathering with Tool Calls

The context gathering stage uses a model configured for the CONTEXT_GATHERING purpose — selected from a database table (ai_model_configs) that maps purposes to specific model providers and IDs. This makes the model selection runtime-configurable: the engineering team can switch models for any purpose without a code deployment.

The context model is given five tools:

  • get_files_content — reads specific project files, with optional line range support to avoid pulling in entire large files
  • list_dir — lists the files in the project's virtual file system
  • glob — finds files matching a pattern (e.g., components/**/*.tsx)
  • batch_grep — searches file contents for a pattern across multiple files simultaneously
  • get_images_by_keywords — fetches image asset URLs for the project by keyword, enabling the AI to use actual project assets when generating UI

The stage is capped at maxSteps: 4 to prevent runaway tool-calling loops. The model typically uses two to three steps: one to list the directory and identify what's relevant, one to read specific files (the target screen, layout definitions, theme config), and sometimes a third to search for component patterns.

The output from this stage is parsed and deduplicated. Tool results are organized by file path and combined with the project's layout.md and theme.md metadata files, which contain structured descriptions of the project's screen architecture and color system respectively. This combined context becomes the injection payload for Stage 2.

Abstract visualization of data flowing through an AI pipeline The two-stage pipeline separates context discovery from code generation, improving both cost efficiency and output quality — Photo by Google DeepMind on Unsplash

Stage 2: Code Generation with Injected Context

The main model — configured for the MAIN_GENERATION purpose — receives a system prompt that is assembled from multiple composable sections:

  • ROLE_SECTION — establishes the model as an expert mobile developer
  • RESPONSIBILITIES_SECTION — encodes strict constraints (no dynamic routing, static path references only)
  • MOBILE_NATIVE_SECTION — React Native and NativeWind-specific rules (how to handle SafeAreaView, flex-wrap behavior, platform-specific patterns)
  • UNSUPPORTED_TAILWIND_SECTION — an explicit list of Tailwind classes that are forbidden in native rendering
  • Template-specific section — file path prefixes and conventions for the project's specific Expo template
  • Context injection — the structured output from Stage 1, including file contents, layout description, and theme definition

The template-specific section is critical. RapidNative supports multiple project templates — including nativewind, nativewind-themed, and fullstack — each with different directory conventions. The system prompt includes a FILE PATH CONFIGURATION section that tells the model exactly what path prefixes to use. This is why generated imports work on the first try rather than requiring manual path corrections.

Tool calling is disabled for this stage (maxSteps: 1, no tools). The model's only task is to generate code, which it streams back via Server-Sent Events. The conversation history sent to the model is also intentionally limited to the last four messages (two user/assistant exchanges), which keeps token usage within budget while preserving enough context for iterative editing.

For projects where the main model is Anthropic's Claude, the system prompt sections are marked with ephemeral cache hints. This allows Anthropic's caching mechanism to skip reprocessing the static sections of the system prompt on repeated requests, reducing both latency and cost on sequential prompts in the same project session.

The Browser Bundler: Live Preview Without a Server

Once generated code reaches the frontend, it needs to be compiled and rendered in a live preview. This is where RapidNative's most architecturally unique component comes in: a full React Native bundler that runs entirely in the browser.

The bundler is almostmetro — an incremental React Native bundler that runs in a Web Worker. It takes the project's file tree, transforms TypeScript/TSX to JavaScript using Babel, instruments code for React Fast Refresh (HMR), and produces a bundle that can be injected into an iframe for live rendering. The entire compilation pipeline runs client-side — there is no server roundtrip for compilation.

The underlying data source for the bundler is a VirtualFileSystem (VFS) — an in-memory representation of the project's file tree backed by Supabase. When the AI generates a new version of a file, the frontend updates the VFS, and the bundler worker picks up the change for an incremental rebuild.

The bundler communicates with the main thread via a message protocol:

DirectionMessage TypeDescription
Main → Workerwatch-startInitial full build with complete file tree
Main → Workerwatch-updateIncremental update with changed file(s)
Main → Workerwatch-stopStop the bundler
Worker → Mainwatch-readyBundle ready for first render
Worker → Mainwatch-rebuildIncremental rebuild complete
Worker → Mainhmr-updateHot module replacement event
Worker → MainerrorBuild error details

The bundler is fault-tolerant by design. If a single file has a syntax error — which can happen with partial AI-generated code during streaming — that file is replaced with a BrokenComponentStub: a visible error card that shows on screen in place of the broken component. Other screens in the app continue to function normally. This is a significant UX advantage: a syntax error in one screen doesn't kill the entire preview.

The bundle output is injected into an iframe, which communicates back to the main editor via postMessage. When a user taps a component in the preview, the iframe reports the event. When the editor wants to apply a code update, it posts a message to the iframe with the new bundle content and waits for a ready confirmation (with a 100ms debounce) before applying.

A smartphone showing a mobile app interface under development Live preview renders on a virtual device — or scan a QR code to see it on a real phone — Photo by Paul Hanaoka on Unsplash

Streaming Architecture: From Token to Rendered Screen

The full event chain from AI generation to rendered preview involves several systems working in sequence. Understanding the streaming architecture helps explain why the preview feels instantaneous rather than batch-processed.

The API route emits Server-Sent Events with these headers:

Content-Type: text/event-stream
Cache-Control: no-cache, no-transform
Connection: keep-alive
X-Accel-Buffering: no

Each SSE event has a typed structure: event: <type>\ndata: <JSON>\n\n. The event types form a lifecycle:

  • start — request accepted, pipeline beginning
  • text — a content chunk from Stage 2 generation (includes status messages like *Analyzing your project...* before real code begins streaming)
  • tool-call and tool-result — Stage 1 context gathering events, visible in the UI as progress indicators
  • usage — combined token metrics from both stages (input + output from both models)
  • done — generation complete, with finish reason
  • error — stream error with details

On the frontend, the useSendMessage hook subscribes to this SSE stream, dispatches optimistic UI updates to the Redux store, and routes each event type to the appropriate handler. As text events arrive and the complete code block is assembled, the file is written to the VFS, triggering an incremental bundler rebuild, which produces an HMR update that the iframe applies — all within the duration of the streaming response itself.

An AbortController connects the client connection to the in-flight AI request: if the user navigates away or cancels generation, the signal propagates to the AI SDK and the LLM call is terminated immediately, preventing wasted tokens.

Multi-Provider LLM Architecture

One of RapidNative's more operationally valuable architectural decisions is that LLM providers are not hardcoded into the application. Model selection is backed by a database table (ai_model_configs) that maps generation purposes (MAIN_GENERATION, CONTEXT_GATHERING, VISION) to specific provider and model combinations.

The AIModelConfigService reads from this table with a 5-minute cache (using Next.js unstable_cache with revalidateTag support). This means the engineering team can switch the model used for any purpose — without a code change or deployment — just by updating a database record.

Supported providers include OpenRouter, AWS Bedrock, Google Vertex AI, Azure (both OpenAI and Anthropic models), and direct Anthropic API access. The underlying transport uses the Vercel AI SDK (ai v4.3.19), which abstracts provider differences behind a unified streaming interface.

This architecture makes RapidNative resilient to model provider outages and allows rapid experimentation with new models as they become available — a significant advantage given how quickly frontier model capabilities are evolving.

The Credit System That Balances Cost and Access

Running two-stage AI generation with frontier models on every request is expensive. RapidNative manages this via a four-bucket credit system that tracks usage across different entitlement types:

Credit BucketDescription
Free daily credits5 credits/day, resets each day
Free monthly credits20 credits/month, resets each month
Subscription creditsMonthly allocation based on plan tier
Non-expiring creditsPurchased bonus credits that don't expire

Before each generation request, the CreditService validates that sufficient credits exist across these buckets in priority order. The deduction happens post-request, after the full response is received and token usage is known. One edge case is handled explicitly: "Fix with AI" prompts — automatically triggered when the bundler detects a build error — skip credit deduction entirely, since the error being fixed was generated by the system.

This four-bucket design gives the team fine-grained control over the economics of AI access: free users get a meaningful daily allowance, subscribers get a larger monthly pool, and power users can top up with non-expiring credits.

Putting It Together: The Full Request Lifecycle

When a user types a prompt in RapidNative and hits send, here's the complete lifecycle:

  1. useSendMessage dispatches a Redux action, sets optimistic UI state, and compresses any attached images
  2. POST request hits /api/user/ai/generate-v2 with the prompt, project ID, target file path, and conversation history
  3. Server validates session, team membership, and credit balance
  4. Stage 1 begins: context gathering model reads the project's file tree, pulls relevant files, searches for patterns
  5. Context is structured and injected into the Stage 2 system prompt
  6. Stage 2 begins: main model generates code, streaming chunks via SSE
  7. Frontend assembles streamed chunks, updates the Redux store and VFS as the response arrives
  8. Bundler worker receives a watch-update message, performs an incremental compile
  9. Compiled bundle is posted to the iframe
  10. Iframe applies HMR update — the preview updates in place, with no full reload
  11. Token usage is reported via the usage event; credits are deducted
  12. done event marks completion; the UI transitions back to the ready state

The entire cycle — from send to rendered preview — typically completes in a few seconds, depending on the complexity of the generated screen. More importantly, the result is code that actually works: correct import paths, NativeWind-compatible styles, template-appropriate file structure, and patterns that match the rest of the project.

What This Architecture Makes Possible

The two-step pipeline and browser bundler unlock a development experience that's fundamentally different from prompting a general AI assistant. You're not getting generic React Native code that needs to be corrected and integrated — you're getting code that already knows your project's conventions, runs through a real compiler, and shows up on screen within seconds.

For developers, it means iterating on screens without switching contexts. For non-developers, it means building production-ready React Native interfaces without writing a line of code. And for teams, the architecture supports real-time collaboration, version-tracked file changes, and export to a complete Expo project ready for App Store or Google Play submission.

The engineering decisions described here — multi-provider model flexibility, browser-side bundling, streaming with fault-tolerant HMR — are what separate a reliable AI mobile app builder from a demo that works once and breaks under real conditions.

If you're building mobile apps and want to see the pipeline in action, try RapidNative free — no credit card required.


Related reading:

Ready to Build Your App?

Turn your idea into a production-ready React Native app in minutes.

Try It Now