How RapidNative's Two-Step AI Pipeline and Browser Bundler Power Instant React Native Code Generation
A technical deep-dive into RapidNative's two-step AI pipeline — how context gathering, code generation, a browser-based React Native bundler, and SSE streaming combine to turn a prompt into a live running app. (234 chars — trim to: "A technical deep-dive into RapidNative's two-step AI pipeline: context gathering, code generation, browser bundler, and SSE streaming — prompt to live app in seconds." — 166 chars ✓)
By Parth
30th Mar 2026
There is a specific, frustrating failure mode that happens when you ask a general-purpose AI assistant to write React Native code. The output looks reasonable. It might even pass a linter. Then you try to run it and get a blank screen, a cryptic Metro error, or a layout that's completely broken on device.
The root cause is almost always the same: the AI generated code without understanding its context. It didn't know your project's file structure. It didn't know you're using NativeWind with Expo SDK 52. It didn't know that space-x-4 and CSS grid classes don't work in the native renderer. It made confident, plausible-sounding guesses — and those guesses were wrong.
This is the exact engineering problem that RapidNative, the AI-native mobile app builder, was designed to solve. Rather than sending a prompt directly to a language model and hoping for the best, RapidNative runs a structured two-step pipeline — first understanding your project's context, then generating code informed by that context. Combined with a browser-based React Native bundler that compiles and previews code without a server, it creates a feedback loop that turns a plain-text prompt into a running app screen in seconds.
This post breaks down exactly how that works — from the first token of context gathering to the final rendered frame in the live preview.
The challenge: making AI-generated code work in React Native's strict native rendering environment — Photo by Luca Bravo on Unsplash
Why React Native Code Generation Is Harder Than It Looks
Before diving into the architecture, it's worth understanding why React Native presents a harder target than web code generation.
React Native code runs through a native rendering engine, not a browser DOM. This means many common CSS patterns that any AI model has seen millions of times simply don't work. NativeWind — the utility-first CSS library for React Native — explicitly forbids certain Tailwind classes including space-x, space-y, grid utilities, and others that have no native equivalent. An AI model trained on web code will reach for these patterns reflexively.
Beyond styling, the file structure of a React Native / Expo project is template-dependent. A standard Expo project uses app/ for routes and components/ for shared components. But a fullstack Expo monorepo might use expo/app/ and expo/components/ as path prefixes. When the AI generates an import like import { Button } from '../components/Button', it needs to know which prefix applies — otherwise the import silently fails and the component never renders.
There are also Expo-specific patterns: SafeAreaView needs edges props for correct insets, dynamic routing isn't supported in certain configurations, certain navigation patterns require specific wrappers. These are the kinds of constraints that appear in RapidNative's system prompt under specific enforcement sections like MOBILE_NATIVE_SECTION and UNSUPPORTED_TAILWIND_SECTION.
The fundamental challenge is that correct React Native generation requires deep context: project-specific file paths, component patterns in the existing codebase, template constraints, and NativeWind rules. Getting that context reliably is what the two-step pipeline is built around.
The Two-Step Pipeline: Context First, Code Second
RapidNative's AI generation endpoint (/api/user/ai/generate-v2) structures every code generation request as two discrete stages, run sequentially.
Stage 1: Context Gathering. A faster, cheaper language model is given the user's prompt and a set of codebase tools. Its job is not to write code — it's to read the project and gather the information the main model will need. It can read specific files, list directories, search file contents, and even fetch images by keyword. After up to four tool-calling steps, its context-gathering work is complete.
Stage 2: Code Generation. The gathered context is injected into the system prompt for a more capable main model. This model receives the full context — layout files, theme definitions, current file content, template rules — and generates the actual code. Tool calling is disabled for this stage. Its only job is to produce output, which streams back to the client via Server-Sent Events.
This separation is a deliberate architectural choice that solves two problems at once. First, it keeps token costs down: reading files and searching codebases is done with a cheaper model, while the expensive tokens are reserved for the actual generation step. Second, it produces better code: the main model gets a clean, structured context injection rather than needing to decide what to read and write simultaneously.
The data flow looks like this:
User Prompt
↓
Stage 1: Context Gathering Model
- list_dir (project file tree)
- get_files_content (layout.md, theme.md, target file)
- glob (find files by pattern)
- batch_grep (search across file contents)
- get_images_by_keywords (for UI assets)
↓
Structured context output
↓
Stage 2: Code Generation Model
- Receives context-enriched system prompt
- No tool calls — pure generation
- Streams response via SSE
↓
Generated code (streamed)
↓
Editor state update → Bundler → Preview
Stage 1: Context Gathering with Tool Calls
The context gathering stage uses a model configured for the CONTEXT_GATHERING purpose — selected from a database table (ai_model_configs) that maps purposes to specific model providers and IDs. This makes the model selection runtime-configurable: the engineering team can switch models for any purpose without a code deployment.
The context model is given five tools:
get_files_content— reads specific project files, with optional line range support to avoid pulling in entire large fileslist_dir— lists the files in the project's virtual file systemglob— finds files matching a pattern (e.g.,components/**/*.tsx)batch_grep— searches file contents for a pattern across multiple files simultaneouslyget_images_by_keywords— fetches image asset URLs for the project by keyword, enabling the AI to use actual project assets when generating UI
The stage is capped at maxSteps: 4 to prevent runaway tool-calling loops. The model typically uses two to three steps: one to list the directory and identify what's relevant, one to read specific files (the target screen, layout definitions, theme config), and sometimes a third to search for component patterns.
The output from this stage is parsed and deduplicated. Tool results are organized by file path and combined with the project's layout.md and theme.md metadata files, which contain structured descriptions of the project's screen architecture and color system respectively. This combined context becomes the injection payload for Stage 2.
The two-stage pipeline separates context discovery from code generation, improving both cost efficiency and output quality — Photo by Google DeepMind on Unsplash
Stage 2: Code Generation with Injected Context
The main model — configured for the MAIN_GENERATION purpose — receives a system prompt that is assembled from multiple composable sections:
ROLE_SECTION— establishes the model as an expert mobile developerRESPONSIBILITIES_SECTION— encodes strict constraints (no dynamic routing, static path references only)MOBILE_NATIVE_SECTION— React Native and NativeWind-specific rules (how to handle SafeAreaView, flex-wrap behavior, platform-specific patterns)UNSUPPORTED_TAILWIND_SECTION— an explicit list of Tailwind classes that are forbidden in native rendering- Template-specific section — file path prefixes and conventions for the project's specific Expo template
- Context injection — the structured output from Stage 1, including file contents, layout description, and theme definition
The template-specific section is critical. RapidNative supports multiple project templates — including nativewind, nativewind-themed, and fullstack — each with different directory conventions. The system prompt includes a FILE PATH CONFIGURATION section that tells the model exactly what path prefixes to use. This is why generated imports work on the first try rather than requiring manual path corrections.
Tool calling is disabled for this stage (maxSteps: 1, no tools). The model's only task is to generate code, which it streams back via Server-Sent Events. The conversation history sent to the model is also intentionally limited to the last four messages (two user/assistant exchanges), which keeps token usage within budget while preserving enough context for iterative editing.
For projects where the main model is Anthropic's Claude, the system prompt sections are marked with ephemeral cache hints. This allows Anthropic's caching mechanism to skip reprocessing the static sections of the system prompt on repeated requests, reducing both latency and cost on sequential prompts in the same project session.
The Browser Bundler: Live Preview Without a Server
Once generated code reaches the frontend, it needs to be compiled and rendered in a live preview. This is where RapidNative's most architecturally unique component comes in: a full React Native bundler that runs entirely in the browser.
The bundler is almostmetro — an incremental React Native bundler that runs in a Web Worker. It takes the project's file tree, transforms TypeScript/TSX to JavaScript using Babel, instruments code for React Fast Refresh (HMR), and produces a bundle that can be injected into an iframe for live rendering. The entire compilation pipeline runs client-side — there is no server roundtrip for compilation.
The underlying data source for the bundler is a VirtualFileSystem (VFS) — an in-memory representation of the project's file tree backed by Supabase. When the AI generates a new version of a file, the frontend updates the VFS, and the bundler worker picks up the change for an incremental rebuild.
The bundler communicates with the main thread via a message protocol:
| Direction | Message Type | Description |
|---|---|---|
| Main → Worker | watch-start | Initial full build with complete file tree |
| Main → Worker | watch-update | Incremental update with changed file(s) |
| Main → Worker | watch-stop | Stop the bundler |
| Worker → Main | watch-ready | Bundle ready for first render |
| Worker → Main | watch-rebuild | Incremental rebuild complete |
| Worker → Main | hmr-update | Hot module replacement event |
| Worker → Main | error | Build error details |
The bundler is fault-tolerant by design. If a single file has a syntax error — which can happen with partial AI-generated code during streaming — that file is replaced with a BrokenComponentStub: a visible error card that shows on screen in place of the broken component. Other screens in the app continue to function normally. This is a significant UX advantage: a syntax error in one screen doesn't kill the entire preview.
The bundle output is injected into an iframe, which communicates back to the main editor via postMessage. When a user taps a component in the preview, the iframe reports the event. When the editor wants to apply a code update, it posts a message to the iframe with the new bundle content and waits for a ready confirmation (with a 100ms debounce) before applying.
Live preview renders on a virtual device — or scan a QR code to see it on a real phone — Photo by Paul Hanaoka on Unsplash
Streaming Architecture: From Token to Rendered Screen
The full event chain from AI generation to rendered preview involves several systems working in sequence. Understanding the streaming architecture helps explain why the preview feels instantaneous rather than batch-processed.
The API route emits Server-Sent Events with these headers:
Content-Type: text/event-stream
Cache-Control: no-cache, no-transform
Connection: keep-alive
X-Accel-Buffering: no
Each SSE event has a typed structure: event: <type>\ndata: <JSON>\n\n. The event types form a lifecycle:
start— request accepted, pipeline beginningtext— a content chunk from Stage 2 generation (includes status messages like*Analyzing your project...*before real code begins streaming)tool-callandtool-result— Stage 1 context gathering events, visible in the UI as progress indicatorsusage— combined token metrics from both stages (input + output from both models)done— generation complete, with finish reasonerror— stream error with details
On the frontend, the useSendMessage hook subscribes to this SSE stream, dispatches optimistic UI updates to the Redux store, and routes each event type to the appropriate handler. As text events arrive and the complete code block is assembled, the file is written to the VFS, triggering an incremental bundler rebuild, which produces an HMR update that the iframe applies — all within the duration of the streaming response itself.
An AbortController connects the client connection to the in-flight AI request: if the user navigates away or cancels generation, the signal propagates to the AI SDK and the LLM call is terminated immediately, preventing wasted tokens.
Multi-Provider LLM Architecture
One of RapidNative's more operationally valuable architectural decisions is that LLM providers are not hardcoded into the application. Model selection is backed by a database table (ai_model_configs) that maps generation purposes (MAIN_GENERATION, CONTEXT_GATHERING, VISION) to specific provider and model combinations.
The AIModelConfigService reads from this table with a 5-minute cache (using Next.js unstable_cache with revalidateTag support). This means the engineering team can switch the model used for any purpose — without a code change or deployment — just by updating a database record.
Supported providers include OpenRouter, AWS Bedrock, Google Vertex AI, Azure (both OpenAI and Anthropic models), and direct Anthropic API access. The underlying transport uses the Vercel AI SDK (ai v4.3.19), which abstracts provider differences behind a unified streaming interface.
This architecture makes RapidNative resilient to model provider outages and allows rapid experimentation with new models as they become available — a significant advantage given how quickly frontier model capabilities are evolving.
The Credit System That Balances Cost and Access
Running two-stage AI generation with frontier models on every request is expensive. RapidNative manages this via a four-bucket credit system that tracks usage across different entitlement types:
| Credit Bucket | Description |
|---|---|
| Free daily credits | 5 credits/day, resets each day |
| Free monthly credits | 20 credits/month, resets each month |
| Subscription credits | Monthly allocation based on plan tier |
| Non-expiring credits | Purchased bonus credits that don't expire |
Before each generation request, the CreditService validates that sufficient credits exist across these buckets in priority order. The deduction happens post-request, after the full response is received and token usage is known. One edge case is handled explicitly: "Fix with AI" prompts — automatically triggered when the bundler detects a build error — skip credit deduction entirely, since the error being fixed was generated by the system.
This four-bucket design gives the team fine-grained control over the economics of AI access: free users get a meaningful daily allowance, subscribers get a larger monthly pool, and power users can top up with non-expiring credits.
Putting It Together: The Full Request Lifecycle
When a user types a prompt in RapidNative and hits send, here's the complete lifecycle:
useSendMessagedispatches a Redux action, sets optimistic UI state, and compresses any attached images- POST request hits
/api/user/ai/generate-v2with the prompt, project ID, target file path, and conversation history - Server validates session, team membership, and credit balance
- Stage 1 begins: context gathering model reads the project's file tree, pulls relevant files, searches for patterns
- Context is structured and injected into the Stage 2 system prompt
- Stage 2 begins: main model generates code, streaming chunks via SSE
- Frontend assembles streamed chunks, updates the Redux store and VFS as the response arrives
- Bundler worker receives a
watch-updatemessage, performs an incremental compile - Compiled bundle is posted to the iframe
- Iframe applies HMR update — the preview updates in place, with no full reload
- Token usage is reported via the
usageevent; credits are deducted doneevent marks completion; the UI transitions back to the ready state
The entire cycle — from send to rendered preview — typically completes in a few seconds, depending on the complexity of the generated screen. More importantly, the result is code that actually works: correct import paths, NativeWind-compatible styles, template-appropriate file structure, and patterns that match the rest of the project.
What This Architecture Makes Possible
The two-step pipeline and browser bundler unlock a development experience that's fundamentally different from prompting a general AI assistant. You're not getting generic React Native code that needs to be corrected and integrated — you're getting code that already knows your project's conventions, runs through a real compiler, and shows up on screen within seconds.
For developers, it means iterating on screens without switching contexts. For non-developers, it means building production-ready React Native interfaces without writing a line of code. And for teams, the architecture supports real-time collaboration, version-tracked file changes, and export to a complete Expo project ready for App Store or Google Play submission.
The engineering decisions described here — multi-provider model flexibility, browser-side bundling, streaming with fault-tolerant HMR — are what separate a reliable AI mobile app builder from a demo that works once and breaks under real conditions.
If you're building mobile apps and want to see the pipeline in action, try RapidNative free — no credit card required.
Related reading:
- Prompt to App: How to Write Prompts That Generate Great Mobile Screens
- AI React Native Generator: A Practical Guide
- Vibe Coding: The Complete Guide to AI-Assisted App Development
- Build React Native App with AI
Ready to Build Your App?
Turn your idea into a production-ready React Native app in minutes.