Aside is an iOS chat app that runs on Apple Intelligence. No API key field. No backend. No "sign in to continue." You open the app and you can talk to a language model on your phone. The conversation history is in SwiftData on the device. The inference happens on the Neural Engine. The whole thing works on a plane.
The interesting question isn't "can you build a chat UI on top of LanguageModelSession" — yes, in about 30 lines. The interesting question is everything else: how do you persist conversations, how do you handle image attachments when the API is text-only, how do you integrate the other Apple Intelligence surfaces (Image Playground, Writing Tools, Translate, Summarize, Smart Reply) into a single coherent chat, and how do you ship it 13 builds deep without the architecture turning into spaghetti.
This is the writeup. Aside is $9.99 on the App Store; the full Xcode source is $19 on Polar.
The FoundationModels wrapper
The whole "talk to a model" surface is one Swift file (AIService.swift), and the core of it is one LanguageModelSession instance:
@Observable
@MainActor
final class AIService {
private var session: LanguageModelSession
init(systemPrompt: String? = nil) {
self.currentInstructions = systemPrompt
if let systemPrompt {
self.session = LanguageModelSession(instructions: systemPrompt)
} else {
self.session = LanguageModelSession()
}
}
/// Recreate the underlying session with new instructions if they differ.
/// We don't reuse sessions across instruction changes because Apple's
/// FoundationModels binds instructions at session creation.
func updateInstructions(_ newInstructions: String) {
guard newInstructions != currentInstructions else { return }
currentInstructions = newInstructions
session = LanguageModelSession(instructions: newInstructions)
}
// ...
}
The instruction-rebinding gotcha is real. Apple's LanguageModelSession takes its system prompt at construction time and doesn't expose a way to mutate it. If the user changes their persona (e.g., from "general assistant" to "Swift code reviewer"), you can't mutate the existing session — you have to throw it away and make a new one. The wrapper above handles that transparently; callers just call updateInstructions(...) and move on.
Image attachments to a text-only API
The Prompt API is text-only. Users send images. Bridge: Vision extracts a textual description (OCR'd text + scene classification) from the image, and that description gets appended to the user's prompt as inline context.
Example: user attaches a screenshot of a Python stack trace and asks "what's wrong with this?". The pipeline becomes:
- Vision OCRs the screenshot → multi-line text of the stack trace.
- Vision classifies the image → tags like "text, document, screenshot."
- Composed prompt:
"[Image context: OCR'd stack trace + 'screenshot' tag] User question: what's wrong with this?" - LanguageModelSession streams the answer.
It's not as good as a true multimodal model that sees pixels, but it's surprisingly capable for text-heavy images (code, documents, receipts, error messages) — which are the vast majority of what users send to a chat app on their phone.
Five Apple Intelligence integrations
Apple Intelligence isn't one feature; it's a collection of surfaces. Aside integrates five of them into the chat flow:
1. Image Playground in the composer
Sparkle button next to the text field opens Apple's .imagePlaygroundSheet seeded with whatever you've typed. The generated image is attached as your next message (no extra step). Apple's modifier is one line of SwiftUI; the seeding-with-current-input + handle-the-result flow is ~15 lines around it.
2. Writing Tools on the input field
Long-press the text field → Rewrite / Proofread / Summarize / Make Friendly / etc. This is one SwiftUI modifier: .writingToolsBehavior(.complete). Apple wires up the menu, the rewrite mechanics, and the UI animation. You write one line. The user gets a feature that would take a week to build from scratch with a custom LLM.
3. Conversation Summarize
Chat-menu item that opens a sheet with a 3-sentence summary of the conversation so far. Implementation: a fresh LanguageModelSession with summarizer instructions, streamed into a SwiftUI Text in a sheet. ~50 lines including the sheet UI.
4. Smart Reply chips
After every assistant message, three short tap-to-fill reply candidates appear above the keyboard. Generated by a quick LanguageModelSession with a "give me 3 short likely replies, one per line" prompt against the last assistant message. ~1 second to generate on iPhone 15 Pro+. Tap-to-insert; user keeps editorial control (no auto-send).
5. Translate (long-press)
Long-press any message → Translate → Apple's native on-device Translation overlay. Pick a target language, see translation. Works offline once the language pack is downloaded. Apple's .translationPresentation modifier; ~5 lines.
Persistence with SwiftData
One Conversation model + one Message model. SwiftData handles the storage. The hardest part wasn't the schema — it was making the chat list re-open the LAST conversation (not the first) on app launch, which sounds trivial but actually required threading the "active conversation ID" through @AppStorage + a small dance with SwiftUI's navigation state to avoid double-opens. Build 9 fix.
Why on-device-only is a constraint, not a marketing claim
Aside has zero network calls in the chat flow. Not "we anonymize" or "we don't store" — there's no endpoint to call. The only network code in the app is the App Store rating prompt's metadata fetch (Apple's API, runs on system schedule, not in any user-action path).
This is enforced architecturally: there is no URLSession in the chat flow, no API key field anywhere in Settings, no "account" concept. If you wanted to send conversations off-device you'd have to add network code that doesn't currently exist. The constraint shapes everything else — no cloud sync means each iPhone is independent, no account means the onboarding skips a "sign in" step that would otherwise lose 40% of installs.
13 builds in three weeks — how the velocity stays
Each TestFlight build between v1.0 and the current v1.1 added one concrete user-visible thing or fixed one specific issue. Build 4: more features + contrast. Build 6: voice input. Build 9: keyboard pushup fix. Build 10: Image Playground. Build 11: Writing Tools. Build 12: Summarize. Build 13: Smart Reply. Build 14: Translate.
Tactically what keeps the velocity going:
- One feature per build. Smaller diffs are easier to debug, easier to revert, easier to ship to TestFlight without breaking the prior build.
- Headless ship pipeline.
fastlane ios shipdoes archive + sign + pilot upload in ~90 seconds. The friction of "submit a build" is gone, so it gets done. - App Store metadata is git-tracked. Localized listing strings, screenshots, keywords — all in
fastlane/metadata/<locale>/. Reviewing a metadata change is a PR diff, not a manual ASC click-through.
File structure
App/—SoloApp+RootView. Boots the app, decides onboarding vs main scene.Features/Chat/—ChatView(the meat — composer + message list + AI integrations) +SummarySheet(the on-device summarizer).Features/History/—HistorySidebar: search, share-as-Markdown, regenerate, delete.Features/Onboarding/— 4-step intro with custom illustrations.Features/Settings/— Persona presets, appearance, language picker.Models/Conversation.swift— SwiftData models.Storage/AIService.swift— FoundationModels wrapper described above.Storage/VoiceDictation.swift— Speech framework for mic input.UI/— Reusable views (message bubble, code block renderer, voice meter, etc.).
~2,800 lines of SwiftUI + FoundationModels + SwiftData + Speech + Vision across 10 files. Zero third-party dependencies.
What you get for $19
The full Xcode project, the FoundationModels wrapper, all five AI-integration patterns above (each isolated to its own SwiftUI modifier or service class), the Vision-based image-context bridge, the SwiftData persistence layer, the Markdown + code-block renderer, the voice-dictation flow with auto-send toggle, and an AGENTS.md walking an AI coding agent through how to fork it into a different chat vertical (a domain-specific assistant, a focused-task helper, a multi-persona chat, anything that wants a chat UI on top of on-device intelligence).
Build your own on-device chat
Aside source bundle — full Xcode project + 5 working Apple Intelligence integrations + AGENTS.md. $19 one-time.
Buy the source — $19