When Pieter Levels launched Nomad List in 2014, he spent weeks manually tagging cities with cost-of-living data. A decade later, he publicly described using GPT-4 to generate entire feature concept lists, score ideas against his existing user data, and draft the copy for new feature announcements — all before writing a single line of code. He called it "thinking out loud with a machine that never runs out of energy." What had previously taken a week of solo brainstorming now happened in a single evening session.
His approach — treating the model as an always-available product collaborator rather than a search engine — represents a shift that separates the new generation of solo founders from those still working the old way.
Solo founders face a structural disadvantage: every hour spent thinking is an hour not spent building. Traditional product discovery — customer interviews, competitive analysis, feature prioritization, roadmap planning — was designed for teams. Each activity has a handoff. Each handoff requires another person.
AI doesn't replace those activities. It compresses them. A well-structured conversation with GPT-4o or Claude 3.5 Sonnet can surface competitive patterns you hadn't noticed, generate 30 feature ideas you can kill in 20 minutes, and produce a prioritized backlog ordered by effort vs. impact — all before your morning coffee is cold.
The operative word is structured. Founders who treat AI as a search engine get search-engine quality output. Founders who bring a clear problem context, relevant constraints, and iterative follow-up questions get product-partner quality output.
The most reliable pattern for using AI in product thinking is what practitioners call context-first prompting. Before asking the AI anything, you deposit a dense block of context: who your users are, what problem you solve, what you've already tried, and what constraints you're working under. Then you ask a specific question.
Compare these two approaches:
Weak: "Give me feature ideas for my project management app."
Strong: "I run a solo-founder project management tool for freelance designers. Current paying users: 340. Top complaint from last 30 support emails: they lose track of which client owes them money. I can't build accounting integrations — no budget. What are 10 lightweight features I could ship in under a week that address this pain point?"
The second prompt does not produce better results because it is longer. It produces better results because it eliminates the AI's need to guess. Every fact you supply is a constraint that prunes the possibility space toward your actual situation.
Two fundamentally different modes exist for AI-assisted product thinking, and confusing them produces poor results. Generation mode asks the AI to expand the solution space: "Give me 20 possible ways a freelancer might track unpaid invoices inside a project tool." You want volume and diversity here. Suppress your critical voice.
Evaluation mode asks the AI to compress the solution space: "Here are my 20 ideas. For each one, rate the likely development effort (low/medium/high) and the likely user delight (low/medium/high), given my constraints above." Now you want rigor, and you want the AI to apply your stated constraints harshly.
In March 2023, Indie Hackers published a thread where founder Marc Köhlbrugge (WIP.co) described running exactly this two-phase loop before deciding to build WIP's public roadmap feature. He generated 40 candidates with ChatGPT, then evaluated them against three criteria: fits solo workflow, visible to community, under 3 days to ship. He moved from 40 to 3 in under two hours.
CORE PRINCIPLE
AI is not a replacement for user research — it is a force-multiplier for the thinking you do between user conversations. Use it to generate hypotheses before interviews and to synthesize patterns after them. Never use AI-generated assumptions as a substitute for actual user data.
In this lab you'll practice context-first prompting for product idea generation and evaluation. Describe a real or hypothetical product you're building — including your target user, their top pain point, and one hard constraint (time, money, or technical). Then ask for feature ideas. In your next message, ask the AI to evaluate those ideas against your constraint.
Complete at least 3 exchanges to finish this lab. The AI assistant is tuned specifically for solo-founder product thinking.
Before Superhuman became famous for its onboarding score, founder Rahul Vohra described conducting 100+ user interviews and sitting with years of NPS data he couldn't synthesize fast enough. In a widely-cited 2018 First Round Capital essay, he documented a manual process that took months. That exact process — tagging interview responses, finding frequency patterns, identifying the "disappointed" cohort — is now compressible to hours using AI.
Today, solo founders running their own qualitative research can paste 50 interview summaries into Claude and receive a structured thematic analysis in minutes. The insight quality depends entirely on the prompting — but the time compression is categorical.
AI is most powerful in user research when applied to three specific synthesis tasks: thematic coding, sentiment clustering, and pain point prioritization.
Thematic coding means identifying recurring themes across qualitative responses. You paste 20–100 interview excerpts, support tickets, or app store reviews and ask the AI to identify the top 5–8 themes, with representative quotes for each. The AI does not replace your judgment — it removes the mechanical labor of the first pass.
Sentiment clustering means separating responses by emotional tone and grouping them. "Here are 80 app reviews. Separate them into: delighted (would miss this), frustrated (has a specific complaint), and indifferent. For each frustrated review, extract the specific complaint." This produces a ranked complaint list in minutes.
Pain point prioritization means asking the AI to rank complaints by frequency and severity. Once it has clustered your data, you can ask: "Which of these pain points appears most often? Which seems most emotionally intense based on language used?" You then cross-reference with your own knowledge of which users are highest-value.
The format in which you present raw data to the AI affects output quality significantly. Three formats work reliably:
Numbered List Format: Each interview excerpt or review is numbered. This allows the AI to cite specific items in its analysis. "Review 14 and Review 37 both mention the same friction point."
Tagged Format: Each entry is prefixed with metadata. "User: Freelance designer, Plan: Pro, Tenure: 8 months — [response text]." The AI can then filter by segment automatically. "What do Pro plan users who've been with you 6+ months complain about that new users don't?"
Comparative Format: You provide two sets of responses side by side — churned users vs. retained users, feature users vs. non-feature users — and ask the AI to identify what distinguishes the groups.
In 2023, founder Rob Walling (TinySeed, Drip) publicly noted in his podcast that AI-assisted churn analysis using the comparative format helped one portfolio company identify a single onboarding step — adding a third team member — as the clearest predictor of retention. They'd had the data for two years. The AI surfaced it in an afternoon.
AI can and does hallucinate. In user research synthesis, this risk manifests in a specific way: the AI may generate plausible-sounding themes or quotes that are paraphrased, blended, or invented. This is especially likely when your input data is sparse — fewer than 15–20 data points — because the model tries to fill gaps.
The mitigation is citation discipline. Always ask the AI to cite the specific numbered item that supports each claim. "For each theme you identify, cite at least two specific reviews by number." Then manually verify those citations exist in your original data. Any theme the AI cannot back with citations should be treated as a hypothesis, not a finding.
IMPORTANT LIMITATION
AI synthesis is only as good as the data you feed it. If your 30 reviews are all from power users who self-selected into your beta, the AI will faithfully surface power-user themes — and completely miss the silent majority of users who churned without leaving a review. AI cannot compensate for selection bias in your research data.
In this lab you'll practice all three research synthesis techniques: thematic coding, sentiment clustering, and pain point prioritization. Paste in 8–15 short user feedback snippets (real reviews, support emails, or ones you make up for practice) and work through the synthesis workflow with the AI.
The AI will guide you through formatting your data, asking it to code themes, cluster by sentiment, and finally rank the pain points by frequency and emotional intensity. Aim for at least 3 exchanges.
In October 2022, Pieter Levels launched Photo AI — an AI headshot generator — and publicly documented building the initial version in roughly four days. A significant portion of that time was spent in ChatGPT and GitHub Copilot, not writing code from scratch, but generating and iterating on specifications: what the upload flow should do, what error states to handle, what the payment integration needed to check. He described the process as "writing the spec by talking to the AI until it sounds right, then handing the spec to Copilot to implement."
That description contains a precise workflow. The spec is not an artifact you write once. It is a conversation you have iteratively, where each round of AI feedback reveals an edge case or ambiguity you hadn't considered.
A product specification doesn't have to be a 40-page PRD. For a solo founder, a useful spec is a document that answers three questions for every feature: What does it do? What are the edge cases? How do you know it's working?
AI is excellent at populating the second question. You describe the happy path — the normal case where everything works — and ask: "What could go wrong? What happens if the user has a slow connection? What happens if they upload a corrupt file? What happens if two users try to do this simultaneously?" The model's breadth of exposure to software systems means it will surface edge cases you haven't thought of.
The third question — acceptance criteria — is where AI-assisted specs become directly useful for implementation. An acceptance criterion is a specific, testable statement: "When a user uploads a file larger than 10MB, they see an error message within 2 seconds and the upload is rejected without partially saving." Writing these with AI forces precision and catches vagueness before it reaches code.
Effective AI-assisted spec writing follows a five-step loop. First, you describe the feature in plain language — one paragraph, no jargon. Second, you ask the AI to restate it as a structured spec with happy path, edge cases, and acceptance criteria. Third, you read the output and mark every assumption that isn't true for your specific product. Fourth, you correct those assumptions and ask the AI to revise. Fifth, you ask one final question: "What haven't I thought of?" This last step consistently surfaces the most valuable input.
In practice, this loop takes 20–40 minutes for a medium-complexity feature. The output is a spec that would have taken a product manager half a day to write, and it is often more thorough because the AI has no blind spots from being "too close" to the product.
Builder.ai, a platform that automated software specification for non-technical clients, published internal data in 2023 showing that AI-assisted spec writing reduced ambiguity-related rework by 34% compared to manually written specs. While Builder.ai's model differs from solo founders, the mechanism is the same: forcing edge case articulation before implementation catches the most expensive errors early.
Once a spec exists, AI can generate prototype scaffolding. This is distinct from production code. A prototype scaffold is a working skeleton — correct structure, placeholder content, no business logic — that lets you test the flow before committing to implementation.
For web products, a prompt like "Generate an HTML prototype of this upload flow based on this spec — no backend, just the UI states: idle, uploading, success, error" produces a testable artifact in seconds. You share it with a potential user over a Loom recording or a screen share. You watch where they hesitate. You update the spec. You iterate.
This prototyping pattern was central to Tibo Louis-Lucas's (Tweet Hunter, Taplio) documented workflow in 2023, where he described using ChatGPT to generate HTML mockups of new features within existing products before deciding whether to build them. His stated goal was to kill bad ideas with a screen recording rather than wasted development hours.
PROCESS NOTE
AI-generated specs have a characteristic failure mode: they describe technically correct systems that are wrong for your users. A spec that covers every edge case of a file upload is useless if the feature itself doesn't address a real pain point. Always validate the feature decision with user data before investing in spec quality.
Choose one specific feature you're considering for a