# SURF NOIR DEV LOG 1 > Building Nearstalgia Bay With (and Against) AI **Published by:** [Surf Noir](https://paragraph.com/@surfnoir/) **Published on:** 2026-01-21 **Categories:** ai, animation, nbx, worldbuilding, storytelling, anime, 3d, midjourney, kling **URL:** https://paragraph.com/@surfnoir/dev-log-1 ## Content TL;DR I spent ~60–80 hours creating an 11-minute animated episode set in the Surf Noir universe using a hybrid AI workflow (Midjourney, Freepik/Seedream, Kling, ElevenLabs, DaVinci Resolve).This was not “prompt and publish.” It was directing, editing, rewriting, rejecting, and problem-solving at scale. Over 5,500+ Midjourney images, 2,600+ Freepik images, and 400k+ generation credits were used to arrive at a finished episode.Because of beta access and partnerships, the out-of-pocket cost was low — but at current market rates, this episode would realistically cost $1k–$5k to produce with AI. A traditionally animated equivalent could cost anywhere from $40k–$150k+.AI didn’t replace storytelling. It compressed production distance — allowing one person to iterate like a small studio — while making taste, clarity, and intention the real bottlenecks.This dev log documents how a story-first approach, not the tools themselves, determines whether AI outputs feel like “slop” or cinema.INTRODUCTION: FOR THE WORLD-BUILDERSThis document is for anyone who's ever had a vision that seemed too big, too outlandish, or too impossible to do alone. If you grew up on Final Fantasy, Metal Gear Solid, Pokémon, Dragon Ball Z, One Piece, Harry Potter, Lord of the Rings—anything where you knew the creator approached building a world as their life's work, as if they had to do it or else, as if they had to literally imagine a place because the real world didn't gel with their soul—this is for you. This is a guide to show you that it's possible. And before you roll your eyes at another "AI will change everything" manifesto, let me be clear: this process actually inspired me to re-incorporate manual art-making into my workflow. AI didn't replace craft, but rather, iit revealed how much craft is still required. AI is expensive. AI is time-consuming. AI requires skill, care, and creative vision to use well. And I see a lot of the loudest voices in AI influencing being the primary reason why AI gets called "slop." They either lack the creativity, care, or money to push the limits of the technology and make something as good as they can. They're more concerned with showing possibilities, rage bait, and collecting engagement checks than actually making a thing. This Dev Log is for:Indie creators who want to use AI as a tool, not a crutchMy supporters who want to know what actually goes into this workPotential collaborators and clients who need to understand the processThe anti-AI people who make blanket statements like "if you use AI you're not an artist" (though unfortunately, you probably won't read this)I spent 60-80+ hours creating an 11.5-minute animated episode set in Nearstalgia Bay, a fictional coastal town in the Surf Noir Archipelago. This document breaks down everything: my creative choices, technical challenges, workflow solutions, and honest cost analysis. My philosophy: “Make it bad, make it imperfect” was derived from a Gawx video I watched during his process that helped me break through my imperfections. It’s poignant because people think AI is all about perfection, when it has so many imperfections. However, there's a much needed growing movement of people leaving the imperfections of humanity, in their work to prove it’s not AI. While the imperfections of AI and the imperfections of humanity manifest differently, the notion was immensely important for someone like myself who is prone to perfectionism. Let's begin. PART I: CREATIVE FOUNDATIONSThematic MotivationsI first came up with Surf Noir and Future Surf in 2020. It was the pandemic and we couldn’t leave the house, so I was left to literally imagine a place. This took form in two ways, exploring real places from a far through books and documentaries, and exploring places that didn’t exist at all through books, games, animation, my dreams, etc. In my own imagined world, I could talk about the issues that I sometimes find trouble addressing in the real world. Magical realism makes more sense to me than actual realism. And thus, Surf Noir is a vehicle for me to have fun tackling issues of classism, gentrification, corporate greed, necessary capitalism, nostalgia as a crutch, and so much more.The World of Surf NoirSurf Noir is a transmedia world-building project inspired by my hometown area in Virginia—Hampton Roads, also known as The Seven Cities. It's a coastal region highly susceptible to sea level rise, connected by bridges and underwater tunnels. But in Surf Noir's world, set in 20XX, it transformed. The seven cities became seven independent island nations collectively known as The Seven Isles of Surf Noir. The Central Conflict: A company called Wave-Tech (founded by the Wavecrest family) discovered a prehistoric stone with water-manipulating properties during deep-sea oil drilling. A lowly oil rig worker cleaned one off as a gift for his daughter. She wore it to prom on a full moon—the next day, freshly cleansed and sun-charged, it burned her chest. A family friend and scientist investigated. He happened to be from the Wavecrest family. They led the gold rush to mine the stone.The stone became known as:Kaimana - what indigenous coastal tribes called it (Hawaiian for "diamond/power of the sea")ORS (Oceanic Resonance Substrate) - what Wave-Tech calls itAs global warming intensified and sea levels rose, WaveTech emerged as global saviors by using ORS to slow sea level rise and redirect hurricanes. They became a corporate powerhouse that makes nearly everything, blurring the lines between corporation and government. But there's a cost: Coastal tribes who'd maintained Kaimana for generations, using it for medicine, water purification, and spiritual practices, became targets. Governments and corporations oppressed them, accusing them of keeping the stone secret, labeling them threats because the stones could be weaponized. Wave-Tech refined Kaimana into ORS—a single-use, unrechargeable version that maintains their monopoly. The authentic stones naturally recharge via sun and moon cycles. The tribes knew this. WaveTech destroyed it.Nearstalgia Bay (NBX)Episode 1 takes place in Nearstalgia Bay, a small resort and fishing village inspired by the Outer Banks (OBX). It's not one of the Seven Isles—it's somewhere in between, which gives it narrative breathing room. The aesthetic: A town where it's always sunset. I never explain this. It just is. This eternal golden hour creates:Visual nostalgia and liminal timeTourism appeal (Wave-Tech markets this)A frozen aesthetic preserving "the way things were"Melancholy beautyNBX got by on simple life and modest family tourism. Then Wave-Tech's waterfront district arrived—introducing a South Beach feel to the quiet fishing village. The clash between old NBX (retro technologies, historic surf shacks) and new NBX (sleek ORS-powered infrastructure) drives the conflict. The Story: Our protagonist, Aza, works at Excursion Club—a struggling travel agency in Mid Town. When she returns from a trip, she discovers Wave-Tech is planning a "revitalization" (gentrification) of Mid Town. Character Design Philosophy: Learning from ArakiWhile designing characters, I was reading Manga in Theory by Hirohiko Araki (creator of JoJo's Bizarre Adventure). His "Golden Ratio" of character design became my foundation:Appearance - Visual design instantly communicates personalitySpeech - Manner of speaking reveals intelligence and temperamentThoughts - Inner world provides emotional depthActions - Behavior defines moral codeKey principles I applied:Characters drive story, not the other way around - Never change a character to suit a plotDistinctiveness above all - Use contradictions to make characters memorableThe Character's Logic - Every decision must make sense according to their personal rulebookDesire and Motivation - Each character needs a clear driving forceEpisodes matter more than overarching plot - People remember moments, not mythsMy character checklist before generating:Name, age, personality traitsSpeech quirks and mannerCore belief or mottoSymbolic color or motifSignature gesturesFavorite/least favorite thingsThis pre-work made generations vastly more successful because I knew who I was prompting, not just what they looked like.The Aesthetic Question That Held Me Back for YearsFor years, I tried forcing AI into the "perfect style." I wanted something so unique, so mine, that it would be instantly recognizable. This perfectionism paralyzed me. Then I made a concession that changed everything: I adopted the One Piece approach. Eiichiro Oda's art style evolved dramatically from East Blue to Wano, but the world and story remained consistent. Fans grew with the aesthetic improvements rather than expecting polish from day one.My solution:Primary storytelling: 3D video game cutscene aesthetic (think PS2-era Final Fantasy, but modern)In-world memories and flashbacks: 2D sakuga-style animationPrint graphics and manga: 2D illustrationLet the style improve over time - Season 1 is the foundation, not the ceilingThis hybrid approach:Makes 3D more forgiving for iterationLets me focus on storytelling NOWCreates visual hierarchy (present = 3D, past = 2D)Primes the audience for an eventual video gameGives permission to improve without invalidating earlier workMost importantly: It got me out of research hell and into production. PART II: THE GREAT PIVOT - Why I Abandoned 2D for 3DThe 2D Research Phase (1+ month: Give or take 3 years)The majority of my AI work since 2021 has been 2D anime-styled. For Surf Noir, I took my previous styles and began refining them, pushing toward something more unique than the standard "classic anime" look everyone else uses. My target: Studio Trigger animation aesthetic (FLCL, Kill la Kill, Gurren Lagann).What I loved about Trigger's style:Characters don't try to "look real"Movements are free, unrealistic, fluidFacial features are expressive and elasticThere's a "controlled chaos" energyI thought achieving this with AI would be more impressive than going realistic. I generated thousands of test images in Midjourney v7, training custom style references and exploring:Expressive sakuga cel-shadingLoose energetic lineworkFlat cel color blocksElastic anatomyDynamic hand-drawn energyI made some beautiful images. I'll still publish them. But I couldn't make them work for animation.Why 2D Didn't Work (Three Critical Problems)Problem 1: Midjourney is great for ideation, terrible for consistency Midjourney v7 creates stunning, unique imagery. But when you need:The same character from multiple anglesConsistent proportions across scenesControlled variations in expression...it falls apart. Every generation is a new interpretation. Even with style references (--sref), character profiles, and careful prompting, I couldn't get the structural consistency needed for animation. Problem 2: Quality degradation in iterative editing I tried porting Midjourney images into Google Nano Banana Pro , Seedream 4.5, and Flux 2 for more controlled editing. The workflow was:Generate in MJ (creative, unique)Import to Nano/Seedream/Flux (structural control)Make adjustments (expressions, angles, etc.)The problem: with each edit, quality degraded. By the third or fourth iteration, the image looked muddy, over-processed. This was especially bad in 2D styles where line weight and color flatness matter. In 3D-leaning styles, this degradation was far less noticeable. Problem 3: Style transfer failure in Freepik models My Midjourney 2D styles were too unique and stylized. When I fed them into Freepik's Seedream or Nano Banana, the models couldn't retain:The specific color paletteThe line art qualityThe cel-shading flatnessThey would default to their interpretation of anime style—usually something more generic, more "Ghibli-adjacent," losing bold sketchy line work and the loose energy I wanted.The 3D SolutionI already had a 3D style I'd developed in Midjourney—a hybrid of:3D character models (video game cutscene quality)2D cel-shading techniquesFilm photography lighting and color gradingWhy 3D worked better:Video generation favors realism - The closer you get to photorealistic proportions and lighting, the easier video models handle it. AI video struggles with extreme 2D stylization.Consistency across models - 3D character designs translated cleanly from Midjourney → Seedream → video generation. The "language" was more universal.Less quality loss in iteration - Because 3D allows for more photographic lighting and texture detail, small degradations weren't as visible.Retro game aesthetic fits the world - NBX has that nostalgic, "you had to be there" vibe. A PS2/early PS3-era cutscene look actually enhances the Surf Noir nostalgia.The aesthetic I landed on:Realistic materials and weathering (scuffed mechs, worn paint, dirt)Anime proportions and character designGrounded sci-fi (utility tech, lived-in world)Noir grit through color grading and lightingThink: Patlabor meets Eureka Seven meets Final Fantasy X cutscenes.What I Learned About 2D (And Why I'll Return)Now that Episode 1 is done, I'm more motivated to solve 2D animation, not less. Here's why: AI made me want to go back to Blender. Which is poetically ironic because I was learning Blender in 2023 when my laptop broke and all I had was my iPad. This is what got me into AI via Midjourney in the first place. Throughout this process, I constantly thought: "If I could just manually adjust this one thing..." or "If I had more control over this movement..." The limitations of AI video generation, especially lip sync and subtle character motion, made me realize that a hybrid approach is not the future, it’s the present:Use AI for ideation, layout, backgroundsUse manual 3D modeling for character rigging and precise controlCombine them for final outputI still want to make a 2D anime OVA. The 2D research wasn't wasted, it was educational. I now know:Which models handle stylization bestHow to maintain style consistency across platformsWhere AI breaks down and manual intervention is neededThe goal is a full-length 2D animated film. But I needed to finish something first to learn the pipeline. Episode 1 was that proving ground. PART III: THE TECHNICAL GAUNTLET(Half way through organizing this dev log, I realized I wanted to add in more videos and images, so I’ll be following up with a video essay version of this as well so I can actually show you)The Image Generation PipelineMy workflow evolved into this five-stage process: Stage 1: Ideation in Midjourney v7Generate creative, stylized concept imagesUse for characters, environments, propsLeverage MJ's unique aesthetic and prompt interpretationDon't worry about consistency yet, just exploreStage 2: Character/Asset RefinementSelect best MJ generationsCreate character sheets with multiple angles:Head shot with turnaroundFull body shot (high res)Expression sheetPose sheet4-5 candid variationsStage 3: Consistency Lock in FreepikImport MJ images as referenceUse Seedream 4.5 (95% of generations—free during beta)Use Nano Banana Pro (5% of generations—500 credits per 4K image)Seedream: best quality and creativityNano: best prompt adherence and controlStage 4: Scene CompositionGenerate backgrounds separately from charactersUse variations feature to maintain consistency and get alternate angles.Inpaint characters into scenes when neededExport at highest resolution possibleStage 5: Video GenerationImport to Kling (O-1 or 2.6 model)Generate 5-10 second clipsUse start/end frame control when needed for transition effectsPray it works the first timeRevise prompt run it againTotal Image Generations:Midjourney: 5,532 images (ideation + style research)Freepik: 2,663 images (final production)95% Seedream 4.5 (free during beta)5% Nano Banana Pro (500 credits per 4K gen)Freepik (Character Creation): 284 images (converting MJ to usable assets)Note: I accidentally deleted all my work once during file management and had to regenerate 300-500 images. These aren't counted above.midjourney versionfinal scene compositionThe Midjourney Mastery CurveEarly in production, I felt my prompt style had become stale and wanted to improve upon my understanding of how to speak Midjourney's language for the scenes I wanted. I used Clarinet's Prompt Helper GPT (an official MJ community tool found here) to understand parameter optimization. Key learnings from that process: Understanding Parameter Intentions:--sw (Style Weight): Controls how strongly the model follows your style references (--sref). High values (1000) = aggressive style adherence. Low values (100-300) = softer blending.--ow (Omni Weight): Controls how tightly the model binds to structural identity from omni-references (--oref). High values lock anatomy/proportions. Low values allow reinterpretation.--exp (Expression Strength): Increases visual extremity and internal contrast. Values 60-100 = energetic but stable. Higher = more painterly chaos.--stylize: Global aesthetic bias. Low (100-250) = literal/realistic. Medium (500-750) = cinematic. High (1000+) = painterly/experimental.My final character generation formula: Character head shot in expressive sakuga cel-shaded animation style inspired by Trigger Studio aesthetics. [detailed character description]. Clean cel background, soft anime color, loose energetic linework, flat cel color blocks, elastic anatomy, consistent proportions, dynamic hand-drawn energy. --ar 58:77 --raw --sref [style reference IDs] --profile [personal profile] --stylize 50-750 For 3D conversion: 3D sakuga animation portrait of [character description] --sref [3D reference] --sw 1000 --s 750 --v 7.0 --p [profile] For character sheets: A 2D character expression and pose sheet in sakuga cel-shaded anime style, featuring [character]. The sheet includes: Full-body front pose, full-body back view, head-and-shoulders neutral, smiling, surprised, angry, sad, joyful, annoyed, shy, serious, sleepy. All expressions front-facing in clean grid layout on white background. --ar 16:9 --s 750 --v 7.0Critical discovery: When using an existing character reference (--oref) that's not in your target style, lower --ow to 50-80 and --sw to 300-400. This frees the model to reinterpret the character in the new style rather than just "repainting" the original.Prompting Strategies That WorkedThe "Enhance Prompt" vs "Auto Prompt" Test: Freepik offers two AI-assisted prompting features:Auto Prompt: AI analyzes your image and writes a descriptionEnhance Prompt: AI takes your prompt and improves itI tested both extensively. Results:Auto Prompt: 50% success rate. Works better when you provide at least a short description so it understands your intent.Enhance Prompt: 70% success rate. Sometimes added unwanted elements or misinterpreted style, but mostly helpful for learning the model's language.Best practice: Write your own prompt first, use Enhance to see how the AI interprets it, then manually adjust based on what worked. Speaking to Seedream: Seedream wants:Clear subject identificationSpecific style keywords ("cel-shaded," "sakuga," "loose linework")Lighting direction described simplyCamera angle stated upfrontAction verbs for motion, not abstract conceptsExample of a failed prompt: "Make this character look more dynamic and interesting" Improved version: "Close-up shot of [reference] focusing on determined facial expression, Dutch angle, dramatic rim lighting from left" The "Make It Bad" Principle: Inspired by a Final Fantasy storytelling deep dive I was listening to during production, I realized: Don't get trapped in your workflow. Press all the buttons. This led me to discover Freepik's experimental "Variations" feature—which, when it worked, saved me hours or even days of trying to maintain consistency across camera angles. How Variations works:Take an existing generationGenerate a bunch of variations at different camera anglesMaintains core composition, style, and character identityFaster than regenerating from scratchFocus on getting a different angle, not so much on consistencyUse this new angle shot as a reference along with your character designs and reprompt it so you can get this new angle.This became essential for:Getting a a fucking side-profile of a scene…jfcAdjusting expressions without full regenerationFine-tuning compositionsConsistency TechniquesUsing Seedream's Reference System: Including the character reference image helps significantly. But the real trick: When trying to change something in a scene without changing the entire scene: ❌ Don't say: "Make this, change that, place this" ✅ Instead say: "Close-up shot of [reference] focusing on ___" This makes the model use the existing shot with stronger influence, rather than regenerating from scratch. Creating a Character Profile in Freepik: Freepik allows you to create a "character profile" that bundles multiple reference images:Front viewBack viewSide viewClose-up headshotWhen you prompt @character_name, it uses all references simultaneously. This improved consistency by ~40% compared to single image reference.Video Generation: Where Things Got HardPlatform Testing: I tried:Kling 2.6 / O-1 (primary tool - 90% of final footage)Hailuo MinimaxSoraVeoWan 2.5 / 2.6Total credits spent on non-Kling testing: ~15,000 Why I stuck with Kling:Best quality-to-cost ratioMost reliable motionStart/end frame control actually worksHandles 3D-style characters better than competitorsKling Workflow:Generate still frame in Freepik (perfect composition)Import to Kling as starting frameWrite motion prompt describing actionGenerate 5-10 second clipPray it doesn't hallucinate chaosMotion Prompting Strategy: Less is more. Kling wants to do too much. It loves dolly zooms and camera moves. ❌ Overly complex: "The camera slowly dollies forward while the character turns their head, maintaining focus on the eyes as light shifts across the scene" ✅ Simple and controlled: "Static shot. Character slowly turns head left. No camera movement." Start/End Frame Control: When possible, I used both start and end frames to constrain motion. This worked ~60% of the time. The other 40%, the model would:Hallucinate new elementsChange the character's appearanceIntroduce unwanted camera movementCreate temporal artifactsThe Door Scene Problem: One of my early video generations had slight inconsistencies in door size between frames. When I tried using start/end frame control, it was a mess—the model couldn't reconcile the difference. Solution: Generate the motion in reverse. Instead of a start/end frame: "Character opens door and peeks through" I did one image with the door already open: "Static shot: a stylized young woman with nervous eyes peeks through a slightly open, weathered blue wooden door into a cozy, cluttered library. Her eyes scan the space from left to right, then she hesitates, and gently pulls the door shut until it clicks." Then I reversed the clip in DaVinci Resolve. Worked perfectly. The Front Desk Sign Shot: Sometimes auto-prompt actually nails it. This was one of those times: "Slow cinematic dolly shot gliding along the curved wooden reception counter of the warm, vintage-style 'Excursion Club' lobby, the glowing golden sign letters casting soft light onto polished wood as the camera moves from close-up of the illuminated text to a wider view revealing travel posters, world maps, and brochures in the softly blurred background, shallow depth of field, cozy ambient lighting, elegant travel-club atmosphere." I didn't write that. The AI did. I kept it. PART IV: THE LIP SYNC NIGHTMAREThis was, without question, the most frustrating part of production.The ProblemAI video models that do lip sync want to make your characters look like Pixar characters (AI made me hate Pixar style). Even when your base image is semi-realistic or anime-styled, the moment you add dialogue, faces become:Overly cartoonishClay-like textureExaggerated Disney expressionsLoss of original art stylePlatforms TestedOmnihuman (Freepik):My primary lip sync toolBest quality when it workedMajor issue: Defaulted to Pixar-style facial animationRequired extensive prompt engineering to minimizeHiggsfield (Wan 2.5 / 2.6):Wan 2.5: Degraded visual quality significantlyWan 2.6: Better, but characters looked like clayAbandoned after testsOpenArt (Kling Avatar 2.0):Tied with Omnihuman for qualityMore artifacts than OmnihumanSometimes better at maintaining styleUsed 10,000 credits testing (account worth $30/month, free via creative partnership with Open Art)OpenArt (Standard Lip Sync):Too many artifactsAbandoned quicklyHedra:Terrible. Just terrible.One test, never returnedMy WorkaroundsSince I couldn't get clean lip sync consistently, I designed around it: 1. Internal monologue Many lines play as voiceover thoughts rather than spoken dialogue. This allowed me to:Use more expressive voice acting (I recorded lines myself and did voice changes)Avoid mouth animation entirelyCreate a more introspective tone2. Off-screen dialogue Characters speak while the camera shows:Another character's reactionThe environmentAn object of focus(I learned this from an episode of Invincible where the creator makes a cameo in the show) 3. Wide shots without visible mouth When characters had to speak on-screen, I used:Extreme wide shots where mouth detail isn't visibleOver-the-shoulder anglesCharacters facing away or in profile4. Strategic close-ups only For critical emotional moments, I accepted the lip sync compromise and used it. But I limited this to a hand ful of shots in the entire episode.Controlling Omnihuman's Cartoonish TendencyFailed approach: "Generate lip sync for this dialogue" Successful approach: Hyper-specific prompting with constraints: [camera angle] of the [describe character] as they continue to look [direction they are facing]. Their head and eyes do not move from the position in the @Start image. No hand motion. The character is completely still and composed. Their facial expression [emotion that matches the audio you’re using]. Key techniques:State the camera angle explicitlyReinforce "no extra movement"Direct where eyes and face should pointUse phrases like "completely still," "composed," "maintains position"Reference the exact starting frameImportant discovery: Close-ups work MUCH better than full-body shots. Full-body lip sync introduces:Arbitrary hand gesturesBody swayingUnwanted head movementKeep it tight on the face when you must use it. Image quality matters: If your base image is already on the fence between Pixar-style and realism, lip sync will push it fully into Pixar. Use more realistic base images for lip sync shots.The Multi-Character Dialogue ExperimentI tried using ElevenLabs' multi-voice prompt feature to generate a full conversation between Aza and Kate at the train station, thinking I could:Generate the conversation as one audio fileFeed it into Kling Avatar 2.0 for multi-character lip syncGet the whole scene in one generationThe test: 700 credits in Kling Avatar 2.0 The prompt: The two characters are looking directly at each other while speaking. There is no hand movement. Their motions are calm and relaxed. No smiling. Character on the left says: "So where are you headed?" Character on the right says: "I gotta meet Naomi, we're staying in Waterside" Character on the left says: "Ooo fancy" ...etc. Result: FAILED. The model couldn't handle:Two characters speaking alternatelyMaintaining consistent expressionsPreventing unwanted gesturesKeeping them looking at each otherSolution: Split the dialogue into separate audio tracks with silence between lines, generate separate close-up shots for each character, edit them together in post. More work, slightly functional…but ultimately less natural feeling than my previous off-screen dialogue work around. I scrapped these shots. PART V: SOUND DESIGN & VOICEElevenLabs Voice ActingElevenLabs became my primary voice tool. I used two accounts (don't ask me how) with a combined 15,000 credits spent. Voice Design Process: Each character got a specific voice profile:Aza: Talia - Calm, slightly raspy, never too excitedJade: Tiffany - Natural and welcoming, "bubbly black girl - Delta energy"Javonte: Ministar - Too cool, poetic, whispery, British-African inspiredCass: [voice undecided] - Mousy, soft-spokenKate: Ivy - Sophisticated and sassy, kawaii energySurf Shack Man: Scott - Calm and welcomingSurf Shack Woman: Ms. Harris - Caring Southern momStudent 1: Revenant - Youthful enthusiasmStudent 2: Grechen - Snooty bratTeacher: Ivanna - Authority figure warmth, the cool teacherThe Boy's Voice (Custom Creation): There are NO good kid voices on ElevenLabs. All the "childlike" voices were women doing super cartoonish performances. So I made one from scratch using ElevenLabs' voice cloning. Result: I was STUNNED by the quality, expressiveness, and accuracy of vernacular. The voice felt authentic, natural, and emotionally present, especially considering my prompt was super basic. Highly recommend the custom voice feature for unique character needs. Multi-Voice Dialogue: As mentioned earlier, I tested multi-character conversations generated in one file. The audio quality was excellent—natural pauses, realistic overlaps, proper emotional tone. Where it failed was in video generation (lip sync couldn't handle it). But for pure audio storytelling or podcast-style content, this feature is incredible.Sound EffectsElevenLabs SFX:Great for ambient sound loopsI used a mix of ElevenLabs and PixabayPixabay:Primary source for most SFX (free, high-quality)Footsteps, door creaks, train sounds, etc.Important note: You don't have to use AI for everything. Pixabay's library is massive and free. Why generate a door creak with AI when a perfect one already exists? "Who gon be the humans” — Jamee Cornelia Ambient Design: NBX's eternal sunset needed a consistent soundscape:Distant ocean waves (looped)Seagull calls (sparse)Light wind (constant low presence)Urban hum in Waterside District scenesI layered 3-4 ambient tracks per scene to create depth without overwhelming dialogue. PART VI: FILE MANAGEMENT & WORKFLOW CHAOSThe Disaster I CreatedI create messy. When I try to be too organized upfront, it fucks up my flow. But the downside: I accidentally deleted all of my work once. Trashed the wrong folder. Gone. 300-500 images, hours of video tests. Had to regenerate large portions from scratch. But after this my process was much more refined so I was able to catch up to my previous stopping point in one day. The second problem: DaVinci Resolve can't locate clips if you move files after importing. I reorganized my folders mid-production and luckily caught this before it became overwhelming to relink media, it’s just something to take note of..What I Learned (The Hard Way)During production:Let the chaos happenOne big "Active Project" folder with everything dumped inUse DaVinci's internal organization (bins, tags, colors)Don't move files once they're importedAfter production:THEN organize into a proper folder structureCreate a master archive with clear namingExport a project file with relinked mediaBack up to external drive and cloudMy eventual structure: Surf_Noir_EP01/ ├── 01_Scripts/ ├── 02_Storyboards/ ├── 03_Assets/ │ ├── Characters/ │ ├── Backgrounds/ │ ├── Props/ ├── 04_Audio/ │ ├── Dialogue/ │ ├── SFX/ │ ├── Music/ ├── 05_Video_Generations/ │ ├── Scene_01/ │ ├── Scene_02/ │ └── ... ├── 06_Final_Edit/ └── 07_Exports/ Why this matters: When sharing your process later (like in this Dev Log), having clear organization makes pulling examples and references infinitely easier. I learned this from working at ad agencies like Vayner Media. PART VII: THE NUMBERS - COMPLETE COST BREAKDOWNTime InvestmentTotal: 60-80+ hours This doesn't include:Months of pre-production conceptualizing Surf NoirThe 1+ month 2D research phase (mostly abandoned)Writing scripts and dialogueCharacter design pre-workBreakdown estimate:Character design & asset creation: 15-20 hoursScene composition & image generation: 20-25 hoursVideo generation & iteration: 15-20 hoursVoice recording & sound design: 5-8 hoursEditing in DaVinci Resolve: 10-15 hoursRendering & troubleshooting: 5-7 hoursLabor rate benchmark: $45-60/hour (industry standard for AI creative direction, based on my experience at tech startups, Hollywood ad agencies, and recent recruitment offers) Monetary value of labor: $2,700 - $4,800Platform CostsMidjourney Pro (Annual):Normal: $60/monthMy rate: $48/month (annual discount)Months used: 1 month heavy productionCost: $48Freepik Credits:Initial research (before deletion): ~50,000 creditsActual production: 372,300 creditsImportant context:Seedream 4.5 (≈95% of images) was free during betaMost credits were spent on video generation, not imagesNano Banana Pro was used sparingly due to cost (500 credits per 4K generation)Because Freepik’s pricing model is credit-based and tiered, exact dollar equivalency fluctuates. However, at current pricing tiers, this volume of usage would conservatively translate to hundreds to low four figures if Seedream were not free. This is a critical point: AI is only “cheap” if you’re not doing much with it. Kling Video Generation:Primary model: Kling 2.6 / O-1Nearly all final video footage generated hereOther platforms tested (Sora, Veo, Minimax, Wan): ~15,000 credits totalKling proved best quality-to-cost ratioExact Kling costs vary by plan and usage window, but this episode represents heavy, sustained generation, not casual testing. OpenArt (Lip Sync Testing):Account value: ~$30/monthCredits included: 12,000Credits used on lip sync: ~10,000Access provided via creative partnershipElevenLabs (Voice & SFX):Free tier: 10,000 credits per accountTwo accounts used (don’t ask)~15,000 credits totalCovered:Full dialogueInternal monologueSelect ambient sound effects Software Licenses:DaVinci Resolve Studio: $299 perpetual licenseWhat This Would Cost Without “Lucky Breaks”This episode benefited from:Seedream being temporarily freeA creative partnership on OpenArtAlready having DaVinci ResolvePrior sunk costs in tools I already ownedIf all tools were paid at market rates today, producing this episode would realistically cost: Low estimate: $1,000–$1,500 High estimate: $3,000–$5,000+ And that’s before valuing labor.Comparison to Traditional AnimationA traditionally animated 8-minute episode at even modest indie rates would require:StoryboardingCharacter designBackground artLayoutAnimationCleanupColorCompositingEditingSound designEven at an extremely conservative $5,000 per finished minute, you’re looking at: $40,000+ minimum More realistically: $80,000–$150,000 AI didn’t just eliminate cost. It collapsed the distance between a solo creator and studio-scale output.ART VIII: WHAT AI ACTUALLY CHANGED (AND WHAT IT DIDN’T)What AI Did NOT DoAI did not:Write the storyDesign the worldDecide the toneChoose the shotsMaintain continuitySolve narrative problemsCreate emotional intentEvery time something worked, it was because:I knew what I wantedI could recognize when it was wrongI had the taste to say “no”That’s the part people ignore when they call this “slop.”What AI DID ChangeAI:Lowered the logistical barrier to entryMade iteration possible at solo scaleAllowed me to fail fasterExposed weak creative instincts immediatelyPunished vague thinkingRewarded specificityMost importantly, AI forced clarity. If you don’t know what you want, AI will happily give you something. That something will almost always be generic.Why “AI Slop” ExistsAI slop isn’t a tool problem. It’s a taste problem. The loudest AI influencers:Optimize for output, not meaningConfuse novelty with substanceTreat art as content arbitrageNever sit with a piece long enough to refine itThis project took 60–80 hours because:I rejected hundreds of “good enough” outputsI rewrote prompts obsessivelyI rebuilt scenes that technically worked but emotionally didn’tThat labor is invisible to people scrolling past results.PART IX: WHY THIS MATTERS (BEYOND THIS EPISODE)This Was Never Just an EpisodeEpisode 1 is:A narrative pilotA workflow testA proof-of-conceptA studio dry runIt proves that:A solo creator can produce serialized animationWorldbuilding can precede monetizationAI can be used with intention, not as spectacleSurf Noir as a Studio ModelLong-term, Surf Noir is:A transmedia IPA music + animation ecosystemA client-facing creative studioA testbed for hybrid AI/manual workflowsThis Dev Log doubles as:A portfolio artifactA transparency documentA capability statementIf you’re a collaborator, investor, or client: This is what my process actually looks like.PART X: WHAT I’D DO DIFFERENTLY (HONESTLY)1. Lock Visual Language EarlierI lost time chasing “perfect” when “coherent” would’ve been enough.2. Design Around AI Limitations SoonerLip sync taught me this the hard way. Structure the story to avoid known weaknesses.3. Commit to Hybrid EarlierBlender + AI would’ve saved hours in subtle motion and control.4. Ship EarlierNothing teaches like finishing.PART XI: WHAT’S NEXTContinued 2D researchBlender character workflowsExpanded Surf Noir episodesMusic-forward storytellingURL + IRL integrationsClient work under Surf Noir StudioFINAL THOUGHTIf you take anything from this: AI will not save you from the work. But it will meet you wherever your ambition already is. If your vision is small, it’ll stay small. If your vision is obsessive, messy, personal, and necessary — AI can help you finish it. And finishing changes everything. ## Publication Information - [Surf Noir](https://paragraph.com/@surfnoir/): Publication homepage - [All Posts](https://paragraph.com/@surfnoir/): More posts from this publication - [RSS Feed](https://api.paragraph.com/blogs/rss/@surfnoir): Subscribe to updates - [Twitter](https://twitter.com/lesurfnoir): Follow on Twitter