I usually live in the world of stylized material, but recently I took a little detour into tinkering with something more realistic to test a theory.
I have long been obsessed with finding ways to maximize control over generative outputs.
As lately I have been immersed in really exciting storyboard-related workflows, I was curious to see if similar mechanisms could drive something more fine-grain than scenes – like individual performances, for example.
During a recent chat with Gemini, a pretty amusing, meme-worthy persona organically emerged.
A smug, Machiavellian networking guru dealing out uncomfortable truths.
I decided to hand him the mic for this one.
This is what transpired afterwards in this oddball little workflow:
- The Prep: Started blocking with thumbnails combining a basic 3D graybox dummy with hand-drawn sketches of expressions mapped onto it.
- The Driver: Turned those into fluid video sequences, and stitched them together into a rough “proxy” performance (voice included) to act as the scaffolding for the high detail video.
- The Look: Created the character likeness using Flux.2 with a LoRa for that rugged skin/hair detail, pseudo-upscaled with Nano Banana Pro.
- The Polish: Merged the performance with the high-detail visuals, toned down the typical video model “chicken feet” artifacting, and upscaled the result to 4K 60fps using the Bytedance upscaler (which did a remarkable job).
Tools used: the Kling 2.6 and Kling O1 video models, Flux.2, Nano Banana Pro, Blender, Affinity, Topaz, Bytedance upscaler, and of course Weavy.
…and these are the original concept sketch and the high detail “plate” shot for the character:


