Some generative models are a pleasure to work with, while others are quite a bit more temperamental.
While impressive, the Wan 2.1 video generation model (or more specifically the Wan 2.1 Fun Control variant) definitely landed in the latter category for me, but the payoff was totally worth it.
I decided to release the end-to-end video generation ComfyUI workflow I have been working on that combines Wan and Flux into a comprehensive All-In-Wan package (no apologies for the terrible pun :D).
Here’s a breakdown of what the workflow does:
- Loads a source video.
- Generates a control video using two ControlNets (OpenPose and Depth), at a resolution matching the target video.
- Extracts the first frame and its depth map, then stylizes it using Flux with an adjustable scaling factor, allowing you to provide Wan with a high-resolution styling guidance image, beyond the target video resolution.
- Combines the control video and the styled starter frame (or alternatively, an external image prepared separately) to generate a Wan output video at either 480p or 720p resolution.
- It can also upscale video frames up to 4x, with an ESRGAN-type upscale model of your choice.
- It stitches the frames together into a final video, and BOOM, done!
To make things easier, it includes a Note node summarizing all necessary model files you will need to get, so guesswork is kept to a minimum.
The workflow should also serve as a solid foundation for incorporating other variants of the Wan model, like first-frame-last-frame or the newer VACE capabilities.
And here’s the best part: it’s absolutely free, no strings attached, no upsells.
Just grab it and send some good vibes my way: All-In-Wan Workflow ZIP
Here’s a visual snapshot of the workflow:

As well as a video pan-through inside ComfyUI (no sound):
…and finally, a composite video showcase with the example assets used during workflow development.
The original source video of the boxing dude came from Vecteezy.
Now for some disclaimers:
1) This is certainly NOT a lightweight workflow.
Both Wan and Flux are quite resource-intensive on their own, and the workflow has been specifically constructed to avoid the nightmare scenario of trying to cram both of them into VRAM at the same time.
It’s good practice to:
- Bypass the Video Generation node group when generating the starter frame using Flux.
- Once the starter image is complete, select “Unload Models” from the Comfy top bar to free up Flux from VRAM, and make space for Wan.
- Re-enable the Video Generation node group and resume the workflow again to proceed with video generation. This method worked flawlessly during development.
- You can also avoid using Flux entirely and load an externally style starter image, which the workflow also allows.
2) It should be relatively straightforward to substitute the model files with less demanding versions for a lower-spec workflow (if you happen to make one, let me know, I’d love to check it out!)
- Flux-Dev can be replaced with Flux-Schnell (although ControlNets may have to be updated as well).
- The 14B Wan model can be replaced with the lighter and faster 1.3B variant.
- While the workflow contains an optimizations node group for Wan, be cautious with it, as it may significantly compromise generation quality, particularly for realistic outputs.
3) The workflow deliberately avoids using any “wireless” broadcast nodes like Anything Everywhere or Get/Set Nodes. As good as these nodes are, they break or introduce various data propagation and execution-order issues too frequently for comfort.
I decided to keep workflows as tidy and readable as possible the old-fashioned way.
(I also wrote about this in a recent LinkedIn post).
imageResizeKJv2 is deprecated, how we can replace it?
It’s not deprecated, the issue is related to this recent ComfyUI update that affected many nodes: https://github.com/kijai/ComfyUI-KJNodes/issues/280
The workflow used to have a deprecated “Upscale and Resize” KJ node, but it was replaced by the newer V2 version before publishing.