This post originally appeared as a LinkedIN article.
Recently, there has been a lot of heated rhetoric and torches-and-pitchforks sentiment around some claims by AI companies, generative AI and its implications on intellectual property ownership, creator rights and livelihoods, the labor market, and more.
As someone who’s been immersed in digital art for decades, led art teams, has been a vocal critic of training data practices, yet also worked extensively with generative AI tools within complex production workflows, I might offer a more nuanced perspective, some predictions, and even a splash of unsolicited advice that may be helpful for some.
AI does not equal training data
I’m seeing a lot of proverbial babies being thrown out with the bathwater right now, and I worry that the frequent conflation of questionable training practices with generative AI technology itself will do the creative community more harm than good by hindering the adoption of absolutely game-changing, transformative tools and workflows.
We do have a problem…
This has been exhaustively analyzed, but the current state of publicly available datasets and training practices is a legitimate cause for concern.
The worst-case outcome of this trajectory is a repeat of the dystopian precedent set by the social media complex: a landscape of centralized oligarchical control, monetization, and manipulation of mountains of indiscriminately harvested data.
…but it’s almost certainly temporary
I am highly optimistic that such an outcome will not materialize, given the inherent unsustainability of the current model, often likened to the anything-goes era of Napster in music sharing.
Unlike the effectively involuntary “secretion” of behavioral data on social media, training data for generative AI results from creative effort and intention.
This raises many legal, ethical, financial, social, and even practical considerations that vigorously push back against using “scraped” datasets, at least for the near term. (Just as I’m writing this, two class action lawsuits have just been announced against OpenAI… )
What unfolds next is even more transformative
Looking further ahead though, I suspect that market adaptation and human ingenuity will not only address some of these issues, but also prompt a profound reassessment of the very nature of intellectual property and the roles of inspiration, originality, creation, curation, and interpretation within it.
It wasn’t lawsuits that made piracy virtually obsolete: it was the emerging value proposition, convenience, and integrated experience of music and video streaming that rendered piracy impractical and cumbersome by comparison.
Also, it won’t be long before compelling arguments arise about the incremental nature of every creative process. Parallels will inevitably be drawn between how AI learns, and how a human creator draws inspiration from existing works, be it a painting, a movie, a book, a song, or even historical events.
Models and datasets are about to get a lot smaller, and a lot more specific
But before we rush too far ahead, I’m convinced that the immediate future of generative AI, along with its most spectacular outcomes, will entail applying it to narrowly focused proprietary datasets by skilled creators.
Even in my own experience (in the context of generative visuals), this is where the technology truly shines.
It transforms from a magic 8-ball that churns out shiny-looking, but generic and unpredictable fluff to an ideation and iteration engine of tremendous power and highly controllable output suitable for practical real-world use.
Skill, talent, and technique move to the forefront
In conjunction with the rapid evolution of models, datasets, and tools, and as the initial euphoria of spray-and-pray prompting subsides, creator skill and rigorous technical experimentation will decisively take over the narrative.
There are already many encouraging signs of this, with talented creators showcasing strong fundamentals and technical expertise pushing the boundaries in surprising and amazing ways.
As this trend accelerates, the results will compare to wholesale generative art like performing a Rachmaninoff Piano Concerto compares to plinking ‘Twinkle, Twinkle, Little Star’ on grandma’s piano.
Specialization is risky, skill diversification is in
This brings me to perhaps my most important point: workflows incorporating AI seem to strongly favor, reward, and amplify broad multi-disciplinary skill sets and higher-order creative and strategic thinking, as opposed to narrow and deep specialization.
This intuitively makes sense: AI is the ultimate specialist, a clockwork demon that lacks the ability to grapple with abstractions and holistic concepts, or meaningfully connect the dots between diverse realms of thought, emotion, and human expression.
On the other hand, the more siloed and mechanistic something is, and the easier it is to break it down into a series of repetitive technical steps, the more vulnerable it is to AI eventually doing it better and faster than it’s ever been done before.
If there’s one piece of advice I could give to anyone (particularly artists), it would be this: embrace the “Renaissance Man” ethos, broaden your foundational skills, and expand into adjacent domains, disciplines, and beyond.
The old “you can’t change the world, but you can change yourself” aphorism has rarely been more applicable.
AI will expand opportunity, mostly in unexpected ways
There is a lingering sense of impending doom about the impact of AI on the labor market.
While I’m sure some of this is warranted – although arguably reactionary hiring practices and widespread economic mismanagement are already doing more damage than AI likely ever will – there’s also an often-overlooked flipside.
There are at least two important trends at play, both of which I have personally witnessed:
The first is the decentralizing and deflationary impulse inherent in every new technology: AI tech, tools, and workflows are poised to empower and elevate the capabilities and productivity of individual creators and small teams to a degree that will catch most people off-guard.
The second is anecdotal and subtle, but perhaps even more powerful.
At Apocalypse Studios we developed an AI-assisted workflow to illustrate a series of “radio plays” that establish the intricate lore and narrative for the dark world of our in-development game Deadhaus Sonata.
This workflow allowed us to create hundreds of thematically consistent illustrations in a matter of days, which would have been prohibitively time-consuming and expensive otherwise.
The project had a surprisingly wide-ranging impact, leading to the hiring of additional voice actors, re-energizing our fan base, immersing the team in the narrative, sparking conversations that resulted in new design initiatives, generating interest from external parties, driving a 5x improvement in our social media engagement, and serving as a prop for look development and an onboarding tool for new team members.
It acted as an expansive, broad-spectrum catalyst – akin to watching neurons seek out and form new connections – and without the force-multiplier effect of AI it would not have existed at all.
Nobody has a crystal ball or knows how things will unfold, but for now, I remain hopeful that AI is not a threat to human creativity, but a tool to help write a new chapter of it.
The robot uprising
We’ll wrap up with a visual and technical breakdown of the header image used for this article, which – shockingly – was itself created using an AI-assisted hybrid workflow.
It hopefully serves as a small example to illustrate the kind of direction and techniques that will gradually take things beyond the “scraped” stigma.
- The base model used was LeonardoAI‘s implementation of StableDiffusion.
- To narrow the focus and get the right vibe, an additional fine-tuned custom model was generated using a dataset of several dozen Soviet-era propaganda posters available in the public domain (representing a consistent body of training data with no copyright issues).
- I created a hand-drawn grayscale sketch in PhotoShop to use as a ControlNet guidance image, with a pretty hefty 0.6 weight.
- The baseline prompt used was extremely simple (“Soviet-style poster of a robot uprising, artificial intelligence, dynamic, vintage”), as most of the heavy lifting was done by the guidance image and the custom model. The prompt’s primary role was to make iterative changes using the same generation seed possible.
- The final image is a composite of 8 different generated variant images picked from a total of about 40, and manually assembled in PhotoShop.
- Final touch-ups, adding some hand-painted elements and detail, and color correction to get the vintage look have also been done in PhotoShop.