Since its inception, the Digital Audrey Art Initiative has been at the forefront of exploring and pushing the boundaries of generative AI art. By employing fine-tuning methods such as Dreambooth and LoRas (Low Rank Adaptations), we've succeeded in creating multiple custom models that bring the essence of Audrey Hepburn to digital life.
Our latest endeavor aimed to capture the timeless beauty of Audrey Hepburn in her mid-20s, during the era of iconic films like Roman Holiday and Sabrina.
Image credit: 'Roman Holiday' movie poster, © 1953, 'Sabrina' movie poster, © 1954 [Paramount Studios]. This image is used for educational and commentary purposes under fair use
The Art of Data Preparation
Fine-tuning generative AI models begins with a crucial and often arduous step: data preparation. This process involves the collection, curation, cleaning, upscaling, and labeling of data. For our Audrey Hepburn model, we meticulously compiled a dataset of nearly 800 images, 100 of which were curated specifically for this project. Given our goal to produce color images, despite the predominance of black-and-white photos from Audrey's era, we sought out colorized images of high quality—a challenging feat. Ultimately, 80% of our dataset comprised monochrome images, with the remaining 20% in color. This decision to maintain a majority of monochrome images while integrating colorized ones sets a foundation for future experiments in colorization, aiming for improved consistency and quality in AI-generated color images.
Samples from dataset
Labeling: A Community-Enhanced Task
Labeling, a time-intensive yet critical task, was significantly alleviated by the GPT4V-Image-Captioner tool. This innovative tool leverages GPT-4's vision capabilities to efficiently generate relevant tags for each image, transforming a process that took approximately three days into one marked by precision and depth. The labels included descriptors such as "black and white photo," "elegant," "tiara," and "classic Hollywood," among others, providing a rich dataset for model training.
Training with Dreambooth on Stable Diffusion SDXL 1.0
The selection of the Dreambooth training method using Kohya_ss, and the foundational model of Stable Diffusion SDXL 1.0 by Stability AI was pivotal. Known for its stability and scalability, this combination allowed us to achieve a base resolution of 1024 x 1024, ensuring high-quality outputs. The training, conducted on a NVIDIA RTX 3090, unfolded over three full days, epitomizing the commitment to creating a model that not only replicates Audrey Hepburn's likeness but does so with the nuanced elegance she embodied.
Testing the Model: Flexibility in Creation
The true test of our model's efficacy came with its application. It's one thing to create a model that can generate images resembling Audrey Hepburn;
It's another to assess its flexibility across various prompts and settings.
Does the model truly capture the essence of Audrey Hepburn, especially in her mid-20s, across various contexts and styles?
This question has been at the heart of our latest tests. As we continue to experiment, pushing the model through an array of artistic prompts, we aim to discern if the generated images bear the unmistakable likeness of Audrey, yet under a myriad of new lights and interpretations. Our journey towards this goal is fueled by an obsessive drive for perfection—a pinnacle we acknowledge as unattainable.
From the very inception of our Digital Audrey Art Initiative, our artists have been driven not by a quest to replicate Audrey Hepburn to perfection; she is, in their eyes and undoubtedly in the eyes of many, already the epitome of perfection. Instead, our aim has always been to celebrate her timeless beauty and elegance through the lens of generative AI, exploring new artistic horizons while paying homage to her iconic presence. This ongoing experiment, a blend of art, technology, and admiration, is a testament to our commitment to innovation and our reverence for Audrey's enduring legacy.
The Future: Custom Models for Different Age Periods
This project underscores the potential for IP owners and stakeholders to consider the creation of multiple custom models, each tailored to different periods in a subject's life. Such an approach allows for a nuanced and comprehensive representation of iconic figures across their lifespan.
Invitation to Collaborate
We invite IP owners and stakeholders interested in exploring the possibilities of fine-tuning generative AI models for their IPs to reach out. Collaborating with the Digital Audrey Art Initiative offers a unique opportunity to push the boundaries of digital art and IP representation.
The journey of fine-tuning generative AI art models is both a technical challenge and a creative endeavor. Through meticulous data preparation, innovative use of community-developed tools, and strategic model selection, the Digital Audrey Art Initiative continues to explore the endless possibilities of AI-driven art. We look forward to future collaborations that will further expand the horizons of what's possible in the realm of generative AI art.
Disclaimer: Some of the the images included in this blog post are copyrighted material and are used for educational and commentary purposes under the fair use doctrine. We do not claim ownership of these images.