A new AI image generation method, InstantID, can quickly recognize who someone is and generate new images based on a single image as a reference, according to a new paper published by the InstantX team in Beijing.
But according to Reuven Cohen, an enterprise AI consultant for Fortune 500 companies who calls InstantID the “new state-of-the-art,” told VentureBeat the new technique has a big downside: It will foster a flood of deepfake audio, images, video tools, just in time for the 2024 election.
“The use of tools like InstantID for deepfakes raises significant concerns due to the ease of creation and consistency of output with no training or fine-tuning required,” he said. “InstantID’s ability to efficiently generate identity-preserving content can lead to the creation of highly realistic and convincing deep fakes with no GPU and little CPU resources required.
InstantID surpasses LoRA for identifiable AI image generation
InstantID, he explained, surpasses LoRA — small, fine-tuned models trained on a small number of parameters like specific characters or styles — which has led to an explosion of creations by LoRA enthusiasts shared on controversial platforms like Civitai. These include everything from AI-generated fan fiction and anime characters to photorealism and even fashion — but LoRA is arguably most well-known for generating porn and deepfakes.
Cohen posted about the new InstantID method on LinkedIn yesterday saying “So long, LoRA,” calling InstantID “deep fakes on steroids.”
The InstantX team’s paper, InstantID: Zero-shot Identity-Preserving Generation in Seconds, said that techniques like LoRA are hindered by high storage demands, lengthy fine-tuning processes and the need for multiple reference images. Existing ID embedding-based methods also have faced challenges, but InstantID offers a ‘plug and play modele’ that ‘adeptly handles image personalization in various styles using just a single facial image, while ensuring high fidelity.’”
Cohen explained that InstantID is a tool for zero-shot identity-preserving generation, which differs significantly from LoRA and QLoRA, which extends the LoRA approach by first simplifying or shrinking the model’s data, further reducing the resources needed for fine-tuning.
Up until now, QLoRA was the state of the art, he said. “While LoRA and QLoRA are techniques for fine-tuning models by updating a subset of model parameters or applying quantization for efficiency, InstantID focuses on generating outputs that preserve the identity characteristics of the input data in a fast and efficient approach.”
It is easier than ever to create AI deepfakes
InstantID’s primary functionality is not directly related to the fine-tuning of the models but rather to maintaining the identity aspects in generated content, he added: “Think consistency in things like the identity of an individual,” he said. “Donald Trump always looks like Donald Trump.”
And now, he cautioned, it couldn’t be easier to quickly prompt engineer a deepfake. “Literally one click to deploy this on Hugging Face or replicate,” he said.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.