Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
Actor Said She Firmly Declined Offer From AI Firm to Serve as Voice of GPT-4.o
Imagine these optics: A man asks a woman to be the voice of his new artificial intelligence assistant. She declines. He appears to go ahead and use a vocal likeness of her instead, without her permission.
The players in this drama, touching on technology, agency and consent, are OpenAI CEO Sam Altman and Hollywood megastar Scarlett Johansson, who voiced an artificial intelligence program in Spike Jonze’s 2013 film “Her.” Altman has said it’s his all-time favorite movie.
On May 13, OpenAI released GPT-4o, billed as its “new flagship model,” which will shortly include the ability to generate not just text but also image, voice and video. The same day, Altman posted on social platform X a one-word message: “her.”
“The new voice (and video) mode is the best computer interface I’ve ever used,” Altman said in a blog post the same day. “It feels like AI from the movies; and it’s still a bit surprising to me that it’s real.”
When OpenAI demoed GPT-4o, many users commented about just how similar its “Sky” AI voice assistant sounded to Johansson.
Is that by design? Johansson told NPR that Altman first approached her nine months ago, requesting to use her voice, saying it would increase the public’s trust in AI. “After much consideration and for personal reasons, I declined the offer,” she said.
While Altman recently followed up again, asking her to reconsider, she said OpenAI launched the new product before they’d connected.
The company has since “paused” the Sky voice’s availability.
‘We believe that AI voices should not deliberately mimic a celebrity’s distinctive voice – Sky’s voice is not an imitation of Scarlett Johansson but belongs to a different professional actress using her own natural speaking voice,” OpenAI said in a Sunday blog post. The company said it built Sky and four other voices – labeled Breeze, Cove, Ember and Juniper – into ChatGPT and then expanded on them for the GPT-4o release, using samples from “voice and screen actors” it hired. The company declined to name the actors.
Johansson’s publicist told NPR that the actor’s attorneys have written to OpenAI seeking precise details about how the Sky voice was created.
“We cast the voice actor behind Sky’s voice before any outreach to Ms. Johansson. Out of respect for Ms. Johansson, we have paused using Sky’s voice in our products,” Altman told NPR in a written statement. “We are sorry to Ms. Johansson that we didn’t communicate better.”
A mea culpa may not be enough. California and some other states have ruled that while voices can’t be copyrighted, they can be misappropriated.
One of the most famous cases involved Bette Midler, who in 1988 sued Ford Motor Co. and its advertising agency, Young & Rubicam, for hiring one of her backup singers to mimic her voice on a “sound-alike” song from one of her albums. She won and received $400,000 in damages.
Also in 1988, Doritos maker Frito-Lay hired a sound-alike to gravelly voiced singer and actor Tom Waits for advertisements. He sued, and in 1990 a jury found in his favor, saying they’d been led to believe that Waits himself was hawking Salsa Rio Doritos. He received over $2 million in damages.
As a California court later ruled: “When a voice is a sufficient amount of a celebrity’s identity, the right of publicity protects it.”
Johansson has criticized the apparent appropriation of her voice – not least in the current era of rampant disinformation, some of it AI-powered.
“I was shocked, angered and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news outlets could not tell the difference,” she told NPR.
“In a time when we are all grappling with deepfakes and the protection of our own likeness, our own work, our own identities, I believe these are questions that deserve absolute clarity,” Johansson said.
OpenAI is one of a handful of U.S. and Chinese mega-AI startups appearing to practice Facebook’s now-retired Silicon Valley mantra – “move fast and break things” – as they claim advancing AI is worth the cost.
Along the away, they’ve met with questions of misappropriation.
Earlier this year, OpenAI defended its use of copyrighted material to train the large language models it needs to power products such as GPT-4o. The company said creating LLMs would otherwise be “impossible,” because copyright law today covers “virtually every sort of human expression,” including photographs, blog posts and software code. The firm contended that using copyrighted data to train LLMs should count as a fair-use exception to the law and said, “We provide an opt-out because it’s the right thing to do.”
The New York Times last December sued OpenAI and backer Microsoft, in part because it said ChatGPT’s response to users’ queries sometimes delivered content that was nearly identical to Times’ articles. Other times, ChatGPT inaccurately attributed its responses to information sourced from the Times.
The Times’ case highlights the parasitic nature of training LLMs and open questions about fair use in the AI age. Now add celebrity vocal impersonation to that list of questions.