OpenAI says it found evidence that DeepSeek used its proprietary models

As artificial intelligence technology has grown and developed in recent years, so have concerns about uncredited use of content related to it, from text found on the internet to recordings of human voices. Now, two AI companies are locked in a fight with each other over allegations of data harvesting.

To get the full picture, we’ll have to go back to last week, when a Chinese startup called DeepSeek announced that its “DeepSeek-R1” product had launched. It said in an X post that the AI had performance on par with well-known U.S.-based company OpenAI’s “OpenAIo1” product. This week, CBS News reported that the app has made it to the top of the Apple app store downloads list.

According to the New York Times, this DeepSeek debut “spooked Silicon Valley tech companies and sent the U.S. financial markets into a tailspin,” since it matched the performance of pretty much an AI on the market.

DeepSeek’s arrival also came with backlash from OpenAI, which the Times reported is “reviewing evidence,” that the startup broke its terms of service “by harvesting large amounts of data from its AI technologies.”

In an interview with Audacy, Senior Vice President and Technology Equity Analyst Angelo Zino of CFRA explained that “what’s also gaining a lot of attention with DeepSeek is the fact that… the claims that have been put out there recently is that they’ve been able to kind of, you know, develop this for a price of under $6 million.”

Now, the Times also explained that most AI companies rely on open sourcing, or freely sharing and re-using code and that DeepSeek is no different, having “corralled” its code and data from across the internet. Furthermore, it said that AI systems need “massive amounts of online data to train,” learning skills from “just about all of the text on the internet.”

A process called distillation involves a company using data generated by another company “to teach similar skills to its own systems,” the Times said. While it noted that this is a common practice in the AI field, it can become “legally problematic” when a company takes data from “proprietary technology.”

That’s what Open AI said DeepSeek may have done with the San Francisco-based startup that is now worth $157 billion. Its terms of service prohibit anyone from “using data generated by its systems to build technologies that compete in the same market,” the Times said.

“We know that groups in the P.R.C.
[People’s Republic of China] are actively working to use methods, including what’s known as distillation, to replicate advanced U.S. AI models,” OpenAI spokeswoman Liz Bourgeois said in a statement emailed to the outlet.

Last May, Audacy reported that experts were warning that the U.S. could fall behind China in an ongoing AI race between the two countries. One of the main concerns was the amount of energy needed to power AI programs and projects.

President Donald Trump has also stressed the importance of the U.S. dominating the AI field. He recently discussed the new, $500 billion “The Stargate Project,” a joint project between SoftBank, OpenAI, Oracle, and MGX and Audacy’s “The On Deadline Podcast” will take a deep dive into his AI initiatives this week.

Regarding DeepSeek’s launch, the president said it “should be a wakeup call for our industries that we need to be laser focused on competing to win,” CBS News reported.

“We are aware of and reviewing indications that DeepSeek may have inappropriately distilled our models, and will share information as we know more,” OpenAI’s spokesperson said. “We take aggressive, proactive countermeasures to protect our technology and will continue working closely with the U.S. government to protect the most capable models being built here.”

The New York Times said that DeepSeek did not immediately respond to a request for comment.

At the same time as OpenAI reviews potential “inappropriate” distillation by DeepSeek, it is facing its own challenges. These include lawsuits accusing it of illegally using copyrighted data to train its systems and reports that OpenAI used speech recognition technology to transcribe the audio from YouTube videos.

Featured Image Photo Credit: (Photo by Leon Neal/Getty Images)