Chinese-made DeepSeek AI model records extensive online user data, stores it in China-based servers

China USA
(Image credit: Shutterstock)

DeepSeek’s newest R1 large language model has already become notorious after its release cratered AI stocks, and revelations about its privacy policy might raise eyebrows even more — the company records extensive data from its online users, including keystrokes, passwords, and data entered in queries like images and text, and then stores it in China-based servers. 

Personal information, including date of birth, email addresses, phone numbers, and passwords, are all fair game, according to DeepSeek. Any content users give to the R1 LLM, from text and audio prompts to uploaded files, may also be collected by DeepSeek. And whenever someone contacts DeepSeek, it says it might keep users’ proof of identity, which presumably means documents like a driver’s license.

But that’s not all. DeepSeek records anything related to users’ hardware: IP addresses, phone models, language, etc. Its collection efforts are so thorough that the company notes “keystroke patterns or rhythms.” Cookies, a classic method of tracking users on the Internet, also contribute to user data collection.

Because R1 is 'open source,' it can be run anywhere on any hardware, which is generally good for privacy — running the model locally on your own hardware will presumably not lead to data collection. However, DeepSeek offers online access to R1 via its website and mobile app, which means the AI company handles and stores online users' data. Thankfully, DeepSeek is very transparent about what data it collects from online users, where it’s stored, and what it does with it. It details it all in its privacy policy webpage, which reveals that there’s almost nothing the company doesn’t collect.

While it’s common practice for companies with lots of user data to sell that data to interested companies such as advertising firms, something that DeepSeek says it might do, it also admits that “advertisers, measurement, and other partners share information with us about you and the actions you have taken outside of the Service, such as your activities on other websites and apps or in stores, including the products or services you purchased, online or in person.” With all this information at its disposal, it seems that DeepSeek has the potential to know its users very intimately.

DeepSeek’s “corporate group” also has access to the data it collects to provide “certain functions, such as storage, content delivery, security, research and development, analytics, customer and technical support, and content moderation.”

As for where all this information is stored, the privacy policy says it’s all kept inside servers located in China, a point that has the potential to spark serious controversy. Concerns about the personal details of Americans being in the hands of the Chinese government was a key factor in the Biden administration’s attempt to ban TikTok, raising the possibility that DeepSeek might come under similar scrutiny.

Of course, former President Biden tried to reverse the TikTok ban in his final days, and President Trump delayed the app’s fate almost as soon as he was sworn in for his second term. Thus, DeepSeek might also be shown some mercy under the right circumstances.

On the other hand, President Trump’s allies include Meta’s Mark Zuckerberg and OpenAI’s Sam Altman, and both of them are probably not very happy to see the R1 LLM run circles around their LLMs. Additionally, it’s hard to imagine that DeepSeek has made a good impression on the Republican President by inadvertently causing the stock prices of many American tech companies to fall significantly.

Developed by Chinese AI company DeepSeek, R1 is an open-source LLM that boasts cutting-edge performance at a fraction of the computing power. With 671 billion parameters, it’s one of the most significant AI models and only took 2.8 million GPU hours to train. Meta’s Llama 3 required 30.8 million GPU hours, or 11 times more.

DeepSeek boasted about these accomplishments over a month ago, but R1 launched on January 20, and the implications were fully appreciated by the stock market only yesterday. The market reacted by selling shares in AI companies like Nvidia. While the spotlight on DeepSeek has raised its profile, many have also reviewed how it handles user privacy, a particularly thorny issue for anything involving AI and software developed in China. 

TOPICS
Matthew Connatser

Matthew Connatser is a freelancing writer for Tom's Hardware US. He writes articles about CPUs, GPUs, SSDs, and computers in general.

Read more
The OpenAI logo is being displayed on a smartphone, with the Microsoft logo visible on the screen in the background, in this photo illustration taken in Brussels, Belgium
Microsoft and OpenAI investigate whether DeepSeek illicitly obtained data from ChatGPT
Intel CEO at Davos
Ex-Intel CEO Pat Gelsinger's startup chose China's DeepSeek instead of OpenAI
AI Interface on Dark Screen Display
Deepseek 'clearly not interested' in scaling up — 160-person team focused on developing new models
Nvidia Grace Hopper superchips
Chinese AI company says breakthroughs enabled creating a leading-edge AI model with 11X less compute — DeepSeek's optimizations could highlight limits of US sanctions
Moore Threads
Moore Threads GPUs allegedly show 'excellent' inference performance with DeepSeek models
Huawei
Huawei adds DeepSeek-optimized inference support for its Ascend AI GPUs
Latest in Artificial Intelligence
ChatGPT Security
Some ChatGPT users are addicted and will suffer withdrawal symptoms if cut off, say researchers
Ant Group headquarters
Ant Group reportedly reduces AI costs 20% with Chinese chips
Nvidia
U.S. asks Malaysia to 'monitor every shipment' to close the flow of restricted GPUs to China
Ryzen AI
AMD launches Gaia open source project for running LLMs locally on any PC
Intel CEO at Davos
At Nvidia's GTC event, Pat Gelsinger reiterated that Jensen 'got lucky with AI,' Intel missed the boat with Larrabee
Nvidia
Nvidia unveils DGX Station workstation PCs with GB300 Blackwell Ultra inside
Latest in News
Nvidia Ada Lovelace and GeForce RTX 40-Series
Nvidia is reportedly close to adopting Intel Foundry's 18A process node for gaming GPUs
RX 9070 XT Sapphire
Lisa Su says Radeon RX 9070-series GPU sales are 10X higher than its predecessors — for the first week of availability
RTX 5070, RX 9070 XT, Arc B580
Real-world GPU prices cost up to twice the MSRP — a look at current FPS per dollar values
Zotac Gaming GeForce RTX 5090 AMP Extreme Infinity
Zotac raises RTX 5090 prices by 20% and seemingly eliminates MSRP models
ASRock fixes AM5 motherboard by cleaning it
ASRock claims to fix 'burned out' AM5 motherboard by cleaning the socket
ChatGPT Security
Some ChatGPT users are addicted and will suffer withdrawal symptoms if cut off, say researchers
  • EzzyB
    the company records extensive data from its online users, including keystrokes, passwords, and data entered in queries like images and text, and then stores it in China-based servers.

    SHOCKING! SHOCKING I SAY! :eek:

    Seriously, is anyone at all surprised by this?
    Reply
  • hotaru251
    90% of people have no info that any gov would care about & that data is already collected (by private companies & their own govs) all time via multiple sources.

    if you are "that" paranoid just run it in a sandbox or on a dummy device thats only used for junk stuff (thus they never get anything of value)
    Reply
  • Notton
    You know what? At this point, I don't care.
    altman/elon/zucker/gates have a long history of harvesting and selling off user data.
    Do you really think they also don't harvest all your data when using their ai models?

    At least DeepSeek is open about the data they collect, and it's open source. Where as grok/openai/copilot/facebook is a big Questionmark. Who knows what they collect about you.

    If you really care about privacy, EU's GDPR is a good starting point.
    Reply
  • Gaidax
    I would definitely not use it for anything work-related, as a software engineer.

    I have no illusions about Western AI chatbots and tools, but China is a whole next level low as far as accountability and morals go.
    Reply
  • Dementoss
    EzzyB said:


    Seriously, is anyone at all surprised by this?
    They shouldn't be, it's as surprising as night following day...
    Reply
  • ederbond
    EzzyB said:
    SHOCKING! SHOCKING I SAY! :eek:

    Seriously, is anyone at all surprised by this?
    Nothing different from what Google, MSFT, Apple, Facebook and X has been doing since forever. So what's the point?
    Reply
  • pug_s
    ederbond said:
    Nothing different from what Google, MSFT, Apple, Facebook and X has been doing since forever. So what's the point?
    Believe it or not, unlike the US, China has a data privacy law (PIPL) . So your data will be housed in some Chinese server and not sold to some 3rd party.
    Reply
  • USAFRet
    pug_s said:
    Believe it or not, unlike the US, China has a data privacy law (PIPL) . So your data will be housed in some Chinese server and not sold to some 3rd party.
    And then used however the govt directs them to.
    Reply
  • WhteTrash
    Would trust more China than the US at this point.
    Reply
  • DalaiLamar
    Gaidax said:
    I would definitely not use it for anything work-related, as a software engineer.

    I have no illusions about Western AI chatbots and tools, but China is a whole next level low as far as accountability and morals go.
    https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExZ3BmZ3dzbWlleDJ2aWp4MWxuMnhiYmppaTR6bTIyY3hjM2F3d3BkOSZlcD12MV9naWZzX3NlYXJjaCZjdD1n/5R1FM2PNw3G6AZWBsc/giphy.gif
    Reply