AI-generated content and other unfavorable practices have put longtime staple CNET on Wikipedia's blacklisted sources

CNET website (Image credit: Shutterstock)

In the wave of AI controversies and lawsuits, CNET has been publicly admonished since it first started posting thinly-veiled AI-generated content on its site in late 2022— a scandal that has culminated in the site being demoted from Trusted to Untrusted Sources on Wikipedia [h/t Futurism].

Considering that CNET has been in the business since 1994 and maintained a top-tier reputation on Wikipedia up until late 2020, this change came after lots of debate between Wikipedia's editors and has drawn the attention of many in the media, including some CNET staff members.

It's important to remember that while Wikipedia is "The Free Encyclopedia that Anyone Can Edit," it's hardly The Wild West. Wikipedia's community of editors and volunteers demand citations for any information added to Wiki pages, which keeps some degree of accountability behind the massive community responsible for operating Wikipedia. While it should never be used as a primary source, Wikipedia tends to be at least an excellent place to start researching those topics due to those citation requirements.

CNET's (apparent) fall on Wikipedia started before its AI-generated content was discovered. Back in October 2020, the acquirement of CNET by publisher Red Ventures began pushing CNET down at Wikipedia since evidence seemed to indicate a drop in editorial standards and more advertiser-favored content.

But following November 2023, when Red Ventures began posting AI-generated content to what used to be one of the most reputable tech sites, Wikipedia editors nearly immediately started pushing to demote CNET entirely from Wikipedia's list of reliable sources. CNET claims to have stopped posting AI-generated content since the nature of Red Ventures' ruthless pursuit of capital and posting of misinformation on other owned sites (like Healthline) has kept CNET off the current list of reliable sources.

One Wikipedia editor, Chess, was quoted in the Futurism piece as saying, "We shouldn't repeatedly put the onus on editors to prove that Red Ventures ruined a site before we can start removing it; they can easily buy or start another. I think we should look at the common denominator here, which is Red Ventures, and target the problem (a spam network) at its source."

This is a genuinely scalding take, but it might just be warranted. The issue here isn't purely the concealed usage of generative AI in published articles on one of the most well-known tech news sites ever. Instead, it's the fact that those AI-generated articles tend to be poorly written and inaccurate.

Before the age of AI, Wikipedia editors already had to deal with unwanted auto-generated content in the form of spambots and malicious actors. In this way, editors' treatment of AI-generated content is remarkably consistent with their past policy: it is just spam, isn't it?

In a related story a few months ago, a self-described "SEO heist" was discovered on Twitter. This may have gone undiscovered had the person responsible not openly boasted about the "achievement," which involved looking at a competitor's site, running it all through AI, and immediately AI generating an entire competing website with 1800 articles targeting the same niche to "steal 3.6M total traffic from a competitor".

The site that was hurt by this so-called SEO heist is called Exceljet, a site run by Excel expert David Bruns to help others better use Excel. Besides having his hard work stolen in perhaps the sleaziest, laziest manner possible, Bruns also discovered that most of that content was inaccurate. Hubspot's coverage of that story also discusses how, fortunately, Google eventually caught onto this.

Unfortunately, the rise of generative AI is also starting to come at the detriment of a usable Internet with content written by humans capable of testing and genuinely understanding things. One can only hope stories like this help discourage publishers from tossing aside quality control to the point where they're auto-generating misleading content.

Particularly when we also consider that lawsuits like The New York Times v. OpenAI and Microsoft remind us that these so-called generative AIs are pretty much required to steal other people's work to function at all. At least when a regular thief steals an object, it still works. With generative AI, you can't even guarantee that the result will be accurate, especially if you already lack the expertise to tell the difference.

Christopher Harper has been a successful freelance tech writer specializing in PC hardware and gaming since 2015, and ghostwrote for various B2B clients in High School before that. Outside of work, Christopher is best known to friends and rivals as an active competitive player in various eSports (particularly fighting games and arena shooters) and a purveyor of music ranging from Jimi Hendrix to Killer Mike to the Sonic Adventure 2 soundtrack.

19 Comments Comment from the forums

COLGeek

Good call from a GIGO perspective.
Reply
vanadiel007

It poses an interesting question: if AI is going to revolutionize our future, according to the media reports, why would it be a bad thing to do journalism using AI?

Where would we draw the line on how to use it, and should we even draw a line considering it's supposed to help us?
Reply
TheyCallMeContra

vanadiel007 said:
It poses an interesting question: if AI is going to revolutionize our future, according to the media reports, why would it be a bad thing to do journalism using AI?

Where would we draw the line on how to use it, and should we even draw a line considering it's supposed to help us?

Which media reports?

Also, as covered in This Article, AI cannot actually test or experience anything it "writes" about for itself. Thus, it can generate a website of over 1000 misleading or incorrect articles that still steal search results from an Actual Human who wrote An Actual Useful Resource. You're asking "why it would be a bad thing" when the reasons are directly in front of you: generative AI functions as an automated content theft machine that is incapable of actually turning around truly high-quality work. And the only times it can even come close, it does so by directly stealing from the livelihood of Actual Human Beings Who Did The Work Instead.

"It's supposed to help us" really doesn't seem to be true at all. The loudest gen. AI fanatics on Twitter are openly hoping to weaponize it against artists and other people who they simply don't want to pay for their work. They want the reward of good art, good 3D, good writing, etc without working to make it themselves or paying an actual person to do it for them. The head of OpenAI is out here saying he wants it to "replace the median human", which sounds utterly inhuman to anyone that isn't a rich tech bro who thinks they're above the rest of humanity.

Artificial Intelligence technology as a whole can certainly have promise in some areas, especially scientific research and such. But "generative AI" is really just a method through which wealthy media companies wish to further devalue the labor of the actual human beings that those companies and even those AIs rely on to function at all, and that's Not a good thing.
Reply
umeng2002_2

Regurgitating press releases does not make you a journalist and it doesn't make AI smart.
Reply
Amdlova

This site will come to same fate.
Just see spam, mass market junk and other poor choices.
Need ad-block to read something :)
Reply
plateLunch

Is Red Ventures a private equity firm? Whenever I hear a story about a reputable company with a good product suddenly turning out junk or in financial trouble, private equity more often than not is behind the decline.
Reply
DSzymborski

This happened over a year ago!
Reply
TheyCallMeContra

DSzymborski said:
This happened over a year ago!

started
Reply
Alvar "Miles" Udell

Just remember that human writers have long published questionable, misleading, and/or outright false information, sometimes under the banner of a reputable source, long before the internet ever existed, either accidentally, or purposefully, and continue to do so in everything from newspapers to academic articles to experimental reports.

Also remember that much of the content of most any tech site that isn't first party reviews is based upon press releases and reposted/reinterpreted posts from other sources. Take the TomsHardware article on using whey protein to extract gold from motherboards (as it's currently on the front page), there is no source link in the article OR even a mention of what source journal the article was from (very bad form Aaron Klotz), only stating that it was " Scientist Raffaele Mezzenga from the Department of Health Sciences and Technology", not even stating which institution they were from, but if you ask AI, specifically Microsoft Copilot, where it came from, you get the source article https://onlinelibrary.wiley.com/doi/10.1002/adma.202310642 cited in its summary. TomsHardware, and all tech sites, needs to have a source citation requirement in their articles ESPECIALLY if they criticize AI for not doing that, though TH has praised Copilot for citing its sources. Copying information from a source, summarizing it and/or putting it in your own words and/or copying direct passages without citing the source is no different than using generative AI programs that also don't cite their sources and generate an article from it.

As far as C-Net is concerned, I haven't considered them a go-to source since the dialup days, when CBS bought the combined Cnet/Zdnet.
Reply
rluker5

A long time ago, dialup era, I used to download software from that site. I stopped when the downloads kept installing Chrome and making it my default browser.
Reply

Show more comments