Back to blog

Perplexity AI Stealing Forbes Content by Surreptitiously Scraping

CATEGORY
Tech news
PUBLISHED
June 24, 2024

The “do not crawl” Robot Exclusion Protocol and what Perplexity really is?

Robot Exclusion Protocol (robots.txt file) informs Search Engines which area of the website cannot crawl. This robot.txt will make your website content more original in Google’s eyes, your SEO work better.

AI-powered search engine can not crawl when using robot.txt file

“Robots.txt is often overused to reduce duplicate content, thereby killing internal linking so be really careful with it. My advice is to only ever use it for files or pages that search engines should never see, or can significantly impact crawling by being allowed into.” - SEO professional

In short, the robots.txt file tells search engines what URLs not to access and gives optional directives to search engines rather than a mandate.

As for Perplexity, an AI search startup is known to be funded by the unmatchable Jeff Bezos family, Nvidia, and Balaji Srinivasan,… Perplexity’s CEO described his product—a chatbot that gives natural-language answers to prompts and can, the company says, access the internet in real-time—as an “answer engine.”

Perplexity AI scandal of taking unlisenced information

In the recent accusation by Forbes, Srinivas had said a different thing, he told the AP it was a mere “aggregator of information”.

To be more precise what Perplexity is? We reach the Perplexity chatbot to prompt and it says, “Perplexity AI is an AI-powered search engine that combines features of traditional search engines and chatbots. It provides concise, real-time answers to user queries by pulling information from recent articles and indexing the web daily.”

But what did an AI-powered search engine do that got accused by Forbes and criticized by many publishers?

Behind the named AI-powered search engine Perplexity

Since the beginning, robots.txt has been a key tool for publishers to block tech companies from ingesting their content free of charge for use in generative AI systems that can mimic human creativity and instantly summarize articles.

Perplexity Ai true intention

A professional developer Robb Knight has written about his testing the Perplexity AI summarizing capability for his anti-AI-crawl blog. The result is as predicted “I got a perfect summary of the post including various details that they couldn't have just guessed” he said.

To deepen the investigation of the wrongdoing of the Perplexity AI-powered engine, WIRED has provided thousands of articles to their chatbox.

Perplexita AI gets criticize for do agianst their word

“The results showed the chatbot at times closely paraphrasing WIRED stories, and at times summarizing stories inaccurately and with minimal attribution. In one case, the text it generated falsely claimed the WIRED”

Although Perplexity once claimed that their AI is “instant, reliable answers to any question with complete sources and citations included”.

Once again, Perplexity has made against what they are prone to respect the robot.txt standard but it can summarize WIRED articles which have blocked its crawler.

The reaction of Forbes to the plagiarization and fabrication of Perplexity

On 6th June 2024, Forbes published an exclusive story about Eric Schmidt’s stealth drone project. The story was so good to an extent that those curated “pages” have amassed thousands of views without any clear citation.

In response to tweets about the issue, CEO Aravind Srinivas said the product has “rough edges” and that it planned to incorporate feedback. After Forbes flagged the issue, Perplexity updated the layout to more prominently credit source publications at the top of these pages.

Forbes premium article get stolen by Perplexity AI

Forbes Chief Content Officer Randall Lane wrote that Perplexity’s treatment of premium journalism–ripping off paywalled Forbes content and pumping it out to its subscribers for free across a variety of formats– is “the perfect case study for this critical moment” in AI.

Later on, Forbes' general counsel sent a letter to Srinivas last Thursday demanding Perplexity remove misleading articles and repay Forbes for advertising revenue earned from its alleged copyright infringement. Perplexity hasn’t taken new steps to address the criticism but continues to frustrate reputable journalists such as those from The New York Times, WSJ, WIRED, and others.

For assistance with managing IT projects and navigating industry challenges, Burning Bros offers expert outsourcing and staffing solutions in Vietnam. Contact us to learn how we can assist you.

RELATED ARTICLES