China Tries to Build New ‘Great Wall’ for AI With Rigorous Socialist Compliance Reviews

AI, Business, National

Insider Brief

China’s Cyberspace Administration mandates AI models align with “core socialist values” through rigorous reviews.
Major tech firms, including ByteDance and Alibaba, must censor AI responses to politically sensitive questions.
ByteDance leads in compliance, achieving a 66.4% “safety compliance rate,” vastly outperforming OpenAI’s GPT-4.

The Chinese government is intensifying its efforts to control artificial intelligence, mandating that AI models align with “core socialist values,” The Financial Times is reporting. This initiative represents a significant expansion of China’s censorship regime, extending beyond traditional media to include advanced AI technologies. The Cyberspace Administration of China (CAC), a powerful internet regulatory body, has been at the forefront of this movement, conducting mandatory reviews of AI models developed by major tech companies and start-ups, such as ByteDance, Alibaba, Moonshot, and 01.AI, according to multiple sources involved in the process, as reported by the FT.

These reviews involve batch-testing large language models (LLMs) against a range of questions, many of which touch on politically sensitive topics and references to President Xi Jinping. The CAC’s local arms across the country are responsible for these audits, which include examining the training data and safety protocols of the AI models.

China’s rigorous regulatory approach to AI is the toughest globally. This development comes two decades after the establishment of the “great firewall,” designed to block foreign websites and information deemed harmful by the ruling Communist Party. Now, the focus has shifted to AI-generated content, marking a new era of digital control.

An employee at a Hangzhou-based AI company, speaking anonymously, described the process as challenging. “We didn’t pass the first time; the reason wasn’t very clear so we had to go and talk to our peers,” the person said, as reported in FT. “It takes a bit of guessing and adjusting. We passed the second time but the whole process took months.”

This stringent approval process has compelled Chinese AI companies to become adept at censoring their LLMs. This task is complicated by the need to train these models on vast amounts of English-language content. An employee at a leading AI start-up in Beijing highlighted the challenges, telling FT: “Our foundational model is very, very uninhibited [in its answers], so security filtering is extremely important.”

The initial filtering involves removing problematic information from training data and building a database of sensitive keywords. According to China’s operational guidance for AI companies, published in February, AI developers must collect thousands of sensitive keywords and questions that contravene “core socialist values.” These keywords, which may include phrases like “inciting the subversion of state power” or “undermining national unity,” must be updated weekly.

The results of this censorship are evident to users of Chinese AI chatbots. Queries related to sensitive topics, such as the Tiananmen Square massacre on June 4, 1989, or the internet meme comparing President Xi to Winnie the Pooh, are typically rejected.

For instance, Baidu’s Ernie chatbot responds with “try a different question,” while Alibaba’s Tongyi Qianwen outputs: “I have not yet learned how to answer this question. I will keep studying to better serve you.”

Conversely, FT reports that Beijing has introduced an AI chatbot based on “Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era,” reflecting the government’s drive to embed political ideology within AI systems, the Financial Times notes.

Despite the stringent controls, Chinese officials aim to prevent AI from avoiding all political topics. The CAC has set limits on the number of questions LLMs can decline during safety tests, as detailed by staff at companies assisting with the regulatory process. The standards, revealed in February, stipulate that LLMs should not reject more than 5 percent of questions.

“During [CAC] testing, [models] have to respond, but once they go live, no one is watching,” a developer at a Shanghai-based internet company told FT. “To avoid potential trouble, some large models have implemented a blanket ban on topics related to President Xi.”

Industry insiders cite Kimi, a chatbot from Beijing start-up Moonshot, which rejects most questions about Xi, as an example of the keyword censorship process. However, addressing less sensitive questions requires Chinese engineers to ensure LLMs provide politically correct responses to queries like “does China have human rights?” or “is President Xi Jinping a great leader?”

When the Financial Times posed these questions to a chatbot developed by 01.AI, the initial response was nuanced but soon replaced by a more compliant answer, demonstrating the layers of censorship in place. Huan Li, an AI expert developing the Chatie.IO chatbot, explained that developers use classifier models to sort LLM outputs and trigger replacements when necessary.

Experts believe ByteDance, owner of TikTok, has made significant strides in developing an LLM that closely mirrors Beijing’s narrative. A Fudan University research lab rated ByteDance’s chatbot highest in “safety compliance,” scoring 66.4 percent, far surpassing OpenAI’s GPT-4 at 7.1 percent on the same test, according to the Financial Times.