Seeking the next DeepSeek: What China’s generative AI registration data can tell us about China’s AI competitiveness

In the early months of 2025, Chinese AI startup DeepSeek caught global observers off guard by releasing a highly advanced large language model (LLM) that matched top-tier Western offerings in both performance and scale.
- DeepSeek’s sudden and unexpected rise has highlighted a lack of clear understanding of China’s generative AI innovation landscape among investors, policymakers, and technology analysts outside of the country.
But there is a data set that can shed considerable light on the generative AI models and AI-powered tools being developed in China. According to Chinese regulations, any company or organization launching AI tools that could have a significant impact on the general public must register each algorithm with China’s cyberspace regulator (CAC). That registration data is a matter of public record.
Put more simply, the Chinese government maintains a comprehensive open list of generative AI tools released in China. The list includes LLMs and other generative algorithms developed by all of China’s major AI players, as well as many you’ve probably never heard of, including:
- DeepSeek and other new-school AI startups like Baichuan and Moonshot AI
- Chinese internet platform giants like Alibaba, Tencent, Meituan, Baidu, ByteDance, and NetEase
- Chinese hardware tech firms such as Huawei, iFlyTek, SenseTime, and others
- State laboratories, state-owned enterprises, state media outlets, and government agencies
The fact that this data set exists is pretty incredible. Imagine having access to a definitive list of all public-facing generative algorithms operating in the US. But due to China’s rather heavy-handed governance of the online environment, we have this very robust tool we can use to assess the state of China’s AI ecosystem. For the purposes of this report, I have collected all of the CAC’s records on algorithm registrations into a single spreadsheet, run some data analysis on them, and from that, drawn some conclusions about China’s AI development.
In this report, I’ll use this data set to answer the following questions:
- How many generative AI tools are operating in China?
- How fast is China’s generative AI ecosystem growing?
- Where are generative AI tools being built?
- Which companies are doing the most generative AI development?
- Which sectors are seeing the most generative AI innovation?
- What’s the role of the state in China’s generative AI ecosystem?
- What types of generative AI projects are foreign companies undertaking in China?
In addition to publishing my findings from the data set here, I am also providing the full source data set for download (grab the Excel here). Containing thousands of entries, this data set provides an excellent starting point for all kinds of research on China’s generative AI industry – at a time when curiosity about what China is doing in the generative AI sector is at an all-time high, and the availability of good data on China’s tech ecosystem is at an all-time low. There’s much more to uncover in this dataset than I’ll be able to manage on my own. I will, however, try to keep the data set updated as the CAC releases new information, which happens frequently – hopefully enabling other researchers to dig deeper. If you do use this data set for your research, I ask only that you send your findings to me so I can increase my own knowledge in the area.
Two more final notes before I dive in:
- The findings here are preliminary, and they should be treated as such. A couple of the data points included in this data set come from doing batch-processed queries against Chinese state databases, and verifying only a sample subset of the data for accuracy. As the data set includes thousands of records, it would take a large team many months to verify all the information. This means there are likely some small errors in the data. If you do discover errors, please send them to me so I can update the data set (and credit you if desired). There are also several back-of-the-napkin estimates in here – I have noted where those appear. If you have questions about how I arrived at some of the conclusions, feel free to email them to me.
- This report is intended to be accessible to a general audience, and as such, I’ve prioritized clarity and simplicity over academic precision. While I aimed for accuracy throughout, I did not split hairs over the precisely proper use of terms, and I heavily glossed over some legal details regarding CAC registration procedures. If you are a researcher in this field, you may take issue with the way I casually approached some of these definitions. Apologies in advance.
The executive tl;dr
- As of April 2025, there are 3,739 registered generative algorithmic tools (GATs) operating in China.
- Of those, 1,104 GATs are more likely to be advanced or foundational tools, because they are B2B enterprise tools with API access.
- The CAC is approving approximately 250-300 new GATs a month, providing a sense of the pace of industry growth.
- Approximately 2,000 companies in China are deploying public-facing generative AI, but only around 650 of those companies have built B2B enterprise tools.
- Unsurprisingly, AI development is highly concentrated in tech hubs, with the top five registration locations – Beijing, Guangdong, Shanghai, Zhejiang, and Jiangsu – accounting for nearly 80% of all GAT registrations.
- Alibaba has registered more GATs than any other company, with 66 registrations, while DeepSeek has registered only three, highlighting that the number of registrations alone is not a reliable indicator of innovativeness.
- China’s generative AI innovation landscape is still highly concentrated in foundational models and technologies, like general-purpose LLMs or other tools and multimodal video or audio generators, with 54% of registered GATs focused on general-purpose or foundational technologies.
- The competition for foundational LLM dominance is extremely diffuse, with 971 GATs claiming to be “large models,” and the field has yet to enter a consolidation phase (though DeepSeek might change that).
- Beyond foundational technologies, the fiercest sector-specific generative AI competition is concentrated in healthcare and education.
- State-affiliated entities have registered approximately 22% of the generative AI tools.
- China Mobile appears to be among the only central state-owned enterprises doing innovative generative AI development.
- Foreign companies have registered 0.5% of GATs in China.
How to read the data set
Before I dig into what I found in this data set, I want to take a quick moment to explain a little more about what the data set includes and how to read it – you’re welcome to skip this part if you don’t intend to explore the data set yourself. Before digging into the findings, I’d like to briefly explain what the data set includes and how to read it. Feel free to skip this section if you don’t plan to explore the data set yourself.
You’ll notice that there are multiple tabs in the data set. That’s because each tab includes registration records for a specific type of generative algorithm:
- “Deep synthesis” service algorithms (DSS – 深度合成服务算法备案)
- Generative AI services / LLMs (GAI/LLM – 生成式人工智能服务备案)
This data set contains complete records of both types, current to April 2025. For the sake of simplicity, I won’t attempt to explain the difference between DSS and GAI/LLM registrations, as the differences are a bit complicated and there’s a bit of overlap. Suffice it to say, the CAC maintains two separate algorithm registration systems that exist in parallel – one older, one newer – for different kinds of generative technologies. If you’re keen to understand the difference, I recorded a podcast with Jeremy Daum of China Law Translate on the issue for Sinica in March 2023.
Taken together, these two types of records cover most public-facing:
- Generative AI models and LLMs (such as DeepSeek)
- Generative algorithms of any kind, even if those algorithms aren’t LLMs or aren’t based on deep learning
- Apps and websites that provide generative AI features to users
I’ll be collectively referring to algorithms of both types as “generative algorithmic tools”, or GATs – for lack of a better term. To be crystal clear, the term “GAT” doesn’t just cover bleeding-edge advanced generative technologies, and it also includes very simplistic generative technologies. For example, a GAT could be:
- A major foundational LLM, like DeepSeek
- An algorithm that generates e-commerce product pictures for merchants
- A chatbot that answers questions about financial markets via a mobile app
- A voice generator that acts as a customer service representative over the phone
This data set includes all the information that the CAC has released about each GAT, in both the original Chinese and auto-translated English, including:
- Registration number
- Approval date
- Name of the GAT
- Name of GAT owner / developer
- Province where the GAT is registered
- A very brief description of what the GAT does (if applicable)
- Whether the GAT is intended for B2B or B2C use cases
- App, website, or other portal in which the GAT is embedded (for B2C GATs only)
It also includes a subset of data that I added myself, namely:
- Whether the entity that registered the GAT is state-affiliated
- What sector the GAT was developed for, if known
- Which well-known Chinese company or organization the GAT owner is affiliated with, if any
- Link to the source document containing the CAC registration information, if available
Finally, throughout this report, I’ll be referencing specific algorithms by their registration numbers as listed in the data set. If you want to look up various GATs as I mention them, you can follow along by using the registration numbers to search for the algorithm in either Tab 1 or Tab 2 in the spreadsheet.
Now, on to the findings.
How many generative AI tools are operating in China?
This data set shows that as of April 2025, there are 3,739 registered GATs in China. That essentially means approximately 3,739 significant generative AI algorithms are interacting with the public on the Chinese internet. That number alone is pretty interesting – it gives a sense of how large the competitive playing field actually is. I should probably note that there are likely a few duplicates in the data set due to some quirks in the way the CAC registers algorithms, so 3,739 may be slightly high, but I’m keeping things simple here.
DSS records | GAI / LLM records | Total GATS operating in China |
3,234 | 505 | 3,739 |
This number will increase rapidly, as thousands more GATs will likely be released in the next 12 months. I say that because currently, on average, roughly 250-300 new GATs are being released on a monthly basis, giving us some sense of the pace of China’s generative AI industry development and adoption.
It should be noted that companies often register more than one GAT of more than one type, so saying there are 3,739 registered GATs is not the same as saying there are 3,739 Chinese companies working on generative AI projects. In this dataset, there are 2,353 unique company names, meaning 2,353 companies deploying public-facing generative AI tools in China. But that number is a little misleading because in many cases, one tech firm may have registered multiple GATs through multiple subsidiaries. To account for subsidiaries, we might round down and guesstimate that approximately 2,000 companies are deploying public-facing generative AI in China.
The fact that there are 3,739 GATs doesn’t mean there are 3,739 potential DeepSeeks. GATs represent a wide range of generative tools with varying levels of sophistication, and many of them aren’t LLMs or foundational technologies. But one cool thing about this data set is that it helps to identify which of these GATs are likely to be more advanced, and which companies are doing more sophisticated work, because, as you’ll see in the data set, Chinese GAT records include separate designations for:
- B2B GATs that offer services via API to developers on an enterprise basis
- B2C GATs built on top of a B2B GAT – like a face-swapping app for consumers made by Developer A, which uses Developer B’s image-generating LLM as its face-swapping engine
In other words, we can surmise from the data who’s more likely to be a serious contender and who isn’t, because entities with at least one B2B AI registration (in addition to any other registrations) are more likely to be doing advanced development, while entities that only have a B2C registration probably aren’t. This makes sense, because serious AI companies doing high-level work are likely to be offering API-based access to their code to other developers, not just releasing AI-based applications based on someone else’s code.
So, if we look at only the B2B GAT records, the number of potentially consequential tools in China shrinks significantly, to a total of 1,104. These 1,104 B2B GATs are developed by approximately 650 companies (730 individual companies, roughly rounded down to 650 to account for subsidiaries). This gives us a rough estimate of the number of companies engaged in high-level, public-facing generative AI development.
Which companies are doing the most GAT development?
All of China’s major tech players have registered at least one GAT, and typically many more. So, some companies, like Alibaba, control dozens of GATs for different AI-enabled tools and LLMs developed by various subsidiaries. The table below shows the number of GAT records associated with some of China’s largest tech firms.
Chinese AI player | Number of associated GAT records |
Alibaba / Ant | 66 |
Tencent | 52 |
NetEase | 37 |
Inspur | 28 |
Baidu | 24 |
ByteDance | 20 |
Sensetime | 19 |
Zhipu AI | 14 |
Kingsoft | 11 |
iFlyTek | 10 |
Huawei | 10 |
JD | 9 |
Baichuan Intelligence | 9 |
Hikvision | 8 |
Minimax | 5 |
Meituan Dianping | 5 |
Kuaishou | 4 |
01.AI | 4 |
DeepSeek | 3 |
Moonshot AI | 3 |
You’ll notice that a higher number of records doesn’t necessarily signal the company is more innovative – it just means the company has more subsidiaries, apps, or projects. DeepSeek, Minimax, Moonshot, and 01.AI, all new-school AI companies that are considered highly innovative, each have only between three to five registrations. NetEase, a major platform company with multiple AI-enabled app releases but which, as far as I know, isn’t doing anything particularly revolutionary in the AI space, has 37 registrations for its many spinoff apps. To provide a more concrete example of how this works, below is a list of some GATs registered by Alibaba and its affiliated entities, which illustrates why a company would have more registrations the more subsidiaries and apps it controls:
GAT registration number | Alibaba-affiliated entity deploying GAT | Description of GAT |
ZheJiang-CaiNiaoWuLiuDaMoXing- 20240116 | Cainiao Logistics | A large model that underpins Alibaba’s logistics subsidiary, Cainiao |
330110046572901220019 | TMall | A smart customer service chatbot on Alibaba’s e-commerce platform TMall |
330110507206401230035 | Alibaba DAMO Academy | A multi-purpose multimodal LLM developed by Alibaba’s research institute |
330110507206401240089 | Alibaba DAMO Academy | An algorithm designed to generate dance videos, also by Alibaba’s research institute |
330110391028001240025 | DingTalk | An algorithm deployed in DingTalk, Alibaba’s enterprise communication tool, which generates meeting minutes based on the audio from video conference calls |
310107429160601240029 | Ele.me | A helper bot for restaurants and merchants selling on Alibaba’s food delivery platform Ele.me |
Which industries are seeing the most generative AI innovation?
To explore which industries are seeing the most generative AI development, I tagged each GAT record with the sector that the GAT was developed to serve. I also included categories for foundational tools that aren’t specific to a certain sector. Of course, categorizing GATs by industry is highly subjective, and I didn’t trust myself to do those categorizations reliably. So I asked DeepSeek to assign a sector-specific designation to each GAT. I only let DeepSeek choose one category per GAT, and then I reviewed a subsample of 500 assignments for accuracy. DeepSeek did fairly well in terms of accuracy, though again, not perfect. The table below shows what it came up with.
Sector | Number of registered GATs |
Multimedia / creative audio-visual | 1114 |
General-purpose AI | 939 |
Education | 222 |
Health/medical | 203 |
E-commerce | 151 |
Office productivity | 139 |
Finance | 132 |
ICT | 123 |
Government / legal | 115 |
Entertainment | 97 |
Industrial | 88 |
IoT / Hardware | 79 |
Media | 63 |
Science / research | 48 |
Automotive | 47 |
HR/recruitment | 44 |
Cybersecurity | 34 |
Travel / tourism | 32 |
Enterprise | 21 |
Agriculture | 11 |
Pets | 10 |
Logistics | 7 |
Real estate | 7 |
Aerospace | 7 |
Food/cuisine | 6 |
You’ll notice that two categories – “multimedia” and “general-purpose” – have far and away the most registrations, with 1,114 and 939 registered GATs, respectively, for a total of 2,053 GATs, or 54% of the total. I would classify both of these categories as “foundational” or “cross-sector” categories, as they include tools and models that underpin most use cases of generative AI. The “multimedia / creative audio-visual” category encompasses everything from image generators, music generators, text-to-speech, AI drawing tools, 3D environment generators, face-swapping, noise reduction, voice changing apps, and more. The “general-purpose AI” category includes everything from foundational multi-purpose LLMs, text generators, translators, and more.
The overwhelming dominance of these categories seems to suggest that generative AI competition in China is centered on core generative technologies and has not yet significantly expanded into sector-specific applications. Indeed, the vast majority of China’s major tech firms appear to be competing at this foundational level – building their own LLMs, chatbots, computer vision tools, image generators, text-to-speech tools, etc. – instead of using tools developed by other firms.
This is not necessarily a good thing for China’s AI competitiveness. In 2023, Baidu CEO Robin Li warned that this level of competitive dispersion in core AI tech is a waste of resources, stating “We need 1 million AI native applications, but we don’t need 100 large models.” Based on this data set, the industry does not yet appear to be taking Li’s admonitions to heart, and the foundational tech ecosystem seems to have gotten even more dispersed since he made those remarks, with industry development still in an explosively competitive phase, and not yet in a consolidation phase. This may be because Chinese companies are afraid that if they lose control over the foundational technology, they cede competitive advantage to their peers. However, my expectation is that as models like DeepSeek develop – along with other leaders like Alibaba’s Qwen and potentially new models from ByteDance – consolidation will begin to occur, and competition will move to applications.
In a similar vein, it’s also worth noting that, of the 3,739 total GAT records, 971 explicitly include the term “large model” (大模型). If we take that at face value, we might conclude that there are hundreds of possible DeepSeek competitors in China – but I don’t think that’s the case. I imagine that most of these “large models” are probably lightly fine-tuned copies of foundational models like DeepSeek, Alibaba’s Qwen, or Meta’s Llama. Still, it’s highly possible that another DeepSeek is lurking among these registrations and simply hasn’t yet stepped onto the international stage.
While competition has not yet truly shifted to industry-level applications, we can already see the vaguest outlines of that competition starting to take shape. I give some quick observations on this below.
Health: Beyond foundational models, healthcare appears to be the most intensely competitive area for generative AI development in China. Multiple major players – including SenseTime, China Mobile, Tencent, Alibaba, the Chinese Academy of Sciences, iFlyTek, Baichuan Intelligence, ByteDance, Inspur, Baidu, and two foreign companies (UK’s Haleon and USA’s GE) – have registered GATs in the healthcare space – a scrum-like phenomenon which is not present in any other sector outside of foundational GATs.
Education: In the wake of the Double Reduction policy, which decimated China’s pay-to-play tutoring sector in 2021, this data set shows that education companies seem to be reinventing themselves as AI-powered edtech firms. Indeed, there are more GATs designed explicitly for education use cases than for any other specific sector, with 222 education-related GATs. Several of these GATs are also focused on English and language tutoring – despite the fact that private English tutoring schools were essentially banned under Double Reduction. These include a math education LLM by education firm TAL (Beijing-MathGPT-20231016) and TAL’s Chinese-English learning assessment algorithm (110108667686801230015), and a series of foreign language practice apps by travel and study-tour company Gaotu (110108225567701230011, 110108225567701240031, 110108225567701240023).
Government: Government applications of generative AI have so far primarily focused on civil service chatbots, legal and policy compliance or research, urban management (including smart cities, transportation, water, and policing), and, in a few cases, content censorship (mainly developed for and used by state media).
Others: The competitive playing field for generative AI applications in travel, logistics, and agriculture is still relatively open. Agriculture is an interesting example. Considering the myriad use cases for generative AI in this sector – such as predictive planting schedules, precision irrigation, disease and pest diagnostics, soil health analysis, crop genetics, automated breeding simulations, yield forecasting, and dynamic pricing models – and considering how huge the market is for related tools in China, it’s fairly surprising there are currently only 11 registered GATs, only a couple of which have the hallmarks of potentially being serious contenders.
What’s the role of the state in China’s generative AI ecosystem?
Here’s another finding that will come as no surprise to students of China’s emerging technology ecosystem: Approximately 22% of GATs are registered by companies or institutions with ties to the state. When I say “ties to the state,” I mean that the GAT owner meets at least one of the following criteria:
- Is a wholly state-owned enterprise, owned or controlled by a national or local State-owned Assets Supervision and Administration Commission (SASAC)
- Is a company controlled by a state agency outside of SASAC, like the Supreme People’s Court or the Ministry of Public Security
- Is a state-backed or state-controlled research entity, such as the Chinese Academy of Sciences (CAS)
- Is a university-invested company, where the university itself is under the control of a state agency such as the Ministry of Education
- Has a mixed ownership structure in which a state-owned entity or state investment fund holds equity alongside private parties
- Is affiliated with a state media organization, such as People’s Daily
By my count, of the 3,739 GATs currently registered, approximately 834 are held by an entity that meets one of those conditions. Once again, all the standard caveats apply: I determined state affiliation through batch-processed queries against state databases, so there are probably some false flags and errors. Plus, there are quite a few edge cases in which it is difficult to determine whether or not a company is truly “state-affiliated.” I’m sure many would argue that Huawei and ByteDance are state-affiliated companies, but I have both listed as private enterprises. Additionally, it’s important to note that affiliation does not necessarily mean direct state control over the company or the GAT, and in many cases, the state appears only tangentially connected to the project – for example, through small minority shareholding by a state-backed innovation fund. So again, 22% is a general figure, not a hard number, but I think it’s a reasonable estimate.
Of course, there are some GAT holders with very clear ties to the state, most notably China’s central state-owned enterprises (SOEs). In February 2025, Premier Li Qiang urged China’s SOEs to aggressively increase their AI capex and experiment with generative AI, a directive that has already resulted in several major SOEs making efforts to develop and deploy GATs. Many of the GATs developed by SOEs are designed to both address the needs of the SOE’s particular industrial sector and to support the company’s operations.
China’s big three state-owned telecommunications companies – China Mobile, China Telecom, and China Unicom – are by far the most innovative AI developers among central SOEs, and the most active deployers of generative AI tools. These three SOEs have collectively registered a total of 75 GATs, with China Mobile holding 41, China Telecom registering 38, and China Unicom registering the remaining six. Most notable of these are a series of registrations by China Mobile for its Jiutian LLM, the first-ever foundational LLM released by a central SOE. As one of China’s largest cloud platform providers, China Mobile has also released a series of GATs based on self-developed models, including code generators, chatbots, ringtone generators, voiceprint recognition systems, text-to-speech generators, medical emergency systems, smart home assistants, smart car assistants, emergency response tools, and more.
A few other major SOEs have also released industry-specific models, though none come close to the telecoms in terms of GAT development activity or innovativeness. One example is the 30-billion parameter air-travel-focused Qianran Model (航旅纵横千穰大模型 – ZhongYangQiYe-QianRang-202406200001) developed by a company under the Civil Aviation Administration of China (CAAC). According to official descriptions of the model, Qianran was trained on a wide variety of data collected by CAAC, and it is designed to both serve as a travel assistant that passengers can query about ticket prices and routes, and to support airport operations, including assessing passenger density and travel flow peaks, as well as “identifying risky passenger behavior” in airport terminals. Basically, it sounds like a foundational model designed for state-owned airlines.
Another example is the 70-billion parameter oil-and-gas-focused Kunlun Model (昆仑大模型 – ZhongYangQiYe-KunLun- 202408090002) employed by state-owned oil and gas giant PetroChina, which is designed to support geological and earthquake research and analysis, as well as perform back-of-office tasks for the industry. Similarly, the research institute affiliated with electrical utility provider State Grid has released the Guangming Model (光明大模型 – ZhongYangQiYe-GuangMing- 202410220006), which reportedly both performs power grid dispatching functions and supports the back-office work of employees. Yet another example is MengniuGPT, developed by state-owned dairy giant Mengniu, which is focused on providing nutrition and health information.
I did some digging into each of these models, and with the exception of those developed by China Mobile, none appear to be the result of true in-house innovation. In many cases, SOEs are not actually building these projects by themselves, but are doing so in partnership with tech companies that have stronger AI engineering capabilities. For example, state media reports that PetroChina’s Kunlun model was “developed with help from China Mobile, Huawei, and iFlytek.” Mengniu’s press releases state that it developed MengniuGPT “jointly with technology partners.” And, as soon as DeepSeek came out, State Grid announced that they would be leveraging DeepSeek to improve Guangming via distillation training, suggesting Guangming model doesn’t measure up to DeepSeek. It seems clear that pushing the bleeding edge is not the role SOEs are playing in China’s generative AI ecosystem. However, SOEs are contributing to AI development by serving as testing grounds for the large-scale application of generative AI to specific industrial use cases, and by providing funding to support the exploration of these applications.
While SOEs may not be pursuing particularly cutting-edge GATs, China’s state-backed research labs most certainly are. Most notably, CAS and its affiliated entities have registered at least 30 GATs covering a wide range of innovative and foundational projects, including several joint projects with tech platforms and other state actors. A few of the more interesting projects are listed in the table below:
Name of Algorithm | CAS-affiliated entity | Description |
Zidong Taichu large model (紫东太初大模型) | 中国科学院自动化研究所, (Institute of Automation, CAS) | In 2022, CAS launched an entire research academy dedicated to building Zidong Taichu, a “low-power consuming trillion parameter” foundational model. |
Baize (Bysearch) multi-modal LLM (白泽跨模态大模型算法) | 人民中科(北京)智能技术有限公司 (A tech joint venture between the CAS Institute of Automation and People’s Daily) | Developed for government and security-focused applications, Bysearch can parse text, video, and audio for content censorship, anti-fraud, and copyright protection use cases. |
Yayi large model (中科闻歌雅意大模型算法) | 北京中科闻歌科技股份有限公司 (Zhongke Wenge, an AI company incubated by CAS) | 300-billion parameter “safe and controllable” large model, ostensibly built on all-Chinese technology |
Other state-backed labs are also contributing interesting projects to the AI tool pool, including:
- PengCheng Mind (Guangdong-pengchengmind-20240124, 440305532159901240017), developed by Pengcheng National Laboratory, a 200-billion parameter foundational model
- GeoGPT (330110491345701240015) by Zhejiang Laboratory, a model designed for geographic research
Where are generative AI tools being built?
Unsurprisingly, AI development is highly concentrated in China’s tech hubs, with the top five registration locations – Beijing, Guangdong, Shanghai, Zhejiang, and Jiangsu – accounting for nearly 80% of all generative AI registrations. Beijing leads the pack with 1,060 registrations – over 300 more than second-place Guangdong at 735. Top AI players such as DeepSeek, Baichuan Intelligence, Zhipu, Moonshot, and, of course, major tech platforms like Baidu and Tencent, have all registered GATs in one of these locations (primarily because that’s where their offices are).
Province | Number of registered GATs |
Beijing | 1,060 |
Guangdong | 735 |
Shanghai | 616 |
Zhejiang | 381 |
Jiangsu | 186 |
Sichuan | 131 |
Fujian | 95 |
Hubei | 80 |
Shandong | 75 |
Anhui | 48 |
Chongqing | 47 |
Hebei | 40 |
Hainan | 39 |
Hunan | 36 |
Tianjin | 31 |
Henan | 27 |
Guizhou | 17 |
Shaanxi | 14 |
Liaoning | 13 |
Shanxi | 8 |
Yunnan | 8 |
Jilin | 7 |
Guangxi | 7 |
Ningxia | 7 |
Jiangxi | 6 |
Inner Mongolia | 5 |
Heilongjiang | 4 |
Gansu | 2 |
Xinjiang | 1 |
I found it interesting – but perhaps predictable – that in regions with few GAT registrations, government-funded projects sometimes make up over half of the GAT projects. For example, of the five GATs registered in Inner Mongolia, three of them were registered by state-owned or state-funded enterprises. Of the seven GATs registered in Jilin, four were registered by state-affiliated bodies. And sometimes, even when an AI company in a rural area is not directly associated with or funded by the state, all of its clients are government entities. In Ningxia, for example, a region with seven registered GATs, the Xiyan model (Ningxia-XiYanDaMoXing- 202407230001) was registered by a private company, but the model is a civil service chatbot that allows the public to query their social security and housing fund information, among other government-held data. I imagine this is because, in less developed regions, government-affiliated entities are the only game in town for GAT development. They have enough money to back a generative algorithm project, and they will also be more inclined to act on a political directive to pursue AI development, even if there is no clear market-based impetus to do so.
What types of generative AI projects are foreign companies undertaking in China?
I was only able to identify 13 foreign multinationals that have GATs registered in China. Those 13 multinationals are collectively associated with a total of 19 records, or 0.5% of all records. I’m defining “foreign multinationals” fairly loosely, as international companies clearly originating outside of mainland China, Hong Kong, Macau, and Taiwan. I also didn’t do much deep digging into the ownership structure of each registering entity, so it’s quite possible that I missed some foreign-invested joint ventures or something of that nature. As of April 2025, foreign companies with registered GATs include:
- Amazon (USA – 110116585231701240019, 640502733231901240017)
- Yum China / KFC (USA – 310115524558001240019)
- GE Healthcare Systems (USA – 310115097072401240017)
- HP (USA – 110105955843701240017)
- Evernote (USA – 110105722786701240015)
- IKEA (Sweden – 310104755117001240013)
- Groupe SEB (France – 331083090034801240013)
- Ipsos Consulting (France – 110109366893901240019)
- Samsung (South Korea – 110105591886701240017, 110105591886701240025)
- Canva (Australia – 110105663460001240011, 110105663460001240029, 110105663460001240037, 110105663460001240045, 110105663460001240053)
- Charoen Pokphand Group (Thailand – 110117885612101240019)
- Lifespring Consultancy (UK – 310110932098501250011)
- Haleon (UK – 310115562563301240011)
There are also a couple of edge cases where foreign involvement is unclear:
- Grammarly appears to have a GAT, but potentially through a local consulting firm that may have registered on their behalf.
- Microsoft’s former China-based AI R&D lab, Xiaoice, was spun off from Microsoft several years ago, and now seems to be engaged in extensive generative AI R&D, but it’s unclear if Microsoft still maintains any involvement.
Only one foreign company – Canva, which has the most GAT registrations of any foreign firm—has registered a B2B GAT, specifically for its creative image generation tool. All other foreign firms have registered only B2C GATs. This essentially means that foreign companies are only launching GAT tools in China that directly relate to the delivery or competitiveness of their existing services. They’re not actually competing in the generative AI sector – they’re competing in sectors that now require the inclusion of generative AI to be competitive. For example:
- IKEA’s GAT runs the “smart shopper” feature on its e-commerce app, an algorithm that generates product recommendations, compares products, and summarizes user reviews
- Amazon has two GATs, one is a customer service helper bot for AWS Marketplace, the other is the AI tool backing its AWS China services
- HP’s GAT underpins the China-facing version of its AI assistant / copilot (惠小微)
- GE Healthcare’s algorithm generates answers to user queries on product manuals, helping doctors and equipment engineers answer questions related to the installation, repair, maintenance, and safety specifications of medical equipment
- Evernote’s GAT supports the generative tool embedded in its China-facing SaaS platform
That finding isn’t likely to surprise anyone, and the reasons behind the lack of foreign involvement in the sector are easy to guess – the technical and regulatory barriers introduced by Beijing are too high, as are the geopolitical risks from a foreign firm’s own country.
Conclusion
To my mind, the most interesting aspect of this data set is that it paints a picture of a chaotic, all-out scramble at the starting line of China’s generative AI race, as major players compete to establish and refine the core technologies on which they hope to build empires of AI-powered products and services. This data set suggests that China’s big tech firms are loath to allow their competitors to build or control the tools they will need, preferring to reinvent the wheel rather than let someone else drive. Early winners in this fierce competition – like DeepSeek and Alibaba – are only beginning to emerge.
Against that backdrop, I think it’s a mistake to view DeepSeek as some kind of anomaly, rising out of otherwise fallow ground. China’s generative AI industry – both in terms of companies exploring foundational models and those releasing consumer-facing apps – is varied and vibrant, with many new competitors entering the fray on a monthly basis. Given this environment, DeepSeek was more of an inevitability than an anomaly, and China very well may give rise to more impactful foundational LLMs in the not-too-distant future.
But foundational AI will not be the competitive focus for long, in China or elsewhere. As winners of the foundational AI contest emerge and consolidate their leads, the industry will turn its attention to building consumer-facing applications on top of existing toolkits, and the Chinese public will see an explosion of AI-based applications take place. That shift should be visible in this data set, as we would expect to see a slowdown in the number of foundational GATs registered and an explosion in sector-specific registrations.
To close things off, I’ll note that if history is any guide, this intense domestic competition within China will eventually lead to more globally competitive AI-powered tools developed by Chinese firms making their way onto the international stage. I say that because, over the past ten years, periods of intense domestic competition within China’s tech sector have typically resulted in the emergence of one or more globally competitive technologies that rival major global players. We’ve already seen this happen with digital platforms such as TikTok, Temu, and SHEIN. These companies cut their teeth on fierce competition in the social and e-commerce spaces in China, eventually landing on a winning international formula. Yet each time Chinese companies introduce new platforms or technologies that resonate with global consumers, it seems to catch the Western world off guard.
It’s time to stop being surprised: Fierce competition in China guarantees that Chinese companies will produce more, high-quality digital tools that win over global consumers. US tech firms should be prepared to meet the challenge.
Questions? Comments? Corrections? Drop me a note at [email protected]. Credit and thanks to Ruby Qiu for her extensive research support.