Data is king: Why content creators must rethink their role in the AI era - FT中文网
登录×
电子邮件/用户名
密码
记住我
请输入邮箱和密码进行绑定操作:
请输入手机号码,通过短信验证(目前仅支持中国大陆地区的手机号):
请您阅读我们的用户注册协议隐私权保护政策,点击下方按钮即视为您接受。
双语电台

Data is king: Why content creators must rethink their role in the AI era

Content creators may feel the most profound shift and play a more important role as data becomes a strategic asset in the AI era
00:00

{"text":[[{"start":9.53,"text":"This article only represents the author's own views."}],[{"start":13.76,"text":"As the global AI race heats up, it’s becoming clear that data doesn’t lose its value once large models reach the reasoning stage. On the contrary, it’s even more critical due to the need for dynamic knowledge. The so-called “last mile” of high-quality datasets often determines a model’s ultimate performance."}],[{"start":36.15,"text":"That is likely why Facebook parent Meta Platforms (META.US) made a $14.3 billion strategic investment in Scale AI, a company focused on data labeling and cleaning for AI training."}],[{"start":53.18,"text":"Scale AI provides structured, high-quality datasets to OpenAI, Meta, Google and other tech giants by combining the output of massive human labor with automated pipelines. Its data labeling process involves tagging images, texts or audio with meaningful metadata — such as identifying pedestrians in a photo or labeling the point of an article. Data cleaning eliminates errors, duplicates or irrelevant material to ensure consistency and accuracy."}],[{"start":87.37,"text":"Another example of the growing value of quality data is a recent licensing deal between The New York Times and Amazon (AMZN.US), which allows fact-checked editorial content to be used for training AI models. A similar agreement between the Associated Press and OpenAI has also been signed."}],[{"start":109.52000000000001,"text":"Though these arrangements are described as content licensing, they reflect a deeper shift: content has become data, and data has become a service. These deals highlight how media organizations are reassessing the value of their content, while AI developers continue to pursue high-quality material with growing urgency."}],[{"start":131.46,"text":"In contrast, the Chinese-language AI ecosystem faces unique challenges, such as a shortage of publicly available data, lack of large-scale professional annotation and difficulty digitizing classical and cultural texts at scale. Such obstacles highlight the challenges facing development of localized large AI models."}],[{"start":155.99,"text":"Chinese-language materials are relatively scarce"}],[{"start":159.62,"text":"A white paper published by Alibaba Research Institute notes that English accounts for 59.8% of all crawlable web text, while Chinese represents just 1.3%. Wikipedia, a commonly used open resource, has over 7 million English articles, whereas there are only 1.5 million Chinese — less than a quarter of the volume."}],[{"start":184.85,"text":"This imbalance creates a major disadvantage. Without sufficient publicly available Chinese material, local large language models in Chinese may fall far behind their English-language counterparts in natural understanding and text generation — potentially leading to culturally mismatched outputs and a sense that these models have “consumed too much foreign ink.”"}],[{"start":209.9,"text":"Chinese authorities have long recognized this gap and have taken steps to address it. Platforms such as People’s Daily and Xinhua are actively constructing curated, high-quality materials, consisting of vetted news, commentary and policy interpretation, designed to ensure alignment with official values and to support AI safety from a moral and ideological standpoint."}],[{"start":237.43,"text":"Initiatives like the \"Cyber Research Large Language Model\" further concentrate on integrating data from legal and policy documents, state media and other publications, reinforcing alignment with Chinese values."}],[{"start":252.22,"text":"In China, such value alignment has become a basic requirement for any domestic AI system. While China has yet to produce a company of Scale AI’s size, several local firms, including Aishu Technology, Testin, iFlytek (002230.SZ) and Haitai Ruisheng (688787.SH), are building up their capabilities in large-scale data annotation and cleaning. The Shanghai AI Lab is also developing a platform-based material processing system in partnership with policy and academic resources, laying the foundation for a “Chinese version of Scale AI.”"}],[{"start":293.65,"text":"According to market research firm IDC, the value of China’s AI training data market was estimated at $260 million in 2023, and is expected to grow to approximately $2.32 billion by 2032, representing a compound annual growth rate of 27.4%."}],[{"start":317.23999999999995,"text":"Ultimately, the performance of any AI model depends on the content it consumes. In the AI era, content creators — especially those in journalism — must recognize that they are no longer merely material providers. They are now an integral part of the data services supply chain."}],[{"start":337.37999999999994,"text":"When news stories, commentary, academic papers and cultural archives are structured, semantically labeled and integrated into AI training pipelines, their value shifts from real-time information to durable data assets. Content creators who proactively organize and annotate their materials, and pursue licensing partnerships with AI developers, may find themselves unlocking new revenue opportunities."}],[{"start":367.2099999999999,"text":"It’s time for content to be seen not just as narrative, but also as infrastructure."}],[{"start":384.2499999999999,"text":""}]],"url":"https://audio.ftmailbox.cn/album/a_1750297349_2997.mp3"}

版权声明:本文版权归FT中文网所有,未经允许任何单位或个人不得转载,复制或以任何其他方式使用本文全部或部分,侵权必究。

法律AI初创公司为律师开辟的另类职业路径

AI热潮正在为初级和资深律师开辟一条另类的职业路径:加入法律科技初创公司工作,且往往还能获得股权。

苹果、伯克希尔与耐心的美德

这两家公司等待绝佳机遇的耐心策略曾经奏效,但如今却愈发困难。

沃什应该倾听美联储的反对声音

在连续供应冲击下持续美联储宽松政策,是一种无视疫情教训的高风险做法。

Lex专栏:诺和诺德再迎问鼎减重药霸主地位的机会

减重药的第二波竞争已然打响,礼来和诺和诺德都已推出口服版本,而这一次,优势或许在诺和诺德这边。

FT社评:美国欠欧洲盟友一份防务路线图

美国的报复性削减开支无法实现合理的北约责任分担。

欧洲能否开发出欧洲版的“战斧”?

欧洲眼下推进的项目至少还要十年才能落地,但短期内并非没有权宜之计。
设置字号×
最小
较小
默认
较大
最大
分享×