A more fruitful way to train AI models on synthetic data is to have them learn through collaboration or competition.
在合成数据上训练人工智能模型的更有效的方法是让它们通过协作或竞争进行学习。
经济学人 Science and technology
A cheaper approach involves generating " synthetic data" in which one LLM makes billions of pages of text to train a second model.
一种更便宜的方法涉及生成“合成数据”,其中一个 LLM 制作数十亿页的文本来训练第二个模型。
英语百科
Synthetic data
Synthetic data are "any production data applicable to a given situation that are not obtained by direct measurement" according to the McGraw-Hill Dictionary of Scientific and Technical Terms; where Craig S. Mullins, an expert in data management, defines production data as "information that is persistently stored and used by professionals to conduct business processes.".