Ling 2.0 and Ling-1T: Scaling Knowledge-Enhanced Large Language Models for Reasoning and Efficiency
Abstract
The Ling 2.0 series represents a new generation of large language models designed around knowledge enhancement, reasoning efficiency, and scalable architecture innovation. Built upon trillion-scale sparse MoE foundations, Ling-1T achieves ~50B active parameters per token with FP8 mixed-precision pipelines and 1F1B interleaved scheduling, realizing over 40% training-throughput gains with negligible accuracy loss (<0.1%). This talk presents the technical journey behind Ling-mini, Ling-flash, and Ling-1T, focusing on (1) efficient large-scale training systems for trillion-parameter models; (2) the Ling Scaling Law and its implications for cross-domain reasoning; (3) hybrid attention and RL-based alignment strategies that enable both concise reasoning and long-context understanding; and (4) how these architectural and algorithmic advances empower industrial applications such as financial risk modeling and knowledge-grounded agents. We will conclude with open-sourced implementations (inclusionAI on Hugging Face and ModelScope) and future research directions toward trustworthy, efficient, and domain-enhanced LLMs.