2021-07-14Back to list
Baidu Research has proposed a unified framework, coined as ERNIE 3.0, for pre-training large-scale knowledge enhanced models. Based on this framework, we have trained a 10-billion-parameter model on a massive unsupervised corpus for both natural language understanding (NLU) and generation (NLG) purposes.
ERNIE 3.0 achieved new state-of-the-art results across 54 Chinese NLP tasks. Its English model secured first place by yielding a better-than-human performance on SuperGLUE, an authoritative language understanding benchmark. In addition to its language understanding capabilities, ERNIE 3.0 also demonstrated impressive creative writing aptitude through its ability to compose novels, lyrics, poems, and couplets.
The paper ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation is available on arXiv.
Over the past year, large-scale pre-trained models such as the GPT-3 or Switch-Transformer have made groundbreaking results in natural language processing (NLP) and artificial intelligence (AI) mainly due to their powerful generalization and transfer abilities. However, while these models can increase model sizing, the lack of knowledge introduced during the training process and traditional fine-tuning approach limits their ability to solve downstream language understanding tasks.
To address the shortcomings of the aforementioned models, we have implemented large-scale knowledge graphs into a 10-billion pre-trained model on the PaddlePaddle framework for the first time, proposing a parallel training task named Universal Knowledge-Text Prediction (UKTP) that requires both unstructured texts and large-scale knowledge graphs.
“We concurrently feed the entity relations of knowledge graphs and plain text data into the pre-training model for joint masked training. By doing so, we improve how the pre-training model captures and shares information from structured knowledge and unstructured text, further enhancing its memory and reasoning abilities,” said Yu Sun, Distinguished Architect of NLP at Baidu.
ERNIE 3.0 fuses auto-regressive and auto-encoding networks, which allows the trained model to be easily tailored for both NLU and NLG tasks under zero-shot learning, few-shot learning, or fine-tuning settings.
The ERNIE 3.0 framework consists of a universal representation module as the backbone shared network, as well as task-specific representation modules. The universal representation module effectively captures universal lexical and syntactic information from training data, while task-specific representation modules selectively extract top-level semantic representations for different objective tasks.
Our researchers evaluated ERNIE 3.0 across 54 Chinese NLP tasks such as sentiment analysis, opinion extraction, machine reading comprehension, text summarization, dialogue generation, and real math word problems. The experiments showed that ERNIE 3.0 obtained the best testing results, averaging more than 3% of improved performance in over 20 different NLP tasks.
To mitigate the lack of data in real-world scenarios, we also tested ERNIE 3.0 under the zero-shot setting for various tasks. Under these parameters, ERNIE 3.0 achieved fantastic results and outperformed other recently proposed large-scale language models such as CPM and PanGu.
Regarding its English language understanding ability, ERNIE 3.0 excelled over T5 and DeBERTa on the challenging SuperGLUE benchmark by obtaining 90.6, taking first place and even besting the human-level score of 89.8. Styled after the GLUE benchmark, SuperGLUE incorporates eight language understanding tasks and was designed to be more comprehensive, challenging, and diverse than its predecessor. This is not the first time that ERNIE has broken records. In December 2019, ERNIE 2.0 topped the GLUE leaderboard to become the world’s first model to score over 90.
ERNIE 3.0 can take creative writing to the next level. Please feel free to tinker around with the ERNIE 3.0 demo here: https://wenxin.baidu.com/wenxin/ernie.
As of today, ERNIE is already being widely utilized in Baidu’s online products and services such as search, newsfeed, smart speakers and more. Through Baidu AI Cloud, ERNIE can also be seamlessly integrated for industrial applications in sectors such as energy, finance, telecommunications, media, and education. With ERNIE 3.0, Baidu is poised to create substantial economic and social value.