Baidu Research

PLATO-2: The State-of-the-art Open-Domain Chatbot in Chinese and English

2020-07-15

With the steady progress in computer dialogue systems, people are becoming more comfortable talking with conversational agents like Google Assistant and Baidu's DuerOS. However, a natural human-bot conversation has a long way to go. Most chatbots are task-oriented dialogue systems that only specialize in target areas. What people want is a chatbot without topic restrictions, known as an open-domain chatbot.

While the open-domain chatbot is still a challenging research field, recent advances in large-scale pre-training approaches fueled by enormous text corpora have spawned cutting-edge English chatbot models like Microsoft's DialoGPT, Google's Meena, and Facebook's Blender.

We are excited to present PLATO-2, our newest open-domain chatbot model that can talk about anything in Chinese and English and engage in deep conversations. Inspired by its previous version PLATO, PLATO-2 uses latent variables for diverse response generation and introduces an effective training method via curriculum learning. Our experiments show that PLATO-2 outperforms other state-of-the-art models in both Chinese and English evaluations with a substantial improvement.

You can read the paper Towards Building an Open-Domain Chatbot via Curriculum Learning on arXiv. You can find open-sourced code on GitHub.

One-to-many mapping

One of the challenges facing dialogue generation systems is "one-to-many" mapping, which refers to how one dialogue context might correspond to multiple appropriate responses. For example, if told, "It is snowing outside," people would say, "How about making a snowman?" or "It's so cold. I miss summer."

The creation of different answers can be attributed to context and background knowledge, including personal attributes (gender, age, portrait, etc.), commonsense knowledge, personality, emotion, etc. However, computer systems find it challenging to model one-to-many relationships, bringing disturbances to dialogue systems' training.

To solve this challenge, models like Baidu's PLATO and Microsoft's OPTIMUS represent this one-to-many relationship via latent space. PLATO explicitly uses discrete latent variables and designs two reciprocal tasks of response generation and response selection to boost the quality of dialogue generation. The model achieved state-of-the-art results in three publicly available datasets (Persona-Chat, Daily Dialogue, DSTC7-AVSD).

PLATO-2

Unlike the unidirectional network from DialoGPT and the Encoder-Decoder architecture from Meena and Blender, PLATO-2 keeps the unified network for bidirectional context encoding and unidirectional response generation through a flexible attention mechanism design. The model also adopts the pre-normalization technique used in GPT- 2, where layer normalization is placed within residual connections.

Researchers trained PLATO-2 via curriculum learning. As shown in the illustration below, there are two stages involved in the learning process. At stage one, a coarse-grained baseline model is trained for general response generation under the simplified one-to-one mapping relationship. At stage two, two models of fine-grained generation and evaluation are further trained for diverse response generation and response coherence estimation, respectively.

Our researchers scaled up PLATO to PLATO-2 this time. While PLATO contains 12 transformer blocks with 110 parameters, the standard PLATO-2 model has 32 transformer blocks and 32 attention heads with 1.6 billion parameters. Researchers also introduced an effective training method of PLATO-2 via curriculum learning, citing the increasing compute of training large-scale models.

Thanks to the powerful parallel compute capability of PaddlePaddle, the 1.6B parameter model's training took approximately three weeks with 64 Nvidia Tesla V100 graphic cards. Researchers also employed gradient checkpointing to trade computation for memory.

Since PLATO-2 has both Chinese and English models, researchers trained it separately on a 684M English dataset extracted from Reddit, and a 1.2B Chinese dataset originated from social media sites.

Evaluation results

In the experiment, researchers carried out static evaluations where the model is tasked to produce responses towards the given multi-turn context, as well as interactive evaluations, which contain bot's self-chat for English evaluations and human-bot chat for Chinese evaluations.

In comparison with Microsoft's DialoGPT, Google's Meena, and Facebook's Blender, PLATO-2 outperformed others in aspects of coherence, information, and engagement in English dialogues. PLATO-2 also demonstrates a significant advantage in Chinese multi-turn chat over Microsoft's Chinese digital assistant Xiao Ice.

As shown in the image below, PLATO-2 added significant richness to its dialogue in self-chat evaluations and broadened the topic to include other related issues. In contrast, the Blender model often changes the subject.

We believe with PLATO-2, we are one step closer to natural human-machine interaction. The code and model of the English-based PLATO-2 will soon be available on GitHub. We also plan to release APIs for the Chinese model.