Baidu Tech Blog

Tech blog for Baidu Research

Gram CTC: Speech Recognition with Word Piece Targets

2017-05-22T04:01:04+00:00 March 2nd, 2017|

Deep Speech presented an end-to-end neural architecture using the CTC loss for speech recognition in multiple languages. Today, we present Gram CTC which extends the CTC loss function to automatically discover and predict word pieces instead of characters. Models using Gram CTC achieve state-of-the-art on the Fisher-Swbd benchmark with single model, demonstrating that end-to-end learning [...]

Deep Voice: Real-Time Neural Text-to-Speech for Production

2017-05-22T04:01:04+00:00 February 28th, 2017|

Baidu Research presents Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks. The biggest obstacle to building such a system thus far has been the speed of audio synthesis – previous approaches have taken minutes or hours to generate only a few seconds of speech. We solve this challenge and show that [...]

Bringing HPC Techniques to Deep Learning

2017-05-22T04:01:04+00:00 February 21st, 2017|

Summary: Neural networks have grown in scale over the past several years, and training can require a massive amount of data and computational resources. To provide the required amount of compute power, we scale models to dozens of GPUs using a technique common in high-performance computing (HPC) but underused in deep learning. This technique, the [...]

PaddlePaddle and Kubernetes Join Forces, Helping Developers Efficiently Train Deep Learning Models

2017-05-22T04:01:04+00:00 February 7th, 2017|

Kubernetes community announced today that PaddlePaddle, the open source deep learning framework originally developed by Baidu, is now compatible with Kubernetes, the cluster management system, making PaddlePaddle the only deep learning framework that officially supports Kubernetes to date. The compatibility will allow developers to conveniently train large models on all major global cloud service providers [...]

Baidu’s Melody: AI-Powered Conversational Bot for Doctors and Patients

2017-05-22T04:01:04+00:00 October 18th, 2016|

Baidu has launched Melody, an AI-powered conversational bot designed to provide relevant information to doctors to assist with recommendations and treatment options. Melody incorporates advanced deep learning and natural language processing (NLP) technologies developed by Baidu. Melody integrates with Baidu Doctor, an app that Baidu launched in China in 2015. Andrew Ng, chief scientist, Baidu, said: [...]

SVAIL Tech Notes: Optimizing RNNs with Differentiable Graphs

2017-05-22T04:01:04+00:00 June 15th, 2016|

This week we posted a new Tech Note in which Jesse Engel discusses a new technique for speeding up the training of deep recurrent neural networks. This is Part II of a multi-part series detailing some of the techniques we've used here at Baidu's Silicon Valley AI Lab (SVAIL) to accelerate the training of recurrent neural networks. While Part [...]

Adam Coates Speaks to TechEmergence about Future of Speech Recognition

2017-05-22T04:01:04+00:00 May 6th, 2016|

Adam Coates sat down recently with Daniel Faggella from TechEmergence at our Sunnyvale office for an interview about AI, Speech Recognition and Natural Language Processing. During the interview, Coates, Director of Baidu Silicon Valley AI Lab, talked about Baidu's work in AI. He also shared his thoughts around consumer artificial intelligence applications in terms of its impact, [...]

Baidu Researchers to Present at GPU Tech Conference

2017-05-22T04:01:04+00:00 March 31st, 2016|

Baidu Research will participate in next week's GPU Tech Conference in San Jose, California. Here's a rundown of some of our activities there: - Exhibit Hall: Baidu will have a table in the "AI Playground" at GTC.  Drop by to say hi, watch a demo of Deep Speech and learn about open job positions. - [...]

Big Data vs. Big Crowds – New Research from Baidu’s Big Data Lab

2017-05-22T04:01:04+00:00 March 29th, 2016|

Baidu’s Big Data Lab has released a paper detailing how to use big data analytics to predict large-scale crowd formation and warn people of potentially deadly stampede events, like the tragic one that claimed the lives of 36 people in Shanghai on New Year’s Eve, 2014. The paper is titled "Early Warning of Human Crowds Based [...]

SVAIL Tech Notes: A Look at Persistent Recurrent Neural Nets

2017-05-22T04:01:04+00:00 March 25th, 2016|

Today we posted a new Tech Note in which Greg Diamos, a research scientist at Baidu's Silicon Valley AI Lab, discusses a new technique for speeding up the training of deep recurrent neural networks. Greg explains: At SVAIL, our mission is to create AI technology that lets us have a significant impact on hundreds of millions [...]