Institute of Deep Learning

Institute of Deep Learning 2017-05-22T04:01:03+00:00

About IDL

Baidu launched the Institute of Deep Learning in 2013. The team’s focus areas include image recognition, machine learning, robotics, human-computer interaction, 3D vision and heterogeneous computing.

Visit the Baidu IDL Beijing website →

Technical Work

CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases.
Zihang Dai, Lei Li and Wei Xu
Annual Meeting of the Association for Computational Linguistics (2016)


Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation.
Jie Zhou, Ying Cao, Xuguang Wang, Peng Li, Wei Xu
Transactions of the Association for Computational Linguistics (2016)


Online Reconstruction of Indoor Scenes from RGB-D Streams
Wang Hao, Wang Jun, Wang Liang
Conference on Computer Vision and Pattern Recognition (2016), dataset


CNN-RNN: A Unified Framework for Multi-label Image Classification
Jiang Wang, Yi Yang, Junhua Mao, Zhiheng Huang, Chang Huang, and Wei Xu
Conference on Computer Vision and Pattern Recognition- Oral (2016)


Video Paragraph Captioning using Hierarchical Recurrent Neural Networks
Haonan Yu, Jiang Wang, Yi Yang, Zhiheng Huang, Wei Xu
Conference on Computer Vision and Pattern Recognition- Oral (2016)


Attention to Scale: Scale-aware Semantic Image Segmentation
Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, and Alan L Yuille
Conference on Computer Vision and Pattern Recognition (2016)


ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering
Kan Chen, Jiang Wang, Liang-Chieh Chen, Haoyuan Gao, Wei Xu, Ram Nevatia (2015)


Fully Convolutional Attention Localization Networks: Efficient Attention Localization for Fine-Grained Recognition.
Liu Xiao, Tian Xia, Jiang Wang, and Yuanqing Lin. (2016)


Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition.
Liu, Xiao, Jiang Wang, Shilei Wen, Errui Ding, and Yuanqing Lin. (2016)


SWIFT: Compiled Inference for Probabilistic Programs
Yi Wu, Lei Li and Stuart J. Russell
Neural Information Processing Systems – Workshop on Black Box Learning and Inference (2015)


On Optimization Algorithms for Recurrent Networks with Long Short-Term Memory
Hieu Pham, Zihang Dai and Lei Li
Bay Area Machine Learning Symposium (2015)


Twisted Recurrent Network for Named Entity Recognition
Zefu Lu, Lei Li and Wei Xu 
Bay Area Machine Learning Symposium (2015)


A Deep Visual Correspondence Embedding Model for Stereo Matching Costs
Zhuoyuan Chen, Xun Sun, Liang Wang, Yinan Yu, Chang Huang
International Conference on Computer Vision (2015)


End-to-end Learning of Semantic Role Labeling Using Recurrent Neural Networks
Jie Zhou, Wei Xu
Association for Computational Linguistics (2015)                                                                                                   


Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
Haoyuan Gao, Junhua Mao, Jie Zhou, Zhiheng Huang, Lei Wang, Wei Xu
Arviv.org (2015)


Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images
Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, Alan Yuille
Arviv.org (2015)


Learning from Massive Noisy Labeled Data for Image Classification
Xiao, Tong, Tian Xia, Yi Yang, Chang Huang, and Xiaogang Wang
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2691-2699 (2015)


Deep Multiple Instance Learning for Image Classification and Auto-Annotation
Wu, Jiajun, Yinan Yu, Chang Huang, and Kai Yu
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3460-3469 (2015)


Multi-Objective Convolutional Learning for Face Labeling
Liu, Sifei, Jimei Yang, Chang Huang, and Ming-Hsuan Yang
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3451-3459 (2015)


Explain Images with Multimodal Recurrent Neural Networks
Mao, Junhua, Wei Xu, Yi Yang, Jiang Wang, and Alan L. Yuille
arXiv preprint arXiv:1410.1090 (2014).


Depth-based Hand Pose Estimation: Methods, Data, and Challenges
Supancic III, James Steven, Gregory Rogez, Yi Yang, Jamie Shotton, and Deva Ramanan
arXiv preprint arXiv:1504.06378 (2015)


Conditional Random Fields as Recurrent Neural Networks
Zheng, Shuai, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip Torr
arXiv preprint arXiv:1502.03240 (2015)


DenseBox: Unifying Landmark Localization with End to End Object Detection
Huang, Lichao, Yi Yang, Yafeng Deng, and Yinan Yu
arXiv preprint arXiv:1509.04874 (2015)


Look and think twice: Capturing Top-down Visual Attention with Feedback Convolutional Neural Networks
Cao, Chunshui, Xianming Liu, Yi Yang, Yinan Yu, Jiang Wang, Zilei Wang, Yongzhen Huang In Proceedings of the IEEE International Conference on Computer Vision, pp. 2956-2964 (2015)


Maxios: Large Scale Nonnegative Matrix Factorization for Collaborative Filtering
Simon Shaolei Du, Yilin Liu, Boyi Chen and Lei Li
Neural Information Processing Systems, Workshop on Distributed Machine Learning and Matrix Computations (2014)


BFiT: From Possible-World Semantics to Random-Evaluation Semantics in Open Universe
Yi Wu, Lei Li and Stuart J. Russell
Neural Information Processing Systems – Workshop on Probabilistic Programming (2014)


Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, Alan Yuille
Arxiv.org (2014)


Bidirectional LSTM-CRF Models for Sequence Tagging
Zhiheng Huang, Wei Xu, Kai Yu
Arxiv.org (2014)