Baidu AI 'supercomputer' breaks Google's image recognition record
The supercomputer, called Minwa, has 72 powerful processors and 144 graphics processors known as GPUs – high-performance specialized chips typically used to deal with visual data. Minwa's accomplishments were detailed in a paper released by Baidu on Monday.
To test Minwa's skills, the supercomputer was let loose on ImageNet – a database of over one million pictures. It then taught itself how to sort the images into a predefined set of roughly 1,000 categories. This required Minwa to be able to identify discrepancies in similar images, such as photos displaying two different breeds of dogs.
In order to complete the task, Minwa applied a “neural network” to recognize the images, training its software with high-resolution versions of pictures so that it would develop an understanding of the characteristics it was looking for.
The data was even delivered in skewed forms – using vignetting, cropping, and color and shape distortion – to ensure the software learned the important properties of each image, rather than getting bogged down with unnecessary details.
That training seemed to work, resulting in software that can even recognize the subject of a picture when it is printed out, held at a slanted angle, and then photographed a second time.
“Our company is now leading the race in computer intelligence,” Ren Wu, a Baidu scientist working on the project, said on Tuesday, as cited by MIT Technology Review.
The computer beat Google's image-recognizing record, which stood at an impressive 4.8 percent failure rate.
“We have great power in our hands – much greater than our competitors,” Wu said.
He added that Minwa's computational power would probably put it among the 300 most powerful computers in the world, if it wasn’t specialized for deep learning.
Considered to be one of the most powerful forms of AI, deep learning involves algorithms which have only recently made their way into the tech industry from academia. It has led to breakthroughs in speech, image, and face recognition.
The key with deep learning seems to be larger data sets and networks, as those tend to produce higher returns.
“With deep learning, [the return] just keeps going up,” Wu said, adding that previous machine-learning techniques didn't show any improvement once data was scaled beyond a certain point.
Minwa supports deep learning by allowing for an artificial neural network with hundreds of billions of connections. That is, hundreds of times more than any previously built network.
Baidu is currently using an even larger supercomputer to analyze 14,000 hours of speech data, in order to improve the company's Chinese and English language speech recognition.