Changing the architecture of deep learning

Image
A picture of Lizhong Chen and Fuxin Li standing together.

Lizhong Chen and Fuxin Li receive NSF CAREER awards to support their research advancing deep learning technology.

Fundamental improvements to deep learning are the research target for two professors in the College of Engineering at Oregon State University.

Each comes at the topic from a different perspective: computer architecture and machine learning algorithms. Both have just received the prestigious Faculty Early Career Development (CAREER) program award from the National Science Foundation to further their research and educational outreach.

They are in good company: 24 other faculty in the School of Electrical Engineering and Computer Science have received similar awards in the past. 

Lizhong Chen, assistant professor of electrical and computer engineering, is designing a transformational approach to parallel processing, which is essential for deep learning and many other computing applications — from small, wearable devices to large supercomputers and data centers. Fuxin Li, assistant professor of computer science, will tackle the issue from the software side by creating algorithms that will simplify the tasks of machine learning. 

Faster, smaller, better

Your laptop or cellphone probably has a two- or four-core processor, but a computer used for deep learning can have over 5,000. That number of processing cores would be impossible with traditional CPUs. To handle the processing power needed for modern applications, computer architects have turned to GPUs.

“You have an army of processing cores that are doing parallel processing and so some kind of coordination among the vast number of processing cores is necessary,” Chen said.

The army sergeant of data traffic for a many-core GPU is the on-chip network, which is the focus of Chen’s research for the CAREER award.

“We are working on a transformative approach to address this issue, and we’ve come up with a new design for the on-chip network. We can’t use a design created for CPUs because we are really on a very different level of parallelism,” Chen said.

The first step in the processes was to identify the bottlenecks of the current system. Based on his results, Chen designed and has applied for a patent on the first routerless network-on-chip (NoC) that is smaller, uses less power, and has higher performance than traditional NoCs.

“We really like this design because we’re not trading off performance for a reduced cost. It’s a better design in all aspects,” Chen said.

For the educational component of the award, Chen plans to strengthen computer architecture education at Oregon State by offering more courses on the topic, and by developing tools that will speed up the process of simulation research for students. He will also mentor aspiring computer architects through K-12 summer programs that encourage women and underrepresented minorities to pursue STEM fields.

“The CAREER award is granted to me but also to the school, so that we can have a more established research program in architecture and GPU in particular,” Chen said.

Smarter algorithms

“Deep learning has been a huge engineering success,” Fuxin Li said. “You can get very good accuracy, sometimes beating human performance, for example, on visual classification tasks like categorizing images of dogs into 120 dog breeds. Not everyone can do that very well.”

A current approach, called convolutional neural networks, can be up to 97 percent accurate on certain tasks, but requires significant amount of data augmentation, namely creating hundreds of templates to train the network from each single image. A transformation could be cropping, rescaling, or rotating the image. Li’s approach seeks to eliminate this process by considering new architectures that are invariant or equivariant to these transformations.

“Engineering-wise, convolutional networks are OK, because we can just have a lot of templates and we can enumerate all these different versions. But scientifically, we want to see if there is any better architecture that can handle transformations automatically, rather than creating all these different templates,” Li said.

If successful, Li’s new algorithm could fundamentally change the process of deep learning which would have broad impacts.

For example, one impact to his own research would be better video segmentation to track objects for applications such as sports analysis, tracking salmon populations, traffic or crowd surveillance, and robotics.

“In the future, I’m hopeful we can help robots grab objects. You really need to know the exact shape of the object to grab it successfully, and we haven’t solved that very well yet,” Li said.

Li plans to mentor high school students through a summer internship program, as well as other on-campus programs designed to engage underrepresented minorities. He will also be developing a toolkit that will allow people to train deep learning algorithms without any programming experience. The CAREER award will also support a graduate student who will be able to dedicate research time to improving deep learning algorithms.

“This CAREER award will allow me to try to solve the underlying science problems of deep learning, and once that is solved it would create a new generation of deep learning algorithms,” Li said.

June 6, 2018