Casual thoughts about deep neural network design

Ai, Artificial Intelligence, Intelligence, Network

Inspired by A ConvNet for the 2020s by Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie from Facebook AI Research (FAIR) and UC Berkeley.

Convolution and transformer are two approaches to design a deep neural network. Recently, the transformer seems to be becoming dominant in developing AI techniques. I have been asking myself: is that true? What makes the transformer special? I do not know the answer and hope to figure it out sooner or later. What I like to deliver in this blog is all about my experience with developing deep neural networks during my Ph.D. study.

What’s the secret to designing a state-of-the-art artificial deep neural network?

Learning Structure – You need to tell the network how to extract features from the input layer by layer. For example,

Hierarchical representation by starting from small-sized patches and gradually increasing the size through merging to achieve scale-invariance
By Sieun Park

Block Design – Play with the internal representation. For example,

Achieves efficient, linear computational complexity by computing self-attention locally. (shifted window approach)
By Sieun Park

Size of Convolution Kernel. For example,

The researchers observed that the benefit of larger convolution kernels and the saturation point is reached at 7 × 7
from A ConvNet for the 2020s

Pay Attention to Temporal Learning – Play with the representation embedding timely. For example,

The position of the spatial depth-wise Conv layer is moved up.
A ConvNet for the 2020s

Training with Fine Hyper-parameters: Optimizer, Learning Rate, Batch-size, Activation functions, and so on. No secrets, but try.

Final thoughts

Thinking about the followings:

Pay attention to each of them: Input->Representation->Output
Spatial and Temporal Learning

Keep them in mind whenever designing a deep neural network for any task.

Casual thoughts about deep neural network design

What’s the secret to designing a state-of-the-art artificial deep neural network?

Final thoughts

Share this: