Peng Liu

Casual thoughts about deep neural network design

Peng Liu January 13, 2022

Ai, Artificial Intelligence, Intelligence, Network

Inspired by A ConvNet for the 2020s by Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie from Facebook AI Research (FAIR) and UC Berkeley.

Convolution and transformer are two approaches to design a deep neural network. Recently, the transformer seems to be becoming dominant in developing AI techniques. I have been asking myself: is that true? What makes the transformer special? I do not know the answer and hope to figure it out sooner or later. What I like to deliver in this blog is all about my experience with developing deep neural networks during my Ph.D. study.

What’s the secret to designing a state-of-the-art artificial deep neural network?

Learning Structure – You need to tell the network how to extract features from the input layer by layer. For example,

Hierarchical representation by starting from small-sized patches and gradually increasing the size through merging to achieve scale-invariance
By Sieun Park

Block Design – Play with the internal representation. For example,

Achieves efficient, linear computational complexity by computing self-attention locally. (shifted window approach)
By Sieun Park

Size of Convolution Kernel. For example,

The researchers observed that the benefit of larger convolution kernels and the saturation point is reached at 7 × 7
from A ConvNet for the 2020s

Pay Attention to Temporal Learning – Play with the representation embedding timely. For example,

The position of the spatial depth-wise Conv layer is moved up.
A ConvNet for the 2020s

Training with Fine Hyper-parameters: Optimizer, Learning Rate, Batch-size, Activation functions, and so on. No secrets, but try.

Final thoughts

Thinking about the followings:

Pay attention to each of them: Input->Representation->Output
Spatial and Temporal Learning

Keep them in mind whenever designing a deep neural network for any task.

Understand your data statistically before developing your model-Chapter I

Peng Liu January 7, 2022

Inspired by “Understanding 8 types of Cross-Validation” by Satyam Kumar

Cross-Validation (CV) is one critical way for evaluating our machine learning models. However, it should be applied correctly to your own data. You need to check your data and understand it statistically before developing your model on your data.

There are more than eight types of Cross-Validation variants you may use to develop your model. However, which one you should use largely depends on your data. We should check the data at least by looking at (1) sample size: small or large? (2) class balance or not? (3) whether it is time-series data?

The following 8 types of CV are explained in the “Understanding 8 types of Cross-Validation” by Satyam Kumar.

Leave p out cross-validation
Leave one out cross-validation
Holdout cross-validation
Repeated random subsampling validation
k-fold cross-validation
Stratified k-fold cross-validation
Time Series cross-validation
Nested cross-validation

You may find the Pros vs. Cons for each one in the article. Here, I just put my key idea:

Make sure you have balanced data and not time-series one. The fast and safe way is to up-sample or down-sample your data. After balancing the data, we can easily apply Nest Cross-Validation (why? check my previous blog)
If it is time-series data, there are few CV options for you, and you need to use time Series cross-validation.

I will keep this topic updated.

“Why You Need to Check Your Residual Plots for Regression Analysis” by Minitab Blog Editor

Peng Liu January 3, 2022

Original article: https://blog.minitab.com/en/adventures-in-statistics-2/why-you-need-to-check-your-residual-plots-for-regression-analysis

Why? To start, let’s breakdown and define the 2 basic components of a valid regression model:
Response = (Constant + Predictors) + Error
Another way we can say this is:
Response = Deterministic + Stochastic

The take-home message to me is that the residual represents the unpredictable error. By checking the residual plot, you can validate whether your predictors are missing some of the predictive information.