Menu Close

Casual thoughts about deep neural network design

Ai, Artificial Intelligence, Intelligence, Network

Inspired by A ConvNet for the 2020s by Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie from Facebook AI Research (FAIR) and UC Berkeley.

Convolution and transformer are two approaches to design a deep neural network. Recently, the transformer seems to be becoming dominant in developing AI techniques. I have been asking myself: is that true? What makes the transformer special? I do not know the answer and hope to figure it out sooner or later. What I like to deliver in this blog is all about my experience with developing deep neural networks during my Ph.D. study.

What’s the secret to designing a state-of-the-art artificial deep neural network?

  • Learning Structure – You need to tell the network how to extract features from the input layer by layer. For example,

Hierarchical representation by starting from small-sized patches and gradually increasing the size through merging to achieve scale-invariance

By Sieun Park
  • Block Design – Play with the internal representation. For example,

Achieves efficient, linear computational complexity by computing self-attention locally. (shifted window approach)

By Sieun Park
  • Size of Convolution Kernel. For example,

The researchers observed that the benefit of larger convolution kernels and the saturation point is reached at 7 × 7

from A ConvNet for the 2020s
  • Pay Attention to Temporal Learning – Play with the representation embedding timely. For example,

The position of the spatial depth-wise Conv layer is moved up.

A ConvNet for the 2020s
  • Training with Fine Hyper-parameters: Optimizer, Learning Rate, Batch-size, Activation functions, and so on. No secrets, but try.

Final thoughts

Thinking about the followings:

  1. Pay attention to each of them: Input->Representation->Output
  2. Spatial and Temporal Learning

Keep them in mind whenever designing a deep neural network for any task.

Understand your data statistically before developing your model-Chapter I

Inspired by “Understanding 8 types of Cross-Validation” by Satyam Kumar

Cross-Validation (CV) is one critical way for evaluating our machine learning models. However, it should be applied correctly to your own data. You need to check your data and understand it statistically before developing your model on your data.

There are more than eight types of Cross-Validation variants you may use to develop your model. However, which one you should use largely depends on your data. We should check the data at least by looking at (1) sample size: small or large? (2) class balance or not? (3) whether it is time-series data?

The following 8 types of CV are explained in the Understanding 8 types of Cross-Validation” by Satyam Kumar.

  1. Leave p out cross-validation
  2. Leave one out cross-validation
  3. Holdout cross-validation
  4. Repeated random subsampling validation
  5. k-fold cross-validation
  6. Stratified k-fold cross-validation
  7. Time Series cross-validation
  8. Nested cross-validation

You may find the Pros vs. Cons for each one in the article. Here, I just put my key idea:

  1. Make sure you have balanced data and not time-series one. The fast and safe way is to up-sample or down-sample your data. After balancing the data, we can easily apply Nest Cross-Validation (why? check my previous blog)
  2. If it is time-series data, there are few CV options for you, and you need to use time Series cross-validation.

I will keep this topic updated.

“Why You Need to Check Your Residual Plots for Regression Analysis” by Minitab Blog Editor

Original article: https://blog.minitab.com/en/adventures-in-statistics-2/why-you-need-to-check-your-residual-plots-for-regression-analysis

Why? To start, let’s breakdown and define the 2 basic components of a valid regression model:

Response = (Constant + Predictors) + Error 

Another way we can say this is:

Response = Deterministic + Stochastic

The take-home message to me is that the residual represents the unpredictable error. By checking the residual plot, you can validate whether your predictors are missing some of the predictive information.

Residual plots can reveal unwanted residual patterns that indicate biased results more effectively than numbers.

The residuals should be centered on zero throughout the range of fitted values and normally distributed.

Minitab's residuals versus fits plot

Now let’s look at a problematic residual plot. Keep in mind that the residuals should not contain any predictive information.

Minitab's residuals versus fit plot with bad residuals

Reading more from minitab:

Regression Analysis Tutorial and Examples

Regression Analysis: How Do I Interpret R-squared and Assess the Goodness-of-Fit?