Menu Close

【Refer】Medical Imaging Meets NIPS: A summary

This year I attended and presented a poster at the Medical Imaging Meets NIPs workshop. The workshop focused on bringing together professionals from both the medical imaging and machine learning communities. Altogether there were eleven talks and two poster sessions. Here I’m going to recap some of the highlights of the workshop. Presentations and posters generally discussed segmentation, classification, and/or image reconstruction.

Segmentation

Before coming to this workshop I must admit that I did not fully understand the value of image segmentation. My thought process was always something along the lines: why would you just want to outline something in an image and not also classify it? This workshop changed my view on the value of segmentation.

Radiation Therapy

Raj Jena, a radiologist at Cambridge University and a Microsoft researcher, gave a presentation on “ Pixel Perfectionism — Machine learning and Adaptive Radiation Therapy.” In the talk he described how machine learning could help provide better treatments and optimize workflows. In order for patients to receive proper radiation therapy it is important to pinpoint the exact boundary of the tumor. By locating the exact boundary between the tumor and healthy tissue, treatments can deliver more radiation as there is less risk of damaging healthy tissue. However, currently, segmentation is done manually by radiologists. This often causes discrepancies between different radiologists, which can noticeably affect treatment results. Consistency is also important in gauging the effectiveness of drugs used in combination with radiation because if the radiation is not the same across the patients it is nearly impossible to tell if improvements are caused by the drug or better radiation.

Machine learning offers the opportunity to provide consistency and more accurate segmentation. Secondly, machine learning models can often run in seconds whereas radiologists often take several hours to manually segment images. This time can be better spent plotting the course of treatment or seeing additional patients. Jena also described how machine learning could allow him to become a “super radiation oncologist.”

Slide from Jena’s talk. The “Super Radiation Oncologist” uses machine learning to constantly adapt therapy and predict effects of treatment.
Slide from Jena’s talk. Details adaptive radiation therapy.

ML can enable oncologists to both better adapt treatments to changes in the shape and size of healthy tissues and to help oncologists predict possible adverse affects of radiation therapy. For instance, Jena described how he is using simple methods such as Gaussian processes to predict potential side effects of radiation.

This was one of my favorite talks of the entire workshop and I urge you to check out Jena’s full presentation.

Building quality datasets

A common theme throughout the workshop was the quality of annotations and the difficulty building good medical imaging datasets. This is particularly true in segmentation tasks where a model can only be as good as its annotators and the annotators must be skilled radiologists themselves.

One possible way to accelerate the annotation process is through active learning. Tanveer Syeda-Mahmood of IBM briefly brought this up when discussing IBM’s work on radiology. With active learning one might start with a small labeled dataset and several expert human annotators. The ML algorithm learns the training set well enough so that it can annotate easy images itself and the experts annotate the hard edge cases. Specifically, images that the classifier scores below a threshold of certainty are then sent to humans to manually annotate. One of the posters (by Girro et al) also discussed using active learning to help effectively train a semantic image segmentation network.

Active learning may address part of the problem, however it does not entirely solve the quality issue. The central question is how can researchers develop a accurate dataset when even the experts disagree on boundaries. With respect to this point, Bjorne Menze presented on the construction of the BraTS dataset. The BraTS dataset is one of the largest brain imaging datasets. He fused data from several different annotators in order to create the “ground truth.” Since its creation BraTS has held several different challenges. One of the challenges involved segmenting all the tumors with machine learning algorithms and the most recent focused on predicting overall survival.

Localization, detection, and classification

Accurately classifying diseases found in medical images was a prominent topic at the workshop. Detecting objects/ROIs and accurately classifying them is a challenging task in medical imaging. This is largely due to the variety of modalities (and dimensions) of medical images (i.e. X-Ray MRI, CT, Ultrasound, and Sonogram), the size of the images, and (as with segmentation) limited annotated (and sometimes low quality) training data. As such, presenters showcased an interesting variety of techniques for overcoming these obstacles.

Ivana Igsum discussed deep learning techniques in cardiac imaging. In particular, she described her work in accurately detecting calcification in arteries. She described how her and her team developed methods to automatically score calcium and categorize cardiovascular disease risk. To do this her team used a multi-layer CNN approach.

Slide from Ivana’s talk (5:46)

Later in the day, Yaroslav Nikulin presented on the winning approach from the digital mammography challenge.

Posters

Natalia Antropova, Benjamin Huynh and Maryellen Giger of the University of Chicago had an interesting poster on using an LSTM to perform breast DCI-MRI classification. This involved inputting 3d MRI images from multiple time steps after a contrast dye was applied. They then extracted features from these images using a CNN which they fed to the LSTM and outputted a prediction. Altogether this poster provided an interesting application of a LSTM (and CNN) to handle “4d” medical imaging data.

My poster focused on my current work in progress with respect to using object detectors to accurately localize and classify multiple conditions in Chest X-Rays. My major goals are to investigate how well object detection algorithms perform on a limited dataset as opposed to multi-label classification CNNs trained on the entire dataset. I think object detectors have a lot of potential at localizing and classifying diseases/conditions in medical images if configured and trained properly, however they are limited by the shortage of labeled bounding box data, which is one of the reasons I found the following poster very interesting.

Hiba Chougrad and Hamid Zouaki had an interesting poster on transfer learning for breast imaging classification. In the abstract Convolutional Neural Networks for Breast Cancer Screening: Transfer Learning with Exponential Decay, they described testing several different transfer learning methods. For example, they compared fine-tuning and utilizing a CNN pretrained on image net with randomly initializing weights. In the end, they discovered the optimal technique was to use an exponentially decaying learning rate to fine tune the layers. So for the bottom layers (i.e., the ones closest to the softmax), the learning rate would be the highest and for the upper layers, the learning rate would be the lowest. This intuitively makes a lot of sense as the bottom layers tend to learn the most dataset relevant features. By using these and related techniques we can (hopefully) develop accurate models without having large datasets.

Reconstruction and generation

Heartflow

I usually am not impressed with industrial pitches touting how great their product is and how it will “revolutionize [insert industry].” However, Heartflow and their DeepLumen blood vessel segmentation algorithm definitely impressed me. The product reduced unnecessary angiograms by 83% and is FDA approved. I will not go into extended detail here, but I think Heartflow is a good example of machine learning having an impact in a real-world environment.

Two of the other presenters also touched on reconstruction as well. Igsum’s talk (previously mentioned) discussed a method for constructing a routine CT from a low dose CT. Daniel Rueckert of Imperial College described how ML-based reconstruction could enable more time stamps in imaging.

Posters

One of the poster that I found particularly interesting was MR-to-CT Synthesis using Cycle-Consistent Generative Adversarial Networks. In this work the authors (Wolterink et al.) take the popular CycleGAN algorithm and used it to convert MRI images into CT images. This is a potentially very useful application that could prevent patients from having to have multiple imaging procedures. Additionally, CTs expose patients to radiation so this could also potentially reduce radiation exposure.

Image from MR-to-CT synthesis article

There were also a posters by Virdi et al. on Synthetic Medical Images from Dual Generative Adversarial Networks and Mardani et al. on Deep Generative Adversarial Networks for Compressed Sensing (GANCS) Automates MRI.

Tools and platforms

Several speakers spoke about new tools aimed at making medical image analysis with machine learning more accessible to both clinicians and ML researchers. Jorge Cardoso described NiftyNet and how it enables researchers to develop medical imaging models more easily. NiftyNet is built on Tensorflow and it includes many simple to use modules for loading high dimensional inputs.

Poster from Makkie et al. on their neuroimaging platform

Also on the tools side, G. Varoquax presented on NILearn,a Python module for Neuro-Imaging data, built on top of scikit-learn. Just as scikit-learn seeks to make ML accessible to people with basic programming skills, the goal of NILearn is to do the same with brain imaging. The only systems related poster was from Makkie and Liu from the University of Georgia. It focused on their brain initiative platform for neuro-imaging and how it fused several different technologies including Spark, AWS and Tensorflow. Finally, the DLTK toolkit had their own poster at the conference. Altogether there were some really interesting toolkits and platforms that should help make medical image analysis with machine learning more accessible to everyone.

Other talks

Wiro Nessen had an interesting presentation on the confluence of biomedical imaging and genetic data. In the talk he described how both large genetic and imaging datasets could be combined to try to detect bio-markers in the images. The synthesis of the two areas could also help detect diseases early and figure out more targeted treatments.

Announcements

  • At the end of her talk, Ivana Igsum announced that the first Deep Learning in Medical Imaging or MIDL event is taking place in Amsterdam in July.
  • I’m continuously adding new papers, conferences, and tools to my machine learning healthcare curated list. However, it’s a big job so make a PR and contribute today!
  • I’m starting a new Slack channel dedicated to machine learning in healthcare. So if you are interested feel free to join.
  • My summary of Machine Learning for Healthcare or ML4H at NIPs will be out in the next few weeks. This will cover everything from hospital operations (like LOS forecasting and hand hygiene), to mining electronic medical records, to drug discovery, to analyzing genomic data. So stay tuned.

How to Successfully Incorporate Undergraduate Researchers Into a Complex Research Program at a Large Institution from Rebecca

Rebecca B. Weldon & Valerie F. Reyna

Human Neuroscience Institute, Department of Human Development, Cornell University, Ithaca, NY 14850.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4521737/

Working as a team is always better than yourself.  Recently, I mentioned this though to some excellent Ph.D. students, and the main goal was to expect to work with them and have more work and larger impact produced to our academic society.  Unfuntunetly,  as we all know,  all Ph.D. researchers are too busy with their own tasks to be able to provide more effort to any more additional work.  In addition, there are some research funds concerns that may also restrict our cooperation.  In this article, the authors proposed the importance of cooperative research and how to successfully incorporate undergraduate researchers into a complex research program at a large institution.   I believe this article is going to help a lot for all Ph.D. student and any researcher who wanna have more and larger impact works done effectively.

Initial screening of potential undergraduate research assistants: Making sure it is a good fit

The first point is to make sure the potential undergraduate research assistant is a good fit for your team.   The very first step in recruiting is to find the students who are really interested in being a part of scientific research, and then,  we should send an initial screening survey to any interested students, which is usually including basic questions about the student and also some questions about career ambitions and extracurricular activities. We need to know why the student thinks he or she will be a good fit for one (or more) of our research teams.  To get a better sense for whether this student is a good match for the lab, the next step is to let our graduate students have a talk with the undergraduate student.  Lastly, we recommend the student to the directory of our lab for a final interview.

How to Install anaconda + Keras/ Tensorflow (Ubuntu) / Pytorch

Recently, many times of environment installations push me to consider of writing a tutorial for saving time. In this tutorial, I would like to make a very quick and straightforward guideline that can lead most of us to start an enjoyable trip of deep learning.   We assume you’ve already installed CUDA.

Need to install Nvidia drivers: http://www.linuxandubuntu.com/home/how-to-install-latest-nvidia-drivers-in-linux

Step 1: download and install  Anaconda

Go to https://www.anaconda.com/download/

a) choose your system from the RED rectangle  Here, I choose Ubuntu (Linux)

b) get the downloading link from the BLUE rectangle by right click

c) wget https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh

This command line will start to download the anaconda installation package.

d) bash Anaconda-latest-Linux-x86_64.sh 

This command line will start installing.  Note: replace “latest” with the version of anaconda, in this case: bash Anaconda3-5.2.0-Linux-x86_64.sh

Note: the answers to all questions during installation are Yes except for the last question:

Do you wish to proceed with the installation of Microsoft VSCode? [yes|no]

>>> no

Step2: create one conda environment

a) cd ananconda3/bin

export PATH=/location/anaconda3/bin:$PATH In my case:export PATH=/home/pengliu/anaconda3/bin:$PATH

b)

conda create -p /location/yourenvname python=x.x

Note: if there are multiple users creating their own envs separately, please make sure the name of each env is unique.  Otherwise, conflicts may occur.

In my case :  conda create -p /home/pliu/pliupy3 python=3.6

Step3: Activate your new environment

             source activate /location/pliupy3

  Step4:

conda install tensorflow-gpu 
conda install -c nvidia cuda-toolkit
conda install -c nvidia cuda-toolkit=10.1.168

Step5: conda install keras

Step6: install other packages

conda install numpy

conda install tqdm

conda install pillow

Now, the installation work has been done and it should work well in most of the cases.

To test the environment:

(pliupy3) user@Server:~/anaconda3/bin$ python  

Press ENTER key

>>>import tensorflow as tf

>>>print(tf.__version__)

>>>1.12.0

To install Pytorch

PyTorch

On Linux:

conda install pytorch torchvision -c pytorch
conda install -c anaconda scikit-image
conda install scikit-learn

pip install python-gist
conda install seaborn

conda install altair
conda install altair_saver
pip3 install torchsampler
pip3 install torchsummary
conda install -c conda-forge opencv
conda install -c conda-forge pytorch-gpu

Others

Use pip like so:

 

The requirements.txt contains :

pip3 install  click
pip3 install  scipy
pip3 install pytables pip3 install  pytorch pip3 install  batchup


GLIBCXX_3.4.20 not found

sudo apt-get install libstdc++6
sudo add-apt-repository ppa:ubuntu-toolchain-r/test 
sudo apt-get update
sudo apt-get upgrade


GLIBCXX_3.4.21 not defined in file libstdc++.so.6 with link time referenc

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/anaconda3/lib

Conda: command not found

Can not use conda in terminal after installation

find the solution here

#just redirect it

for Anaconda 2 :

  1. export PATH=~/anaconda2/bin:$PATH

for Anaconda 3 :

  1. export PATH=~/anaconda3/bin:$PATH

How to change default Anaconda python environment

https://stackoverflow.com/questions/28436769/how-to-change-default-anaconda-python-environment

What is the step-by-step procedure to fix the “The following packages have unmet dependencies”?

ImportError: /lib64/libm.so.6: version `GLIBC_2.23′ not found

This is due to the python version: downgrade it to python 3.6.8 by conda install python==3.6.8

1type’ as a synonym of type is deprecated; in a future version of numpy

This is due to numpy version. Downgrade it to 1.16