Before you begin
Deep learning (DL) has proven to be extremely effective in addressing a range of major biological challenges, including predicting protein structure,4 DNA sequencing,5 and drug discovery.6 The application of DL has expanded into the microbiological field,7 particularly in cellular image analysis. In traditional cellular image analysis, there are several challenges that need to be addressed.
One challenge is that parasites have completely distinctive features in morphology during their complex life cycles,8 and the shape and size of the cells can vary considerably,9 making the classification and detection of different parasites and cells quite difficult. Additionally, obtaining high-quality and in-focus microscopic images can be challenging,7 due to various factors such as the diffraction barrier and defects in optical systems.10
DL-based cellular image analysis can solve these problems to some extent. However, the black-box nature of DL often leads to unexplainable results. Incorporating the knowledge and insights of experts into the modeling process can help to solve it, but most of the DL-based methods have not considered the importance of knowledge from microbiologists in cellular image analysis.11,12 They are highly specialized and lack detailed instructions for most microbiologists. As a result, it can be challenging to develop accurate and easy-to-use DL models for cellular image analysis in microbiology.
To address these challenges, this protocol introduces a knowledge-integrated DL framework for cellular image analysis in microbiology. By building upon the previous studies of our group,1,2,3 this protocol provides a comprehensive guide to implementing a wide spectrum of tasks (i.e., classification, detection, and reconstruction) in cellular image analysis. The following sections describe how the DL model integrates with human expert knowledge and provides step-by-step instructions accessible to both beginners and professionals.
Description of the methods
This protocol introduces three DL models integrated with knowledge from microbiologists, namely deep cycle transfer learning (DCTL),1 geometric-feature spectrum ExtremeNet (GFS-ExtremeNet),2and correcting out-of-focus microscopic images (COMI).3 These models are designed for the classification, detection, and reconstruction tasks of cellular images in microbiology.
DCTL and COMI are both based on cycle generative adversarial networks (CycleGAN),13 as illustrated in Figure 1A. CycleGAN is comprised of two sets of generator-discriminator structures, which are different types of neural networks with distinct functionalities. Generators are used to transform the input images into different styles, while discriminators are used to identify whether the images are synthesized or not. Unlike traditional GANs, the cycle network topology does not require the one-to-one pairing of source images (DomainX) and target images (DomainY), as in the case of DCTL.
In DCTL, X represents the morphologically similar macroscopic objects, while Y denotes the parasites to be recognized. The GeneratorX→Y transforms the macroscopic images in DomainX into their corresponding parasite images, SyntheticY. Then, GeneratorY→X restores the images in SyntheticY back to the original macroscopic images, RestorationX. Another cycle performs the same process in reverse. Finally, the discriminators are used to distinguish between the generated images and the original images, which are used to help the generators improve the quality of the generated images.
Building upon the backbone of CycleGAN, DCTL incorporates human expert knowledge through two supplementary feature extractors, as shown in Figure 1B. Using four groups of extreme points, it calculates the microscopic and macroscopic correlation (MMC)1 to find the morphologically similar macroscopic objects of each parasite as a quantitative knowledge representation (Figure 2A). CycleGAN then learns the morphological information from these two image domains and teaches the supplementary feature extractors to identify different parasites. Each feature extractor is trained on both original images and synthetic images using a Cross-Entropy loss function.14 Once the model training is completed, the supplementary Feature ExtractorY can be applied to classify the four types of parasites.
Key resources table
|REAGENT or RESOURCE||SOURCE||IDENTIFIER|
|Software and algorithms|
|Codes and Datasets||Github||https://github.com/ruijunfeng/A-knowledge-integrated-deep-learning-framework-for-cellular-image-analysis-in-parasite-microbiology|
|Computing Platform: This protocol was performed on a computer with Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40 GHz Processor, two NVIDIA 2080Ti graphic cards, and 32G memory. Computer with more graphic cards is recommended to accelerate the training and evaluation.||Windows 10||https://www.microsoft.com/en-au/software-download/windows10|