Star protocols:用于分析微生物学细胞图像的深度学习框架

Star protocols:用于分析微生物学细胞图像的深度学习框架

Before you begin

Deep learning (DL) has proven to be extremely effective in addressing a range of major biological challenges, including predicting protein structure,4 DNA sequencing,5 and drug discovery.6 The application of DL has expanded into the microbiological field,7 particularly in cellular image analysis. In traditional cellular image analysis, there are several challenges that need to be addressed.

One challenge is that parasites have completely distinctive features in morphology during their complex life cycles,8 and the shape and size of the cells can vary considerably,9 making the classification and detection of different parasites and cells quite difficult. Additionally, obtaining high-quality and in-focus microscopic images can be challenging,7 due to various factors such as the diffraction barrier and defects in optical systems.10

DL-based cellular image analysis can solve these problems to some extent. However, the black-box nature of DL often leads to unexplainable results. Incorporating the knowledge and insights of experts into the modeling process can help to solve it, but most of the DL-based methods have not considered the importance of knowledge from microbiologists in cellular image analysis.11,12 They are highly specialized and lack detailed instructions for most microbiologists. As a result, it can be challenging to develop accurate and easy-to-use DL models for cellular image analysis in microbiology.

To address these challenges, this protocol introduces a knowledge-integrated DL framework for cellular image analysis in microbiology. By building upon the previous studies of our group,1,2,3 this protocol provides a comprehensive guide to implementing a wide spectrum of tasks (i.e., classification, detection, and reconstruction) in cellular image analysis. The following sections describe how the DL model integrates with human expert knowledge and provides step-by-step instructions accessible to both beginners and professionals.

Description of the methods

This protocol introduces three DL models integrated with knowledge from microbiologists, namely deep cycle transfer learning (DCTL),1 geometric-feature spectrum ExtremeNet (GFS-ExtremeNet),2and correcting out-of-focus microscopic images (COMI).3 These models are designed for the classification, detection, and reconstruction tasks of cellular images in microbiology.

DCTL and COMI are both based on cycle generative adversarial networks (CycleGAN),13 as illustrated in Figure 1A. CycleGAN is comprised of two sets of generator-discriminator structures, which are different types of neural networks with distinct functionalities. Generators are used to transform the input images into different styles, while discriminators are used to identify whether the images are synthesized or not. Unlike traditional GANs, the cycle network topology does not require the one-to-one pairing of source images (DomainX) and target images (DomainY), as in the case of DCTL.

In DCTL, X represents the morphologically similar macroscopic objects, while Y denotes the parasites to be recognized. The GeneratorXY transforms the macroscopic images in DomainX into their corresponding parasite images, SyntheticY. Then, GeneratorYX restores the images in SyntheticY back to the original macroscopic images, RestorationX. Another cycle performs the same process in reverse. Finally, the discriminators are used to distinguish between the generated images and the original images, which are used to help the generators improve the quality of the generated images.

Building upon the backbone of CycleGAN, DCTL incorporates human expert knowledge through two supplementary feature extractors, as shown in Figure 1B. Using four groups of extreme points, it calculates the microscopic and macroscopic correlation (MMC)1 to find the morphologically similar macroscopic objects of each parasite as a quantitative knowledge representation (Figure 2A). CycleGAN then learns the morphological information from these two image domains and teaches the supplementary feature extractors to identify different parasites. Each feature extractor is trained on both original images and synthetic images using a Cross-Entropy loss function.14 Once the model training is completed, the supplementary Feature ExtractorY can be applied to classify the four types of parasites.

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Software and algorithms
Anaconda Anaconda v2.4.0 https://www.anaconda.com/
Spyder Spyder v5.3.3 https://www.spyder-ide.org/
Python3 Python v3.7.16 https://www.python.org/
Tensorflow Tensorflow v1.15.0 https://www.tensorflow.org/
Tensorboard Tensorboard v1.15.0 https://pypi.org/project/tensorboard/
Tensorflow-estimator Tensorflow-estimator v1.15.1 https://pypi.org/project/tensorflow-estimator/
Pytorch Pytorch v1.2.0 https://pytorch.org/
Torchvision Torchvision v0.4.0 https://pytorch.org/
Keras Keras v2.2.4 https://keras.io/
Keras-contrib Keras-contrib v2.0.8 https://github.com/keras-team/keras-contrib
H5py H5py v2.10.0 https://www.h5py.org/
Scikit-learn Scikit-learn v1.0.2 https://scikit-learn.org/stable/
Matplotlib Matplotlib v3.5.3 https://matplotlib.org/
Scikit-image Scikit-image v0.17.2 https://scikit-image.org/
Opencv-python Opencv-python v4.6.0.66 https://pypi.org/project/opencv-python/
Pycocotools Pycocotools v2.0.5 https://pypi.org/project/pycocotools/2.0.5/
Tqdm Tqdm v4.64.1 https://tqdm.github.io/releases/
Pandas Pandas v1.3.5 https://pandas.pydata.org/
Numpy Numpy v1.21.6 https://numpy.org/
Protobuf Protobuf v3.19.0 https://pypi.org/project/protobuf/
Tensorflow-gpu Tensorflow-gpu v1.15.0 https://www.tensorflow.org/
cuDNN cuDNN v7.6.5 https://developer.nvidia.com/cudnn
Cudatoolkit Cudatoolkit v10.0.130 https://developer.nvidia.com/cuda-toolkit
Other
Codes and Datasets Github https://github.com/ruijunfeng/A-knowledge-integrated-deep-learning-framework-for-cellular-image-analysis-in-parasite-microbiology
Computing Platform: This protocol was performed on a computer with Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40 GHz Processor, two NVIDIA 2080Ti graphic cards, and 32G memory. Computer with more graphic cards is recommended to accelerate the training and evaluation. Windows 10 https://www.microsoft.com/en-au/software-download/windows10

如若转载,请注明出处:https://www.ouq.net/2460.html

(1)
打赏 微信打赏,为服务器增加50M流量 微信打赏,为服务器增加50M流量 支付宝打赏,为服务器增加50M流量 支付宝打赏,为服务器增加50M流量
上一篇 08/09/2023 14:03
下一篇 08/10/2023 00:09

相关推荐

  • CS229 机器学习课程复习材料-概率论

    CS229 机器学习课程复习材料-概率论 概率论复习和参考 概率论是对不确定性的研究。通过这门课,我们将依靠概率论中的概念来推导机器学习算法。这篇笔记试图涵盖适用于CS229的概率论基础。概率论的数学理论非常复杂,并且涉及到“分析”的一个分…

    12/23/2024
    44
  • 机器学习:数学基础知识

    数学基础知识 高等数学 1.导数定义: 导数和微分的概念  (1) 或者:  (2) 2.左右导数导数的几何意义和物理意义 函数在处的左、右导数分别定义为: 左导数: 右导数: 3.函数的可导性与连续性之间的关系 Th1: 函数在处可微在处…

    机器学习 12/23/2024
    49
  • Alphafold3安装

    You will need a machine running Linux; AlphaFold 3 does not support other operating systems. Full installation requires …

    机器学习 12/09/2024
    254
  • AlphaFold 3学习笔记-input输入格式(1)

    AlphaFold 3可以模拟由以下一种或多种生物分子类型组成的结构:蛋白质、DNA、RNA 生物学上常见的配体:ATP、ADP、AMP、GTP、GDP、FAD、NADP、NADPH、NDP、血红素、血红素 C、肉豆蔻酸、油酸、棕榈酸、柠檬…

    12/09/2024
    215
  • Kozak序列的功能和应用

    Kozak 序列是在真核生物的mRNA中共有的(gcc)gccRccAUGG序列 。它在翻译过程的启动中扮演了重要角色。 Kozak 序列通常被认为是 GCCGCCACCATGG,其中 ATG 是起始密码子(通常是信号序列的起始)。建议使用…

    11/02/2024
    794