CellProfiler Analyst(CPA)允许对数据进行交互式探索和分析,特别是来自高通量、基于图像的实验。它最受欢迎的功能是一个有监督的机器学习系统(”分类器”),它可以被训练来识别复杂和微妙的表型,用于对数百万个细胞进行自动评分。CPA提供了探索和分析多维数据的工具,特别是由其配套的图像分析软件CellProfiler分析的高通量、基于图像的实验数据。
CellProfiler Analyst的主要工具:
- 图像库可显示(全图和细胞)图像,并有各种过滤选项,可与其他工具交互使用。
- 分类器可以使用流行的监督机器学习模型对细胞和视野级别的多种表型进行分类。
- 板块浏览器根据实验的空间布局显示数据,如多孔板或微阵列。
- 散点图、柱状图和密度图显示数字数据。
- 表查看器以电子表格的形式显示数字和文本数据,可以点击数据点来显示图像。
- 归一化工具用归一化和特征选择的列创建一个新的数据表。
I. Preliminary data requirements
CPA requires access to the following data sources:
- An image table and an object table containing measurements and metadata
These may reside in a MySQL or SQLite database or in a set of comma-separated value (CSV) files. A MySQL database is recommended, though you may need to consult with your local information technology staff to set up a database server. See section II.B for more information. The tables must contain a few datacolumns needed by CellProfiler Analyst to access images and data properly, such as an Image ID column to link the per-image and per-object tables, file path and file name columns to specify where images are stored, and X, Y location columns to specify where each object resides within the image. These configuration details are specified in a properties file. Note: if image classification is specified in the properties file, an object table is not required. See section III.
- The images that were analyzed to generate the above-mentioned Table Viewers
These can be stored either locally or remotely and accessed via HTTP. The directory structure does not matter as long as the file paths stored in the image table point to the correct images. Throughout CPA, the term image is meant to include all image data associated with an analyzed field-of-view. An image in this sense usually includes several individual monochromatic images that show the different wavelengths (channels) as well as images that show outlines of identified objects. You can specify any number of image channels (including, for example, outlines of objects that resulted from image processing) by adding path and filename columns to the image table of your database for each channel. CPA currently requires image files to be monochromatic; several individual channels can be combined into a color image for viewing within the software. CPA currently supports the following image file types: BMP, CUR, DCX, Cellomics DIB, FLI, FLC, FPX, GBR, GD, GIF, ICO, IM, IMT, IPTC/NAA, JPG/JPEG, MCIDAS, MIC, MSP, PCD, PCX, PIXAR, PNG, PPM, PSD, SGI, SPIDER, TGA, TIF/TIFF, WAL, XBM, XPM, XV Thumbnails.
Note
While designed for high-throughput, image-based biological experiments, CellProfiler Analyst is also useful for the exploration of other multi-dimensional data sets, particularly when data points are linked to images.
I.A Example image table
The image table requires one column for a unique image ID and a pair of columns for each channel represented in the images: one column for the image path, and one column for the image file name (which may include some part of the path to the image, such as the subdirectory that contains the file). These columns do not need to have specific names; you will indicate which column names correspond to image ID, image path, and image filename when configuring the properties file. The remaining columns can contain measurements and metadata about each image.
Note
While MySQL and SQLite support diverse column names, CPA will not handle column names that contain commas. In general, we advise that you use only alphanumeric characters and underscores in the names of your table columns.
An image table for an experiment involving cells imaged for GFP and Hoechst would have two channels and would look something like this:
I.B Example object table
The object table requires four columns: a foreign key image ID column that corresponds to the image ID in the image table, a unique object ID column, a column for the object x-location, and a column for the object y-location. CPA expects the location columns to correspond to the x-y pixel coordinates of the objects’ centroids; the corresponding column names that are produced by CellProfiler depend on the name of the objects; for example, if nuclei were measured, the column names would be Nuclei_Location_Center_X and Nuclei_Location_Center_Y. Again, these columns do not need to have specific names; you indicate which column names correspond to these functionalities when configuring the properties file. Additional columns in this table typically contain measurements for each object, but are completely up to the user.
Note
While MySQL and SQLite support diverse column names, CPA will not handle column names that contain commas. In general, we advise that you use only alphanumeric characters and underscores in the names of your table columns.
An object table for an experiment involving cells imaged for GFP and Hoechst would have two channels and would look something like this:
II. Installation and getting started
CPA releases
All CellProfiler-Analyst releases can be found here
II.C Using the example dataset
Download the CPA example dataset from http://cellprofileranalyst.org/ or this link and unzip it to create the cpa_example directory. This directory contains:
- example.properties – Configuration file for CPA (see section III).
- MyTrainingSet.txt – Example training set file to be used in the Classifier (see section V).
- images/ – Images from the screen used in the example.
- per_image.csv – Comma Separated Values file for image data. This file was exported by CellProfiler’s ExportToDatabase module.
- per_object.csv – Comma Separated Values file for object data. This file was exported by CellProfiler’s ExportToDatabase module.
- example_SETUP.SQL – Used by CPA to create an internal database (SQLite). It can also be used to create a MySQL database. This file was exported by CellProfiler’s ExportToDatabase module.
Run the CPAnalyst file created by the install process above. A dialog will appear asking you to select a properties file. Navigate to the cpa_example directory and select the example.properties file. You’re now ready to experiment with CellProfiler Analyst!
如若转载,请注明出处:https://www.ouq.net/2350.html