Single-molecule Fluorescence In Situ Hybridization (smFISH)

Expected run time for this demo: few minutes.

Single-molecule Fluorescence In Situ Hybridization (smFISH) is a technique used to visualize individual RNA molecules of specific gene.

In this tutorial we will count single mRNA molecules of the housekeeping gene MDN1 in the model organism S. cerevisiae.

As well as detecting the spots, we will also segment the nucleus as a reference channel from the DAPI staining.

Goals

  • Detect spots with low signal-to-noise ratio

  • Segment large globular-like structures as reference channel (nucleus)

Preliminary steps

The first step with every new dataset is to segment the objects. These typically are the single cells, but it can be any object where you want to detect spots.

While you can detect spots in the entire image, it is highly recommended to identify region of interests (ROIs) and segment them.

The dataset provided with this tutorial already contains the segmentation files with the ROIs of the single cells, but if you need to segment ROIs we recommend using out other software called Cell-ACDC.

Dataset

To follow this tutorial, download the dataset from here.

This dataset was published in this publication.

After unzipping the downloaded file, you will see the following folder structure:

Position_2
└── Images
    ├── MMY116_2c_ACT1_MDN1_10_s02_MDN1.tif
    ├── MMY116_2c_ACT1_MDN1_10_s02_metadata.csv
    ├── MMY116_2c_ACT1_MDN1_10_s02_ACT1.tif
    ├── MMY116_2c_ACT1_MDN1_10_s02_DAPI.tif
    ├── MMY116_2c_ACT1_MDN1_10_s02_phase_contr.tif
    ├── MMY116_2c_ACT1_MDN1_10_s02_segmInfo.csv
    ├── MMY116_2c_ACT1_MDN1_10_s02_manualBackground.npz
    ├── MMY116_2c_ACT1_MDN1_10_s02_segm.npz
    ├── MMY116_2c_ACT1_MDN1_10_s02_acdc_output.csv
    ├── MMY116_2c_ACT1_MDN1_10_s02_last_tracked_i.txt
    └── MMY116_2c_ACT1_MDN1_10_s02_custom_combine_metrics.ini

Some of these files are generated by Cell-ACDC and they will not be discussed here.

What we can see is that we have 4 .tif files corresponding to 4 channels whose filename end with MDN1.tif, ACT1.tif, DAPI.tif, and phase_contr.tif.

The ACT1 channel will not be used in this tutorial.

The phase_contr channel is the channel we used to segment the single cells with Cell-ACDC. The resulting segmentation masks are saved in the file ending with segm.npz.

The MDN1 channel is the channel where we want to detect the spots.

The DAPI channel is the staining of the nucleus and we can use it in SpotMAX as the reference channel (more details below).

../_images/tutorials_smFISH_yeast_figure.svg

A) Phase contrast channel used for segmentation of the cells. B) Quasar 670 channel used to visualize single molecules of mRNA of the MDN1 gene (spots channel). Arrows indicate spots with low signal-to-noise ratio. C) DAPI channel used to stain the nucleus (reference channel).

Tip

SpotMAX can take advantage of mother-bud (or sister cells) relationships. To annotate the relationship use the Cell-ACDC software. These annotations are saved in the file ending with acdc_output.csv.

Loading the dataset

Now that we have our dataset with the segmentation file of the cells, we can proceed with detecting the spots.

Important

In this tutorial we assume that you are already familiar with the analysis parameters. If not, please read about them here: Description of the parameters.

The first step is to load the dataset into the GUI. To run the GUI see here: Run SpotMAX from the GUI.

Click on the load-folder Load folder button on the top-right of the GUI. Select the Position_2 folder you downloaded and load the MDN1 channel.

When the dataset is loaded, you will see on the Analysis parameters tab on the left that some of the parameters have already been filled out.

Now let’s see how we can determine the optimal parameters for this dataset.

Setting up the parameters

The parameters are grouped into separate sections so we will go one by one.

File paths and channels

Since we want to segment the nucleus as a reference channel from the DAPI channel, we write ‘DAPI’ in the Reference channel end name parameter.

If we want to take advantage of the mother-bud (or sister cells) pairings we write ‘acdc_output.csv’ in the Table with lineage info end name parameter.

We can then decide on a Run number (in this case we leave it at 1), and, optionally, we can append a text at the end of the output files, for example we could write ‘tutorial’ at in the Text to append at the end of the output files.

Finally, we select ‘.csv’ for the File extension of the output tables.

METADATA

Since some of the metadata is already saved in the file ending with metadata.csv some of the entries were correctly loaded.

We need to correct the Spots reporter emmission wavelength (nm) to 668 since the fluorescence probe used to image MDN1 is Quasar 670.

Now we need to determine the optimal values for the Spot minimum z-size (μm) and Resolution multiplier in y- and x- direction parameters. These are important because if the resulting Spot (z, y, x) minimum dimensions (radius) is too low we will detect multiple peaks within the same spot. On the other hand, if it is too high, we risk to miss the smaller spots. For this tutorial we will use Spot minimum z-size (μm) = 1.0 and Resolution multiplier in y- and x- direction = 1.5.

Tip

The simplest way to determine these values is to use the tools available in the Tune parameters tab. See more instructions in this section Tune parameters tab and here Spot minimum z-size (μm).

Once you have inserted these values you should now see the following at the Spot (z, y, x) minimum dimensions (radius) parameter:

Spot (z, y, x) minimum dimensions (radius)  (1.0, 0.4366, 0.4366) μm
                                            (4.1667, 6.0586, 6.0586) pxl

Pre-processing

For the pre-processing activating or not the Aggregate cells prior analysis should not make a big difference becasue we expect spots in every cell. If we already know that some cells in the image do not have spots activating this parameter might be very important (especially if we use Thresholding for the Spots segmentation method).

We do not need to activate Remove hot pixels because this specific dataset does not have any very bright isolated single pixel.

We leave the Initial gaussian filter sigma to 0.75 because we want to activate Sharpen spots signal prior detection. When sharpening is active, the gaussian filtered image is not used for detection but only for quantification. Using a small gaussian sigma is recommended to remove some of the background noise. With a higher sigma the smoothing would be to aggressive, especially because we are dealing with low signal-to-noise ratio.

Tip

You can visually inspect the result of every pre-processing filter by pressing on the compute compute button beside each filter.

Reference channel

In this tutorial, as well as detecting the spots, we also want to segment the nucleus from the DAPI signal as the reference channel. This way we can detect whether a spot is inside or outside of the nucleus and calculate the distance from each spot to the edge of the nucleus.

To this purpose we activate Segment reference channel and also Ref. channel is single object (e.g., nucleus).

Then we set the Ref. channel gaussian filter sigma to 2.0. Finding a good sigma for the gaussian filter might require some trial and error. The idea is that we want to segment the nucleus as a round object and not segment any other artefact.

Note

When testing parameters remember to use the compute button beside the testable parameters.

Since we are not segmenting network-like structures we leave the Sigmas used to enhance network-like structures parameter to 0.0.

Now we need to choose whether to use the ‘Thresholding’ or ‘BioImage.IO model’ for the Ref. channel segmentation method. Since we know that ‘Thresholding’ works well in this case we will use that, but feel free to experiment with any of the models available at the BioImage Model Zoo.

Next, to choose the optimal Ref. channel threshold function we click on the compute button beside the Ref. channel segmentation method and we should be able to appreciate that thresholding_yen does a pretty good job at segmenting the nucleus.

Finally, we can choose whether to Save reference channel segmentation masks and Save pre-processed reference channel image.

Spots channel

We are almost done, since this is the last section that we will setup.

For the Spots segmentation method we know that ‘SpotMAX AI’ works well in this case, but feel free to experiment with ‘Thresholding’ (which is much faster than the neural networks) and with any of the models available at the BioImage Model Zoo.

Note

If this is the first time you are using the ‘SpotMAX AI’ method, SpotMAX will need to install some libraries. Keep an eye on the terminal during this time and check that installation is successful.

After selecting ‘SpotMAX AI’ you will need to configure the parameters of the model. To do so, click on the cog cog button beside the parameter. If you want more information about the parameters of the AI see this section SpotMAX AI parameters. For this dataset, we know that the following parameters work well:

  • Model type: 2D

  • Preprocess across experiment: False

  • Preprocess across timepoints: False

  • Gaussian filter sigma: 1.0

  • Remove hot pixels: False

  • Config yaml filepath: SpotMAX_v2/spotmax/nnet/config.yaml

  • PhysicalSizeX: 0.07206 (same as in the metadata_smfish_yeast_tutorial)

  • Resolution multiplier yx: 1.0 (same as in the metadata_smfish_yeast_tutorial)

Next, we can ignore Spot detection threshold function because we are using the ‘SpotMAX AI’, we do not set any filtering feature at the Features and thresholds for filtering true spots, we activate Optimise detection for high spot density, and we do not activate Compute spots size (fit gaussian peak(s)).

Finally we can choose whether to Save spots segmentation masks and Save pre-processed spots image.

Note

Since we do not activate Compute spots size (fit gaussian peak(s)) we do not need to worry about the paramters in the SpotFIT section. Also, we can leave the Configuration section deactivated and we will get asked about it when we run the analysis

Running the analysis

Ok, we are finally ready to run the analysis!

To do so simply click on the cog_play Run analysis... button on the top right of the tab.

SpotMAX will now allow use to save the parameters to an INI configuration file and we choose ‘Yes’. This way we can load them back into the GUI any time we want by clicking on the load-folder Load parameters from previous analysis... button on the top-left of the tab.

Next, SpotMAX will ask us whether we want to select the measurements to save and we say ‘No, save all the measurements’.

Then we choose a filename for the parameters file and the folder where to save them. We will get a dialogue confirming that parameters where saved with the path where they have been saved. We click ‘Ok’ and we get a reminder that the analysis will now run in the terminal and we should keep an eye on that.

We click on ‘Ok, run now!’ and we move our attention to the terminal. In the terminal we will get asked some last questions about parameters that we did not selected and we simply confirm that we want to use the default ones.

The analysis will now run and the output files will be saved in the same folder of the dataset in a new folder called SpotMAX_output. For details about the output files see this section Output files.

Closing remarks

At the end of the analysis you can go back to the GUI and visualize and inspect the results using the tools in the Inspect and edit results.

That’s it! I hope you found this tutorial useful and you can let us know if you found mistakes or any other feedback on our GitHub page or by sending us an email at .

Until next time!