diff --git a/README.md b/README.md index 82256093..ee91459a 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,7 @@ or navigate to any of the documents listed below and download it individually. 3. [Tutorial: Determining Moore's Law with real data in NumPy](content/mooreslaw-tutorial.md) 4. [Tutorial: Saving and sharing your NumPy arrays](content/save-load-arrays.md) 5. [Tutorial: NumPy deep learning on MNIST from scratch](content/tutorial-deep-learning-on-mnist.md) +6. [Tutorial: X-ray image processing](content/tutorial-x-ray-image-processing.md) ## Contributing diff --git a/content/tutorial-x-ray-image-processing.ipynb b/content/tutorial-x-ray-image-processing.ipynb deleted file mode 100644 index 97f12b6b..00000000 --- a/content/tutorial-x-ray-image-processing.ipynb +++ /dev/null @@ -1,726 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Tutorial: X-ray image processing" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This tutorial demonstrates how to read and process X-ray images with NumPy, imageio, Matplotlib and SciPy. You will learn how to load medical images, focus on certain parts, and visually compare them using mask filters and the [Sobel](https://en.wikipedia.org/wiki/Sobel_operator) and [Canny](https://en.wikipedia.org/wiki/Canny_edge_detector) edge detectors. X-ray image analysis can be part of your data analysis and [machine learning workflow](https://www.sciencedirect.com/science/article/pii/S235291481930214X) when, for example, you're building an algorithm that helps [detect pneumonia](https://www.kaggle.com/c/rsna-pneumonia-detection-challenge) as part of a [Kaggle](https://www.kaggle.com) [competition](https://www.kaggle.com/eduardomineo/u-net-lung-segmentation-montgomery-shenzhen). In the healthcare industry, medical image processing and analysis is particularly important when images are estimated to account for [at least 90%](https://www-03.ibm.com/press/us/en/pressrelease/51146.wss) of all medical data.\n", - "\n", - "You'll be working with radiology images from the [ChestX-ray8](https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community\n", - ") dataset provided by the [National Institutes of Health (NIH)](http://nih.gov). ChestX-ray8 contains over 100,000 de-identified X-ray images in the PNG format from more than 30,000 patients. You can find ChestX-ray8's files on NIH's public Box [repository](https://nihcc.app.box.com/v/ChestXray-NIHCC) in the `/images` folder. (For more details, refer to the research [paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Wang_ChestX-ray8_Hospital-Scale_Chest_CVPR_2017_paper.pdf) published at CVPR (a computer vision conference) in 2017.)\n", - "\n", - "For your convenience, a small number of PNG images have been saved to this tutorial's repository under `/tutorial-x-ray-image-processing`, since ChestX-ray8 contains gigabytes of data and you may find it challenging to download it in batches.\n", - "\n", - "
\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Prerequisites" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The reader should have some knowledge of Python, NumPy arrays, and Matplotlib. To refresh the memory, you can take the [Python](https://docs.python.org/dev/tutorial/index.html) and Matplotlib [PyPlot](https://matplotlib.org/tutorials/introductory/pyplot.html) tutorials, and the NumPy [quickstart](https://numpy.org/devdocs/user/quickstart.html).\n", - "\n", - "The following packages are used in this tutorial:\n", - "\n", - "- [imageio](https://imageio.github.io) for reading and writing image data. The healthcare industry usually works with the [DICOM](https://en.wikipedia.org/wiki/DICOM) format for medical imaging and [imageio](https://imageio.readthedocs.io/en/stable/format_dicom.html) should be well-suited for reading that format. For simplicity, in this tutorial you'll be working with PNG files.\n", - "- [Matplotlib](https://matplotlib.org/) for data visualization.\n", - "- [SciPy](https://www.scipy.org) for multi-dimensional image processing via [`ndimage`](https://docs.scipy.org/doc/scipy/reference/ndimage.html).\n", - "\n", - "This tutorial can be run locally in an isolated environment, such as [Virtualenv](https://virtualenv.pypa.io/en/stable/) or [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). You can use [Jupyter Notebook or JupyterLab](https://jupyter.org/install) to run each notebook cell." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Table of contents" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Examine an X-ray with `imageio`\n", - "2. Combine images with `np.stack()` to demonstrate progression\n", - "3. Edge detection using the Laplacian-Gaussian, Gaussian gradient, Sobel, and Canny filters\n", - "4. Apply masks to X-rays with `np.where()`\n", - "5. Compare the results\n", - "\n", - "---" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Examine an X-ray with `imageio`" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's begin with a simple example using just one X-ray image from the ChestX-ray8 dataset. \n", - "\n", - "The file — `00000011_001.png` — has been downloaded for you and saved in the `/tutorial-x-ray-image-processing` folder.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Load the image with `imageio`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "import imageio\n", - "\n", - "DIR = 'tutorial-x-ray-image-processing/'\n", - "\n", - "xray_image = imageio.imread(os.path.join(DIR, '00000011_001.png'))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "2. Check that its shape is 1024x1024 pixels and that the array is made up of 8-bit integers:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(xray_image.shape)\n", - "print(xray_image.dtype)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "3. Import `matplotlib` and display the image in a grayscale colormap:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import matplotlib.pyplot as plt\n", - "\n", - "plt.imshow(xray_image, cmap='gray')\n", - "plt.axis('off')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Combine images with `np.stack()` to demonstrate progression" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "With NumPy's `np.stack()` you can combine multiple X-rays to make an n-dimensional array and then show the \"health progress\" in a sequential manner.\n", - "\n", - "In the next example, instead of 1 image you'll use 8 X-ray 1024x1024-pixel images from the ChestX-ray8 dataset that have been downloaded and extracted from one of the dataset files. They are numbered from `...000.png` to `...008.png` and let's assume they belong to the same patient." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Import NumPy, read in each of the X-rays, and stack them together with `np.stack()`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "\n", - "file1 = imageio.imread(os.path.join(DIR + '00000011_000.png'))\n", - "file2 = imageio.imread(os.path.join(DIR + '00000011_001.png'))\n", - "file3 = imageio.imread(os.path.join(DIR + '00000011_003.png'))\n", - "file4 = imageio.imread(os.path.join(DIR + '00000011_004.png'))\n", - "file5 = imageio.imread(os.path.join(DIR + '00000011_005.png'))\n", - "file6 = imageio.imread(os.path.join(DIR + '00000011_006.png'))\n", - "file7 = imageio.imread(os.path.join(DIR + '00000011_007.png'))\n", - "file8 = imageio.imread(os.path.join(DIR + '00000011_008.png'))\n", - "\n", - "combined_xray_images_1 = np.stack([file1, file2, file3, file4, file5, file6, file7, file8])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Alternatively, you can `append` the image arrays as follows:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "combined_xray_images_2 = []\n", - "\n", - "for i in range(8):\n", - " single_xray_image = imageio.imread(os.path.join(DIR + '00000011_00'+str(i)+'.png'))\n", - " combined_xray_images_2.append(single_xray_image)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "_Note on performance:_\n", - "\n", - "- `append`ing the images may no be faster. If you care about performance, you should probably use `np.stack()`, as evidenced when you try to time the code with Python's `timeit`:\n", - "\n", - " ```python\n", - " %timeit combined_xray_images_1 = np.stack([file1, file2, file3, file4, file5, file6, file7, file8])\n", - " ```\n", - "\n", - " Example output:\n", - "\n", - " ```\n", - " 1.52 ms ± 49.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n", - " ```\n", - "\n", - " ```python\n", - " %timeit C = [combined_xray_images_2.append(imageio.imread(os.path.join(DIR + '00000011_00'+str(i)+'.png'))) for i in range(8)]\n", - " ```\n", - "\n", - " Example output:\n", - "\n", - " ```\n", - " 159 ms ± 2.69 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n", - " ```\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "2. Check the shape of the new X-ray image array containing 8 stacked images:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "combined_xray_images_1.shape" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "3. You can now display the \"health progress\" by plotting each of frames next to each other using Matplotlib:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fig, axes = plt.subplots(nrows=1, ncols=8, figsize=(30, 30))\n", - "\n", - "for i in range(8):\n", - " x = combined_xray_images_1[i]\n", - " axes[i].imshow(x, cmap='gray')\n", - " axes[i].axis('off')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "4. In addition, it can be helpful to show the progress as an animation. Let's create a GIF file with `imageio.mimwrite()` and display the result in the notebook:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from IPython.display import Image\n", - "\n", - "GIF_PATH = os.path.join(DIR, 'xray_image.gif')\n", - "imageio.mimwrite(GIF_PATH, combined_xray_images_1, format= '.gif', fps = 1) \n", - "\n", - "Image(filename=GIF_PATH, width=400, height=400)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Edge detection using the Laplacian-Gaussian, Gaussian gradient, Sobel, and Canny filters" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "When processing biomedical data, it can be useful to emphasize the 2D [\"edges\"](https://en.wikipedia.org/wiki/Edge_detection) to focus on particular features in an image. To do that, using [image gradients](https://en.wikipedia.org/wiki/Image_gradient) can be particularly helpful when detecting the change of color pixel intensity." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### The Laplace filter with Gaussian second derivatives\n", - "\n", - "Let's start with an n-dimensional [Laplace](https://en.wikipedia.org/wiki/Laplace_distribution) filter (\"Laplacian-Gaussian\") that uses [Gaussian](https://en.wikipedia.org/wiki/Normal_distribution) second derivatives. This Laplacian method focuses on pixels with rapid intensity change in values and is combined with Gaussian smoothing to [remove noise](https://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm). Let's examine how it can be useful in analyzing 2D X-ray images.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "- The implementation of the Laplacian-Gaussian filter is relatively straightforward: 1) import the `ndimage` module from SciPy; and 2) call [`scipy.ndimage.gaussian_laplace()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.gaussian_laplace.html) with a sigma (scalar) parameter, which affects the standard deviations of the Gaussian filter (you'll use `1` in the example below):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from scipy import ndimage\n", - "\n", - "xray_image_laplace_gaussian = ndimage.gaussian_laplace(xray_image, sigma=1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "2. Display the original X-ray and the one with the Laplacian-Gaussian filter:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 10))\n", - "\n", - "axes[0].set_title('Original')\n", - "axes[0].imshow(xray_image, cmap='gray')\n", - "axes[1].set_title('Laplacian-Gaussian (edges)')\n", - "axes[1].imshow(xray_image_laplace_gaussian, cmap='gray')\n", - "for i in axes:\n", - " i.axis('off')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### The Gaussian gradient magnitude method\n", - "\n", - "Another method for edge detection that can be useful is the [Gaussian](https://en.wikipedia.org/wiki/Normal_distribution) (gradient) filter. It computes the multidimensional gradient magnitude with Gaussian derivatives and helps by remove [high-frequency](https://www.cs.cornell.edu/courses/cs6670/2011sp/lectures/lec02_filter.pdf) image components.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Call [`scipy.ndimage.gaussian_gradient_magnitude()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.gaussian_gradient_magnitude.html) with a sigma (scalar) parameter (for standard deviations; you'll use `2` in the example below):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "x_ray_image_gaussian_gradient = ndimage.gaussian_gradient_magnitude(xray_image, sigma=2)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "2. Display the original X-ray and the one with the Gaussian gradient filter:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 10))\n", - "\n", - "axes[0].set_title('Original')\n", - "axes[0].imshow(xray_image, cmap='gray')\n", - "axes[1].set_title('Gaussian gradient (edges)')\n", - "axes[1].imshow(x_ray_image_gaussian_gradient, cmap='gray')\n", - "for i in axes:\n", - " i.axis('off')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### The Sobel-Feldman operator (the Sobel filter)\n", - "\n", - "To find regions of high spatial frequency (the edges or the edge maps) along the horizontal and vertical axes of a 2D X-ray image, you can use the [Sobel-Feldman operator (Sobel filter)](https://en.wikipedia.org/wiki/Sobel_operator) technique. The Sobel filter applies two 3x3 kernel matrices — one for each axis — onto the X-ray through a [convolution](https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution). Then, these two points (gradients) are combined using the [Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) to produce a gradient magnitude." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Use the Sobel filters — ([`scipy.ndimage.sobel()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.sobel.html)) — on x- and y-axes of the X-ray. Then, calculate the distance between `x` and `y` (with the Sobel filters applied to them) using the [Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) and NumPy's [`np.hypot()`](https://numpy.org/doc/stable/reference/generated/numpy.hypot.html) to obtain the magnitude. Finally, normalize the rescaled image for the pixel values to be between 0 and 255. \n", - "\n", - " [Image normalization](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) follows the `output_channel = 255.0 * (input_channel - min_value) / (max_value - min_value)` [formula](http://dev.ipol.im/~nmonzon/Normalization.pdf). Because you're using a grayscale image, you need to normalize just one channel." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "x_sobel = ndimage.sobel(xray_image, axis=0)\n", - "y_sobel = ndimage.sobel(xray_image, axis=1)\n", - "\n", - "xray_image_sobel = np.hypot(x_sobel, y_sobel)\n", - "\n", - "xray_image_sobel *= 255.0 / np.max(xray_image_sobel)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "2. Change the new image array data type to the 32-bit floating-point format from `float16` to [make it compatible](https://github.com/matplotlib/matplotlib/issues/15432) with Matplotlib:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print('The data type - before: ', xray_image_sobel.dtype)\n", - "\n", - "xray_image_sobel = xray_image_sobel.astype('float32')\n", - "\n", - "print('The data type - after: ', xray_image_sobel.dtype)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "3. Display the original X-ray and the one with the Sobel \"edge\" filter applied. Note that both the grayscale and `CMRmap` colormaps are used to help emphasize the edges:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(15, 15))\n", - "\n", - "axes[0].set_title('Original')\n", - "axes[0].imshow(xray_image, cmap='gray')\n", - "axes[1].set_title('Sobel (edges) - grayscale')\n", - "axes[1].imshow(xray_image_sobel, cmap='gray')\n", - "axes[2].set_title('Sobel (edges) - CMRmap')\n", - "axes[2].imshow(xray_image_sobel, cmap='CMRmap')\n", - "for i in axes:\n", - " i.axis('off')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### The Canny filter\n", - "\n", - "You can also consider using another well-known filter for edge detection called the [Canny filter](https://en.wikipedia.org/wiki/Canny_edge_detector).\n", - "\n", - "First, you apply a [Gaussian](https://en.wikipedia.org/wiki/Gaussian_filter) filter to remove the noise in an image. In this example, you're using using the [Fourier](https://en.wikipedia.org/wiki/Fourier_transform) filter which smoothens the X-ray through a [convolution](https://en.wikipedia.org/wiki/Convolution) process. Next, you apply the [Prewitt filter](https://en.wikipedia.org/wiki/Prewitt_operator) on each of the 2 axes of the image to help detect some of the edges — this will result in 2 gradient values. Similar to the Sobel filter, the Prewitt operator also applies two 3x3 kernel matrices — one for each axis — onto the X-ray through a [convolution](https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution). In the end, you compute the magnitude between the two gradients using the [Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) and [normalize](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) the images, as before." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Use SciPy's Fourier filters — [`scipy.ndimage.fourier_gaussian()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.fourier_gaussian.html) — with a small `sigma` value to remove some of the noise from the X-ray. Then, calculate two gradients using [`scipy.ndimage.prewitt()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.prewitt.html). Next, measure the distance between the gradients using NumPy's `np.hypot()`. Finally, [normalize](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) the rescaled image, as before." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fourier_gaussian = ndimage.fourier_gaussian(xray_image, sigma=0.05)\n", - "\n", - "x_prewitt = ndimage.prewitt(fourier_gaussian, axis=0)\n", - "y_prewitt = ndimage.prewitt(fourier_gaussian, axis=1)\n", - "\n", - "xray_image_canny = np.hypot(x_prewitt, y_prewitt)\n", - "\n", - "xray_image_canny *= 255.0 / np.max(xray_image_canny)\n", - "\n", - "print('The data type - ', xray_image_canny.dtype)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "2. Plot the original X-ray image and the ones with the edges detected with the help of the Canny filter technique. The edges can be emphasized using the `prism`, `nipy_spectral`, and `terrain` Matplotlib colormaps." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fig, axes = plt.subplots(nrows=1, ncols=4, figsize=(20, 15))\n", - "\n", - "axes[0].set_title('Original')\n", - "axes[0].imshow(xray_image, cmap='gray')\n", - "axes[1].set_title('Canny (edges) - prism')\n", - "axes[1].imshow(xray_image_canny, cmap='prism')\n", - "axes[2].set_title('Canny (edges) - nipy_spectral')\n", - "axes[2].imshow(xray_image_canny, cmap='nipy_spectral')\n", - "axes[3].set_title('Canny (edges) - terrain')\n", - "axes[3].imshow(xray_image_canny, cmap='terrain')\n", - "for i in axes:\n", - " i.axis('off')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Apply masks to X-rays with `np.where()`" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To screen out only certain pixels in X-ray images to help detect particular features, you can apply masks with NumPy's [`np.where(condition: array_like (bool), x: array_like, y: ndarray)`](https://numpy.org/doc/stable/reference/generated/numpy.where.html) that returns `x` when `True` and `y` when `False`.\n", - "\n", - "Identifying regions of interest — certain sets of pixels in an image — can be useful and masks serve as boolean arrays of the same shape as the original image." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Retrieve some basics statistics about the pixel values in the original X-ray image you've been working with:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print('The data type of the X-ray image is: ', xray_image.dtype)\n", - "print('The minimum pixel value is: ', np.min(xray_image))\n", - "print('The maximum pixel value is: ', np.max(xray_image))\n", - "print('The average pixel value is: ', np.mean(xray_image))\n", - "print('The median pixel value is: ', np.median(xray_image))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "2. The array data type is `uint8` and the minimum/maximum value results suggest that all 256 colors (from `0` to `255`) are used in the X-ray. Let's visualize the _pixel intensity distribution_ of the original raw X-ray image with `ndimage.histogram()` and Matplotlib:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "pixel_intensity_distribution = ndimage.histogram(xray_image, min=np.min(xray_image), max=np.max(xray_image), bins=256)\n", - "\n", - "plt.plot(pixel_intensity_distribution)\n", - "plt.title('Pixel intensity distribution')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As the pixel intensity distribution suggests, there are many low (between around 0 and 20) and very high (between around 200 and 240) pixel values. \n", - "\n", - "3. You can create different conditional masks with NumPy's `np.where()` — for example, let's have only those values of the image with the pixels exceeding a certain threshold:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# The threshold is \"greater than 150\"\n", - "# Return the original image if true, `0` otherwise\n", - "xray_image_mask_noisy = np.where(xray_image > 150, xray_image, 0)\n", - "\n", - "plt.imshow(xray_image_mask_noisy, cmap='gray')\n", - "plt.axis('off')\n", - "plt.show()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# The threshold is \"greater than 150\"\n", - "# Return `1` if true, `0` otherwise\n", - "xray_image_mask_less_noisy = np.where(xray_image > 150, 1, 0)\n", - "\n", - "plt.imshow(xray_image_mask_less_noisy, cmap='gray')\n", - "plt.axis('off')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Compare the results" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's display some of the results of processed X-ray images you've worked with so far:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fig, axes = plt.subplots(nrows=1, ncols=9, figsize=(30, 30))\n", - "\n", - "axes[0].set_title('Original')\n", - "axes[0].imshow(xray_image, cmap='gray')\n", - "axes[1].set_title('Laplace-Gaussian (edges)')\n", - "axes[1].imshow(xray_image_laplace_gaussian, cmap='gray')\n", - "axes[2].set_title('Gaussian gradient (edges)')\n", - "axes[2].imshow(x_ray_image_gaussian_gradient, cmap='gray')\n", - "axes[3].set_title('Sobel (edges) - grayscale')\n", - "axes[3].imshow(xray_image_sobel, cmap='gray')\n", - "axes[4].set_title('Sobel (edges) - hot')\n", - "axes[4].imshow(xray_image_sobel, cmap='hot')\n", - "axes[5].set_title('Canny (edges) - prism)')\n", - "axes[5].imshow(xray_image_canny, cmap='prism')\n", - "axes[6].set_title('Canny (edges) - nipy_spectral)')\n", - "axes[6].imshow(xray_image_canny, cmap='nipy_spectral')\n", - "axes[7].set_title('Mask (> 150, noisy)')\n", - "axes[7].imshow(xray_image_mask_noisy, cmap='gray')\n", - "axes[8].set_title('Mask (> 150, less noisy)')\n", - "axes[8].imshow(xray_image_mask_less_noisy, cmap='gray')\n", - "for i in axes:\n", - " i.axis('off')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Next steps" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If you want to use your own samples, you can use [this image](https://openi.nlm.nih.gov/detailedresult?img=CXR3666_IM-1824-1001&query=chest%20infection&it=xg&req=4&npos=32) or search for various other ones on the [_Openi_](https://openi.nlm.nih.gov) database. Openi contains many biomedical images and it can be especially helpful if you have low bandwidth and/or are restricted by the amount of data you can download.\n", - "\n", - "To learn more about image processing in the context of biomedical image data or simply edge detection, you may find the following material useful:\n", - "\n", - "- [DICOM processing and segmentation in Python](https://www.raddq.com/dicom-processing-segmentation-visualization-in-python/) with Scikit-Image and pydicom (Radiology Data Quest)\n", - "- [Image manipulation and processing using Numpy and Scipy](https://scipy-lectures.org/advanced/image_processing/index.html) (Scipy Lecture Notes)\n", - "- [Intensity values](https://s3.amazonaws.com/assets.datacamp.com/production/course_7032/slides/chapter2.pdf) (presentation, DataCamp)\n", - "- [Object detection with Raspberry Pi and Python](https://makersportal.com/blog/2019/4/23/image-processing-with-raspberry-pi-and-python-part-ii-spatial-statistics-and-correlations) (Maker Portal)\n", - "- [X-ray data preparation and segmentation](https://www.kaggle.com/eduardomineo/u-net-lung-segmentation-montgomery-shenzhen) with deep learning (a Kaggle-hosted Jupyter notebook)\n", - "- [Image filtering](https://www.cs.cornell.edu/courses/cs6670/2011sp/lectures/lec02_filter.pdf) (lecture slides, CS6670: Computer Vision, Cornell University)\n", - "- [Edge detection in Python](https://towardsdatascience.com/edge-detection-in-python-a3c263a13e03) and NumPy (Towards Data Science)\n", - "- [Edge detection](https://datacarpentry.org/image-processing/08-edge-detection/) with Scikit-Image (Data Carpentry)\n", - "- [Image gradients and gradient filtering](www.cs.cmu.edu/~16385/s17/Slides/4.0_Image_Gradients_and_Gradient_Filtering.pdf) (lecture slides, 16-385 Computer Vision, Carnegie Mellon University)\n" - ] - } - ], - "metadata": { - "colab": { - "name": "tutorial-x-ray-image-processing.ipynb", - "provenance": [], - "toc_visible": true - }, - "kernelspec": { - "display_name": "Python 3", - "name": "python3" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/content/tutorial-x-ray-image-processing.md b/content/tutorial-x-ray-image-processing.md new file mode 100644 index 00000000..71a4669d --- /dev/null +++ b/content/tutorial-x-ray-image-processing.md @@ -0,0 +1,572 @@ +--- +jupytext: + text_representation: + extension: .md + format_name: myst + format_version: 0.12 + jupytext_version: 1.7.1 +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Tutorial: X-ray image processing + ++++ + +This tutorial demonstrates how to read and process X-ray images with NumPy, +imageio, Matplotlib and SciPy. You will learn how to load medical images, focus +on certain parts, and visually compare them using the +[Gaussian](https://en.wikipedia.org/wiki/Gaussian_filter), +[Laplacian-Gaussian](https://en.wikipedia.org/wiki/Laplace_distribution), +[Sobel](https://en.wikipedia.org/wiki/Sobel_operator), and +[Canny](https://en.wikipedia.org/wiki/Canny_edge_detector) filters for edge +detection. + +X-ray image analysis can be part of your data analysis and +[machine learning workflow](https://www.sciencedirect.com/science/article/pii/S235291481930214X) +when, for example, you're building an algorithm that helps +[detect pneumonia](https://www.kaggle.com/c/rsna-pneumonia-detection-challenge) +as part of a [Kaggle](https://www.kaggle.com) +[competition](https://www.kaggle.com/eduardomineo/u-net-lung-segmentation-montgomery-shenzhen). +In the healthcare industry, medical image processing and analysis is +particularly important when images are estimated to account for +[at least 90%](https://www-03.ibm.com/press/us/en/pressrelease/51146.wss) of all +medical data. + +You'll be working with radiology images from the +[ChestX-ray8](https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community) +dataset provided by the [National Institutes of Health (NIH)](http://nih.gov). +ChestX-ray8 contains over 100,000 de-identified X-ray images in the PNG format +from more than 30,000 patients. You can find ChestX-ray8's files on NIH's public +Box [repository](https://nihcc.app.box.com/v/ChestXray-NIHCC) in the `/images` +folder. (For more details, refer to the research +[paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Wang_ChestX-ray8_Hospital-Scale_Chest_CVPR_2017_paper.pdf) +published at CVPR (a computer vision conference) in 2017.) + +For your convenience, a small number of PNG images have been saved to this +tutorial's repository under `tutorial-x-ray-image-processing/`, since +ChestX-ray8 contains gigabytes of data and you may find it challenging to +download it in batches. + +![image.png](tutorial-x-ray-image-processing.png) + ++++ + +## Prerequisites + ++++ + +The reader should have some knowledge of Python, NumPy arrays, and Matplotlib. +To refresh the memory, you can take the +[Python](https://docs.python.org/dev/tutorial/index.html) and Matplotlib +[PyPlot](https://matplotlib.org/tutorials/introductory/pyplot.html) tutorials, +and the NumPy [quickstart](https://numpy.org/devdocs/user/quickstart.html). + +The following packages are used in this tutorial: + +- [imageio](https://imageio.github.io) for reading and writing image data. The +healthcare industry usually works with the +[DICOM](https://en.wikipedia.org/wiki/DICOM) format for medical imaging and +[imageio](https://imageio.readthedocs.io/en/stable/format_dicom.html) should be +well-suited for reading that format. For simplicity, in this tutorial you'll be +working with PNG files. +- [Matplotlib](https://matplotlib.org/) for data visualization. +- [SciPy](https://www.scipy.org) for multi-dimensional image processing via +[`ndimage`](https://docs.scipy.org/doc/scipy/reference/ndimage.html). + +This tutorial can be run locally in an isolated environment, such as +[Virtualenv](https://virtualenv.pypa.io/en/stable/) or +[conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). +You can use [Jupyter Notebook or JupyterLab](https://jupyter.org/install) to run +each notebook cell. + ++++ + +## Table of contents + ++++ + +1. Examine an X-ray with `imageio` +2. Combine images with `np.stack()` to demonstrate progression +3. Edge detection using the Laplacian-Gaussian, Gaussian gradient, Sobel, and + Canny filters +4. Apply masks to X-rays with `np.where()` +5. Compare the results + +--- + ++++ + +## Examine an X-ray with `imageio` + ++++ + +Let's begin with a simple example using just one X-ray image from the +ChestX-ray8 dataset. + +The file — `00000011_001.png` — has been downloaded for you and saved in the +`/tutorial-x-ray-image-processing` folder. + ++++ + +**1.** Load the image with `imageio`: + +```{code-cell} ipython3 +import os +import imageio + +DIR = 'tutorial-x-ray-image-processing' + +xray_image = imageio.imread(os.path.join(DIR, '00000011_001.png')) +``` + +**2.** Check that its shape is 1024x1024 pixels and that the array is made up of +8-bit integers: + +```{code-cell} ipython3 +print(xray_image.shape) +print(xray_image.dtype) +``` + +**3.** Import `matplotlib` and display the image in a grayscale colormap: + +```{code-cell} ipython3 +import matplotlib.pyplot as plt + +plt.imshow(xray_image, cmap='gray') +plt.axis('off') +plt.show() +``` + +## Combine images with `np.stack()` to demonstrate progression + ++++ + +With NumPy's `np.stack()` you can combine multiple X-rays to make an +n-dimensional array and then show the "health progress" in a sequential manner. + +In the next example, instead of 1 image you'll use 8 X-ray 1024x1024-pixel +images from the ChestX-ray8 dataset that have been downloaded and extracted +from one of the dataset files. They are numbered from `...000.png` to +`...008.png` and let's assume they belong to the same patient. + ++++ + +**1.** Import NumPy, read in each of the X-rays, and stack them together with +`np.stack()`: + +```{code-cell} ipython3 +import numpy as np + +file1 = imageio.imread(os.path.join(DIR, '00000011_000.png')) +file2 = imageio.imread(os.path.join(DIR, '00000011_001.png')) +file3 = imageio.imread(os.path.join(DIR, '00000011_003.png')) +file4 = imageio.imread(os.path.join(DIR, '00000011_004.png')) +file5 = imageio.imread(os.path.join(DIR, '00000011_005.png')) +file6 = imageio.imread(os.path.join(DIR, '00000011_006.png')) +file7 = imageio.imread(os.path.join(DIR, '00000011_007.png')) +file8 = imageio.imread(os.path.join(DIR, '00000011_008.png')) + +combined_xray_images_1 = np.stack([file1, file2, file3, file4, file5, file6, file7, file8]) +``` + +Alternatively, you can `append` the image arrays as follows: + +```{code-cell} ipython3 +combined_xray_images_2 = [] + +for i in range(8): + single_xray_image = imageio.imread(os.path.join(DIR, '00000011_00'+str(i)+'.png')) + combined_xray_images_2.append(single_xray_image) +``` + +_Note on performance:_ + +- `append`ing the images may no be faster. If you care about performance, you + should probably use `np.stack()`, as evidenced when you try to time the code + with Python's `timeit`: + + ```python + %timeit combined_xray_images_1 = np.stack([file1, file2, file3, file4, file5, file6, file7, file8]) + ``` + + Example output: + + ``` + 1.52 ms ± 49.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) + ``` + + ```python + %timeit C = [combined_xray_images_2.append(imageio.imread(os.path.join(DIR, '00000011_00'+str(i)+'.png'))) for i in range(8)] + ``` + + Example output: + + ``` + 159 ms ± 2.69 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) + ``` + ++++ + +**2.** Check the shape of the new X-ray image array containing 8 stacked images: + +```{code-cell} ipython3 +combined_xray_images_1.shape +``` + +**3.** You can now display the "health progress" by plotting each of frames next +to each other using Matplotlib: + +```{code-cell} ipython3 +fig, axes = plt.subplots(nrows=1, ncols=8, figsize=(30, 30)) + +for i in range(8): + x = combined_xray_images_1[i] + axes[i].imshow(x, cmap='gray') + axes[i].axis('off') +``` + +**4.** In addition, it can be helpful to show the progress as an animation. +Let's create a GIF file with `imageio.mimwrite()` and display the result in the +notebook: + +```{code-cell} ipython3 +GIF_PATH = os.path.join(DIR, 'xray_image.gif') +imageio.mimwrite(GIF_PATH, combined_xray_images_1, format= '.gif', fps = 1) +``` +Which gives us: +![health_progress](tutorial-x-ray-image-processing/xray_image.gif) +## Edge detection using the Laplacian-Gaussian, Gaussian gradient, Sobel, and Canny filters + ++++ + +When processing biomedical data, it can be useful to emphasize the 2D +["edges"](https://en.wikipedia.org/wiki/Edge_detection) to focus on particular +features in an image. To do that, using +[image gradients](https://en.wikipedia.org/wiki/Image_gradient) can be +particularly helpful when detecting the change of color pixel intensity. + ++++ + +### The Laplace filter with Gaussian second derivatives + +Let's start with an n-dimensional +[Laplace](https://en.wikipedia.org/wiki/Laplace_distribution) filter +("Laplacian-Gaussian") that uses +[Gaussian](https://en.wikipedia.org/wiki/Normal_distribution) second +derivatives. This Laplacian method focuses on pixels with rapid intensity change +in values and is combined with Gaussian smoothing to +[remove noise](https://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm). Let's examine +how it can be useful in analyzing 2D X-ray images. + ++++ + +- The implementation of the Laplacian-Gaussian filter is relatively +straightforward: 1) import the `ndimage` module from SciPy; and 2) call +[`scipy.ndimage.gaussian_laplace()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.gaussian_laplace.html) +with a sigma (scalar) parameter, which affects the standard deviations of the +Gaussian filter (you'll use `1` in the example below): + +```{code-cell} ipython3 +from scipy import ndimage + +xray_image_laplace_gaussian = ndimage.gaussian_laplace(xray_image, sigma=1) +``` + +Display the original X-ray and the one with the Laplacian-Gaussian filter: + +```{code-cell} ipython3 +fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 10)) + +axes[0].set_title('Original') +axes[0].imshow(xray_image, cmap='gray') +axes[1].set_title('Laplacian-Gaussian (edges)') +axes[1].imshow(xray_image_laplace_gaussian, cmap='gray') +for i in axes: + i.axis('off') +plt.show() +``` + +### The Gaussian gradient magnitude method + +Another method for edge detection that can be useful is the +[Gaussian](https://en.wikipedia.org/wiki/Normal_distribution) (gradient) filter. +It computes the multidimensional gradient magnitude with Gaussian derivatives +and helps by remove +[high-frequency](https://www.cs.cornell.edu/courses/cs6670/2011sp/lectures/lec02_filter.pdf) +image components. + ++++ + +**1.** Call [`scipy.ndimage.gaussian_gradient_magnitude()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.gaussian_gradient_magnitude.html) +with a sigma (scalar) parameter (for standard deviations; you'll use `2` in the +example below): + +```{code-cell} ipython3 +x_ray_image_gaussian_gradient = ndimage.gaussian_gradient_magnitude(xray_image, sigma=2) +``` + +**2.** Display the original X-ray and the one with the Gaussian gradient filter: + +```{code-cell} ipython3 +fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 10)) + +axes[0].set_title('Original') +axes[0].imshow(xray_image, cmap='gray') +axes[1].set_title('Gaussian gradient (edges)') +axes[1].imshow(x_ray_image_gaussian_gradient, cmap='gray') +for i in axes: + i.axis('off') +plt.show() +``` + +### The Sobel-Feldman operator (the Sobel filter) + +To find regions of high spatial frequency (the edges or the edge maps) along the +horizontal and vertical axes of a 2D X-ray image, you can use the +[Sobel-Feldman operator (Sobel filter)](https://en.wikipedia.org/wiki/Sobel_operator) +technique. The Sobel filter applies two 3x3 kernel matrices — one for each axis +— onto the X-ray through a [convolution](https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution). +Then, these two points (gradients) are combined using the +[Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) to +produce a gradient magnitude. + ++++ + +**1.** Use the Sobel filters — ([`scipy.ndimage.sobel()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.sobel.html)) +— on x- and y-axes of the X-ray. Then, calculate the distance between `x` and +`y` (with the Sobel filters applied to them) using the +[Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) and +NumPy's [`np.hypot()`](https://numpy.org/doc/stable/reference/generated/numpy.hypot.html) +to obtain the magnitude. Finally, normalize the rescaled image for the pixel +values to be between 0 and 255. + + [Image normalization](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) + follows the `output_channel = 255.0 * (input_channel - min_value) / (max_value - min_value)` + [formula](http://dev.ipol.im/~nmonzon/Normalization.pdf). Because you're + using a grayscale image, you need to normalize just one channel. + +```{code-cell} ipython3 +x_sobel = ndimage.sobel(xray_image, axis=0) +y_sobel = ndimage.sobel(xray_image, axis=1) + +xray_image_sobel = np.hypot(x_sobel, y_sobel) + +xray_image_sobel *= 255.0 / np.max(xray_image_sobel) +``` + +**2.** Change the new image array data type to the 32-bit floating-point format +from `float16` to [make it compatible](https://github.com/matplotlib/matplotlib/issues/15432) +with Matplotlib: + +```{code-cell} ipython3 +print('The data type - before: ', xray_image_sobel.dtype) + +xray_image_sobel = xray_image_sobel.astype('float32') + +print('The data type - after: ', xray_image_sobel.dtype) +``` + +**3.** Display the original X-ray and the one with the Sobel "edge" filter +applied. Note that both the grayscale and `CMRmap` colormaps are used to help +emphasize the edges: + +```{code-cell} ipython3 +fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(15, 15)) + +axes[0].set_title('Original') +axes[0].imshow(xray_image, cmap='gray') +axes[1].set_title('Sobel (edges) - grayscale') +axes[1].imshow(xray_image_sobel, cmap='gray') +axes[2].set_title('Sobel (edges) - CMRmap') +axes[2].imshow(xray_image_sobel, cmap='CMRmap') +for i in axes: + i.axis('off') +plt.show() +``` + +### The Canny filter + +You can also consider using another well-known filter for edge detection called +the [Canny filter](https://en.wikipedia.org/wiki/Canny_edge_detector). + +First, you apply a [Gaussian](https://en.wikipedia.org/wiki/Gaussian_filter) +filter to remove the noise in an image. In this example, you're using using the +[Fourier](https://en.wikipedia.org/wiki/Fourier_transform) filter which +smoothens the X-ray through a [convolution](https://en.wikipedia.org/wiki/Convolution) +process. Next, you apply the [Prewitt filter](https://en.wikipedia.org/wiki/Prewitt_operator) +on each of the 2 axes of the image to help detect some of the edges — this will +result in 2 gradient values. Similar to the Sobel filter, the Prewitt operator +also applies two 3x3 kernel matrices — one for each axis — onto the X-ray +through a [convolution](https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution). +In the end, you compute the magnitude between the two gradients using the +[Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) and +[normalize](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) +the images, as before. + ++++ + +**1.** Use SciPy's Fourier filters — [`scipy.ndimage.fourier_gaussian()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.fourier_gaussian.html) +— with a small `sigma` value to remove some of the noise from the X-ray. Then, +calculate two gradients using [`scipy.ndimage.prewitt()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.prewitt.html). +Next, measure the distance between the gradients using NumPy's `np.hypot()`. +Finally, [normalize](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) +the rescaled image, as before. + +```{code-cell} ipython3 +fourier_gaussian = ndimage.fourier_gaussian(xray_image, sigma=0.05) + +x_prewitt = ndimage.prewitt(fourier_gaussian, axis=0) +y_prewitt = ndimage.prewitt(fourier_gaussian, axis=1) + +xray_image_canny = np.hypot(x_prewitt, y_prewitt) + +xray_image_canny *= 255.0 / np.max(xray_image_canny) + +print('The data type - ', xray_image_canny.dtype) +``` + +**2.** Plot the original X-ray image and the ones with the edges detected with +the help of the Canny filter technique. The edges can be emphasized using the +`prism`, `nipy_spectral`, and `terrain` Matplotlib colormaps. + +```{code-cell} ipython3 +fig, axes = plt.subplots(nrows=1, ncols=4, figsize=(20, 15)) + +axes[0].set_title('Original') +axes[0].imshow(xray_image, cmap='gray') +axes[1].set_title('Canny (edges) - prism') +axes[1].imshow(xray_image_canny, cmap='prism') +axes[2].set_title('Canny (edges) - nipy_spectral') +axes[2].imshow(xray_image_canny, cmap='nipy_spectral') +axes[3].set_title('Canny (edges) - terrain') +axes[3].imshow(xray_image_canny, cmap='terrain') +for i in axes: + i.axis('off') +plt.show() +``` + +## Apply masks to X-rays with `np.where()` + ++++ + +To screen out only certain pixels in X-ray images to help detect particular +features, you can apply masks with NumPy's +[`np.where(condition: array_like (bool), x: array_like, y: ndarray)`](https://numpy.org/doc/stable/reference/generated/numpy.where.html) +that returns `x` when `True` and `y` when `False`. + +Identifying regions of interest — certain sets of pixels in an image — can be +useful and masks serve as boolean arrays of the same shape as the original +image. + ++++ + +**1.** Retrieve some basics statistics about the pixel values in the original +X-ray image you've been working with: + +```{code-cell} ipython3 +print('The data type of the X-ray image is: ', xray_image.dtype) +print('The minimum pixel value is: ', np.min(xray_image)) +print('The maximum pixel value is: ', np.max(xray_image)) +print('The average pixel value is: ', np.mean(xray_image)) +print('The median pixel value is: ', np.median(xray_image)) +``` + +**2.** The array data type is `uint8` and the minimum/maximum value results +suggest that all 256 colors (from `0` to `255`) are used in the X-ray. Let's +visualize the _pixel intensity distribution_ of the original raw X-ray image +with `ndimage.histogram()` and Matplotlib: + +```{code-cell} ipython3 +pixel_intensity_distribution = ndimage.histogram(xray_image, min=np.min(xray_image), max=np.max(xray_image), bins=256) + +plt.plot(pixel_intensity_distribution) +plt.title('Pixel intensity distribution') +plt.show() +``` + +As the pixel intensity distribution suggests, there are many low (between around +0 and 20) and very high (between around 200 and 240) pixel values. + +**3.** You can create different conditional masks with NumPy's `np.where()` — +for example, let's have only those values of the image with the pixels exceeding +a certain threshold: + +```{code-cell} ipython3 +# The threshold is "greater than 150" +# Return the original image if true, `0` otherwise +xray_image_mask_noisy = np.where(xray_image > 150, xray_image, 0) + +plt.imshow(xray_image_mask_noisy, cmap='gray') +plt.axis('off') +plt.show() +``` + +```{code-cell} ipython3 +# The threshold is "greater than 150" +# Return `1` if true, `0` otherwise +xray_image_mask_less_noisy = np.where(xray_image > 150, 1, 0) + +plt.imshow(xray_image_mask_less_noisy, cmap='gray') +plt.axis('off') +plt.show() +``` + +## Compare the results + ++++ + +Let's display some of the results of processed X-ray images you've worked with +so far: + +```{code-cell} ipython3 +fig, axes = plt.subplots(nrows=1, ncols=9, figsize=(30, 30)) + +axes[0].set_title('Original') +axes[0].imshow(xray_image, cmap='gray') +axes[1].set_title('Laplace-Gaussian (edges)') +axes[1].imshow(xray_image_laplace_gaussian, cmap='gray') +axes[2].set_title('Gaussian gradient (edges)') +axes[2].imshow(x_ray_image_gaussian_gradient, cmap='gray') +axes[3].set_title('Sobel (edges) - grayscale') +axes[3].imshow(xray_image_sobel, cmap='gray') +axes[4].set_title('Sobel (edges) - hot') +axes[4].imshow(xray_image_sobel, cmap='hot') +axes[5].set_title('Canny (edges) - prism)') +axes[5].imshow(xray_image_canny, cmap='prism') +axes[6].set_title('Canny (edges) - nipy_spectral)') +axes[6].imshow(xray_image_canny, cmap='nipy_spectral') +axes[7].set_title('Mask (> 150, noisy)') +axes[7].imshow(xray_image_mask_noisy, cmap='gray') +axes[8].set_title('Mask (> 150, less noisy)') +axes[8].imshow(xray_image_mask_less_noisy, cmap='gray') +for i in axes: + i.axis('off') +plt.show() +``` + +## Next steps + ++++ + +If you want to use your own samples, you can use +[this image](https://openi.nlm.nih.gov/detailedresult?img=CXR3666_IM-1824-1001&query=chest%20infection&it=xg&req=4&npos=32) +or search for various other ones on the [_Openi_](https://openi.nlm.nih.gov) +database. Openi contains many biomedical images and it can be especially helpful +if you have low bandwidth and/or are restricted by the amount of data you can +download. + +To learn more about image processing in the context of biomedical image data or +simply edge detection, you may find the following material useful: + +- [DICOM processing and segmentation in Python](https://www.raddq.com/dicom-processing-segmentation-visualization-in-python/) with Scikit-Image and pydicom (Radiology Data Quest) +- [Image manipulation and processing using Numpy and Scipy](https://scipy-lectures.org/advanced/image_processing/index.html) (Scipy Lecture Notes) +- [Intensity values](https://s3.amazonaws.com/assets.datacamp.com/production/course_7032/slides/chapter2.pdf) (presentation, DataCamp) +- [Object detection with Raspberry Pi and Python](https://makersportal.com/blog/2019/4/23/image-processing-with-raspberry-pi-and-python-part-ii-spatial-statistics-and-correlations) (Maker Portal) +- [X-ray data preparation and segmentation](https://www.kaggle.com/eduardomineo/u-net-lung-segmentation-montgomery-shenzhen) with deep learning (a Kaggle-hosted Jupyter notebook) +- [Image filtering](https://www.cs.cornell.edu/courses/cs6670/2011sp/lectures/lec02_filter.pdf) (lecture slides, CS6670: Computer Vision, Cornell University) +- [Edge detection in Python](https://towardsdatascience.com/edge-detection-in-python-a3c263a13e03) and NumPy (Towards Data Science) +- [Edge detection](https://datacarpentry.org/image-processing/08-edge-detection/) with Scikit-Image (Data Carpentry) +- [Image gradients and gradient filtering](https://www.cs.cmu.edu/~16385/s17/Slides/4.0_Image_Gradients_and_Gradient_Filtering.pdf) (lecture slides, 16-385 Computer Vision, Carnegie Mellon University) diff --git a/environment.yml b/environment.yml index a10f3e83..75479662 100644 --- a/environment.yml +++ b/environment.yml @@ -9,5 +9,6 @@ dependencies: - nbval - statsmodels - pip + - imageio - pip: - jupyter-book diff --git a/requirements.txt b/requirements.txt index 744d4e70..8b9bde6a 100644 --- a/requirements.txt +++ b/requirements.txt @@ -6,3 +6,4 @@ matplotlib pytest nbval statsmodels +imageio diff --git a/site/index.md b/site/index.md index 14e940b0..edd35fa8 100644 --- a/site/index.md +++ b/site/index.md @@ -31,6 +31,7 @@ content/tutorial-svd content/mooreslaw-tutorial content/save-load-arrays content/tutorial-deep-learning-on-mnist +content/tutorial-x-ray-image-processing ``` ### Attribution