diff --git a/README.md b/README.md
index b97bd9f0e8..a57c07e7c5 100644
--- a/README.md
+++ b/README.md
@@ -1,35 +1,76 @@
-# Companion Jupyter notebooks for the book "Deep Learning with Python"
+# Companion notebooks for Deep Learning with Python
 
-This repository contains Jupyter notebooks implementing the code samples found in the book [Deep Learning with Python, 2nd Edition (Manning Publications)](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff).
+This repository contains Jupyter notebooks implementing the code samples found in the book [Deep Learning with Python, third edition (2025)](https://www.manning.com/books/deep-learning-with-python-third-edition?a_aid=keras&a_bid=76564dff)
+by Francois Chollet and Matthew Watson. In addition, you will also find the legacy notebooks for the [second edition (2021)](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff)
+and the [first edition (2017)](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff).
 
 For readability, these notebooks only contain runnable code blocks and section titles, and omit everything else in the book: text paragraphs, figures, and pseudocode.
 **If you want to be able to follow what's going on, I recommend reading the notebooks side by side with your copy of the book.**
 
-These notebooks use TensorFlow 2.6.
+## Running the code
+
+We recommend running these notebooks on [Colab](https://colab.google), which
+provides a hosted runtime with all the dependencies you will need. You can also,
+run these notebooks locally, either by setting up your own Jupyter environment,
+or using Colab's instructions for
+[running locally](https://research.google.com/colaboratory/local-runtimes.html).
+
+By default, all notebooks will run on Colab's free tier GPU runtime, which
+is sufficient to run all code in this book. Chapter 8-18 chapters will benefit
+from a faster GPU if you have a Colab Pro subscription. You can change your
+runtime type using **Runtime -> Change runtime type** in Colab's dropdown menus.
+
+## Choosing a backend
+
+The code for third edition is written using Keras 3. As such, it can be run with
+JAX, TensorFlow or PyTorch as a backend. To set the backend, update the backend
+in the cell at the top of the colab that looks like this:
+
+```python
+import os
+os.environ["KERAS_BACKEND"] = "jax"
+```
+
+This must be done only once per session before importing Keras. If you are
+in the middle running a notebook, you will need to restart the notebook session
+and rerun all relevant notebook cells. This can be done in using
+**Runtime -> Restart Session** in Colab's dropdown menus.
+
+## Using Kaggle data
+
+This book uses datasets and model weights provided by Kaggle, an online Machine
+Learning community and platform. You will need to create a Kaggle login to run
+Kaggle code in this book; instructions are given in Chapter 8.
+
+For chapters that need Kaggle data, you can login to Kaggle once per session
+when you hit the notebook cell with `kagglehub.login()`. Alternately,
+you can set up your Kaggle login information once as Colab secrets:
+
+ * Go to https://www.kaggle.com/ and sign in.
+ * Go to https://www.kaggle.com/settings and generate a Kaggle API key.
+ * Open the secrets tab in Colab by clicking the key icon on the left.
+ * Add two secrets, `KAGGLE_USERNAME` and `KAGGLE_KEY` with the username and key
+   you just created.
+
+Following this approach you will only need to copy your Kaggle secret key once,
+though you will need to allow each notebook to access your secrets when running
+the relevant Kaggle code.
 
 ## Table of contents
 
 * [Chapter 2: The mathematical building blocks of neural networks](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter02_mathematical-building-blocks.ipynb)
-* [Chapter 3: Introduction to Keras and TensorFlow](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter03_introduction-to-keras-and-tf.ipynb)
-* [Chapter 4: Getting started with neural networks: classification and regression](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter04_getting-started-with-neural-networks.ipynb)
+* [Chapter 3: Introduction to TensorFlow, PyTorch, JAX, and Keras](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter03_introduction-to-ml-frameworks.ipynb)
+* [Chapter 4: Classification and regression](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter04_classification-and-regression.ipynb)
 * [Chapter 5: Fundamentals of machine learning](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter05_fundamentals-of-ml.ipynb)
-* [Chapter 7: Working with Keras: a deep dive](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter07_working-with-keras.ipynb)
-* [Chapter 8: Introduction to deep learning for computer vision](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter08_intro-to-dl-for-computer-vision.ipynb)
-* Chapter 9: Advanced deep learning for computer vision
-    - [Part 1: Image segmentation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_part01_image-segmentation.ipynb)
-    - [Part 2: Modern convnet architecture patterns](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_part02_modern-convnet-architecture-patterns.ipynb)
-    - [Part 3: Interpreting what convnets learn](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_part03_interpreting-what-convnets-learn.ipynb)
-* [Chapter 10: Deep learning for timeseries](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter10_dl-for-timeseries.ipynb)
-* Chapter 11: Deep learning for text
-    - [Part 1: Introduction](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_part01_introduction.ipynb)
-    - [Part 2: Sequence models](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_part02_sequence-models.ipynb)
-    - [Part 3: Transformer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_part03_transformer.ipynb)
-    - [Part 4: Sequence-to-sequence learning](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_part04_sequence-to-sequence-learning.ipynb)
-* Chapter 12: Generative deep learning
-    - [Part 1: Text generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part01_text-generation.ipynb)
-    - [Part 2: Deep Dream](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part02_deep-dream.ipynb)
-    - [Part 3: Neural style transfer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part03_neural-style-transfer.ipynb)
-    - [Part 4: Variational autoencoders](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part04_variational-autoencoders.ipynb)
-    - [Part 5: Generative adversarial networks](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part05_gans.ipynb)
-* [Chapter 13: Best practices for the real world](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter13_best-practices-for-the-real-world.ipynb)
-* [Chapter 14: Conclusions](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter14_conclusions.ipynb)
+* [Chapter 7: A deep dive on Keras](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter07_deep-dive-keras.ipynb)
+* [Chapter 8: Image Classification](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter08_image-classification.ipynb)
+* [Chapter 9: Convnet architecture patterns](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_convnet-architecture-patterns.ipynb)
+* [Chapter 10: Interpreting what ConvNets learn](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter10_interpreting-what-convnets-learn.ipynb)
+* [Chapter 11: Image Segmentation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_image-segmentation.ipynb)
+* [Chapter 12: Object Detection](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_object-detection.ipynb)
+* [Chapter 13: Timeseries Forecasting](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter13_timeseries-forecasting.ipynb)
+* [Chapter 14: Text Classification](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter14_text-classification.ipynb)
+* [Chapter 15: Language Models and the Transformer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter15_language-models-and-the-transformer.ipynb)
+* [Chapter 16: Text Generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter16_text-generation.ipynb)
+* [Chapter 17: Image Generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter17_image-generation.ipynb)
+* [Chapter 18: Best practices for the real world](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter18_best-practices-for-the-real-world.ipynb)
diff --git a/chapter02_mathematical-building-blocks.ipynb b/chapter02_mathematical-building-blocks.ipynb
index 01edc9becc..3c419b7b8f 100644
--- a/chapter02_mathematical-building-blocks.ipynb
+++ b/chapter02_mathematical-building-blocks.ipynb
@@ -6,16 +6,55 @@
     "colab_type": "text"
    },
    "source": [
-    "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": 0,
    "metadata": {
-    "colab_type": "text"
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
    },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"tensorflow\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
    "source": [
-    "# The mathematical building blocks of neural networks"
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
    ]
   },
   {
@@ -24,7 +63,7 @@
     "colab_type": "text"
    },
    "source": [
-    "## A first look at a neural network"
+    "## The mathematical building blocks of neural networks"
    ]
   },
   {
@@ -33,7 +72,7 @@
     "colab_type": "text"
    },
    "source": [
-    "**Loading the MNIST dataset in Keras**"
+    "### A first look at a neural network"
    ]
   },
   {
@@ -44,7 +83,8 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow.keras.datasets import mnist\n",
+    "from keras.datasets import mnist\n",
+    "\n",
     "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()"
    ]
   },
@@ -114,15 +154,6 @@
     "test_labels"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**The network architecture**"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": 0,
@@ -131,21 +162,15 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow import keras\n",
-    "from tensorflow.keras import layers\n",
-    "model = keras.Sequential([\n",
-    "    layers.Dense(512, activation=\"relu\"),\n",
-    "    layers.Dense(10, activation=\"softmax\")\n",
-    "])"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**The compilation step**"
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(512, activation=\"relu\"),\n",
+    "        layers.Dense(10, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")"
    ]
   },
   {
@@ -156,18 +181,11 @@
    },
    "outputs": [],
    "source": [
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"sparse_categorical_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**Preparing the image data**"
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
    ]
   },
   {
@@ -184,15 +202,6 @@
     "test_images = test_images.astype(\"float32\") / 255"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**\"Fitting\" the model**"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": 0,
@@ -204,15 +213,6 @@
     "model.fit(train_images, train_labels, epochs=5, batch_size=128)"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**Using the model to make predictions**"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": 0,
@@ -259,15 +259,6 @@
     "test_labels[0]"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**Evaluating the model on new data**"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": 0,
@@ -286,7 +277,7 @@
     "colab_type": "text"
    },
    "source": [
-    "## Data representations for neural networks"
+    "### Data representations for neural networks"
    ]
   },
   {
@@ -295,7 +286,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Scalars (rank-0 tensors)"
+    "#### Scalars (rank-0 tensors)"
    ]
   },
   {
@@ -328,7 +319,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Vectors (rank-1 tensors)"
+    "#### Vectors (rank-1 tensors)"
    ]
   },
   {
@@ -360,7 +351,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Matrices (rank-2 tensors)"
+    "#### Matrices (rank-2 tensors)"
    ]
   },
   {
@@ -383,7 +374,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Rank-3 and higher-rank tensors"
+    "#### Rank-3 tensors and higher-rank tensors"
    ]
   },
   {
@@ -412,7 +403,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Key attributes"
+    "#### Key attributes"
    ]
   },
   {
@@ -423,7 +414,8 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow.keras.datasets import mnist\n",
+    "from keras.datasets import mnist\n",
+    "\n",
     "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()"
    ]
   },
@@ -460,15 +452,6 @@
     "train_images.dtype"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**Displaying the fourth digit**"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": 0,
@@ -478,6 +461,7 @@
    "outputs": [],
    "source": [
     "import matplotlib.pyplot as plt\n",
+    "\n",
     "digit = train_images[4]\n",
     "plt.imshow(digit, cmap=plt.cm.binary)\n",
     "plt.show()"
@@ -500,7 +484,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Manipulating tensors in NumPy"
+    "#### Manipulating tensors in NumPy"
    ]
   },
   {
@@ -567,7 +551,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### The notion of data batches"
+    "#### The notion of data batches"
    ]
   },
   {
@@ -601,7 +585,7 @@
    "outputs": [],
    "source": [
     "n = 3\n",
-    "batch = train_images[128 * n:128 * (n + 1)]"
+    "batch = train_images[128 * n : 128 * (n + 1)]"
    ]
   },
   {
@@ -610,7 +594,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Real-world examples of data tensors"
+    "#### Real-world examples of data tensors"
    ]
   },
   {
@@ -619,7 +603,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Vector data"
+    "##### Vector data"
    ]
   },
   {
@@ -628,7 +612,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Timeseries data or sequence data"
+    "##### Timeseries data or sequence data"
    ]
   },
   {
@@ -637,7 +621,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Image data"
+    "##### Image data"
    ]
   },
   {
@@ -646,7 +630,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Video data"
+    "##### Video data"
    ]
   },
   {
@@ -655,7 +639,7 @@
     "colab_type": "text"
    },
    "source": [
-    "## The gears of neural networks: tensor operations"
+    "### The gears of neural networks: Tensor operations"
    ]
   },
   {
@@ -664,7 +648,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Element-wise operations"
+    "#### Element-wise operations"
    ]
   },
   {
@@ -718,7 +702,7 @@
     "t0 = time.time()\n",
     "for _ in range(1000):\n",
     "    z = x + y\n",
-    "    z = np.maximum(z, 0.)\n",
+    "    z = np.maximum(z, 0.0)\n",
     "print(\"Took: {0:.2f} s\".format(time.time() - t0))"
    ]
   },
@@ -743,7 +727,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Broadcasting"
+    "#### Broadcasting"
    ]
   },
   {
@@ -755,6 +739,7 @@
    "outputs": [],
    "source": [
     "import numpy as np\n",
+    "\n",
     "X = np.random.random((32, 10))\n",
     "y = np.random.random((10,))"
    ]
@@ -778,7 +763,7 @@
    },
    "outputs": [],
    "source": [
-    "Y = np.concatenate([y] * 32, axis=0)"
+    "Y = np.tile(y, (32, 1))"
    ]
   },
   {
@@ -809,6 +794,7 @@
    "outputs": [],
    "source": [
     "import numpy as np\n",
+    "\n",
     "x = np.random.random((64, 3, 32, 10))\n",
     "y = np.random.random((32, 10))\n",
     "z = np.maximum(x, y)"
@@ -820,7 +806,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Tensor product"
+    "#### Tensor product"
    ]
   },
   {
@@ -833,7 +819,9 @@
    "source": [
     "x = np.random.random((32,))\n",
     "y = np.random.random((32,))\n",
-    "z = np.dot(x, y)"
+    "\n",
+    "z = np.matmul(x, y)\n",
+    "z = x @ y"
    ]
   },
   {
@@ -844,11 +832,11 @@
    },
    "outputs": [],
    "source": [
-    "def naive_vector_dot(x, y):\n",
+    "def naive_vector_product(x, y):\n",
     "    assert len(x.shape) == 1\n",
     "    assert len(y.shape) == 1\n",
     "    assert x.shape[0] == y.shape[0]\n",
-    "    z = 0.\n",
+    "    z = 0.0\n",
     "    for i in range(x.shape[0]):\n",
     "        z += x[i] * y[i]\n",
     "    return z"
@@ -862,7 +850,7 @@
    },
    "outputs": [],
    "source": [
-    "def naive_matrix_vector_dot(x, y):\n",
+    "def naive_matrix_vector_product(x, y):\n",
     "    assert len(x.shape) == 2\n",
     "    assert len(y.shape) == 1\n",
     "    assert x.shape[1] == y.shape[0]\n",
@@ -881,10 +869,10 @@
    },
    "outputs": [],
    "source": [
-    "def naive_matrix_vector_dot(x, y):\n",
+    "def naive_matrix_vector_product(x, y):\n",
     "    z = np.zeros(x.shape[0])\n",
     "    for i in range(x.shape[0]):\n",
-    "        z[i] = naive_vector_dot(x[i, :], y)\n",
+    "        z[i] = naive_vector_product(x[i, :], y)\n",
     "    return z"
    ]
   },
@@ -896,7 +884,7 @@
    },
    "outputs": [],
    "source": [
-    "def naive_matrix_dot(x, y):\n",
+    "def naive_matrix_product(x, y):\n",
     "    assert len(x.shape) == 2\n",
     "    assert len(y.shape) == 2\n",
     "    assert x.shape[1] == y.shape[0]\n",
@@ -905,7 +893,7 @@
     "        for j in range(y.shape[1]):\n",
     "            row_x = x[i, :]\n",
     "            column_y = y[:, j]\n",
-    "            z[i, j] = naive_vector_dot(row_x, column_y)\n",
+    "            z[i, j] = naive_vector_product(row_x, column_y)\n",
     "    return z"
    ]
   },
@@ -915,7 +903,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Tensor reshaping"
+    "#### Tensor reshaping"
    ]
   },
   {
@@ -938,8 +926,8 @@
    "outputs": [],
    "source": [
     "x = np.array([[0., 1.],\n",
-    "             [2., 3.],\n",
-    "             [4., 5.]])\n",
+    "              [2., 3.],\n",
+    "              [4., 5.]])\n",
     "x.shape"
    ]
   },
@@ -963,18 +951,21 @@
    },
    "outputs": [],
    "source": [
-    "x = np.zeros((300, 20))\n",
-    "x = np.transpose(x)\n",
-    "x.shape"
+    "x = x.reshape((2, 3))\n",
+    "x"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": 0,
    "metadata": {
-    "colab_type": "text"
+    "colab_type": "code"
    },
+   "outputs": [],
    "source": [
-    "### Geometric interpretation of tensor operations"
+    "x = np.zeros((300, 20))\n",
+    "x = np.transpose(x)\n",
+    "x.shape"
    ]
   },
   {
@@ -983,7 +974,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### A geometric interpretation of deep learning"
+    "#### Geometric interpretation of tensor operations"
    ]
   },
   {
@@ -992,7 +983,7 @@
     "colab_type": "text"
    },
    "source": [
-    "## The engine of neural networks: gradient-based optimization"
+    "#### A geometric interpretation of deep learning"
    ]
   },
   {
@@ -1001,7 +992,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### What's a derivative?"
+    "### The engine of neural networks: Gradient-based optimization"
    ]
   },
   {
@@ -1010,7 +1001,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Derivative of a tensor operation: the gradient"
+    "#### What's a derivative?"
    ]
   },
   {
@@ -1019,7 +1010,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Stochastic gradient descent"
+    "#### Derivative of a tensor operation: The gradient"
    ]
   },
   {
@@ -1028,7 +1019,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Chaining derivatives: The Backpropagation algorithm"
+    "#### Stochastic gradient descent"
    ]
   },
   {
@@ -1037,7 +1028,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### The chain rule"
+    "#### Chaining derivatives: The Backpropagation algorithm"
    ]
   },
   {
@@ -1046,7 +1037,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Automatic differentiation with computation graphs"
+    "##### The chain rule"
    ]
   },
   {
@@ -1055,52 +1046,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### The gradient tape in TensorFlow"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 0,
-   "metadata": {
-    "colab_type": "code"
-   },
-   "outputs": [],
-   "source": [
-    "import tensorflow as tf\n",
-    "x = tf.Variable(0.)\n",
-    "with tf.GradientTape() as tape:\n",
-    "    y = 2 * x + 3\n",
-    "grad_of_y_wrt_x = tape.gradient(y, x)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 0,
-   "metadata": {
-    "colab_type": "code"
-   },
-   "outputs": [],
-   "source": [
-    "x = tf.Variable(tf.random.uniform((2, 2)))\n",
-    "with tf.GradientTape() as tape:\n",
-    "    y = 2 * x + 3\n",
-    "grad_of_y_wrt_x = tape.gradient(y, x)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 0,
-   "metadata": {
-    "colab_type": "code"
-   },
-   "outputs": [],
-   "source": [
-    "W = tf.Variable(tf.random.uniform((2, 2)))\n",
-    "b = tf.Variable(tf.zeros((2,)))\n",
-    "x = tf.random.uniform((2, 2))\n",
-    "with tf.GradientTape() as tape:\n",
-    "    y = tf.matmul(x, W) + b\n",
-    "grad_of_y_wrt_W_and_b = tape.gradient(y, [W, b])"
+    "##### Automatic differentiation with computation graphs"
    ]
   },
   {
@@ -1109,7 +1055,7 @@
     "colab_type": "text"
    },
    "source": [
-    "## Looking back at our first example"
+    "### Looking back at our first example"
    ]
   },
   {
@@ -1135,10 +1081,12 @@
    },
    "outputs": [],
    "source": [
-    "model = keras.Sequential([\n",
-    "    layers.Dense(512, activation=\"relu\"),\n",
-    "    layers.Dense(10, activation=\"softmax\")\n",
-    "])"
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(512, activation=\"relu\"),\n",
+    "        layers.Dense(10, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")"
    ]
   },
   {
@@ -1149,9 +1097,11 @@
    },
    "outputs": [],
    "source": [
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"sparse_categorical_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])"
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
    ]
   },
   {
@@ -1162,7 +1112,12 @@
    },
    "outputs": [],
    "source": [
-    "model.fit(train_images, train_labels, epochs=5, batch_size=128)"
+    "model.fit(\n",
+    "    train_images,\n",
+    "    train_labels,\n",
+    "    epochs=5,\n",
+    "    batch_size=128,\n",
+    ")"
    ]
   },
   {
@@ -1171,7 +1126,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Reimplementing our first example from scratch in TensorFlow"
+    "#### Reimplementing our first example from scratch"
    ]
   },
   {
@@ -1180,7 +1135,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### A simple Dense class"
+    "##### A simple Dense class"
    ]
   },
   {
@@ -1191,22 +1146,23 @@
    },
    "outputs": [],
    "source": [
-    "import tensorflow as tf\n",
+    "import keras\n",
+    "from keras import ops\n",
     "\n",
     "class NaiveDense:\n",
-    "    def __init__(self, input_size, output_size, activation):\n",
+    "    def __init__(self, input_size, output_size, activation=None):\n",
     "        self.activation = activation\n",
-    "\n",
-    "        w_shape = (input_size, output_size)\n",
-    "        w_initial_value = tf.random.uniform(w_shape, minval=0, maxval=1e-1)\n",
-    "        self.W = tf.Variable(w_initial_value)\n",
-    "\n",
-    "        b_shape = (output_size,)\n",
-    "        b_initial_value = tf.zeros(b_shape)\n",
-    "        self.b = tf.Variable(b_initial_value)\n",
+    "        self.W = keras.Variable(\n",
+    "            shape=(input_size, output_size), initializer=\"uniform\"\n",
+    "        )\n",
+    "        self.b = keras.Variable(shape=(output_size,), initializer=\"zeros\")\n",
     "\n",
     "    def __call__(self, inputs):\n",
-    "        return self.activation(tf.matmul(inputs, self.W) + self.b)\n",
+    "        x = ops.matmul(inputs, self.W)\n",
+    "        x = x + self.b\n",
+    "        if self.activation is not None:\n",
+    "            x = self.activation(x)\n",
+    "        return x\n",
     "\n",
     "    @property\n",
     "    def weights(self):\n",
@@ -1219,7 +1175,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### A simple Sequential class"
+    "##### A simple Sequential class"
    ]
   },
   {
@@ -1237,15 +1193,15 @@
     "    def __call__(self, inputs):\n",
     "        x = inputs\n",
     "        for layer in self.layers:\n",
-    "           x = layer(x)\n",
+    "            x = layer(x)\n",
     "        return x\n",
     "\n",
     "    @property\n",
     "    def weights(self):\n",
-    "       weights = []\n",
-    "       for layer in self.layers:\n",
-    "           weights += layer.weights\n",
-    "       return weights"
+    "        weights = []\n",
+    "        for layer in self.layers:\n",
+    "            weights += layer.weights\n",
+    "        return weights"
    ]
   },
   {
@@ -1256,10 +1212,12 @@
    },
    "outputs": [],
    "source": [
-    "model = NaiveSequential([\n",
-    "    NaiveDense(input_size=28 * 28, output_size=512, activation=tf.nn.relu),\n",
-    "    NaiveDense(input_size=512, output_size=10, activation=tf.nn.softmax)\n",
-    "])\n",
+    "model = NaiveSequential(\n",
+    "    [\n",
+    "        NaiveDense(input_size=28 * 28, output_size=512, activation=ops.relu),\n",
+    "        NaiveDense(input_size=512, output_size=10, activation=ops.softmax),\n",
+    "    ]\n",
+    ")\n",
     "assert len(model.weights) == 4"
    ]
   },
@@ -1269,7 +1227,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### A batch generator"
+    "##### A batch generator"
    ]
   },
   {
@@ -1304,26 +1262,16 @@
     "colab_type": "text"
    },
    "source": [
-    "### Running one training step"
+    "#### Running one training step"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": 0,
+   "cell_type": "markdown",
    "metadata": {
-    "colab_type": "code"
+    "colab_type": "text"
    },
-   "outputs": [],
    "source": [
-    "def one_training_step(model, images_batch, labels_batch):\n",
-    "    with tf.GradientTape() as tape:\n",
-    "        predictions = model(images_batch)\n",
-    "        per_sample_losses = tf.keras.losses.sparse_categorical_crossentropy(\n",
-    "            labels_batch, predictions)\n",
-    "        average_loss = tf.reduce_mean(per_sample_losses)\n",
-    "    gradients = tape.gradient(average_loss, model.weights)\n",
-    "    update_weights(gradients, model.weights)\n",
-    "    return average_loss"
+    "##### The weight update step"
    ]
   },
   {
@@ -1338,7 +1286,7 @@
     "\n",
     "def update_weights(gradients, weights):\n",
     "    for g, w in zip(gradients, weights):\n",
-    "        w.assign_sub(g * learning_rate)"
+    "        w.assign(w - g * learning_rate)"
    ]
   },
   {
@@ -1349,7 +1297,7 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow.keras import optimizers\n",
+    "from keras import optimizers\n",
     "\n",
     "optimizer = optimizers.SGD(learning_rate=1e-3)\n",
     "\n",
@@ -1363,7 +1311,24 @@
     "colab_type": "text"
    },
    "source": [
-    "### The full training loop"
+    "##### Gradient computation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "import tensorflow as tf\n",
+    "\n",
+    "x = tf.zeros(shape=())\n",
+    "with tf.GradientTape() as tape:\n",
+    "    y = 2 * x + 3\n",
+    "grad_of_y_wrt_x = tape.gradient(y, x)"
    ]
   },
   {
@@ -1374,6 +1339,35 @@
    },
    "outputs": [],
    "source": [
+    "%%backend tensorflow\n",
+    "def one_training_step(model, images_batch, labels_batch):\n",
+    "    with tf.GradientTape() as tape:\n",
+    "        predictions = model(images_batch)\n",
+    "        loss = ops.sparse_categorical_crossentropy(labels_batch, predictions)\n",
+    "        average_loss = ops.mean(loss)\n",
+    "    gradients = tape.gradient(average_loss, model.weights)\n",
+    "    update_weights(gradients, model.weights)\n",
+    "    return average_loss"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The full training loop"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
     "def fit(model, images, labels, epochs, batch_size=128):\n",
     "    for epoch_counter in range(epochs):\n",
     "        print(f\"Epoch {epoch_counter}\")\n",
@@ -1393,7 +1387,9 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow.keras.datasets import mnist\n",
+    "%%backend tensorflow\n",
+    "from keras.datasets import mnist\n",
+    "\n",
     "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
     "\n",
     "train_images = train_images.reshape((60000, 28 * 28))\n",
@@ -1410,7 +1406,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Evaluating the model"
+    "#### Evaluating the model"
    ]
   },
   {
@@ -1421,27 +1417,19 @@
    },
    "outputs": [],
    "source": [
+    "%%backend tensorflow\n",
     "predictions = model(test_images)\n",
-    "predictions = predictions.numpy()\n",
-    "predicted_labels = np.argmax(predictions, axis=1)\n",
+    "predicted_labels = ops.argmax(predictions, axis=1)\n",
     "matches = predicted_labels == test_labels\n",
-    "print(f\"accuracy: {matches.mean():.2f}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "## Summary"
+    "f\"accuracy: {ops.mean(matches):.2f}\""
    ]
   }
  ],
  "metadata": {
+  "accelerator": "GPU",
   "colab": {
    "collapsed_sections": [],
-   "name": "chapter02_mathematical-building-blocks.i",
+   "name": "chapter02_mathematical-building-blocks",
    "private_outputs": false,
    "provenance": [],
    "toc_visible": true
@@ -1461,7 +1449,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.7.0"
+   "version": "3.10.0"
   }
  },
  "nbformat": 4,
diff --git a/chapter03_introduction-to-ml-frameworks.ipynb b/chapter03_introduction-to-ml-frameworks.ipynb
new file mode 100644
index 0000000000..7d29c2f859
--- /dev/null
+++ b/chapter03_introduction-to-ml-frameworks.ipynb
@@ -0,0 +1,1779 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Introduction to TensorFlow, PyTorch, JAX, and Keras"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### A brief history of deep learning frameworks"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### How these frameworks relate to each other"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Introduction to TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### First steps with TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Tensors and variables in TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Constant tensors"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import tensorflow as tf\n",
+    "tf.ones(shape=(2, 1))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "tf.zeros(shape=(2, 1))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "tf.constant([1, 2, 3], dtype=\"float32\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Random tensors"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = tf.random.normal(shape=(3, 1), mean=0., stddev=1.)\n",
+    "print(x)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = tf.random.uniform(shape=(3, 1), minval=0., maxval=1.)\n",
+    "print(x)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Tensor assignment and the Variable class"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "x = np.ones(shape=(2, 2))\n",
+    "x[0, 0] = 0.0"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "v = tf.Variable(initial_value=tf.random.normal(shape=(3, 1)))\n",
+    "print(v)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "v.assign(tf.ones((3, 1)))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "v[0, 0].assign(3.)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "v.assign_add(tf.ones((3, 1)))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Tensor operations: Doing math in TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "a = tf.ones((2, 2))\n",
+    "b = tf.square(a)\n",
+    "c = tf.sqrt(a)\n",
+    "d = b + c\n",
+    "e = tf.matmul(a, b)\n",
+    "f = tf.concat((a, b), axis=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def dense(inputs, W, b):\n",
+    "    return tf.nn.relu(tf.matmul(inputs, W) + b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Gradients in TensorFlow: A second look at the GradientTape API"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_var = tf.Variable(initial_value=3.0)\n",
+    "with tf.GradientTape() as tape:\n",
+    "    result = tf.square(input_var)\n",
+    "gradient = tape.gradient(result, input_var)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_const = tf.constant(3.0)\n",
+    "with tf.GradientTape() as tape:\n",
+    "    tape.watch(input_const)\n",
+    "    result = tf.square(input_const)\n",
+    "gradient = tape.gradient(result, input_const)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "time = tf.Variable(0.0)\n",
+    "with tf.GradientTape() as outer_tape:\n",
+    "    with tf.GradientTape() as inner_tape:\n",
+    "        position = 4.9 * time**2\n",
+    "    speed = inner_tape.gradient(position, time)\n",
+    "acceleration = outer_tape.gradient(speed, time)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Making TensorFlow functions fast using compilation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "@tf.function\n",
+    "def dense(inputs, W, b):\n",
+    "    return tf.nn.relu(tf.matmul(inputs, W) + b)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "@tf.function(jit_compile=True)\n",
+    "def dense(inputs, W, b):\n",
+    "    return tf.nn.relu(tf.matmul(inputs, W) + b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### An end-to-end example: A linear classifier in pure TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "num_samples_per_class = 1000\n",
+    "negative_samples = np.random.multivariate_normal(\n",
+    "    mean=[0, 3], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class\n",
+    ")\n",
+    "positive_samples = np.random.multivariate_normal(\n",
+    "    mean=[3, 0], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = np.vstack((negative_samples, positive_samples)).astype(np.float32)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "targets = np.vstack(\n",
+    "    (\n",
+    "        np.zeros((num_samples_per_class, 1), dtype=\"float32\"),\n",
+    "        np.ones((num_samples_per_class, 1), dtype=\"float32\"),\n",
+    "    )\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "plt.scatter(inputs[:, 0], inputs[:, 1], c=targets[:, 0])\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_dim = 2\n",
+    "output_dim = 1\n",
+    "W = tf.Variable(initial_value=tf.random.uniform(shape=(input_dim, output_dim)))\n",
+    "b = tf.Variable(initial_value=tf.zeros(shape=(output_dim,)))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def model(inputs, W, b):\n",
+    "    return tf.matmul(inputs, W) + b"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def mean_squared_error(targets, predictions):\n",
+    "    per_sample_losses = tf.square(targets - predictions)\n",
+    "    return tf.reduce_mean(per_sample_losses)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "learning_rate = 0.1\n",
+    "\n",
+    "@tf.function(jit_compile=True)\n",
+    "def training_step(inputs, targets, W, b):\n",
+    "    with tf.GradientTape() as tape:\n",
+    "        predictions = model(inputs, W, b)\n",
+    "        loss = mean_squared_error(predictions, targets)\n",
+    "    grad_loss_wrt_W, grad_loss_wrt_b = tape.gradient(loss, [W, b])\n",
+    "    W.assign_sub(grad_loss_wrt_W * learning_rate)\n",
+    "    b.assign_sub(grad_loss_wrt_b * learning_rate)\n",
+    "    return loss"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "for step in range(40):\n",
+    "    loss = training_step(inputs, targets, W, b)\n",
+    "    print(f\"Loss at step {step}: {loss:.4f}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "predictions = model(inputs, W, b)\n",
+    "plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = np.linspace(-1, 4, 100)\n",
+    "y = -W[0] / W[1] * x + (0.5 - b) / W[1]\n",
+    "plt.plot(x, y, \"-r\")\n",
+    "plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### What makes the TensorFlow approach unique"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Introduction to PyTorch"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### First steps with PyTorch"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Tensors and parameters in PyTorch"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Constant tensors"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "torch.ones(size=(2, 1))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "torch.zeros(size=(2, 1))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "torch.tensor([1, 2, 3], dtype=torch.float32)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Random tensors"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "torch.normal(\n",
+    "mean=torch.zeros(size=(3, 1)),\n",
+    "std=torch.ones(size=(3, 1)))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "torch.rand(3, 1)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Tensor assignment and the Parameter class"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = torch.zeros(size=(2, 1))\n",
+    "x[0, 0] = 1.\n",
+    "x"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = torch.zeros(size=(2, 1))\n",
+    "p = torch.nn.parameter.Parameter(data=x)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Tensor operations: Doing math in PyTorch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "a = torch.ones((2, 2))\n",
+    "b = torch.square(a)\n",
+    "c = torch.sqrt(a)\n",
+    "d = b + c\n",
+    "e = torch.matmul(a, b)\n",
+    "f = torch.cat((a, b), dim=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def dense(inputs, W, b):\n",
+    "    return torch.nn.relu(torch.matmul(inputs, W) + b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Computing gradients with PyTorch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_var = torch.tensor(3.0, requires_grad=True)\n",
+    "result = torch.square(input_var)\n",
+    "result.backward()\n",
+    "gradient = input_var.grad\n",
+    "gradient"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "result = torch.square(input_var)\n",
+    "result.backward()\n",
+    "input_var.grad"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_var.grad = None"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### An end-to-end example: A linear classifier in pure PyTorch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_dim = 2\n",
+    "output_dim = 1\n",
+    "\n",
+    "W = torch.rand(input_dim, output_dim, requires_grad=True)\n",
+    "b = torch.zeros(output_dim, requires_grad=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def model(inputs, W, b):\n",
+    "    return torch.matmul(inputs, W) + b"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def mean_squared_error(targets, predictions):\n",
+    "    per_sample_losses = torch.square(targets - predictions)\n",
+    "    return torch.mean(per_sample_losses)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "learning_rate = 0.1\n",
+    "\n",
+    "def training_step(inputs, targets, W, b):\n",
+    "    predictions = model(inputs)\n",
+    "    loss = mean_squared_error(targets, predictions)\n",
+    "    loss.backward()\n",
+    "    grad_loss_wrt_W, grad_loss_wrt_b = W.grad, b.grad\n",
+    "    with torch.no_grad():\n",
+    "        W -= grad_loss_wrt_W * learning_rate\n",
+    "        b -= grad_loss_wrt_b * learning_rate\n",
+    "    W.grad = None\n",
+    "    b.grad = None\n",
+    "    return loss"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Packaging state and computation with the Module class"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class LinearModel(torch.nn.Module):\n",
+    "    def __init__(self):\n",
+    "        super().__init__()\n",
+    "        self.W = torch.nn.Parameter(torch.rand(input_dim, output_dim))\n",
+    "        self.b = torch.nn.Parameter(torch.zeros(output_dim))\n",
+    "\n",
+    "    def forward(self, inputs):\n",
+    "        return torch.matmul(inputs, self.W) + self.b"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = LinearModel()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "torch_inputs = torch.tensor(inputs)\n",
+    "output = model(torch_inputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def training_step(inputs, targets):\n",
+    "    predictions = model(inputs)\n",
+    "    loss = mean_squared_error(targets, predictions)\n",
+    "    loss.backward()\n",
+    "    optimizer.step()\n",
+    "    model.zero_grad()\n",
+    "    return loss"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Making PyTorch modules fast using compilation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "compiled_model = torch.compile(model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "@torch.compile\n",
+    "def dense(inputs, W, b):\n",
+    "    return torch.nn.relu(torch.matmul(inputs, W) + b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### What makes the PyTorch approach unique"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Introduction to JAX"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### First steps with JAX"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Tensors in JAX"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from jax import numpy as jnp\n",
+    "jnp.ones(shape=(2, 1))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "jnp.zeros(shape=(2, 1))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "jnp.array([1, 2, 3], dtype=\"float32\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Random number generation in JAX"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "np.random.normal(size=(3,))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "np.random.normal(size=(3,))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def apply_noise(x, seed):\n",
+    "    np.random.seed(seed)\n",
+    "    x = x * np.random.normal((3,))\n",
+    "    return x\n",
+    "\n",
+    "seed = 1337\n",
+    "y = apply_noise(x, seed)\n",
+    "seed += 1\n",
+    "z = apply_noise(x, seed)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import jax\n",
+    "\n",
+    "seed_key = jax.random.key(1337)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "seed_key = jax.random.key(0)\n",
+    "jax.random.normal(seed_key, shape=(3,))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "seed_key = jax.random.key(123)\n",
+    "jax.random.normal(seed_key, shape=(3,))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "jax.random.normal(seed_key, shape=(3,))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "seed_key = jax.random.key(123)\n",
+    "jax.random.normal(seed_key, shape=(3,))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "new_seed_key = jax.random.split(seed_key, num=1)[0]\n",
+    "jax.random.normal(new_seed_key, shape=(3,))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Tensor assignment"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = jnp.array([1, 2, 3], dtype=\"float32\")\n",
+    "new_x = x.at[0].set(10)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Tensor operations: Doing math in JAX"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "a = jnp.ones((2, 2))\n",
+    "b = jnp.square(a)\n",
+    "c = jnp.sqrt(a)\n",
+    "d = b + c\n",
+    "e = jnp.matmul(a, b)\n",
+    "e *= d"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def dense(inputs, W, b):\n",
+    "    return jax.nn.relu(jnp.matmul(inputs, W) + b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Computing gradients with JAX"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def compute_loss(input_var):\n",
+    "    return jnp.square(input_var)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "grad_fn = jax.grad(compute_loss)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_var = jnp.array(3.0)\n",
+    "grad_of_loss_wrt_input_var = grad_fn(input_var)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### JAX gradient-computation best practices"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Returning the loss value"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "grad_fn = jax.value_and_grad(compute_loss)\n",
+    "output, grad_of_loss_wrt_input_var = grad_fn(input_var)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Getting gradients for a complex function"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Returning auxiliary outputs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Making JAX functions fast with @jax.jit"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "@jax.jit\n",
+    "def dense(inputs, W, b):\n",
+    "    return jax.nn.relu(jnp.matmul(inputs, W) + b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### An end-to-end example: A linear classifier in pure JAX"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def model(inputs, W, b):\n",
+    "    return jnp.matmul(inputs, W) + b\n",
+    "\n",
+    "def mean_squared_error(targets, predictions):\n",
+    "    per_sample_losses = jnp.square(targets - predictions)\n",
+    "    return jnp.mean(per_sample_losses)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def compute_loss(state, inputs, targets):\n",
+    "    W, b = state\n",
+    "    predictions = model(inputs, W, b)\n",
+    "    loss = mean_squared_error(targets, predictions)\n",
+    "    return loss"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "grad_fn = jax.value_and_grad(compute_loss)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "learning_rate = 0.1\n",
+    "\n",
+    "@jax.jit\n",
+    "def training_step(inputs, targets, W, b):\n",
+    "    loss, grads = grad_fn((W, b), inputs, targets)\n",
+    "    grad_wrt_W, grad_wrt_b = grads\n",
+    "    W = W - grad_wrt_W * learning_rate\n",
+    "    b = b - grad_wrt_b * learning_rate\n",
+    "    return loss, W, b"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_dim = 2\n",
+    "output_dim = 1\n",
+    "\n",
+    "W = jax.numpy.array(np.random.uniform(size=(input_dim, output_dim)))\n",
+    "b = jax.numpy.array(np.zeros(shape=(output_dim,)))\n",
+    "state = (W, b)\n",
+    "for step in range(40):\n",
+    "    loss, W, b = training_step(inputs, targets, W, b)\n",
+    "    print(f\"Loss at step {step}: {loss:.4f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### What makes the JAX approach unique"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Introduction to Keras"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### First steps with Keras"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Picking a backend framework"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\"\n",
+    "\n",
+    "import keras"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Layers: The building blocks of deep learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### The base `Layer` class in Keras"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "\n",
+    "class SimpleDense(keras.Layer):\n",
+    "    def __init__(self, units, activation=None):\n",
+    "        super().__init__()\n",
+    "        self.units = units\n",
+    "        self.activation = activation\n",
+    "\n",
+    "    def build(self, input_shape):\n",
+    "        batch_dim, input_dim = input_shape\n",
+    "        self.W = self.add_weight(\n",
+    "            shape=(input_dim, self.units), initializer=\"random_normal\"\n",
+    "        )\n",
+    "        self.b = self.add_weight(shape=(self.units,), initializer=\"zeros\")\n",
+    "\n",
+    "    def call(self, inputs):\n",
+    "        y = keras.ops.matmul(inputs, self.W) + self.b\n",
+    "        if self.activation is not None:\n",
+    "            y = self.activation(y)\n",
+    "        return y"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "my_dense = SimpleDense(units=32, activation=keras.ops.relu)\n",
+    "input_tensor = keras.ops.ones(shape=(2, 784))\n",
+    "output_tensor = my_dense(input_tensor)\n",
+    "print(output_tensor.shape)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Automatic shape inference: Building layers on the fly"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import layers\n",
+    "\n",
+    "layer = layers.Dense(32, activation=\"relu\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import models\n",
+    "from keras import layers\n",
+    "\n",
+    "model = models.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(32, activation=\"relu\"),\n",
+    "        layers.Dense(32),\n",
+    "    ]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        SimpleDense(32, activation=\"relu\"),\n",
+    "        SimpleDense(64, activation=\"relu\"),\n",
+    "        SimpleDense(32, activation=\"relu\"),\n",
+    "        SimpleDense(10, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### From layers to models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The \"compile\" step: Configuring the learning process"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential([keras.layers.Dense(1)])\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"mean_squared_error\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=keras.optimizers.RMSprop(),\n",
+    "    loss=keras.losses.MeanSquaredError(),\n",
+    "    metrics=[keras.metrics.BinaryAccuracy()],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Picking a loss function"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Understanding the fit method"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "history = model.fit(\n",
+    "    inputs,\n",
+    "    targets,\n",
+    "    epochs=5,\n",
+    "    batch_size=128,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "history.history"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Monitoring loss and metrics on validation data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential([keras.layers.Dense(1)])\n",
+    "model.compile(\n",
+    "    optimizer=keras.optimizers.RMSprop(learning_rate=0.1),\n",
+    "    loss=keras.losses.MeanSquaredError(),\n",
+    "    metrics=[keras.metrics.BinaryAccuracy()],\n",
+    ")\n",
+    "\n",
+    "indices_permutation = np.random.permutation(len(inputs))\n",
+    "shuffled_inputs = inputs[indices_permutation]\n",
+    "shuffled_targets = targets[indices_permutation]\n",
+    "\n",
+    "num_validation_samples = int(0.3 * len(inputs))\n",
+    "val_inputs = shuffled_inputs[:num_validation_samples]\n",
+    "val_targets = shuffled_targets[:num_validation_samples]\n",
+    "training_inputs = shuffled_inputs[num_validation_samples:]\n",
+    "training_targets = shuffled_targets[num_validation_samples:]\n",
+    "model.fit(\n",
+    "    training_inputs,\n",
+    "    training_targets,\n",
+    "    epochs=5,\n",
+    "    batch_size=16,\n",
+    "    validation_data=(val_inputs, val_targets),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Inference: Using a model after training"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "predictions = model.predict(val_inputs, batch_size=128)\n",
+    "print(predictions[:10])"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter03_introduction-to-ml-frameworks",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter04_classification-and-regression.ipynb b/chapter04_classification-and-regression.ipynb
new file mode 100644
index 0000000000..6e68704a45
--- /dev/null
+++ b/chapter04_classification-and-regression.ipynb
@@ -0,0 +1,1305 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Classification and regression"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Classifying movie reviews: A binary classification example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The IMDb dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras.datasets import imdb\n",
+    "\n",
+    "(train_data, train_labels), (test_data, test_labels) = imdb.load_data(\n",
+    "    num_words=10000\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_data[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_labels[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "max([max(sequence) for sequence in train_data])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "word_index = imdb.get_word_index()\n",
+    "reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])\n",
+    "decoded_review = \" \".join(\n",
+    "    [reverse_word_index.get(i - 3, \"?\") for i in train_data[0]]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "decoded_review[:100]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Preparing the data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "def multi_hot_encode(sequences, num_classes):\n",
+    "    results = np.zeros((len(sequences), num_classes))\n",
+    "    for i, sequence in enumerate(sequences):\n",
+    "        results[i][sequence] = 1.0\n",
+    "    return results\n",
+    "\n",
+    "x_train = multi_hot_encode(train_data, num_classes=10000)\n",
+    "x_test = multi_hot_encode(test_data, num_classes=10000)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x_train[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "y_train = train_labels.astype(\"float32\")\n",
+    "y_test = test_labels.astype(\"float32\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Building your model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(16, activation=\"relu\"),\n",
+    "        layers.Dense(16, activation=\"relu\"),\n",
+    "        layers.Dense(1, activation=\"sigmoid\"),\n",
+    "    ]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Validating your approach"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x_val = x_train[:10000]\n",
+    "partial_x_train = x_train[10000:]\n",
+    "y_val = y_train[:10000]\n",
+    "partial_y_train = y_train[10000:]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "history = model.fit(\n",
+    "    partial_x_train,\n",
+    "    partial_y_train,\n",
+    "    epochs=20,\n",
+    "    batch_size=512,\n",
+    "    validation_data=(x_val, y_val),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "history = model.fit(\n",
+    "    x_train,\n",
+    "    y_train,\n",
+    "    epochs=20,\n",
+    "    batch_size=512,\n",
+    "    validation_split=0.2,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "history_dict = history.history\n",
+    "history_dict.keys()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "history_dict = history.history\n",
+    "loss_values = history_dict[\"loss\"]\n",
+    "val_loss_values = history_dict[\"val_loss\"]\n",
+    "epochs = range(1, len(loss_values) + 1)\n",
+    "plt.plot(epochs, loss_values, \"r--\", label=\"Training loss\")\n",
+    "plt.plot(epochs, val_loss_values, \"b\", label=\"Validation loss\")\n",
+    "plt.title(\"[IMDB] Training and validation loss\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.xticks(epochs)\n",
+    "plt.ylabel(\"Loss\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "plt.clf()\n",
+    "acc = history_dict[\"accuracy\"]\n",
+    "val_acc = history_dict[\"val_accuracy\"]\n",
+    "plt.plot(epochs, acc, \"r--\", label=\"Training acc\")\n",
+    "plt.plot(epochs, val_acc, \"b\", label=\"Validation acc\")\n",
+    "plt.title(\"[IMDB] Training and validation accuracy\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.xticks(epochs)\n",
+    "plt.ylabel(\"Accuracy\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(16, activation=\"relu\"),\n",
+    "        layers.Dense(16, activation=\"relu\"),\n",
+    "        layers.Dense(1, activation=\"sigmoid\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(x_train, y_train, epochs=4, batch_size=512)\n",
+    "results = model.evaluate(x_test, y_test)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "results"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Using a trained model to generate predictions on new data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.predict(x_test)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Further experiments"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Wrapping up"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Classifying newswires: A multiclass classification example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The Reuters dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras.datasets import reuters\n",
+    "\n",
+    "(train_data, train_labels), (test_data, test_labels) = reuters.load_data(\n",
+    "    num_words=10000\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "len(train_data)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "len(test_data)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_data[10]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "word_index = reuters.get_word_index()\n",
+    "reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])\n",
+    "decoded_newswire = \" \".join(\n",
+    "    [reverse_word_index.get(i - 3, \"?\") for i in train_data[10]]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_labels[10]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Preparing the data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x_train = multi_hot_encode(train_data, num_classes=10000)\n",
+    "x_test = multi_hot_encode(test_data, num_classes=10000)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def one_hot_encode(labels, num_classes=46):\n",
+    "    results = np.zeros((len(labels), num_classes))\n",
+    "    for i, label in enumerate(labels):\n",
+    "        results[i, label] = 1.0\n",
+    "    return results\n",
+    "\n",
+    "y_train = one_hot_encode(train_labels)\n",
+    "y_test = one_hot_encode(test_labels)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras.utils import to_categorical\n",
+    "\n",
+    "y_train = to_categorical(train_labels)\n",
+    "y_test = to_categorical(test_labels)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Building your model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(64, activation=\"relu\"),\n",
+    "        layers.Dense(64, activation=\"relu\"),\n",
+    "        layers.Dense(46, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "top_3_accuracy = keras.metrics.TopKCategoricalAccuracy(\n",
+    "    k=3, name=\"top_3_accuracy\"\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\", top_3_accuracy],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Validating your approach"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x_val = x_train[:1000]\n",
+    "partial_x_train = x_train[1000:]\n",
+    "y_val = y_train[:1000]\n",
+    "partial_y_train = y_train[1000:]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "history = model.fit(\n",
+    "    partial_x_train,\n",
+    "    partial_y_train,\n",
+    "    epochs=20,\n",
+    "    batch_size=512,\n",
+    "    validation_data=(x_val, y_val),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "loss = history.history[\"loss\"]\n",
+    "val_loss = history.history[\"val_loss\"]\n",
+    "epochs = range(1, len(loss) + 1)\n",
+    "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
+    "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
+    "plt.title(\"Training and validation loss\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.xticks(epochs)\n",
+    "plt.ylabel(\"Loss\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "plt.clf()\n",
+    "acc = history.history[\"accuracy\"]\n",
+    "val_acc = history.history[\"val_accuracy\"]\n",
+    "plt.plot(epochs, acc, \"r--\", label=\"Training accuracy\")\n",
+    "plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n",
+    "plt.title(\"Training and validation accuracy\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.xticks(epochs)\n",
+    "plt.ylabel(\"Accuracy\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "plt.clf()\n",
+    "acc = history.history[\"top_3_accuracy\"]\n",
+    "val_acc = history.history[\"val_top_3_accuracy\"]\n",
+    "plt.plot(epochs, acc, \"r--\", label=\"Training top-3 accuracy\")\n",
+    "plt.plot(epochs, val_acc, \"b\", label=\"Validation top-3 accuracy\")\n",
+    "plt.title(\"Training and validation top-3 accuracy\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.xticks(epochs)\n",
+    "plt.ylabel(\"Top-3 accuracy\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(64, activation=\"relu\"),\n",
+    "        layers.Dense(64, activation=\"relu\"),\n",
+    "        layers.Dense(46, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(\n",
+    "    x_train,\n",
+    "    y_train,\n",
+    "    epochs=9,\n",
+    "    batch_size=512,\n",
+    ")\n",
+    "results = model.evaluate(x_test, y_test)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "results"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import copy\n",
+    "test_labels_copy = copy.copy(test_labels)\n",
+    "np.random.shuffle(test_labels_copy)\n",
+    "hits_array = np.array(test_labels == test_labels_copy)\n",
+    "hits_array.mean()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Generating predictions on new data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "predictions = model.predict(x_test)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "predictions[0].shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "np.sum(predictions[0])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "np.argmax(predictions[0])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### A different way to handle the labels and the loss"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "y_train = train_labels\n",
+    "y_test = test_labels"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The importance of having sufficiently large intermediate layers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(64, activation=\"relu\"),\n",
+    "        layers.Dense(4, activation=\"relu\"),\n",
+    "        layers.Dense(46, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(\n",
+    "    partial_x_train,\n",
+    "    partial_y_train,\n",
+    "    epochs=20,\n",
+    "    batch_size=128,\n",
+    "    validation_data=(x_val, y_val),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Further experiments"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Wrapping up"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Predicting house prices: A regression example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The California Housing Price dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras.datasets import california_housing\n",
+    "\n",
+    "(train_data, train_targets), (test_data, test_targets) = (\n",
+    "    california_housing.load_data(version=\"small\")\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_data.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_data.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_targets"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Preparing the data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "mean = train_data.mean(axis=0)\n",
+    "std = train_data.std(axis=0)\n",
+    "x_train = (train_data - mean) / std\n",
+    "x_test = (test_data - mean) / std"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "y_train = train_targets / 100000\n",
+    "y_test = test_targets / 100000"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Building your model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def get_model():\n",
+    "    model = keras.Sequential(\n",
+    "        [\n",
+    "            layers.Dense(64, activation=\"relu\"),\n",
+    "            layers.Dense(64, activation=\"relu\"),\n",
+    "            layers.Dense(1),\n",
+    "        ]\n",
+    "    )\n",
+    "    model.compile(\n",
+    "        optimizer=\"adam\",\n",
+    "        loss=\"mean_squared_error\",\n",
+    "        metrics=[\"mean_absolute_error\"],\n",
+    "    )\n",
+    "    return model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Validating your approach using K-fold validation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "k = 4\n",
+    "num_val_samples = len(x_train) // k\n",
+    "num_epochs = 50\n",
+    "all_scores = []\n",
+    "for i in range(k):\n",
+    "    print(f\"Processing fold #{i + 1}\")\n",
+    "    fold_x_val = x_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
+    "    fold_y_val = y_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
+    "    fold_x_train = np.concatenate(\n",
+    "        [x_train[: i * num_val_samples], x_train[(i + 1) * num_val_samples :]],\n",
+    "        axis=0,\n",
+    "    )\n",
+    "    fold_y_train = np.concatenate(\n",
+    "        [y_train[: i * num_val_samples], y_train[(i + 1) * num_val_samples :]],\n",
+    "        axis=0,\n",
+    "    )\n",
+    "    model = get_model()\n",
+    "    model.fit(\n",
+    "        fold_x_train,\n",
+    "        fold_y_train,\n",
+    "        epochs=num_epochs,\n",
+    "        batch_size=16,\n",
+    "        verbose=0,\n",
+    "    )\n",
+    "    scores = model.evaluate(fold_x_val, fold_y_val, verbose=0)\n",
+    "    val_loss, val_mae = scores\n",
+    "    all_scores.append(val_mae)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "[round(value, 3) for value in all_scores]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "round(np.mean(all_scores), 3)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "k = 4\n",
+    "num_val_samples = len(x_train) // k\n",
+    "num_epochs = 200\n",
+    "all_mae_histories = []\n",
+    "for i in range(k):\n",
+    "    print(f\"Processing fold #{i + 1}\")\n",
+    "    fold_x_val = x_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
+    "    fold_y_val = y_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
+    "    fold_x_train = np.concatenate(\n",
+    "        [x_train[: i * num_val_samples], x_train[(i + 1) * num_val_samples :]],\n",
+    "        axis=0,\n",
+    "    )\n",
+    "    fold_y_train = np.concatenate(\n",
+    "        [y_train[: i * num_val_samples], y_train[(i + 1) * num_val_samples :]],\n",
+    "        axis=0,\n",
+    "    )\n",
+    "    model = get_model()\n",
+    "    history = model.fit(\n",
+    "        fold_x_train,\n",
+    "        fold_y_train,\n",
+    "        validation_data=(fold_x_val, fold_y_val),\n",
+    "        epochs=num_epochs,\n",
+    "        batch_size=16,\n",
+    "        verbose=0,\n",
+    "    )\n",
+    "    mae_history = history.history[\"val_mean_absolute_error\"]\n",
+    "    all_mae_histories.append(mae_history)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "average_mae_history = [\n",
+    "    np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "epochs = range(1, len(average_mae_history) + 1)\n",
+    "plt.plot(epochs, average_mae_history)\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Validation MAE\")\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "truncated_mae_history = average_mae_history[10:]\n",
+    "epochs = range(10, len(truncated_mae_history) + 10)\n",
+    "plt.plot(epochs, truncated_mae_history)\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Validation MAE\")\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = get_model()\n",
+    "model.fit(x_train, y_train, epochs=130, batch_size=16, verbose=0)\n",
+    "test_mean_squared_error, test_mean_absolute_error = model.evaluate(\n",
+    "    x_test, y_test\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "round(test_mean_absolute_error, 3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Generating predictions on new data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "predictions = model.predict(x_test)\n",
+    "predictions[0]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Wrapping up"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter04_classification-and-regression",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter05_fundamentals-of-ml.ipynb b/chapter05_fundamentals-of-ml.ipynb
index dd61f4ead8..2aadcc9b85 100644
--- a/chapter05_fundamentals-of-ml.ipynb
+++ b/chapter05_fundamentals-of-ml.ipynb
@@ -6,16 +6,55 @@
     "colab_type": "text"
    },
    "source": [
-    "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": 0,
    "metadata": {
-    "colab_type": "text"
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
    },
+   "outputs": [],
    "source": [
-    "# Fundamentals of machine learning"
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
    ]
   },
   {
@@ -24,7 +63,7 @@
     "colab_type": "text"
    },
    "source": [
-    "## Generalization: The goal of machine learning"
+    "## Fundamentals of machine learning"
    ]
   },
   {
@@ -33,7 +72,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Underfitting and overfitting"
+    "### Generalization: The goal of machine learning"
    ]
   },
   {
@@ -42,7 +81,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Noisy training data"
+    "#### Underfitting and overfitting"
    ]
   },
   {
@@ -51,7 +90,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Ambiguous features"
+    "##### Noisy training data"
    ]
   },
   {
@@ -60,7 +99,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Rare features and spurious correlations"
+    "##### Ambiguous features"
    ]
   },
   {
@@ -69,7 +108,7 @@
     "colab_type": "text"
    },
    "source": [
-    "**Adding white-noise channels or all-zeros channels to MNIST**"
+    "##### Rare features and spurious correlations"
    ]
   },
   {
@@ -80,7 +119,7 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow.keras.datasets import mnist\n",
+    "from keras.datasets import mnist\n",
     "import numpy as np\n",
     "\n",
     "(train_images, train_labels), _ = mnist.load_data()\n",
@@ -88,19 +127,12 @@
     "train_images = train_images.astype(\"float32\") / 255\n",
     "\n",
     "train_images_with_noise_channels = np.concatenate(\n",
-    "    [train_images, np.random.random((len(train_images), 784))], axis=1)\n",
+    "    [train_images, np.random.random((len(train_images), 784))], axis=1\n",
+    ")\n",
     "\n",
     "train_images_with_zeros_channels = np.concatenate(\n",
-    "    [train_images, np.zeros((len(train_images), 784))], axis=1)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**Training the same model on MNIST data with noise channels or all-zero channels**"
+    "    [train_images, np.zeros((len(train_images), 784))], axis=1\n",
+    ")"
    ]
   },
   {
@@ -111,41 +143,40 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow import keras\n",
-    "from tensorflow.keras import layers\n",
+    "import keras\n",
+    "from keras import layers\n",
     "\n",
     "def get_model():\n",
-    "    model = keras.Sequential([\n",
-    "        layers.Dense(512, activation=\"relu\"),\n",
-    "        layers.Dense(10, activation=\"softmax\")\n",
-    "    ])\n",
-    "    model.compile(optimizer=\"rmsprop\",\n",
-    "                  loss=\"sparse_categorical_crossentropy\",\n",
-    "                  metrics=[\"accuracy\"])\n",
+    "    model = keras.Sequential(\n",
+    "        [\n",
+    "            layers.Dense(512, activation=\"relu\"),\n",
+    "            layers.Dense(10, activation=\"softmax\"),\n",
+    "        ]\n",
+    "    )\n",
+    "    model.compile(\n",
+    "        optimizer=\"adam\",\n",
+    "        loss=\"sparse_categorical_crossentropy\",\n",
+    "        metrics=[\"accuracy\"],\n",
+    "    )\n",
     "    return model\n",
     "\n",
     "model = get_model()\n",
     "history_noise = model.fit(\n",
-    "    train_images_with_noise_channels, train_labels,\n",
+    "    train_images_with_noise_channels,\n",
+    "    train_labels,\n",
     "    epochs=10,\n",
     "    batch_size=128,\n",
-    "    validation_split=0.2)\n",
+    "    validation_split=0.2,\n",
+    ")\n",
     "\n",
     "model = get_model()\n",
     "history_zeros = model.fit(\n",
-    "    train_images_with_zeros_channels, train_labels,\n",
+    "    train_images_with_zeros_channels,\n",
+    "    train_labels,\n",
     "    epochs=10,\n",
     "    batch_size=128,\n",
-    "    validation_split=0.2)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**Plotting a validation accuracy comparison**"
+    "    validation_split=0.2,\n",
+    ")"
    ]
   },
   {
@@ -157,17 +188,28 @@
    "outputs": [],
    "source": [
     "import matplotlib.pyplot as plt\n",
+    "\n",
     "val_acc_noise = history_noise.history[\"val_accuracy\"]\n",
     "val_acc_zeros = history_zeros.history[\"val_accuracy\"]\n",
     "epochs = range(1, 11)\n",
-    "plt.plot(epochs, val_acc_noise, \"b-\",\n",
-    "         label=\"Validation accuracy with noise channels\")\n",
-    "plt.plot(epochs, val_acc_zeros, \"b--\",\n",
-    "         label=\"Validation accuracy with zeros channels\")\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    val_acc_noise,\n",
+    "    \"b-\",\n",
+    "    label=\"Validation accuracy with noise channels\",\n",
+    ")\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    val_acc_zeros,\n",
+    "    \"r--\",\n",
+    "    label=\"Validation accuracy with zeros channels\",\n",
+    ")\n",
     "plt.title(\"Effect of noise channels on validation accuracy\")\n",
     "plt.xlabel(\"Epochs\")\n",
+    "plt.xticks(epochs)\n",
     "plt.ylabel(\"Accuracy\")\n",
-    "plt.legend()"
+    "plt.legend()\n",
+    "plt.show()"
    ]
   },
   {
@@ -176,16 +218,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### The nature of generalization in deep learning"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**Fitting a MNIST model with randomly shuffled labels**"
+    "#### The nature of generalization in deep learning"
    ]
   },
   {
@@ -203,26 +236,24 @@
     "random_train_labels = train_labels[:]\n",
     "np.random.shuffle(random_train_labels)\n",
     "\n",
-    "model = keras.Sequential([\n",
-    "    layers.Dense(512, activation=\"relu\"),\n",
-    "    layers.Dense(10, activation=\"softmax\")\n",
-    "])\n",
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"sparse_categorical_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
-    "model.fit(train_images, random_train_labels,\n",
-    "          epochs=100,\n",
-    "          batch_size=128,\n",
-    "          validation_split=0.2)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "#### The manifold hypothesis"
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(512, activation=\"relu\"),\n",
+    "        layers.Dense(10, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(\n",
+    "    train_images,\n",
+    "    random_train_labels,\n",
+    "    epochs=100,\n",
+    "    batch_size=128,\n",
+    "    validation_split=0.2,\n",
+    ")"
    ]
   },
   {
@@ -231,7 +262,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Interpolation as a source of generalization"
+    "##### The manifold hypothesis"
    ]
   },
   {
@@ -240,7 +271,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Why deep learning works"
+    "##### Interpolation as a source of generalization"
    ]
   },
   {
@@ -249,7 +280,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Training data is paramount"
+    "##### Why deep learning works"
    ]
   },
   {
@@ -258,7 +289,7 @@
     "colab_type": "text"
    },
    "source": [
-    "## Evaluating machine-learning models"
+    "##### Training data is paramount"
    ]
   },
   {
@@ -267,7 +298,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Training, validation, and test sets"
+    "### Evaluating machine-learning models"
    ]
   },
   {
@@ -276,7 +307,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Simple hold-out validation"
+    "#### Training, validation, and test sets"
    ]
   },
   {
@@ -285,7 +316,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### K-fold validation"
+    "##### Simple hold-out validation"
    ]
   },
   {
@@ -294,7 +325,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Iterated K-fold validation with shuffling"
+    "##### K-fold validation"
    ]
   },
   {
@@ -303,7 +334,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Beating a common-sense baseline"
+    "##### Iterated K-fold validation with shuffling"
    ]
   },
   {
@@ -312,7 +343,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Things to keep in mind about model evaluation"
+    "#### Beating a common-sense baseline"
    ]
   },
   {
@@ -321,7 +352,7 @@
     "colab_type": "text"
    },
    "source": [
-    "## Improving model fit"
+    "#### Things to keep in mind about model evaluation"
    ]
   },
   {
@@ -330,7 +361,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Tuning key gradient descent parameters"
+    "### Improving model fit"
    ]
   },
   {
@@ -339,7 +370,7 @@
     "colab_type": "text"
    },
    "source": [
-    "**Training a MNIST model with an incorrectly high learning rate**"
+    "#### Tuning key gradient descent parameters"
    ]
   },
   {
@@ -354,26 +385,20 @@
     "train_images = train_images.reshape((60000, 28 * 28))\n",
     "train_images = train_images.astype(\"float32\") / 255\n",
     "\n",
-    "model = keras.Sequential([\n",
-    "    layers.Dense(512, activation=\"relu\"),\n",
-    "    layers.Dense(10, activation=\"softmax\")\n",
-    "])\n",
-    "model.compile(optimizer=keras.optimizers.RMSprop(1.),\n",
-    "              loss=\"sparse_categorical_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
-    "model.fit(train_images, train_labels,\n",
-    "          epochs=10,\n",
-    "          batch_size=128,\n",
-    "          validation_split=0.2)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**The same model with a more appropriate learning rate**"
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(512, activation=\"relu\"),\n",
+    "        layers.Dense(10, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=keras.optimizers.RMSprop(learning_rate=1.0),\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(\n",
+    "    train_images, train_labels, epochs=10, batch_size=128, validation_split=0.2\n",
+    ")"
    ]
   },
   {
@@ -384,17 +409,20 @@
    },
    "outputs": [],
    "source": [
-    "model = keras.Sequential([\n",
-    "    layers.Dense(512, activation=\"relu\"),\n",
-    "    layers.Dense(10, activation=\"softmax\")\n",
-    "])\n",
-    "model.compile(optimizer=keras.optimizers.RMSprop(1e-2),\n",
-    "              loss=\"sparse_categorical_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
-    "model.fit(train_images, train_labels,\n",
-    "          epochs=10,\n",
-    "          batch_size=128,\n",
-    "          validation_split=0.2)"
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(512, activation=\"relu\"),\n",
+    "        layers.Dense(10, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=keras.optimizers.RMSprop(learning_rate=1e-2),\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(\n",
+    "    train_images, train_labels, epochs=10, batch_size=128, validation_split=0.2\n",
+    ")"
    ]
   },
   {
@@ -403,7 +431,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Leveraging better architecture priors"
+    "#### Using better architecture priors"
    ]
   },
   {
@@ -412,16 +440,26 @@
     "colab_type": "text"
    },
    "source": [
-    "### Increasing model capacity"
+    "#### Increasing model capacity"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": 0,
    "metadata": {
-    "colab_type": "text"
+    "colab_type": "code"
    },
+   "outputs": [],
    "source": [
-    "**A simple logistic regression on MNIST**"
+    "model = keras.Sequential([layers.Dense(10, activation=\"softmax\")])\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "history_small_model = model.fit(\n",
+    "    train_images, train_labels, epochs=20, batch_size=128, validation_split=0.2\n",
+    ")"
    ]
   },
   {
@@ -432,15 +470,45 @@
    },
    "outputs": [],
    "source": [
-    "model = keras.Sequential([layers.Dense(10, activation=\"softmax\")])\n",
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"sparse_categorical_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
-    "history_small_model = model.fit(\n",
-    "    train_images, train_labels,\n",
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "val_loss = history_small_model.history[\"val_loss\"]\n",
+    "epochs = range(1, 21)\n",
+    "plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n",
+    "plt.title(\"Validation loss for a model with insufficient capacity\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Loss\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(128, activation=\"relu\"),\n",
+    "        layers.Dense(128, activation=\"relu\"),\n",
+    "        layers.Dense(10, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "history_large_model = model.fit(\n",
+    "    train_images,\n",
+    "    train_labels,\n",
     "    epochs=20,\n",
     "    batch_size=128,\n",
-    "    validation_split=0.2)"
+    "    validation_split=0.2,\n",
+    ")"
    ]
   },
   {
@@ -451,15 +519,14 @@
    },
    "outputs": [],
    "source": [
-    "import matplotlib.pyplot as plt\n",
-    "val_loss = history_small_model.history[\"val_loss\"]\n",
+    "val_loss = history_large_model.history[\"val_loss\"]\n",
     "epochs = range(1, 21)\n",
-    "plt.plot(epochs, val_loss, \"b--\",\n",
-    "         label=\"Validation loss\")\n",
-    "plt.title(\"Effect of insufficient model capacity on validation loss\")\n",
+    "plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n",
+    "plt.title(\"Validation loss for a model with appropriate capacity\")\n",
     "plt.xlabel(\"Epochs\")\n",
     "plt.ylabel(\"Loss\")\n",
-    "plt.legend()"
+    "plt.legend()\n",
+    "plt.show()"
    ]
   },
   {
@@ -470,28 +537,44 @@
    },
    "outputs": [],
    "source": [
-    "model = keras.Sequential([\n",
-    "    layers.Dense(96, activation=\"relu\"),\n",
-    "    layers.Dense(96, activation=\"relu\"),\n",
-    "    layers.Dense(10, activation=\"softmax\"),\n",
-    "])\n",
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"sparse_categorical_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
-    "history_large_model = model.fit(\n",
-    "    train_images, train_labels,\n",
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(2048, activation=\"relu\"),\n",
+    "        layers.Dense(2048, activation=\"relu\"),\n",
+    "        layers.Dense(2048, activation=\"relu\"),\n",
+    "        layers.Dense(10, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "history_very_large_model = model.fit(\n",
+    "    train_images,\n",
+    "    train_labels,\n",
     "    epochs=20,\n",
-    "    batch_size=128,\n",
-    "    validation_split=0.2)"
+    "    batch_size=32,\n",
+    "    validation_split=0.2,\n",
+    ")"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": 0,
    "metadata": {
-    "colab_type": "text"
+    "colab_type": "code"
    },
+   "outputs": [],
    "source": [
-    "## Improving generalization"
+    "val_loss = history_very_large_model.history[\"val_loss\"]\n",
+    "epochs = range(1, 21)\n",
+    "plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n",
+    "plt.title(\"Validation loss for a model with too much capacity\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Loss\")\n",
+    "plt.legend()\n",
+    "plt.show()"
    ]
   },
   {
@@ -500,7 +583,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Dataset curation"
+    "### Improving generalization"
    ]
   },
   {
@@ -509,7 +592,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Feature engineering"
+    "#### Dataset curation"
    ]
   },
   {
@@ -518,7 +601,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Using early stopping"
+    "#### Feature engineering"
    ]
   },
   {
@@ -527,7 +610,7 @@
     "colab_type": "text"
    },
    "source": [
-    "### Regularizing your model"
+    "#### Using early stopping"
    ]
   },
   {
@@ -536,7 +619,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Reducing the network's size"
+    "#### Regularizing your model"
    ]
   },
   {
@@ -545,7 +628,7 @@
     "colab_type": "text"
    },
    "source": [
-    "**Original model**"
+    "##### Reducing the network's size"
    ]
   },
   {
@@ -556,35 +639,37 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow.keras.datasets import imdb\n",
+    "from keras.datasets import imdb\n",
+    "\n",
     "(train_data, train_labels), _ = imdb.load_data(num_words=10000)\n",
     "\n",
     "def vectorize_sequences(sequences, dimension=10000):\n",
     "    results = np.zeros((len(sequences), dimension))\n",
     "    for i, sequence in enumerate(sequences):\n",
-    "        results[i, sequence] = 1.\n",
+    "        results[i, sequence] = 1.0\n",
     "    return results\n",
+    "\n",
     "train_data = vectorize_sequences(train_data)\n",
     "\n",
-    "model = keras.Sequential([\n",
-    "    layers.Dense(16, activation=\"relu\"),\n",
-    "    layers.Dense(16, activation=\"relu\"),\n",
-    "    layers.Dense(1, activation=\"sigmoid\")\n",
-    "])\n",
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"binary_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
-    "history_original = model.fit(train_data, train_labels,\n",
-    "                             epochs=20, batch_size=512, validation_split=0.4)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**Version of the model with lower capacity**"
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(16, activation=\"relu\"),\n",
+    "        layers.Dense(16, activation=\"relu\"),\n",
+    "        layers.Dense(1, activation=\"sigmoid\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "history_original = model.fit(\n",
+    "    train_data,\n",
+    "    train_labels,\n",
+    "    epochs=20,\n",
+    "    batch_size=512,\n",
+    "    validation_split=0.4,\n",
+    ")"
    ]
   },
   {
@@ -595,26 +680,56 @@
    },
    "outputs": [],
    "source": [
-    "model = keras.Sequential([\n",
-    "    layers.Dense(4, activation=\"relu\"),\n",
-    "    layers.Dense(4, activation=\"relu\"),\n",
-    "    layers.Dense(1, activation=\"sigmoid\")\n",
-    "])\n",
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"binary_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(4, activation=\"relu\"),\n",
+    "        layers.Dense(4, activation=\"relu\"),\n",
+    "        layers.Dense(1, activation=\"sigmoid\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
     "history_smaller_model = model.fit(\n",
-    "    train_data, train_labels,\n",
-    "    epochs=20, batch_size=512, validation_split=0.4)"
+    "    train_data,\n",
+    "    train_labels,\n",
+    "    epochs=20,\n",
+    "    batch_size=512,\n",
+    "    validation_split=0.4,\n",
+    ")"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": 0,
    "metadata": {
-    "colab_type": "text"
+    "colab_type": "code"
    },
+   "outputs": [],
    "source": [
-    "**Version of the model with higher capacity**"
+    "original_val_loss = history_original.history[\"val_loss\"]\n",
+    "smaller_model_val_loss = history_smaller_model.history[\"val_loss\"]\n",
+    "epochs = range(1, 21)\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    original_val_loss,\n",
+    "    \"r--\",\n",
+    "    label=\"Validation loss of original model\",\n",
+    ")\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    smaller_model_val_loss,\n",
+    "    \"b-\",\n",
+    "    label=\"Validation loss of smaller model\",\n",
+    ")\n",
+    "plt.title(\"Original model vs. smaller model (IMDB review classification)\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Loss\")\n",
+    "plt.xticks(epochs)\n",
+    "plt.legend()\n",
+    "plt.show()"
    ]
   },
   {
@@ -625,26 +740,56 @@
    },
    "outputs": [],
    "source": [
-    "model = keras.Sequential([\n",
-    "    layers.Dense(512, activation=\"relu\"),\n",
-    "    layers.Dense(512, activation=\"relu\"),\n",
-    "    layers.Dense(1, activation=\"sigmoid\")\n",
-    "])\n",
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"binary_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(512, activation=\"relu\"),\n",
+    "        layers.Dense(512, activation=\"relu\"),\n",
+    "        layers.Dense(1, activation=\"sigmoid\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
     "history_larger_model = model.fit(\n",
-    "    train_data, train_labels,\n",
-    "    epochs=20, batch_size=512, validation_split=0.4)"
+    "    train_data,\n",
+    "    train_labels,\n",
+    "    epochs=20,\n",
+    "    batch_size=512,\n",
+    "    validation_split=0.4,\n",
+    ")"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": 0,
    "metadata": {
-    "colab_type": "text"
+    "colab_type": "code"
    },
+   "outputs": [],
    "source": [
-    "#### Adding weight regularization"
+    "original_val_loss = history_original.history[\"val_loss\"]\n",
+    "larger_model_val_loss = history_larger_model.history[\"val_loss\"]\n",
+    "epochs = range(1, 21)\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    original_val_loss,\n",
+    "    \"r--\",\n",
+    "    label=\"Validation loss of original model\",\n",
+    ")\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    larger_model_val_loss,\n",
+    "    \"b-\",\n",
+    "    label=\"Validation loss of larger model\",\n",
+    ")\n",
+    "plt.title(\"Original model vs. larger model (IMDB review classification)\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Loss\")\n",
+    "plt.xticks(epochs)\n",
+    "plt.legend()\n",
+    "plt.show()"
    ]
   },
   {
@@ -653,7 +798,7 @@
     "colab_type": "text"
    },
    "source": [
-    "**Adding L2 weight regularization to the model**"
+    "##### Adding weight regularization"
    ]
   },
   {
@@ -664,31 +809,60 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow.keras import regularizers\n",
-    "model = keras.Sequential([\n",
-    "    layers.Dense(16,\n",
-    "                 kernel_regularizer=regularizers.l2(0.002),\n",
-    "                 activation=\"relu\"),\n",
-    "    layers.Dense(16,\n",
-    "                 kernel_regularizer=regularizers.l2(0.002),\n",
-    "                 activation=\"relu\"),\n",
-    "    layers.Dense(1, activation=\"sigmoid\")\n",
-    "])\n",
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"binary_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
+    "from keras.regularizers import l2\n",
+    "\n",
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(16, kernel_regularizer=l2(0.002), activation=\"relu\"),\n",
+    "        layers.Dense(16, kernel_regularizer=l2(0.002), activation=\"relu\"),\n",
+    "        layers.Dense(1, activation=\"sigmoid\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
     "history_l2_reg = model.fit(\n",
-    "    train_data, train_labels,\n",
-    "    epochs=20, batch_size=512, validation_split=0.4)"
+    "    train_data,\n",
+    "    train_labels,\n",
+    "    epochs=20,\n",
+    "    batch_size=512,\n",
+    "    validation_split=0.4,\n",
+    ")"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": 0,
    "metadata": {
-    "colab_type": "text"
+    "colab_type": "code"
    },
+   "outputs": [],
    "source": [
-    "**Different weight regularizers available in Keras**"
+    "original_val_loss = history_original.history[\"val_loss\"]\n",
+    "l2_val_loss = history_l2_reg.history[\"val_loss\"]\n",
+    "epochs = range(1, 21)\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    original_val_loss,\n",
+    "    \"r--\",\n",
+    "    label=\"Validation loss of original model\",\n",
+    ")\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    l2_val_loss,\n",
+    "    \"b-\",\n",
+    "    label=\"Validation loss of L2-regularized model\",\n",
+    ")\n",
+    "plt.title(\n",
+    "    \"Original model vs. L2-regularized model (IMDB review classification)\"\n",
+    ")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Loss\")\n",
+    "plt.xticks(epochs)\n",
+    "plt.legend()\n",
+    "plt.show()"
    ]
   },
   {
@@ -699,7 +873,8 @@
    },
    "outputs": [],
    "source": [
-    "from tensorflow.keras import regularizers\n",
+    "from keras import regularizers\n",
+    "\n",
     "regularizers.l1(0.001)\n",
     "regularizers.l1_l2(l1=0.001, l2=0.001)"
    ]
@@ -710,16 +885,7 @@
     "colab_type": "text"
    },
    "source": [
-    "#### Adding dropout"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "colab_type": "text"
-   },
-   "source": [
-    "**Adding dropout to the IMDB model**"
+    "##### Adding dropout"
    ]
   },
   {
@@ -730,35 +896,68 @@
    },
    "outputs": [],
    "source": [
-    "model = keras.Sequential([\n",
-    "    layers.Dense(16, activation=\"relu\"),\n",
-    "    layers.Dropout(0.5),\n",
-    "    layers.Dense(16, activation=\"relu\"),\n",
-    "    layers.Dropout(0.5),\n",
-    "    layers.Dense(1, activation=\"sigmoid\")\n",
-    "])\n",
-    "model.compile(optimizer=\"rmsprop\",\n",
-    "              loss=\"binary_crossentropy\",\n",
-    "              metrics=[\"accuracy\"])\n",
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(16, activation=\"relu\"),\n",
+    "        layers.Dropout(0.5),\n",
+    "        layers.Dense(16, activation=\"relu\"),\n",
+    "        layers.Dropout(0.5),\n",
+    "        layers.Dense(1, activation=\"sigmoid\"),\n",
+    "    ]\n",
+    ")\n",
+    "model.compile(\n",
+    "    optimizer=\"rmsprop\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
     "history_dropout = model.fit(\n",
-    "    train_data, train_labels,\n",
-    "    epochs=20, batch_size=512, validation_split=0.4)"
+    "    train_data,\n",
+    "    train_labels,\n",
+    "    epochs=20,\n",
+    "    batch_size=512,\n",
+    "    validation_split=0.4,\n",
+    ")"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": 0,
    "metadata": {
-    "colab_type": "text"
+    "colab_type": "code"
    },
+   "outputs": [],
    "source": [
-    "## Summary"
+    "original_val_loss = history_original.history[\"val_loss\"]\n",
+    "dropout_val_loss = history_dropout.history[\"val_loss\"]\n",
+    "epochs = range(1, 21)\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    original_val_loss,\n",
+    "    \"r--\",\n",
+    "    label=\"Validation loss of original model\",\n",
+    ")\n",
+    "plt.plot(\n",
+    "    epochs,\n",
+    "    dropout_val_loss,\n",
+    "    \"b-\",\n",
+    "    label=\"Validation loss of dropout-regularized model\",\n",
+    ")\n",
+    "plt.title(\n",
+    "    \"Original model vs. dropout-regularized model (IMDB review classification)\"\n",
+    ")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Loss\")\n",
+    "plt.xticks(epochs)\n",
+    "plt.legend()\n",
+    "plt.show()"
    ]
   }
  ],
  "metadata": {
+  "accelerator": "GPU",
   "colab": {
    "collapsed_sections": [],
-   "name": "chapter05_fundamentals-of-ml.i",
+   "name": "chapter05_fundamentals-of-ml",
    "private_outputs": false,
    "provenance": [],
    "toc_visible": true
@@ -778,7 +977,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.7.0"
+   "version": "3.10.0"
   }
  },
  "nbformat": 4,
diff --git a/chapter07_deep-dive-keras.ipynb b/chapter07_deep-dive-keras.ipynb
new file mode 100644
index 0000000000..be5963473c
--- /dev/null
+++ b/chapter07_deep-dive-keras.ipynb
@@ -0,0 +1,1834 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## A deep dive on Keras"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### A spectrum of workflows"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Different ways to build Keras models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The Sequential model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "model = keras.Sequential(\n",
+    "    [\n",
+    "        layers.Dense(64, activation=\"relu\"),\n",
+    "        layers.Dense(10, activation=\"softmax\"),\n",
+    "    ]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential()\n",
+    "model.add(layers.Dense(64, activation=\"relu\"))\n",
+    "model.add(layers.Dense(10, activation=\"softmax\"))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.weights"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.build(input_shape=(None, 3))\n",
+    "model.weights"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential(name=\"my_example_model\")\n",
+    "model.add(layers.Dense(64, activation=\"relu\", name=\"my_first_layer\"))\n",
+    "model.add(layers.Dense(10, activation=\"softmax\", name=\"my_last_layer\"))\n",
+    "model.build((None, 3))\n",
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential()\n",
+    "model.add(keras.Input(shape=(3,)))\n",
+    "model.add(layers.Dense(64, activation=\"relu\"))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.add(layers.Dense(10, activation=\"softmax\"))\n",
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The Functional API"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### A simple example"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(3,), name=\"my_input\")\n",
+    "features = layers.Dense(64, activation=\"relu\")(inputs)\n",
+    "outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
+    "model = keras.Model(inputs=inputs, outputs=outputs, name=\"my_functional_model\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(3,), name=\"my_input\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs.dtype"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "features = layers.Dense(64, activation=\"relu\")(inputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "features.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
+    "model = keras.Model(inputs=inputs, outputs=outputs, name=\"my_functional_model\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Multi-input, multi-output models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "vocabulary_size = 10000\n",
+    "num_tags = 100\n",
+    "num_departments = 4\n",
+    "\n",
+    "title = keras.Input(shape=(vocabulary_size,), name=\"title\")\n",
+    "text_body = keras.Input(shape=(vocabulary_size,), name=\"text_body\")\n",
+    "tags = keras.Input(shape=(num_tags,), name=\"tags\")\n",
+    "\n",
+    "features = layers.Concatenate()([title, text_body, tags])\n",
+    "features = layers.Dense(64, activation=\"relu\", name=\"dense_features\")(features)\n",
+    "\n",
+    "priority = layers.Dense(1, activation=\"sigmoid\", name=\"priority\")(features)\n",
+    "department = layers.Dense(\n",
+    "    num_departments, activation=\"softmax\", name=\"department\"\n",
+    ")(features)\n",
+    "\n",
+    "model = keras.Model(\n",
+    "    inputs=[title, text_body, tags],\n",
+    "    outputs=[priority, department],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Training a multi-input, multi-output model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "num_samples = 1280\n",
+    "\n",
+    "title_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n",
+    "text_body_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n",
+    "tags_data = np.random.randint(0, 2, size=(num_samples, num_tags))\n",
+    "\n",
+    "priority_data = np.random.random(size=(num_samples, 1))\n",
+    "department_data = np.random.randint(0, num_departments, size=(num_samples, 1))\n",
+    "\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=[\"mean_squared_error\", \"sparse_categorical_crossentropy\"],\n",
+    "    metrics=[[\"mean_absolute_error\"], [\"accuracy\"]],\n",
+    ")\n",
+    "model.fit(\n",
+    "    [title_data, text_body_data, tags_data],\n",
+    "    [priority_data, department_data],\n",
+    "    epochs=1,\n",
+    ")\n",
+    "model.evaluate(\n",
+    "    [title_data, text_body_data, tags_data], [priority_data, department_data]\n",
+    ")\n",
+    "priority_preds, department_preds = model.predict(\n",
+    "    [title_data, text_body_data, tags_data]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss={\n",
+    "        \"priority\": \"mean_squared_error\",\n",
+    "        \"department\": \"sparse_categorical_crossentropy\",\n",
+    "    },\n",
+    "    metrics={\n",
+    "        \"priority\": [\"mean_absolute_error\"],\n",
+    "        \"department\": [\"accuracy\"],\n",
+    "    },\n",
+    ")\n",
+    "model.fit(\n",
+    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
+    "    {\"priority\": priority_data, \"department\": department_data},\n",
+    "    epochs=1,\n",
+    ")\n",
+    "model.evaluate(\n",
+    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
+    "    {\"priority\": priority_data, \"department\": department_data},\n",
+    ")\n",
+    "priority_preds, department_preds = model.predict(\n",
+    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### The power of the Functional API: Access to layer connectivity"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Plotting layer connectivity"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "keras.utils.plot_model(model, \"ticket_classifier.png\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "keras.utils.plot_model(\n",
+    "    model,\n",
+    "    \"ticket_classifier_with_shape_info.png\",\n",
+    "    show_shapes=True,\n",
+    "    show_layer_names=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### Feature extraction with a Functional model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.layers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.layers[3].input"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.layers[3].output"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "features = model.layers[4].output\n",
+    "difficulty = layers.Dense(3, activation=\"softmax\", name=\"difficulty\")(features)\n",
+    "\n",
+    "new_model = keras.Model(\n",
+    "    inputs=[title, text_body, tags], outputs=[priority, department, difficulty]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "keras.utils.plot_model(\n",
+    "    new_model,\n",
+    "    \"updated_ticket_classifier.png\",\n",
+    "    show_shapes=True,\n",
+    "    show_layer_names=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Subclassing the Model class"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Rewriting our previous example as a subclassed model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class CustomerTicketModel(keras.Model):\n",
+    "    def __init__(self, num_departments):\n",
+    "        super().__init__()\n",
+    "        self.concat_layer = layers.Concatenate()\n",
+    "        self.mixing_layer = layers.Dense(64, activation=\"relu\")\n",
+    "        self.priority_scorer = layers.Dense(1, activation=\"sigmoid\")\n",
+    "        self.department_classifier = layers.Dense(\n",
+    "            num_departments, activation=\"softmax\"\n",
+    "        )\n",
+    "\n",
+    "    def call(self, inputs):\n",
+    "        title = inputs[\"title\"]\n",
+    "        text_body = inputs[\"text_body\"]\n",
+    "        tags = inputs[\"tags\"]\n",
+    "\n",
+    "        features = self.concat_layer([title, text_body, tags])\n",
+    "        features = self.mixing_layer(features)\n",
+    "        priority = self.priority_scorer(features)\n",
+    "        department = self.department_classifier(features)\n",
+    "        return priority, department"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = CustomerTicketModel(num_departments=4)\n",
+    "\n",
+    "priority, department = model(\n",
+    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=[\"mean_squared_error\", \"sparse_categorical_crossentropy\"],\n",
+    "    metrics=[[\"mean_absolute_error\"], [\"accuracy\"]],\n",
+    ")\n",
+    "model.fit(\n",
+    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
+    "    [priority_data, department_data],\n",
+    "    epochs=1,\n",
+    ")\n",
+    "model.evaluate(\n",
+    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
+    "    [priority_data, department_data],\n",
+    ")\n",
+    "priority_preds, department_preds = model.predict(\n",
+    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Beware: What subclassed models don't support"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Mixing and matching different components"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class Classifier(keras.Model):\n",
+    "    def __init__(self, num_classes=2):\n",
+    "        super().__init__()\n",
+    "        if num_classes == 2:\n",
+    "            num_units = 1\n",
+    "            activation = \"sigmoid\"\n",
+    "        else:\n",
+    "            num_units = num_classes\n",
+    "            activation = \"softmax\"\n",
+    "        self.dense = layers.Dense(num_units, activation=activation)\n",
+    "\n",
+    "    def call(self, inputs):\n",
+    "        return self.dense(inputs)\n",
+    "\n",
+    "inputs = keras.Input(shape=(3,))\n",
+    "features = layers.Dense(64, activation=\"relu\")(inputs)\n",
+    "outputs = Classifier(num_classes=10)(features)\n",
+    "model = keras.Model(inputs=inputs, outputs=outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(64,))\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(inputs)\n",
+    "binary_classifier = keras.Model(inputs=inputs, outputs=outputs)\n",
+    "\n",
+    "class MyModel(keras.Model):\n",
+    "    def __init__(self, num_classes=2):\n",
+    "        super().__init__()\n",
+    "        self.dense = layers.Dense(64, activation=\"relu\")\n",
+    "        self.classifier = binary_classifier\n",
+    "\n",
+    "    def call(self, inputs):\n",
+    "        features = self.dense(inputs)\n",
+    "        return self.classifier(features)\n",
+    "\n",
+    "model = MyModel()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Remember: Use the right tool for the job"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Using built-in training and evaluation loops"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras.datasets import mnist\n",
+    "\n",
+    "def get_mnist_model():\n",
+    "    inputs = keras.Input(shape=(28 * 28,))\n",
+    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
+    "    features = layers.Dropout(0.5)(features)\n",
+    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
+    "    model = keras.Model(inputs, outputs)\n",
+    "    return model\n",
+    "\n",
+    "(images, labels), (test_images, test_labels) = mnist.load_data()\n",
+    "images = images.reshape((60000, 28 * 28)).astype(\"float32\") / 255\n",
+    "test_images = test_images.reshape((10000, 28 * 28)).astype(\"float32\") / 255\n",
+    "train_images, val_images = images[10000:], images[:10000]\n",
+    "train_labels, val_labels = labels[10000:], labels[:10000]\n",
+    "\n",
+    "model = get_mnist_model()\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(\n",
+    "    train_images,\n",
+    "    train_labels,\n",
+    "    epochs=3,\n",
+    "    validation_data=(val_images, val_labels),\n",
+    ")\n",
+    "test_metrics = model.evaluate(test_images, test_labels)\n",
+    "predictions = model.predict(test_images)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Writing your own metrics"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "class RootMeanSquaredError(keras.metrics.Metric):\n",
+    "    def __init__(self, name=\"rmse\", **kwargs):\n",
+    "        super().__init__(name=name, **kwargs)\n",
+    "        self.mse_sum = self.add_weight(name=\"mse_sum\", initializer=\"zeros\")\n",
+    "        self.total_samples = self.add_weight(\n",
+    "            name=\"total_samples\", initializer=\"zeros\"\n",
+    "        )\n",
+    "\n",
+    "    def update_state(self, y_true, y_pred, sample_weight=None):\n",
+    "        y_true = ops.one_hot(y_true, num_classes=ops.shape(y_pred)[1])\n",
+    "        mse = ops.sum(ops.square(y_true - y_pred))\n",
+    "        self.mse_sum.assign_add(mse)\n",
+    "        num_samples = ops.shape(y_pred)[0]\n",
+    "        self.total_samples.assign_add(num_samples)\n",
+    "\n",
+    "    def result(self):\n",
+    "        return ops.sqrt(self.mse_sum / self.total_samples)\n",
+    "\n",
+    "    def reset_state(self):\n",
+    "        self.mse_sum.assign(0.)\n",
+    "        self.total_samples.assign(0.)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = get_mnist_model()\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\", RootMeanSquaredError()],\n",
+    ")\n",
+    "model.fit(\n",
+    "    train_images,\n",
+    "    train_labels,\n",
+    "    epochs=3,\n",
+    "    validation_data=(val_images, val_labels),\n",
+    ")\n",
+    "test_metrics = model.evaluate(test_images, test_labels)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Using callbacks"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### The EarlyStopping and ModelCheckpoint callbacks"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "callbacks_list = [\n",
+    "    keras.callbacks.EarlyStopping(\n",
+    "        monitor=\"accuracy\",\n",
+    "        patience=1,\n",
+    "    ),\n",
+    "    keras.callbacks.ModelCheckpoint(\n",
+    "        filepath=\"checkpoint_path.keras\",\n",
+    "        monitor=\"val_loss\",\n",
+    "        save_best_only=True,\n",
+    "    ),\n",
+    "]\n",
+    "model = get_mnist_model()\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(\n",
+    "    train_images,\n",
+    "    train_labels,\n",
+    "    epochs=10,\n",
+    "    callbacks=callbacks_list,\n",
+    "    validation_data=(val_images, val_labels),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.models.load_model(\"checkpoint_path.keras\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Writing your own callbacks"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from matplotlib import pyplot as plt\n",
+    "\n",
+    "class LossHistory(keras.callbacks.Callback):\n",
+    "    def on_train_begin(self, logs):\n",
+    "        self.per_batch_losses = []\n",
+    "\n",
+    "    def on_batch_end(self, batch, logs):\n",
+    "        self.per_batch_losses.append(logs.get(\"loss\"))\n",
+    "\n",
+    "    def on_epoch_end(self, epoch, logs):\n",
+    "        plt.clf()\n",
+    "        plt.plot(\n",
+    "            range(len(self.per_batch_losses)),\n",
+    "            self.per_batch_losses,\n",
+    "            label=\"Training loss for each batch\",\n",
+    "        )\n",
+    "        plt.xlabel(f\"Batch (epoch {epoch})\")\n",
+    "        plt.ylabel(\"Loss\")\n",
+    "        plt.legend()\n",
+    "        plt.savefig(f\"plot_at_epoch_{epoch}\", dpi=300)\n",
+    "        self.per_batch_losses = []"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = get_mnist_model()\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(\n",
+    "    train_images,\n",
+    "    train_labels,\n",
+    "    epochs=10,\n",
+    "    callbacks=[LossHistory()],\n",
+    "    validation_data=(val_images, val_labels),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Monitoring and visualization with TensorBoard"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = get_mnist_model()\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "\n",
+    "tensorboard = keras.callbacks.TensorBoard(\n",
+    "    log_dir=\"/full_path_to_your_log_dir\",\n",
+    ")\n",
+    "model.fit(\n",
+    "    train_images,\n",
+    "    train_labels,\n",
+    "    epochs=10,\n",
+    "    validation_data=(val_images, val_labels),\n",
+    "    callbacks=[tensorboard],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%load_ext tensorboard\n",
+    "%tensorboard --logdir /full_path_to_your_log_dir"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Writing your own training and evaluation loops"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Training vs. inference"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Writing custom training step functions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### A TensorFlow training step function"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "import tensorflow as tf\n",
+    "\n",
+    "model = get_mnist_model()\n",
+    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
+    "optimizer = keras.optimizers.Adam()\n",
+    "\n",
+    "def train_step(inputs, targets):\n",
+    "    with tf.GradientTape() as tape:\n",
+    "        predictions = model(inputs, training=True)\n",
+    "        loss = loss_fn(targets, predictions)\n",
+    "    gradients = tape.gradient(loss, model.trainable_weights)\n",
+    "    optimizer.apply(gradients, model.trainable_weights)\n",
+    "    return loss"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "batch_size = 32\n",
+    "inputs = train_images[:batch_size]\n",
+    "targets = train_labels[:batch_size]\n",
+    "loss = train_step(inputs, targets)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### A PyTorch training step function"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend torch\n",
+    "import torch\n",
+    "\n",
+    "model = get_mnist_model()\n",
+    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
+    "optimizer = keras.optimizers.Adam()\n",
+    "\n",
+    "def train_step(inputs, targets):\n",
+    "    predictions = model(inputs, training=True)\n",
+    "    loss = loss_fn(targets, predictions)\n",
+    "    loss.backward()\n",
+    "    gradients = [weight.value.grad for weight in model.trainable_weights]\n",
+    "    with torch.no_grad():\n",
+    "        optimizer.apply(gradients, model.trainable_weights)\n",
+    "    model.zero_grad()\n",
+    "    return loss"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend torch\n",
+    "batch_size = 32\n",
+    "inputs = train_images[:batch_size]\n",
+    "targets = train_labels[:batch_size]\n",
+    "loss = train_step(inputs, targets)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### A JAX training step function"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "model = get_mnist_model()\n",
+    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
+    "\n",
+    "def compute_loss_and_updates(\n",
+    "    trainable_variables, non_trainable_variables, inputs, targets\n",
+    "):\n",
+    "    outputs, non_trainable_variables = model.stateless_call(\n",
+    "        trainable_variables, non_trainable_variables, inputs, training=True\n",
+    "    )\n",
+    "    loss = loss_fn(targets, outputs)\n",
+    "    return loss, non_trainable_variables"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "import jax\n",
+    "\n",
+    "grad_fn = jax.value_and_grad(compute_loss_and_updates, has_aux=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "optimizer = keras.optimizers.Adam()\n",
+    "optimizer.build(model.trainable_variables)\n",
+    "\n",
+    "def train_step(state, inputs, targets):\n",
+    "    (trainable_variables, non_trainable_variables, optimizer_variables) = state\n",
+    "    (loss, non_trainable_variables), grads = grad_fn(\n",
+    "        trainable_variables, non_trainable_variables, inputs, targets\n",
+    "    )\n",
+    "    trainable_variables, optimizer_variables = optimizer.stateless_apply(\n",
+    "        optimizer_variables, grads, trainable_variables\n",
+    "    )\n",
+    "    return loss, (\n",
+    "        trainable_variables,\n",
+    "        non_trainable_variables,\n",
+    "        optimizer_variables,\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "batch_size = 32\n",
+    "inputs = train_images[:batch_size]\n",
+    "targets = train_labels[:batch_size]\n",
+    "\n",
+    "trainable_variables = [v.value for v in model.trainable_variables]\n",
+    "non_trainable_variables = [v.value for v in model.non_trainable_variables]\n",
+    "optimizer_variables = [v.value for v in optimizer.variables]\n",
+    "\n",
+    "state = (trainable_variables, non_trainable_variables, optimizer_variables)\n",
+    "loss, state = train_step(state, inputs, targets)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Low-level usage of metrics"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "metric = keras.metrics.SparseCategoricalAccuracy()\n",
+    "targets = ops.array([0, 1, 2])\n",
+    "predictions = ops.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])\n",
+    "metric.update_state(targets, predictions)\n",
+    "current_result = metric.result()\n",
+    "print(f\"result: {current_result:.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "values = ops.array([0, 1, 2, 3, 4])\n",
+    "mean_tracker = keras.metrics.Mean()\n",
+    "for value in values:\n",
+    "    mean_tracker.update_state(value)\n",
+    "print(f\"Mean of values: {mean_tracker.result():.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "metric = keras.metrics.SparseCategoricalAccuracy()\n",
+    "targets = ops.array([0, 1, 2])\n",
+    "predictions = ops.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])\n",
+    "\n",
+    "metric_variables = metric.variables\n",
+    "metric_variables = metric.stateless_update_state(\n",
+    "    metric_variables, targets, predictions\n",
+    ")\n",
+    "current_result = metric.stateless_result(metric_variables)\n",
+    "print(f\"result: {current_result:.2f}\")\n",
+    "\n",
+    "metric_variables = metric.stateless_reset_state()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Using fit() with a custom training loop"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Customizing fit() with TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
+    "loss_tracker = keras.metrics.Mean(name=\"loss\")\n",
+    "\n",
+    "class CustomModel(keras.Model):\n",
+    "    def train_step(self, data):\n",
+    "        inputs, targets = data\n",
+    "        with tf.GradientTape() as tape:\n",
+    "            predictions = self(inputs, training=True)\n",
+    "            loss = loss_fn(targets, predictions)\n",
+    "        gradients = tape.gradient(loss, self.trainable_weights)\n",
+    "        self.optimizer.apply(gradients, self.trainable_weights)\n",
+    "\n",
+    "        loss_tracker.update_state(loss)\n",
+    "        return {\"loss\": loss_tracker.result()}\n",
+    "\n",
+    "    @property\n",
+    "    def metrics(self):\n",
+    "        return [loss_tracker]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "def get_custom_model():\n",
+    "    inputs = keras.Input(shape=(28 * 28,))\n",
+    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
+    "    features = layers.Dropout(0.5)(features)\n",
+    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
+    "    model = CustomModel(inputs, outputs)\n",
+    "    model.compile(optimizer=keras.optimizers.Adam())\n",
+    "    return model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "model = get_custom_model()\n",
+    "model.fit(train_images, train_labels, epochs=3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Customizing fit() with PyTorch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend torch\n",
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
+    "loss_tracker = keras.metrics.Mean(name=\"loss\")\n",
+    "\n",
+    "class CustomModel(keras.Model):\n",
+    "    def train_step(self, data):\n",
+    "        inputs, targets = data\n",
+    "        predictions = self(inputs, training=True)\n",
+    "        loss = loss_fn(targets, predictions)\n",
+    "\n",
+    "        loss.backward()\n",
+    "        trainable_weights = [v for v in self.trainable_weights]\n",
+    "        gradients = [v.value.grad for v in trainable_weights]\n",
+    "\n",
+    "        with torch.no_grad():\n",
+    "            self.optimizer.apply(gradients, trainable_weights)\n",
+    "\n",
+    "        loss_tracker.update_state(loss)\n",
+    "        return {\"loss\": loss_tracker.result()}\n",
+    "\n",
+    "    @property\n",
+    "    def metrics(self):\n",
+    "        return [loss_tracker]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend torch\n",
+    "def get_custom_model():\n",
+    "    inputs = keras.Input(shape=(28 * 28,))\n",
+    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
+    "    features = layers.Dropout(0.5)(features)\n",
+    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
+    "    model = CustomModel(inputs, outputs)\n",
+    "    model.compile(optimizer=keras.optimizers.Adam())\n",
+    "    return model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend torch\n",
+    "model = get_custom_model()\n",
+    "model.fit(train_images, train_labels, epochs=3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Customizing fit() with JAX"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
+    "\n",
+    "class CustomModel(keras.Model):\n",
+    "    def compute_loss_and_updates(\n",
+    "        self,\n",
+    "        trainable_variables,\n",
+    "        non_trainable_variables,\n",
+    "        inputs,\n",
+    "        targets,\n",
+    "        training=False,\n",
+    "    ):\n",
+    "        predictions, non_trainable_variables = self.stateless_call(\n",
+    "            trainable_variables,\n",
+    "            non_trainable_variables,\n",
+    "            inputs,\n",
+    "            training=training,\n",
+    "        )\n",
+    "        loss = loss_fn(targets, predictions)\n",
+    "        return loss, non_trainable_variables\n",
+    "\n",
+    "    def train_step(self, state, data):\n",
+    "        (\n",
+    "            trainable_variables,\n",
+    "            non_trainable_variables,\n",
+    "            optimizer_variables,\n",
+    "            metrics_variables,\n",
+    "        ) = state\n",
+    "        inputs, targets = data\n",
+    "\n",
+    "        grad_fn = jax.value_and_grad(\n",
+    "            self.compute_loss_and_updates, has_aux=True\n",
+    "        )\n",
+    "\n",
+    "        (loss, non_trainable_variables), grads = grad_fn(\n",
+    "            trainable_variables,\n",
+    "            non_trainable_variables,\n",
+    "            inputs,\n",
+    "            targets,\n",
+    "            training=True,\n",
+    "        )\n",
+    "\n",
+    "        (\n",
+    "            trainable_variables,\n",
+    "            optimizer_variables,\n",
+    "        ) = self.optimizer.stateless_apply(\n",
+    "            optimizer_variables, grads, trainable_variables\n",
+    "        )\n",
+    "\n",
+    "        logs = {\"loss\": loss}\n",
+    "        state = (\n",
+    "            trainable_variables,\n",
+    "            non_trainable_variables,\n",
+    "            optimizer_variables,\n",
+    "            metrics_variables,\n",
+    "        )\n",
+    "        return logs, state"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "def get_custom_model():\n",
+    "    inputs = keras.Input(shape=(28 * 28,))\n",
+    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
+    "    features = layers.Dropout(0.5)(features)\n",
+    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
+    "    model = CustomModel(inputs, outputs)\n",
+    "    model.compile(optimizer=keras.optimizers.Adam())\n",
+    "    return model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "model = get_custom_model()\n",
+    "model.fit(train_images, train_labels, epochs=3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Handling metrics in a custom train_step()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### train_step() metrics handling with TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "class CustomModel(keras.Model):\n",
+    "    def train_step(self, data):\n",
+    "        inputs, targets = data\n",
+    "        with tf.GradientTape() as tape:\n",
+    "            predictions = self(inputs, training=True)\n",
+    "            loss = self.compute_loss(y=targets, y_pred=predictions)\n",
+    "\n",
+    "        gradients = tape.gradient(loss, self.trainable_weights)\n",
+    "        self.optimizer.apply(gradients, self.trainable_weights)\n",
+    "\n",
+    "        for metric in self.metrics:\n",
+    "            if metric.name == \"loss\":\n",
+    "                metric.update_state(loss)\n",
+    "            else:\n",
+    "                metric.update_state(targets, predictions)\n",
+    "\n",
+    "        return {m.name: m.result() for m in self.metrics}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "def get_custom_model():\n",
+    "    inputs = keras.Input(shape=(28 * 28,))\n",
+    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
+    "    features = layers.Dropout(0.5)(features)\n",
+    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
+    "    model = CustomModel(inputs, outputs)\n",
+    "    model.compile(\n",
+    "        optimizer=keras.optimizers.Adam(),\n",
+    "        loss=keras.losses.SparseCategoricalCrossentropy(),\n",
+    "        metrics=[keras.metrics.SparseCategoricalAccuracy()],\n",
+    "    )\n",
+    "    return model\n",
+    "\n",
+    "model = get_custom_model()\n",
+    "model.fit(train_images, train_labels, epochs=3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### train_step() metrics handling with PyTorch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend torch\n",
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "class CustomModel(keras.Model):\n",
+    "    def train_step(self, data):\n",
+    "        inputs, targets = data\n",
+    "        predictions = self(inputs, training=True)\n",
+    "        loss = self.compute_loss(y=targets, y_pred=predictions)\n",
+    "\n",
+    "        loss.backward()\n",
+    "        trainable_weights = [v for v in self.trainable_weights]\n",
+    "        gradients = [v.value.grad for v in trainable_weights]\n",
+    "\n",
+    "        with torch.no_grad():\n",
+    "            self.optimizer.apply(gradients, trainable_weights)\n",
+    "\n",
+    "        for metric in self.metrics:\n",
+    "            if metric.name == \"loss\":\n",
+    "                metric.update_state(loss)\n",
+    "            else:\n",
+    "                metric.update_state(targets, predictions)\n",
+    "\n",
+    "        return {m.name: m.result() for m in self.metrics}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend torch\n",
+    "def get_custom_model():\n",
+    "    inputs = keras.Input(shape=(28 * 28,))\n",
+    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
+    "    features = layers.Dropout(0.5)(features)\n",
+    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
+    "    model = CustomModel(inputs, outputs)\n",
+    "    model.compile(\n",
+    "        optimizer=keras.optimizers.Adam(),\n",
+    "        loss=keras.losses.SparseCategoricalCrossentropy(),\n",
+    "        metrics=[keras.metrics.SparseCategoricalAccuracy()],\n",
+    "    )\n",
+    "    return model\n",
+    "\n",
+    "model = get_custom_model()\n",
+    "model.fit(train_images, train_labels, epochs=3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### train_step() metrics handling with JAX"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "class CustomModel(keras.Model):\n",
+    "    def compute_loss_and_updates(\n",
+    "        self,\n",
+    "        trainable_variables,\n",
+    "        non_trainable_variables,\n",
+    "        inputs,\n",
+    "        targets,\n",
+    "        training=False,\n",
+    "    ):\n",
+    "        predictions, non_trainable_variables = self.stateless_call(\n",
+    "            trainable_variables,\n",
+    "            non_trainable_variables,\n",
+    "            inputs,\n",
+    "            training=training,\n",
+    "        )\n",
+    "        loss = self.compute_loss(y=targets, y_pred=predictions)\n",
+    "        return loss, (predictions, non_trainable_variables)\n",
+    "\n",
+    "    def train_step(self, state, data):\n",
+    "        (\n",
+    "            trainable_variables,\n",
+    "            non_trainable_variables,\n",
+    "            optimizer_variables,\n",
+    "            metrics_variables,\n",
+    "        ) = state\n",
+    "        inputs, targets = data\n",
+    "\n",
+    "        grad_fn = jax.value_and_grad(\n",
+    "            self.compute_loss_and_updates, has_aux=True\n",
+    "        )\n",
+    "\n",
+    "        (loss, (predictions, non_trainable_variables)), grads = grad_fn(\n",
+    "            trainable_variables,\n",
+    "            non_trainable_variables,\n",
+    "            inputs,\n",
+    "            targets,\n",
+    "            training=True,\n",
+    "        )\n",
+    "        (\n",
+    "            trainable_variables,\n",
+    "            optimizer_variables,\n",
+    "        ) = self.optimizer.stateless_apply(\n",
+    "            optimizer_variables, grads, trainable_variables\n",
+    "        )\n",
+    "\n",
+    "        new_metrics_vars = []\n",
+    "        logs = {}\n",
+    "        for metric in self.metrics:\n",
+    "            num_prev = len(new_metrics_vars)\n",
+    "            num_current = len(metric.variables)\n",
+    "            current_vars = metrics_variables[num_prev : num_prev + num_current]\n",
+    "            if metric.name == \"loss\":\n",
+    "                current_vars = metric.stateless_update_state(current_vars, loss)\n",
+    "            else:\n",
+    "                current_vars = metric.stateless_update_state(\n",
+    "                    current_vars, targets, predictions\n",
+    "                )\n",
+    "            logs[metric.name] = metric.stateless_result(current_vars)\n",
+    "            new_metrics_vars += current_vars\n",
+    "\n",
+    "        state = (\n",
+    "            trainable_variables,\n",
+    "            non_trainable_variables,\n",
+    "            optimizer_variables,\n",
+    "            new_metrics_vars,\n",
+    "        )\n",
+    "        return logs, state"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter07_deep-dive-keras",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter08_image-classification.ipynb b/chapter08_image-classification.ipynb
new file mode 100644
index 0000000000..63d8e640f7
--- /dev/null
+++ b/chapter08_image-classification.ipynb
@@ -0,0 +1,1030 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Image classification"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Introduction to ConvNets"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "inputs = keras.Input(shape=(28, 28, 1))\n",
+    "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(inputs)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.GlobalAveragePooling2D()(x)\n",
+    "outputs = layers.Dense(10, activation=\"softmax\")(x)\n",
+    "model = keras.Model(inputs=inputs, outputs=outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras.datasets import mnist\n",
+    "\n",
+    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
+    "train_images = train_images.reshape((60000, 28, 28, 1))\n",
+    "train_images = train_images.astype(\"float32\") / 255\n",
+    "test_images = test_images.reshape((10000, 28, 28, 1))\n",
+    "test_images = test_images.astype(\"float32\") / 255\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(train_images, train_labels, epochs=5, batch_size=64)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_loss, test_acc = model.evaluate(test_images, test_labels)\n",
+    "print(f\"Test accuracy: {test_acc:.3f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The convolution operation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Understanding border effects and padding"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Understanding convolution strides"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The max-pooling operation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(28, 28, 1))\n",
+    "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(inputs)\n",
+    "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.GlobalAveragePooling2D()(x)\n",
+    "outputs = layers.Dense(10, activation=\"softmax\")(x)\n",
+    "model_no_max_pool = keras.Model(inputs=inputs, outputs=outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model_no_max_pool.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Training a ConvNet from scratch on a small dataset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The relevance of deep learning for small-data problems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Downloading the data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import kagglehub\n",
+    "\n",
+    "kagglehub.login()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "download_path = kagglehub.competition_download(\"dogs-vs-cats\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import zipfile\n",
+    "\n",
+    "with zipfile.ZipFile(download_path + \"/train.zip\", \"r\") as zip_ref:\n",
+    "    zip_ref.extractall(\".\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os, shutil, pathlib\n",
+    "\n",
+    "original_dir = pathlib.Path(\"train\")\n",
+    "new_base_dir = pathlib.Path(\"dogs_vs_cats_small\")\n",
+    "\n",
+    "def make_subset(subset_name, start_index, end_index):\n",
+    "    for category in (\"cat\", \"dog\"):\n",
+    "        dir = new_base_dir / subset_name / category\n",
+    "        os.makedirs(dir)\n",
+    "        fnames = [f\"{category}.{i}.jpg\" for i in range(start_index, end_index)]\n",
+    "        for fname in fnames:\n",
+    "            shutil.copyfile(src=original_dir / fname, dst=dir / fname)\n",
+    "\n",
+    "make_subset(\"train\", start_index=0, end_index=1000)\n",
+    "make_subset(\"validation\", start_index=1000, end_index=1500)\n",
+    "make_subset(\"test\", start_index=1500, end_index=2500)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Building your model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "inputs = keras.Input(shape=(180, 180, 3))\n",
+    "x = layers.Rescaling(1.0 / 255)(inputs)\n",
+    "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=512, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.GlobalAveragePooling2D()(x)\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "model = keras.Model(inputs=inputs, outputs=outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    optimizer=\"adam\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Data preprocessing"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras.utils import image_dataset_from_directory\n",
+    "\n",
+    "batch_size = 64\n",
+    "image_size = (180, 180)\n",
+    "train_dataset = image_dataset_from_directory(\n",
+    "    new_base_dir / \"train\", image_size=image_size, batch_size=batch_size\n",
+    ")\n",
+    "validation_dataset = image_dataset_from_directory(\n",
+    "    new_base_dir / \"validation\", image_size=image_size, batch_size=batch_size\n",
+    ")\n",
+    "test_dataset = image_dataset_from_directory(\n",
+    "    new_base_dir / \"test\", image_size=image_size, batch_size=batch_size\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Understanding TensorFlow Dataset objects"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import tensorflow as tf\n",
+    "\n",
+    "random_numbers = np.random.normal(size=(1000, 16))\n",
+    "dataset = tf.data.Dataset.from_tensor_slices(random_numbers)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "for i, element in enumerate(dataset):\n",
+    "    print(element.shape)\n",
+    "    if i >= 2:\n",
+    "        break"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "batched_dataset = dataset.batch(32)\n",
+    "for i, element in enumerate(batched_dataset):\n",
+    "    print(element.shape)\n",
+    "    if i >= 2:\n",
+    "        break"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "reshaped_dataset = dataset.map(\n",
+    "    lambda x: tf.reshape(x, (4, 4)),\n",
+    "    num_parallel_calls=8)\n",
+    "for i, element in enumerate(reshaped_dataset):\n",
+    "    print(element.shape)\n",
+    "    if i >= 2:\n",
+    "        break"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Fitting the model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "for data_batch, labels_batch in train_dataset:\n",
+    "    print(\"data batch shape:\", data_batch.shape)\n",
+    "    print(\"labels batch shape:\", labels_batch.shape)\n",
+    "    break"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\n",
+    "        filepath=\"convnet_from_scratch.keras\",\n",
+    "        save_best_only=True,\n",
+    "        monitor=\"val_loss\",\n",
+    "    )\n",
+    "]\n",
+    "history = model.fit(\n",
+    "    train_dataset,\n",
+    "    epochs=50,\n",
+    "    validation_data=validation_dataset,\n",
+    "    callbacks=callbacks,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "accuracy = history.history[\"accuracy\"]\n",
+    "val_accuracy = history.history[\"val_accuracy\"]\n",
+    "loss = history.history[\"loss\"]\n",
+    "val_loss = history.history[\"val_loss\"]\n",
+    "epochs = range(1, len(accuracy) + 1)\n",
+    "\n",
+    "plt.plot(epochs, accuracy, \"r--\", label=\"Training accuracy\")\n",
+    "plt.plot(epochs, val_accuracy, \"b\", label=\"Validation accuracy\")\n",
+    "plt.title(\"Training and validation accuracy\")\n",
+    "plt.legend()\n",
+    "plt.figure()\n",
+    "\n",
+    "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
+    "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
+    "plt.title(\"Training and validation loss\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_model = keras.models.load_model(\"convnet_from_scratch.keras\")\n",
+    "test_loss, test_acc = test_model.evaluate(test_dataset)\n",
+    "print(f\"Test accuracy: {test_acc:.3f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Using data augmentation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "data_augmentation_layers = [\n",
+    "    layers.RandomFlip(\"horizontal\"),\n",
+    "    layers.RandomRotation(0.1),\n",
+    "    layers.RandomZoom(0.2),\n",
+    "]\n",
+    "\n",
+    "def data_augmentation(images, targets):\n",
+    "    for layer in data_augmentation_layers:\n",
+    "        images = layer(images)\n",
+    "    return images, targets\n",
+    "\n",
+    "augmented_train_dataset = train_dataset.map(\n",
+    "    data_augmentation, num_parallel_calls=8\n",
+    ")\n",
+    "augmented_train_dataset = augmented_train_dataset.prefetch(tf.data.AUTOTUNE)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "plt.figure(figsize=(10, 10))\n",
+    "for image_batch, _ in train_dataset.take(1):\n",
+    "    image = image_batch[0]\n",
+    "    for i in range(9):\n",
+    "        ax = plt.subplot(3, 3, i + 1)\n",
+    "        augmented_image, _ = data_augmentation(image, None)\n",
+    "        augmented_image = keras.ops.convert_to_numpy(augmented_image)\n",
+    "        plt.imshow(augmented_image.astype(\"uint8\"))\n",
+    "        plt.axis(\"off\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(180, 180, 3))\n",
+    "x = layers.Rescaling(1.0 / 255)(inputs)\n",
+    "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
+    "x = layers.Conv2D(filters=512, kernel_size=3, activation=\"relu\")(x)\n",
+    "x = layers.GlobalAveragePooling2D()(x)\n",
+    "x = layers.Dropout(0.25)(x)\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "model = keras.Model(inputs=inputs, outputs=outputs)\n",
+    "\n",
+    "model.compile(\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    optimizer=\"adam\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\n",
+    "        filepath=\"convnet_from_scratch_with_augmentation.keras\",\n",
+    "        save_best_only=True,\n",
+    "        monitor=\"val_loss\",\n",
+    "    )\n",
+    "]\n",
+    "history = model.fit(\n",
+    "    augmented_train_dataset,\n",
+    "    epochs=100,\n",
+    "    validation_data=validation_dataset,\n",
+    "    callbacks=callbacks,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_model = keras.models.load_model(\n",
+    "    \"convnet_from_scratch_with_augmentation.keras\"\n",
+    ")\n",
+    "test_loss, test_acc = test_model.evaluate(test_dataset)\n",
+    "print(f\"Test accuracy: {test_acc:.3f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Using a pretrained model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Feature extraction with a pretrained model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras_hub\n",
+    "\n",
+    "conv_base = keras_hub.models.Backbone.from_preset(\"xception_41_imagenet\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "preprocessor = keras_hub.layers.ImageConverter.from_preset(\n",
+    "    \"xception_41_imagenet\",\n",
+    "    image_size=(180, 180),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Fast feature extraction without data augmentation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def get_features_and_labels(dataset):\n",
+    "    all_features = []\n",
+    "    all_labels = []\n",
+    "    for images, labels in dataset:\n",
+    "        preprocessed_images = preprocessor(images)\n",
+    "        features = conv_base.predict(preprocessed_images, verbose=0)\n",
+    "        all_features.append(features)\n",
+    "        all_labels.append(labels)\n",
+    "    return np.concatenate(all_features), np.concatenate(all_labels)\n",
+    "\n",
+    "train_features, train_labels = get_features_and_labels(train_dataset)\n",
+    "val_features, val_labels = get_features_and_labels(validation_dataset)\n",
+    "test_features, test_labels = get_features_and_labels(test_dataset)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_features.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(6, 6, 2048))\n",
+    "x = layers.GlobalAveragePooling2D()(inputs)\n",
+    "x = layers.Dense(256, activation=\"relu\")(x)\n",
+    "x = layers.Dropout(0.25)(x)\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "model = keras.Model(inputs, outputs)\n",
+    "model.compile(\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    optimizer=\"adam\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "\n",
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\n",
+    "        filepath=\"feature_extraction.keras\",\n",
+    "        save_best_only=True,\n",
+    "        monitor=\"val_loss\",\n",
+    "    )\n",
+    "]\n",
+    "history = model.fit(\n",
+    "    train_features,\n",
+    "    train_labels,\n",
+    "    epochs=10,\n",
+    "    validation_data=(val_features, val_labels),\n",
+    "    callbacks=callbacks,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "acc = history.history[\"accuracy\"]\n",
+    "val_acc = history.history[\"val_accuracy\"]\n",
+    "loss = history.history[\"loss\"]\n",
+    "val_loss = history.history[\"val_loss\"]\n",
+    "epochs = range(1, len(acc) + 1)\n",
+    "plt.plot(epochs, acc, \"r--\", label=\"Training accuracy\")\n",
+    "plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n",
+    "plt.title(\"Training and validation accuracy\")\n",
+    "plt.legend()\n",
+    "plt.figure()\n",
+    "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
+    "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
+    "plt.title(\"Training and validation loss\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_model = keras.models.load_model(\"feature_extraction.keras\")\n",
+    "test_loss, test_acc = test_model.evaluate(test_features, test_labels)\n",
+    "print(f\"Test accuracy: {test_acc:.3f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Feature extraction together with data augmentation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras_hub\n",
+    "\n",
+    "conv_base = keras_hub.models.Backbone.from_preset(\n",
+    "    \"xception_41_imagenet\",\n",
+    "    trainable=False,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "conv_base.trainable = True\n",
+    "len(conv_base.trainable_weights)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "conv_base.trainable = False\n",
+    "len(conv_base.trainable_weights)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(180, 180, 3))\n",
+    "x = preprocessor(inputs)\n",
+    "x = conv_base(x)\n",
+    "x = layers.GlobalAveragePooling2D()(x)\n",
+    "x = layers.Dense(256)(x)\n",
+    "x = layers.Dropout(0.25)(x)\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "model = keras.Model(inputs, outputs)\n",
+    "model.compile(\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    optimizer=\"adam\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\n",
+    "        filepath=\"feature_extraction_with_data_augmentation.keras\",\n",
+    "        save_best_only=True,\n",
+    "        monitor=\"val_loss\",\n",
+    "    )\n",
+    "]\n",
+    "history = model.fit(\n",
+    "    augmented_train_dataset,\n",
+    "    epochs=30,\n",
+    "    validation_data=validation_dataset,\n",
+    "    callbacks=callbacks,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_model = keras.models.load_model(\n",
+    "    \"feature_extraction_with_data_augmentation.keras\"\n",
+    ")\n",
+    "test_loss, test_acc = test_model.evaluate(test_dataset)\n",
+    "print(f\"Test accuracy: {test_acc:.3f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Fine-tuning a pretrained model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    optimizer=keras.optimizers.Adam(learning_rate=1e-5),\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "\n",
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\n",
+    "        filepath=\"fine_tuning.keras\",\n",
+    "        save_best_only=True,\n",
+    "        monitor=\"val_loss\",\n",
+    "    )\n",
+    "]\n",
+    "history = model.fit(\n",
+    "    augmented_train_dataset,\n",
+    "    epochs=30,\n",
+    "    validation_data=validation_dataset,\n",
+    "    callbacks=callbacks,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.models.load_model(\"fine_tuning.keras\")\n",
+    "test_loss, test_acc = model.evaluate(test_dataset)\n",
+    "print(f\"Test accuracy: {test_acc:.3f}\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter08_image-classification",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter09_convnet-architecture-patterns.ipynb b/chapter09_convnet-architecture-patterns.ipynb
new file mode 100644
index 0000000000..136ebaa12d
--- /dev/null
+++ b/chapter09_convnet-architecture-patterns.ipynb
@@ -0,0 +1,381 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## ConvNet architecture patterns"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Modularity, hierarchy, and reuse"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Residual connections"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "inputs = keras.Input(shape=(32, 32, 3))\n",
+    "x = layers.Conv2D(32, 3, activation=\"relu\")(inputs)\n",
+    "residual = x\n",
+    "x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "residual = layers.Conv2D(64, 1)(residual)\n",
+    "x = layers.add([x, residual])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(32, 32, 3))\n",
+    "x = layers.Conv2D(32, 3, activation=\"relu\")(inputs)\n",
+    "residual = x\n",
+    "x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "x = layers.MaxPooling2D(2, padding=\"same\")(x)\n",
+    "residual = layers.Conv2D(64, 1, strides=2)(residual)\n",
+    "x = layers.add([x, residual])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(32, 32, 3))\n",
+    "x = layers.Rescaling(1.0 / 255)(inputs)\n",
+    "\n",
+    "def residual_block(x, filters, pooling=False):\n",
+    "    residual = x\n",
+    "    x = layers.Conv2D(filters, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = layers.Conv2D(filters, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "    if pooling:\n",
+    "        x = layers.MaxPooling2D(2, padding=\"same\")(x)\n",
+    "        residual = layers.Conv2D(filters, 1, strides=2)(residual)\n",
+    "    elif filters != residual.shape[-1]:\n",
+    "        residual = layers.Conv2D(filters, 1)(residual)\n",
+    "    x = layers.add([x, residual])\n",
+    "    return x\n",
+    "\n",
+    "x = residual_block(x, filters=32, pooling=True)\n",
+    "x = residual_block(x, filters=64, pooling=True)\n",
+    "x = residual_block(x, filters=128, pooling=False)\n",
+    "\n",
+    "x = layers.GlobalAveragePooling2D()(x)\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "model = keras.Model(inputs=inputs, outputs=outputs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Batch normalization"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Depthwise separable convolutions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Putting it together: A mini Xception-like model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import kagglehub\n",
+    "\n",
+    "kagglehub.login()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import zipfile\n",
+    "\n",
+    "download_path = kagglehub.competition_download(\"dogs-vs-cats\")\n",
+    "\n",
+    "with zipfile.ZipFile(download_path + \"/train.zip\", \"r\") as zip_ref:\n",
+    "    zip_ref.extractall(\".\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os, shutil, pathlib\n",
+    "from keras.utils import image_dataset_from_directory\n",
+    "\n",
+    "original_dir = pathlib.Path(\"train\")\n",
+    "new_base_dir = pathlib.Path(\"dogs_vs_cats_small\")\n",
+    "\n",
+    "def make_subset(subset_name, start_index, end_index):\n",
+    "    for category in (\"cat\", \"dog\"):\n",
+    "        dir = new_base_dir / subset_name / category\n",
+    "        os.makedirs(dir)\n",
+    "        fnames = [f\"{category}.{i}.jpg\" for i in range(start_index, end_index)]\n",
+    "        for fname in fnames:\n",
+    "            shutil.copyfile(src=original_dir / fname, dst=dir / fname)\n",
+    "\n",
+    "make_subset(\"train\", start_index=0, end_index=1000)\n",
+    "make_subset(\"validation\", start_index=1000, end_index=1500)\n",
+    "make_subset(\"test\", start_index=1500, end_index=2500)\n",
+    "\n",
+    "batch_size = 64\n",
+    "image_size = (180, 180)\n",
+    "train_dataset = image_dataset_from_directory(\n",
+    "    new_base_dir / \"train\",\n",
+    "    image_size=image_size,\n",
+    "    batch_size=batch_size,\n",
+    ")\n",
+    "validation_dataset = image_dataset_from_directory(\n",
+    "    new_base_dir / \"validation\",\n",
+    "    image_size=image_size,\n",
+    "    batch_size=batch_size,\n",
+    ")\n",
+    "test_dataset = image_dataset_from_directory(\n",
+    "    new_base_dir / \"test\",\n",
+    "    image_size=image_size,\n",
+    "    batch_size=batch_size,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import tensorflow as tf\n",
+    "from keras import layers\n",
+    "\n",
+    "data_augmentation_layers = [\n",
+    "    layers.RandomFlip(\"horizontal\"),\n",
+    "    layers.RandomRotation(0.1),\n",
+    "    layers.RandomZoom(0.2),\n",
+    "]\n",
+    "\n",
+    "def data_augmentation(images, targets):\n",
+    "    for layer in data_augmentation_layers:\n",
+    "        images = layer(images)\n",
+    "    return images, targets\n",
+    "\n",
+    "augmented_train_dataset = train_dataset.map(\n",
+    "    data_augmentation, num_parallel_calls=8\n",
+    ")\n",
+    "augmented_train_dataset = augmented_train_dataset.prefetch(tf.data.AUTOTUNE)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "\n",
+    "inputs = keras.Input(shape=(180, 180, 3))\n",
+    "x = layers.Rescaling(1.0 / 255)(inputs)\n",
+    "x = layers.Conv2D(filters=32, kernel_size=5, use_bias=False)(x)\n",
+    "\n",
+    "for size in [32, 64, 128, 256, 512]:\n",
+    "    residual = x\n",
+    "\n",
+    "    x = layers.BatchNormalization()(x)\n",
+    "    x = layers.Activation(\"relu\")(x)\n",
+    "    x = layers.SeparableConv2D(size, 3, padding=\"same\", use_bias=False)(x)\n",
+    "\n",
+    "    x = layers.BatchNormalization()(x)\n",
+    "    x = layers.Activation(\"relu\")(x)\n",
+    "    x = layers.SeparableConv2D(size, 3, padding=\"same\", use_bias=False)(x)\n",
+    "\n",
+    "    x = layers.MaxPooling2D(3, strides=2, padding=\"same\")(x)\n",
+    "\n",
+    "    residual = layers.Conv2D(\n",
+    "        size, 1, strides=2, padding=\"same\", use_bias=False\n",
+    "    )(residual)\n",
+    "    x = layers.add([x, residual])\n",
+    "\n",
+    "x = layers.GlobalAveragePooling2D()(x)\n",
+    "x = layers.Dropout(0.5)(x)\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "model = keras.Model(inputs=inputs, outputs=outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    optimizer=\"adam\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "history = model.fit(\n",
+    "    augmented_train_dataset,\n",
+    "    epochs=100,\n",
+    "    validation_data=validation_dataset,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Beyond convolution: Vision Transformers"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter09_convnet-architecture-patterns",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter10_interpreting-what-convnets-learn.ipynb b/chapter10_interpreting-what-convnets-learn.ipynb
new file mode 100644
index 0000000000..869c82d8f5
--- /dev/null
+++ b/chapter10_interpreting-what-convnets-learn.ipynb
@@ -0,0 +1,827 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Interpreting what ConvNets learn"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Visualizing intermediate activations"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from google.colab import files\n",
+    "\n",
+    "# You can use this to load the file\n",
+    "# \"convnet_from_scratch_with_augmentation.keras\"\n",
+    "# you obtained in the last chapter.\n",
+    "files.upload()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "model = keras.models.load_model(\n",
+    "    \"convnet_from_scratch_with_augmentation.keras\"\n",
+    ")\n",
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "import numpy as np\n",
+    "\n",
+    "img_path = keras.utils.get_file(\n",
+    "    fname=\"cat.jpg\", origin=\"https://img-datasets.s3.amazonaws.com/cat.jpg\"\n",
+    ")\n",
+    "\n",
+    "def get_img_array(img_path, target_size):\n",
+    "    img = keras.utils.load_img(img_path, target_size=target_size)\n",
+    "    array = keras.utils.img_to_array(img)\n",
+    "    array = np.expand_dims(array, axis=0)\n",
+    "    return array\n",
+    "\n",
+    "img_tensor = get_img_array(img_path, target_size=(180, 180))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "plt.axis(\"off\")\n",
+    "plt.imshow(img_tensor[0].astype(\"uint8\"))\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import layers\n",
+    "\n",
+    "layer_outputs = []\n",
+    "layer_names = []\n",
+    "for layer in model.layers:\n",
+    "    if isinstance(layer, (layers.Conv2D, layers.MaxPooling2D)):\n",
+    "        layer_outputs.append(layer.output)\n",
+    "        layer_names.append(layer.name)\n",
+    "activation_model = keras.Model(inputs=model.input, outputs=layer_outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "activations = activation_model.predict(img_tensor)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "first_layer_activation = activations[0]\n",
+    "print(first_layer_activation.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "plt.matshow(first_layer_activation[0, :, :, 5], cmap=\"viridis\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "images_per_row = 16\n",
+    "for layer_name, layer_activation in zip(layer_names, activations):\n",
+    "    n_features = layer_activation.shape[-1]\n",
+    "    size = layer_activation.shape[1]\n",
+    "    n_cols = n_features // images_per_row\n",
+    "    display_grid = np.zeros(\n",
+    "        ((size + 1) * n_cols - 1, images_per_row * (size + 1) - 1)\n",
+    "    )\n",
+    "    for col in range(n_cols):\n",
+    "        for row in range(images_per_row):\n",
+    "            channel_index = col * images_per_row + row\n",
+    "            channel_image = layer_activation[0, :, :, channel_index].copy()\n",
+    "            if channel_image.sum() != 0:\n",
+    "                channel_image -= channel_image.mean()\n",
+    "                channel_image /= channel_image.std()\n",
+    "                channel_image *= 64\n",
+    "                channel_image += 128\n",
+    "            channel_image = np.clip(channel_image, 0, 255).astype(\"uint8\")\n",
+    "            display_grid[\n",
+    "                col * (size + 1) : (col + 1) * size + col,\n",
+    "                row * (size + 1) : (row + 1) * size + row,\n",
+    "            ] = channel_image\n",
+    "    scale = 1.0 / size\n",
+    "    plt.figure(\n",
+    "        figsize=(scale * display_grid.shape[1], scale * display_grid.shape[0])\n",
+    "    )\n",
+    "    plt.title(layer_name)\n",
+    "    plt.grid(False)\n",
+    "    plt.axis(\"off\")\n",
+    "    plt.imshow(display_grid, aspect=\"auto\", cmap=\"viridis\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Visualizing ConvNet filters"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras_hub\n",
+    "\n",
+    "model = keras_hub.models.Backbone.from_preset(\n",
+    "    \"xception_41_imagenet\",\n",
+    ")\n",
+    "preprocessor = keras_hub.layers.ImageConverter.from_preset(\n",
+    "    \"xception_41_imagenet\",\n",
+    "    image_size=(180, 180),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "for layer in model.layers:\n",
+    "    if isinstance(layer, (keras.layers.Conv2D, keras.layers.SeparableConv2D)):\n",
+    "        print(layer.name)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "layer_name = \"block3_sepconv1\"\n",
+    "layer = model.get_layer(name=layer_name)\n",
+    "feature_extractor = keras.Model(inputs=model.input, outputs=layer.output)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "activation = feature_extractor(preprocessor(img_tensor))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "def compute_loss(image, filter_index):\n",
+    "    activation = feature_extractor(image)\n",
+    "    filter_activation = activation[:, 2:-2, 2:-2, filter_index]\n",
+    "    return ops.mean(filter_activation)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Gradient ascent in TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "import tensorflow as tf\n",
+    "\n",
+    "@tf.function\n",
+    "def gradient_ascent_step(image, filter_index, learning_rate):\n",
+    "    with tf.GradientTape() as tape:\n",
+    "        tape.watch(image)\n",
+    "        loss = compute_loss(image, filter_index)\n",
+    "    grads = tape.gradient(loss, image)\n",
+    "    grads = ops.normalize(grads)\n",
+    "    image += learning_rate * grads\n",
+    "    return image"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Gradient ascent in PyTorch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend torch\n",
+    "import torch\n",
+    "\n",
+    "def gradient_ascent_step(image, filter_index, learning_rate):\n",
+    "    image = image.clone().detach().requires_grad_(True)\n",
+    "    loss = compute_loss(image, filter_index)\n",
+    "    loss.backward()\n",
+    "    grads = image.grad\n",
+    "    grads = ops.normalize(grads)\n",
+    "    image = image + learning_rate * grads\n",
+    "    return image"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Gradient ascent in JAX"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "import jax\n",
+    "\n",
+    "grad_fn = jax.grad(compute_loss)\n",
+    "\n",
+    "@jax.jit\n",
+    "def gradient_ascent_step(image, filter_index, learning_rate):\n",
+    "    grads = grad_fn(image, filter_index)\n",
+    "    grads = ops.normalize(grads)\n",
+    "    image += learning_rate * grads\n",
+    "    return image"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The filter visualization loop"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "img_width = 200\n",
+    "img_height = 200\n",
+    "\n",
+    "def generate_filter_pattern(filter_index):\n",
+    "    iterations = 30\n",
+    "    learning_rate = 10.0\n",
+    "    image = keras.random.uniform(\n",
+    "        minval=0.4, maxval=0.6, shape=(1, img_width, img_height, 3)\n",
+    "    )\n",
+    "    for i in range(iterations):\n",
+    "        image = gradient_ascent_step(image, filter_index, learning_rate)\n",
+    "    return image[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def deprocess_image(image):\n",
+    "    image -= ops.mean(image)\n",
+    "    image /= ops.std(image)\n",
+    "    image *= 64\n",
+    "    image += 128\n",
+    "    image = ops.clip(image, 0, 255)\n",
+    "    image = image[25:-25, 25:-25, :]\n",
+    "    image = ops.cast(image, dtype=\"uint8\")\n",
+    "    return ops.convert_to_numpy(image)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "plt.axis(\"off\")\n",
+    "plt.imshow(deprocess_image(generate_filter_pattern(filter_index=2)))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "all_images = []\n",
+    "for filter_index in range(64):\n",
+    "    print(f\"Processing filter {filter_index}\")\n",
+    "    image = deprocess_image(generate_filter_pattern(filter_index))\n",
+    "    all_images.append(image)\n",
+    "\n",
+    "margin = 5\n",
+    "n = 8\n",
+    "box_width = img_width - 25 * 2\n",
+    "box_height = img_height - 25 * 2\n",
+    "full_width = n * box_width + (n - 1) * margin\n",
+    "full_height = n * box_height + (n - 1) * margin\n",
+    "stitched_filters = np.zeros((full_width, full_height, 3))\n",
+    "\n",
+    "for i in range(n):\n",
+    "    for j in range(n):\n",
+    "        image = all_images[i * n + j]\n",
+    "        stitched_filters[\n",
+    "            (box_width + margin) * i : (box_width + margin) * i + box_width,\n",
+    "            (box_height + margin) * j : (box_height + margin) * j + box_height,\n",
+    "            :,\n",
+    "        ] = image\n",
+    "\n",
+    "keras.utils.save_img(f\"filters_for_layer_{layer_name}.png\", stitched_filters)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Visualizing heatmaps of class activation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "img_path = keras.utils.get_file(\n",
+    "    fname=\"elephant.jpg\",\n",
+    "    origin=\"https://img-datasets.s3.amazonaws.com/elephant.jpg\",\n",
+    ")\n",
+    "img = keras.utils.load_img(img_path)\n",
+    "img_array = np.expand_dims(img, axis=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras_hub.models.ImageClassifier.from_preset(\n",
+    "   \"xception_41_imagenet\",\n",
+    "   activation=\"softmax\",\n",
+    ")\n",
+    "preds = model.predict(img_array)\n",
+    "preds.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "keras_hub.utils.decode_imagenet_predictions(preds)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "np.argmax(preds[0])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "img_array = model.preprocessor(img_array)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "last_conv_layer_name = \"block14_sepconv2_act\"\n",
+    "last_conv_layer = model.backbone.get_layer(last_conv_layer_name)\n",
+    "last_conv_layer_model = keras.Model(model.inputs, last_conv_layer.output)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "classifier_input = last_conv_layer.output\n",
+    "x = classifier_input\n",
+    "for layer_name in [\"pooler\", \"predictions\"]:\n",
+    "    x = model.get_layer(layer_name)(x)\n",
+    "classifier_model = keras.Model(classifier_input, x)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Getting the gradient of the top class: TensorFlow version"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend tensorflow\n",
+    "import tensorflow as tf\n",
+    "\n",
+    "def get_top_class_gradients(img_array):\n",
+    "    last_conv_layer_output = last_conv_layer_model(img_array)\n",
+    "    with tf.GradientTape() as tape:\n",
+    "        tape.watch(last_conv_layer_output)\n",
+    "        preds = classifier_model(last_conv_layer_output)\n",
+    "        top_pred_index = ops.argmax(preds[0])\n",
+    "        top_class_channel = preds[:, top_pred_index]\n",
+    "\n",
+    "    grads = tape.gradient(top_class_channel, last_conv_layer_output)\n",
+    "    return grads, last_conv_layer_output\n",
+    "\n",
+    "grads, last_conv_layer_output = get_top_class_gradients(img_array)\n",
+    "grads = ops.convert_to_numpy(grads)\n",
+    "last_conv_layer_output = ops.convert_to_numpy(last_conv_layer_output)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Getting the gradient of the top class: PyTorch version"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend torch\n",
+    "def get_top_class_gradients(img_array):\n",
+    "    last_conv_layer_output = last_conv_layer_model(img_array)\n",
+    "    last_conv_layer_output = (\n",
+    "        last_conv_layer_output.clone().detach().requires_grad_(True)\n",
+    "    )\n",
+    "    preds = classifier_model(last_conv_layer_output)\n",
+    "    top_pred_index = ops.argmax(preds[0])\n",
+    "    top_class_channel = preds[:, top_pred_index]\n",
+    "    top_class_channel.backward()\n",
+    "    grads = last_conv_layer_output.grad\n",
+    "    return grads, last_conv_layer_output\n",
+    "\n",
+    "grads, last_conv_layer_output = get_top_class_gradients(img_array)\n",
+    "grads = ops.convert_to_numpy(grads)\n",
+    "last_conv_layer_output = ops.convert_to_numpy(last_conv_layer_output)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Getting the gradient of the top class: JAX version"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "%%backend jax\n",
+    "import jax\n",
+    "\n",
+    "def loss_fn(last_conv_layer_output):\n",
+    "    preds = classifier_model(last_conv_layer_output)\n",
+    "    top_pred_index = ops.argmax(preds[0])\n",
+    "    top_class_channel = preds[:, top_pred_index]\n",
+    "    return top_class_channel[0]\n",
+    "\n",
+    "grad_fn = jax.grad(loss_fn)\n",
+    "\n",
+    "def get_top_class_gradients(img_array):\n",
+    "    last_conv_layer_output = last_conv_layer_model(img_array)\n",
+    "    grads = grad_fn(last_conv_layer_output)\n",
+    "    return grads, last_conv_layer_output\n",
+    "\n",
+    "grads, last_conv_layer_output = get_top_class_gradients(img_array)\n",
+    "grads = ops.convert_to_numpy(grads)\n",
+    "last_conv_layer_output = ops.convert_to_numpy(last_conv_layer_output)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Displaying the class activation heatmap"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "pooled_grads = np.mean(grads, axis=(0, 1, 2))\n",
+    "last_conv_layer_output = last_conv_layer_output[0].copy()\n",
+    "for i in range(pooled_grads.shape[-1]):\n",
+    "    last_conv_layer_output[:, :, i] *= pooled_grads[i]\n",
+    "heatmap = np.mean(last_conv_layer_output, axis=-1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "heatmap = np.maximum(heatmap, 0)\n",
+    "heatmap /= np.max(heatmap)\n",
+    "plt.matshow(heatmap)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.cm as cm\n",
+    "\n",
+    "img = keras.utils.load_img(img_path)\n",
+    "img = keras.utils.img_to_array(img)\n",
+    "\n",
+    "heatmap = np.uint8(255 * heatmap)\n",
+    "\n",
+    "jet = cm.get_cmap(\"jet\")\n",
+    "jet_colors = jet(np.arange(256))[:, :3]\n",
+    "jet_heatmap = jet_colors[heatmap]\n",
+    "\n",
+    "jet_heatmap = keras.utils.array_to_img(jet_heatmap)\n",
+    "jet_heatmap = jet_heatmap.resize((img.shape[1], img.shape[0]))\n",
+    "jet_heatmap = keras.utils.img_to_array(jet_heatmap)\n",
+    "\n",
+    "superimposed_img = jet_heatmap * 0.4 + img\n",
+    "superimposed_img = keras.utils.array_to_img(superimposed_img)\n",
+    "\n",
+    "plt.imshow(superimposed_img)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Visualizing the latent space of a ConvNet"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter10_interpreting-what-convnets-learn",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter11_image-segmentation.ipynb b/chapter11_image-segmentation.ipynb
new file mode 100644
index 0000000000..74f4c6b13d
--- /dev/null
+++ b/chapter11_image-segmentation.ipynb
@@ -0,0 +1,701 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Image segmentation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Computer vision tasks"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Types of image segmentation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Training a segmentation model from scratch"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Downloading a segmentation dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz\n",
+    "!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz\n",
+    "!tar -xf images.tar.gz\n",
+    "!tar -xf annotations.tar.gz"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import pathlib\n",
+    "\n",
+    "input_dir = pathlib.Path(\"images\")\n",
+    "target_dir = pathlib.Path(\"annotations/trimaps\")\n",
+    "\n",
+    "input_img_paths = sorted(input_dir.glob(\"*.jpg\"))\n",
+    "target_paths = sorted(target_dir.glob(\"[!.]*.png\"))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "from keras.utils import load_img, img_to_array, array_to_img\n",
+    "\n",
+    "plt.axis(\"off\")\n",
+    "plt.imshow(load_img(input_img_paths[9]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def display_target(target_array):\n",
+    "    normalized_array = (target_array.astype(\"uint8\") - 1) * 127\n",
+    "    plt.axis(\"off\")\n",
+    "    plt.imshow(normalized_array[:, :, 0])\n",
+    "\n",
+    "img = img_to_array(load_img(target_paths[9], color_mode=\"grayscale\"))\n",
+    "display_target(img)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import random\n",
+    "\n",
+    "img_size = (200, 200)\n",
+    "num_imgs = len(input_img_paths)\n",
+    "\n",
+    "random.Random(1337).shuffle(input_img_paths)\n",
+    "random.Random(1337).shuffle(target_paths)\n",
+    "\n",
+    "def path_to_input_image(path):\n",
+    "    return img_to_array(load_img(path, target_size=img_size))\n",
+    "\n",
+    "def path_to_target(path):\n",
+    "    img = img_to_array(\n",
+    "        load_img(path, target_size=img_size, color_mode=\"grayscale\")\n",
+    "    )\n",
+    "    img = img.astype(\"uint8\") - 1\n",
+    "    return img\n",
+    "\n",
+    "input_imgs = np.zeros((num_imgs,) + img_size + (3,), dtype=\"float32\")\n",
+    "targets = np.zeros((num_imgs,) + img_size + (1,), dtype=\"uint8\")\n",
+    "for i in range(num_imgs):\n",
+    "    input_imgs[i] = path_to_input_image(input_img_paths[i])\n",
+    "    targets[i] = path_to_target(target_paths[i])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "num_val_samples = 1000\n",
+    "train_input_imgs = input_imgs[:-num_val_samples]\n",
+    "train_targets = targets[:-num_val_samples]\n",
+    "val_input_imgs = input_imgs[-num_val_samples:]\n",
+    "val_targets = targets[-num_val_samples:]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Building and training the segmentation model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras.layers import Rescaling, Conv2D, Conv2DTranspose\n",
+    "\n",
+    "def get_model(img_size, num_classes):\n",
+    "    inputs = keras.Input(shape=img_size + (3,))\n",
+    "    x = Rescaling(1.0 / 255)(inputs)\n",
+    "\n",
+    "    x = Conv2D(64, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = Conv2D(128, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = Conv2D(128, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = Conv2D(256, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n",
+    "    x = Conv2D(256, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "\n",
+    "    x = Conv2DTranspose(256, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = Conv2DTranspose(256, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = Conv2DTranspose(128, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = Conv2DTranspose(128, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = Conv2DTranspose(64, 3, activation=\"relu\", padding=\"same\")(x)\n",
+    "    x = Conv2DTranspose(64, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n",
+    "\n",
+    "    outputs = Conv2D(num_classes, 3, activation=\"softmax\", padding=\"same\")(x)\n",
+    "\n",
+    "    return keras.Model(inputs, outputs)\n",
+    "\n",
+    "model = get_model(img_size=img_size, num_classes=3)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# \u26a0\ufe0fNOTE\u26a0\ufe0f: The following IoU metric is *very* slow on the PyTorch backend!\n",
+    "# If you are running with PyTorch, we recommend re-running the notebook with Jax\n",
+    "# or TensorFlow, or skipping to the next section of this chapter."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "foreground_iou = keras.metrics.IoU(\n",
+    "    num_classes=3,\n",
+    "    target_class_ids=(0,),\n",
+    "    name=\"foreground_iou\",\n",
+    "    sparse_y_true=True,\n",
+    "    sparse_y_pred=False,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[foreground_iou],\n",
+    ")\n",
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\n",
+    "        \"oxford_segmentation.keras\",\n",
+    "        save_best_only=True,\n",
+    "    ),\n",
+    "]\n",
+    "history = model.fit(\n",
+    "    train_input_imgs,\n",
+    "    train_targets,\n",
+    "    epochs=50,\n",
+    "    callbacks=callbacks,\n",
+    "    batch_size=64,\n",
+    "    validation_data=(val_input_imgs, val_targets),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "epochs = range(1, len(history.history[\"loss\"]) + 1)\n",
+    "loss = history.history[\"loss\"]\n",
+    "val_loss = history.history[\"val_loss\"]\n",
+    "plt.figure()\n",
+    "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
+    "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
+    "plt.title(\"Training and validation loss\")\n",
+    "plt.legend()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.models.load_model(\"oxford_segmentation.keras\")\n",
+    "\n",
+    "i = 4\n",
+    "test_image = val_input_imgs[i]\n",
+    "plt.axis(\"off\")\n",
+    "plt.imshow(array_to_img(test_image))\n",
+    "\n",
+    "mask = model.predict(np.expand_dims(test_image, 0))[0]\n",
+    "\n",
+    "def display_mask(pred):\n",
+    "    mask = np.argmax(pred, axis=-1)\n",
+    "    mask *= 127\n",
+    "    plt.axis(\"off\")\n",
+    "    plt.imshow(mask)\n",
+    "\n",
+    "display_mask(mask)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Using a pretrained segmentation model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Downloading the Segment Anything Model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras_hub\n",
+    "\n",
+    "model = keras_hub.models.ImageSegmenter.from_preset(\"sam_huge_sa1b\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.count_params()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### How Segment Anything works"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Preparing a test image"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "path = keras.utils.get_file(\n",
+    "    origin=\"https://s3.amazonaws.com/keras.io/img/book/fruits.jpg\"\n",
+    ")\n",
+    "pil_image = keras.utils.load_img(path)\n",
+    "image_array = keras.utils.img_to_array(pil_image)\n",
+    "\n",
+    "plt.imshow(image_array.astype(\"uint8\"))\n",
+    "plt.axis(\"off\")\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "image_size = (1024, 1024)\n",
+    "\n",
+    "def resize_and_pad(x):\n",
+    "    return ops.image.resize(x, image_size, pad_to_aspect_ratio=True)\n",
+    "\n",
+    "image = resize_and_pad(image_array)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "from keras import ops\n",
+    "\n",
+    "def show_image(image, ax):\n",
+    "    ax.imshow(ops.convert_to_numpy(image).astype(\"uint8\"))\n",
+    "\n",
+    "def show_mask(mask, ax):\n",
+    "    color = np.array([30 / 255, 144 / 255, 255 / 255, 0.6])\n",
+    "    h, w, _ = mask.shape\n",
+    "    mask_image = mask.reshape(h, w, 1) * color.reshape(1, 1, -1)\n",
+    "    ax.imshow(mask_image)\n",
+    "\n",
+    "def show_points(points, ax):\n",
+    "    x, y = points[:, 0], points[:, 1]\n",
+    "    ax.scatter(x, y, c=\"green\", marker=\"*\", s=375, ec=\"white\", lw=1.25)\n",
+    "\n",
+    "def show_box(box, ax):\n",
+    "    box = box.reshape(-1)\n",
+    "    x0, y0 = box[0], box[1]\n",
+    "    w, h = box[2] - box[0], box[3] - box[1]\n",
+    "    ax.add_patch(plt.Rectangle((x0, y0), w, h, ec=\"red\", fc=\"none\", lw=2))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Prompting the model with a target point"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "input_point = np.array([[580, 450]])\n",
+    "input_label = np.array([1])\n",
+    "\n",
+    "plt.figure(figsize=(10, 10))\n",
+    "show_image(image, plt.gca())\n",
+    "show_points(input_point, plt.gca())\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "outputs = model.predict(\n",
+    "    {\n",
+    "        \"images\": ops.expand_dims(image, axis=0),\n",
+    "        \"points\": ops.expand_dims(input_point, axis=0),\n",
+    "        \"labels\": ops.expand_dims(input_label, axis=0),\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "outputs[\"masks\"].shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def get_mask(sam_outputs, index=0):\n",
+    "    mask = sam_outputs[\"masks\"][0][index]\n",
+    "    mask = np.expand_dims(mask, axis=-1)\n",
+    "    mask = resize_and_pad(mask)\n",
+    "    return ops.convert_to_numpy(mask) > 0.0\n",
+    "\n",
+    "mask = get_mask(outputs, index=0)\n",
+    "\n",
+    "plt.figure(figsize=(10, 10))\n",
+    "show_image(image, plt.gca())\n",
+    "show_mask(mask, plt.gca())\n",
+    "show_points(input_point, plt.gca())\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_point = np.array([[300, 550]])\n",
+    "input_label = np.array([1])\n",
+    "\n",
+    "outputs = model.predict(\n",
+    "    {\n",
+    "        \"images\": ops.expand_dims(image, axis=0),\n",
+    "        \"points\": ops.expand_dims(input_point, axis=0),\n",
+    "        \"labels\": ops.expand_dims(input_label, axis=0),\n",
+    "    }\n",
+    ")\n",
+    "mask = get_mask(outputs, index=0)\n",
+    "\n",
+    "plt.figure(figsize=(10, 10))\n",
+    "show_image(image, plt.gca())\n",
+    "show_mask(mask, plt.gca())\n",
+    "show_points(input_point, plt.gca())\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "fig, axes = plt.subplots(1, 3, figsize=(20, 60))\n",
+    "masks = outputs[\"masks\"][0][1:]\n",
+    "for i, mask in enumerate(masks):\n",
+    "    show_image(image, axes[i])\n",
+    "    show_points(input_point, axes[i])\n",
+    "    mask = get_mask(outputs, index=i + 1)\n",
+    "    show_mask(mask, axes[i])\n",
+    "    axes[i].set_title(f\"Mask {i + 1}\", fontsize=16)\n",
+    "    axes[i].axis(\"off\")\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Prompting the model with a target box"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_box = np.array(\n",
+    "    [\n",
+    "        [520, 180],\n",
+    "        [770, 420],\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "plt.figure(figsize=(10, 10))\n",
+    "show_image(image, plt.gca())\n",
+    "show_box(input_box, plt.gca())\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "outputs = model.predict(\n",
+    "    {\n",
+    "        \"images\": ops.expand_dims(image, axis=0),\n",
+    "        \"boxes\": ops.expand_dims(input_box, axis=(0, 1)),\n",
+    "    }\n",
+    ")\n",
+    "mask = get_mask(outputs, 0)\n",
+    "plt.figure(figsize=(10, 10))\n",
+    "show_image(image, plt.gca())\n",
+    "show_mask(mask, plt.gca())\n",
+    "show_box(input_box, plt.gca())\n",
+    "plt.show()"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter11_image-segmentation",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter12_object-detection.ipynb b/chapter12_object-detection.ipynb
new file mode 100644
index 0000000000..47bf3e1008
--- /dev/null
+++ b/chapter12_object-detection.ipynb
@@ -0,0 +1,712 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Object detection"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Single-stage vs. two-stage object detectors"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Two-stage R-CNN detectors"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Single-stage detectors"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Training a YOLO model from scratch"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Downloading the COCO dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "import keras_hub\n",
+    "\n",
+    "images_path = keras.utils.get_file(\n",
+    "    \"coco\",\n",
+    "    \"http://images.cocodataset.org/zips/train2017.zip\",\n",
+    "    extract=True,\n",
+    ")\n",
+    "annotations_path = keras.utils.get_file(\n",
+    "    \"annotations\",\n",
+    "    \"http://images.cocodataset.org/annotations/annotations_trainval2017.zip\",\n",
+    "    extract=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import json\n",
+    "\n",
+    "with open(f\"{annotations_path}/annotations/instances_train2017.json\", \"r\") as f:\n",
+    "    annotations = json.load(f)\n",
+    "\n",
+    "images = {image[\"id\"]: image for image in annotations[\"images\"]}\n",
+    "\n",
+    "def scale_box(box, width, height):\n",
+    "    scale = 1.0 / max(width, height)\n",
+    "    x, y, w, h = [v * scale for v in box]\n",
+    "    x += (height - width) * scale / 2 if height > width else 0\n",
+    "    y += (width - height) * scale / 2 if width > height else 0\n",
+    "    return [x, y, w, h]\n",
+    "\n",
+    "metadata = {}\n",
+    "for annotation in annotations[\"annotations\"]:\n",
+    "    id = annotation[\"image_id\"]\n",
+    "    if id not in metadata:\n",
+    "        metadata[id] = {\"boxes\": [], \"labels\": []}\n",
+    "    image = images[id]\n",
+    "    box = scale_box(annotation[\"bbox\"], image[\"width\"], image[\"height\"])\n",
+    "    metadata[id][\"boxes\"].append(box)\n",
+    "    metadata[id][\"labels\"].append(annotation[\"category_id\"])\n",
+    "    metadata[id][\"path\"] = images_path + \"/train2017/\" + image[\"file_name\"]\n",
+    "metadata = list(metadata.values())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "len(metadata)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "min([len(x[\"boxes\"]) for x in metadata])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "max([len(x[\"boxes\"]) for x in metadata])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "max(max(x[\"labels\"]) for x in metadata) + 1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "metadata[435]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "[keras_hub.utils.coco_id_to_name(x) for x in metadata[435][\"labels\"]]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "from matplotlib.colors import hsv_to_rgb\n",
+    "from matplotlib.patches import Rectangle\n",
+    "\n",
+    "color_map = {0: \"gray\"}\n",
+    "\n",
+    "def label_to_color(label):\n",
+    "    if label not in color_map:\n",
+    "        h, s, v = (len(color_map) * 0.618) % 1, 0.5, 0.9\n",
+    "        color_map[label] = hsv_to_rgb((h, s, v))\n",
+    "    return color_map[label]\n",
+    "\n",
+    "def draw_box(ax, box, text, color):\n",
+    "    x, y, w, h = box\n",
+    "    ax.add_patch(Rectangle((x, y), w, h, lw=2, ec=color, fc=\"none\"))\n",
+    "    textbox = dict(fc=color, pad=1, ec=\"none\")\n",
+    "    ax.text(x, y, text, c=\"white\", size=10, va=\"bottom\", bbox=textbox)\n",
+    "\n",
+    "def draw_image(ax, image):\n",
+    "    ax.set(xlim=(0, 1), ylim=(1, 0), xticks=[], yticks=[], aspect=\"equal\")\n",
+    "    image = plt.imread(image)\n",
+    "    height, width = image.shape[:2]\n",
+    "    hpad = (1 - height / width) / 2 if width > height else 0\n",
+    "    wpad = (1 - width / height) / 2 if height > width else 0\n",
+    "    extent = [wpad, 1 - wpad, 1 - hpad, hpad]\n",
+    "    ax.imshow(image, extent=extent)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "sample = metadata[435]\n",
+    "ig, ax = plt.subplots(dpi=300)\n",
+    "draw_image(ax, sample[\"path\"])\n",
+    "for box, label in zip(sample[\"boxes\"], sample[\"labels\"]):\n",
+    "    label_name = keras_hub.utils.coco_id_to_name(label)\n",
+    "    draw_box(ax, box, label_name, label_to_color(label))\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import random\n",
+    "\n",
+    "metadata = list(filter(lambda x: len(x[\"boxes\"]) <= 4, metadata))\n",
+    "random.shuffle(metadata)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Creating a YOLO model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "image_size = 448\n",
+    "\n",
+    "backbone = keras_hub.models.Backbone.from_preset(\n",
+    "    \"resnet_50_imagenet\",\n",
+    ")\n",
+    "preprocessor = keras_hub.layers.ImageConverter.from_preset(\n",
+    "    \"resnet_50_imagenet\",\n",
+    "    image_size=(image_size, image_size),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import layers\n",
+    "\n",
+    "grid_size = 6\n",
+    "num_labels = 91\n",
+    "\n",
+    "inputs = keras.Input(shape=(image_size, image_size, 3))\n",
+    "x = backbone(inputs)\n",
+    "x = layers.Conv2D(512, (3, 3), strides=(2, 2))(x)\n",
+    "x = keras.layers.Flatten()(x)\n",
+    "x = layers.Dense(2048, activation=\"relu\", kernel_initializer=\"glorot_normal\")(x)\n",
+    "x = layers.Dropout(0.5)(x)\n",
+    "x = layers.Dense(grid_size * grid_size * (num_labels + 5))(x)\n",
+    "x = layers.Reshape((grid_size, grid_size, num_labels + 5))(x)\n",
+    "box_predictions = x[..., :5]\n",
+    "class_predictions = layers.Activation(\"softmax\")(x[..., 5:])\n",
+    "outputs = {\"box\": box_predictions, \"class\": class_predictions}\n",
+    "model = keras.Model(inputs, outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Readying the COCO data for the YOLO model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def to_grid(box):\n",
+    "    x, y, w, h = box\n",
+    "    cx, cy = (x + w / 2) * grid_size, (y + h / 2) * grid_size\n",
+    "    ix, iy = int(cx), int(cy)\n",
+    "    return (ix, iy), (cx - ix, cy - iy, w, h)\n",
+    "\n",
+    "def from_grid(loc, box):\n",
+    "    (xi, yi), (x, y, w, h) = loc, box\n",
+    "    x = (xi + x) / grid_size - w / 2\n",
+    "    y = (yi + y) / grid_size - h / 2\n",
+    "    return (x, y, w, h)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import math\n",
+    "\n",
+    "class_array = np.zeros((len(metadata), grid_size, grid_size))\n",
+    "box_array = np.zeros((len(metadata), grid_size, grid_size, 5))\n",
+    "\n",
+    "for index, sample in enumerate(metadata):\n",
+    "    boxes, labels = sample[\"boxes\"], sample[\"labels\"]\n",
+    "    for box, label in zip(boxes, labels):\n",
+    "        (x, y, w, h) = box\n",
+    "        left, right = math.floor(x * grid_size), math.ceil((x + w) * grid_size)\n",
+    "        bottom, top = math.floor(y * grid_size), math.ceil((y + h) * grid_size)\n",
+    "        class_array[index, bottom:top, left:right] = label\n",
+    "\n",
+    "for index, sample in enumerate(metadata):\n",
+    "    boxes, labels = sample[\"boxes\"], sample[\"labels\"]\n",
+    "    for box, label in zip(boxes, labels):\n",
+    "        (xi, yi), (grid_box) = to_grid(box)\n",
+    "        box_array[index, yi, xi] = [*grid_box, 1.0]\n",
+    "        class_array[index, yi, xi] = label"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def draw_prediction(image, boxes, classes, cutoff=None):\n",
+    "    fig, ax = plt.subplots(dpi=300)\n",
+    "    draw_image(ax, image)\n",
+    "    for yi, row in enumerate(classes):\n",
+    "        for xi, label in enumerate(row):\n",
+    "            color = label_to_color(label) if label else \"none\"\n",
+    "            x, y, w, h = (v / grid_size for v in (xi, yi, 1.0, 1.0))\n",
+    "            r = Rectangle((x, y), w, h, lw=2, ec=\"black\", fc=color, alpha=0.5)\n",
+    "            ax.add_patch(r)\n",
+    "    for yi, row in enumerate(boxes):\n",
+    "        for xi, box in enumerate(row):\n",
+    "            box, confidence = box[:4], box[4]\n",
+    "            if not cutoff or confidence >= cutoff:\n",
+    "                box = from_grid((xi, yi), box)\n",
+    "                label = classes[yi, xi]\n",
+    "                color = label_to_color(label)\n",
+    "                name = keras_hub.utils.coco_id_to_name(label)\n",
+    "                draw_box(ax, box, f\"{name} {max(confidence, 0):.2f}\", color)\n",
+    "    plt.show()\n",
+    "\n",
+    "draw_prediction(metadata[0][\"path\"], box_array[0], class_array[0], cutoff=1.0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import tensorflow as tf\n",
+    "\n",
+    "def load_image(path):\n",
+    "    x = tf.io.read_file(path)\n",
+    "    x = tf.image.decode_jpeg(x, channels=3)\n",
+    "    return preprocessor(x)\n",
+    "\n",
+    "images = tf.data.Dataset.from_tensor_slices([x[\"path\"] for x in metadata])\n",
+    "images = images.map(load_image, num_parallel_calls=8)\n",
+    "labels = {\"box\": box_array, \"class\": class_array}\n",
+    "labels = tf.data.Dataset.from_tensor_slices(labels)\n",
+    "\n",
+    "dataset = tf.data.Dataset.zip(images, labels).batch(16).prefetch(2)\n",
+    "val_dataset, train_dataset = dataset.take(500), dataset.skip(500)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Training the YOLO model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "def unpack(box):\n",
+    "    return box[..., 0], box[..., 1], box[..., 2], box[..., 3]\n",
+    "\n",
+    "def intersection(box1, box2):\n",
+    "    cx1, cy1, w1, h1 = unpack(box1)\n",
+    "    cx2, cy2, w2, h2 = unpack(box2)\n",
+    "    left = ops.maximum(cx1 - w1 / 2, cx2 - w2 / 2)\n",
+    "    bottom = ops.maximum(cy1 - h1 / 2, cy2 - h2 / 2)\n",
+    "    right = ops.minimum(cx1 + w1 / 2, cx2 + w2 / 2)\n",
+    "    top = ops.minimum(cy1 + h1 / 2, cy2 + h2 / 2)\n",
+    "    return ops.maximum(0.0, right - left) * ops.maximum(0.0, top - bottom)\n",
+    "\n",
+    "def intersection_over_union(box1, box2):\n",
+    "    cx1, cy1, w1, h1 = unpack(box1)\n",
+    "    cx2, cy2, w2, h2 = unpack(box2)\n",
+    "    intersection_area = intersection(box1, box2)\n",
+    "    a1 = ops.maximum(w1, 0.0) * ops.maximum(h1, 0.0)\n",
+    "    a2 = ops.maximum(w2, 0.0) * ops.maximum(h2, 0.0)\n",
+    "    union_area = a1 + a2 - intersection_area\n",
+    "    return ops.divide_no_nan(intersection_area, union_area)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def signed_sqrt(x):\n",
+    "    return ops.sign(x) * ops.sqrt(ops.absolute(x) + keras.config.epsilon())\n",
+    "\n",
+    "def box_loss(true, pred):\n",
+    "    xy_true, wh_true, conf_true = true[..., :2], true[..., 2:4], true[..., 4:]\n",
+    "    xy_pred, wh_pred, conf_pred = pred[..., :2], pred[..., 2:4], pred[..., 4:]\n",
+    "    no_object = conf_true == 0.0\n",
+    "    xy_error = ops.square(xy_true - xy_pred)\n",
+    "    wh_error = ops.square(signed_sqrt(wh_true) - signed_sqrt(wh_pred))\n",
+    "    iou = intersection_over_union(true, pred)\n",
+    "    conf_target = ops.where(no_object, 0.0, ops.expand_dims(iou, -1))\n",
+    "    conf_error = ops.square(conf_target - conf_pred)\n",
+    "    error = ops.concatenate(\n",
+    "        (\n",
+    "            ops.where(no_object, 0.0, xy_error * 5.0),\n",
+    "            ops.where(no_object, 0.0, wh_error * 5.0),\n",
+    "            ops.where(no_object, conf_error * 0.5, conf_error),\n",
+    "        ),\n",
+    "        axis=-1,\n",
+    "    )\n",
+    "    return ops.sum(error, axis=(1, 2, 3))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=keras.optimizers.Adam(2e-4),\n",
+    "    loss={\"box\": box_loss, \"class\": \"sparse_categorical_crossentropy\"},\n",
+    ")\n",
+    "model.fit(\n",
+    "    train_dataset,\n",
+    "    validation_data=val_dataset,\n",
+    "    epochs=4,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x, y = next(iter(val_dataset.rebatch(1)))\n",
+    "preds = model.predict(x)\n",
+    "boxes = preds[\"box\"][0]\n",
+    "classes = np.argmax(preds[\"class\"][0], axis=-1)\n",
+    "path = metadata[0][\"path\"]\n",
+    "draw_prediction(path, boxes, classes, cutoff=0.1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "draw_prediction(path, boxes, classes, cutoff=None)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Using a pretrained RetinaNet detector"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "url = \"https://s3.us-east-1.amazonaws.com/book.keras.io/3e/seurat.jpg\"\n",
+    "path = keras.utils.get_file(origin=url)\n",
+    "image = np.array([keras.utils.load_img(path)])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "detector = keras_hub.models.ObjectDetector.from_preset(\n",
+    "    \"retinanet_resnet50_fpn_v2_coco\",\n",
+    "    bounding_box_format=\"rel_xywh\",\n",
+    ")\n",
+    "predictions = detector.predict(image)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "[(k, v.shape) for k, v in predictions.items()]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "predictions[\"boxes\"][0][0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "fig, ax = plt.subplots(dpi=300)\n",
+    "draw_image(ax, path)\n",
+    "num_detections = predictions[\"num_detections\"][0]\n",
+    "for i in range(num_detections):\n",
+    "    box = predictions[\"boxes\"][0][i]\n",
+    "    label = predictions[\"labels\"][0][i]\n",
+    "    label_name = keras_hub.utils.coco_id_to_name(label)\n",
+    "    draw_box(ax, box, label_name, label_to_color(label))\n",
+    "plt.show()"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter12_object-detection",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter13_timeseries-forecasting.ipynb b/chapter13_timeseries-forecasting.ipynb
new file mode 100644
index 0000000000..2c60dd76be
--- /dev/null
+++ b/chapter13_timeseries-forecasting.ipynb
@@ -0,0 +1,714 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Timeseries forecasting"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Different kinds of timeseries tasks"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### A temperature forecasting example"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!wget https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip\n",
+    "!unzip jena_climate_2009_2016.csv.zip"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "fname = os.path.join(\"jena_climate_2009_2016.csv\")\n",
+    "\n",
+    "with open(fname) as f:\n",
+    "    data = f.read()\n",
+    "\n",
+    "lines = data.split(\"\\n\")\n",
+    "header = lines[0].split(\",\")\n",
+    "lines = lines[1:]\n",
+    "print(header)\n",
+    "print(len(lines))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "temperature = np.zeros((len(lines),))\n",
+    "raw_data = np.zeros((len(lines), len(header) - 1))\n",
+    "\n",
+    "for i, line in enumerate(lines):\n",
+    "    values = [float(x) for x in line.split(\",\")[1:]]\n",
+    "    temperature[i] = values[1]\n",
+    "    raw_data[i, :] = values[:]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from matplotlib import pyplot as plt\n",
+    "\n",
+    "plt.plot(range(len(temperature)), temperature)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "plt.plot(range(1440), temperature[:1440])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "num_train_samples = int(0.5 * len(raw_data))\n",
+    "num_val_samples = int(0.25 * len(raw_data))\n",
+    "num_test_samples = len(raw_data) - num_train_samples - num_val_samples\n",
+    "print(\"num_train_samples:\", num_train_samples)\n",
+    "print(\"num_val_samples:\", num_val_samples)\n",
+    "print(\"num_test_samples:\", num_test_samples)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Preparing the data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "mean = raw_data[:num_train_samples].mean(axis=0)\n",
+    "raw_data -= mean\n",
+    "std = raw_data[:num_train_samples].std(axis=0)\n",
+    "raw_data /= std"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import keras\n",
+    "\n",
+    "int_sequence = np.arange(10)\n",
+    "dummy_dataset = keras.utils.timeseries_dataset_from_array(\n",
+    "    data=int_sequence[:-3],\n",
+    "    targets=int_sequence[3:],\n",
+    "    sequence_length=3,\n",
+    "    batch_size=2,\n",
+    ")\n",
+    "\n",
+    "for inputs, targets in dummy_dataset:\n",
+    "    for i in range(inputs.shape[0]):\n",
+    "        print([int(x) for x in inputs[i]], int(targets[i]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "sampling_rate = 6\n",
+    "sequence_length = 120\n",
+    "delay = sampling_rate * (sequence_length + 24 - 1)\n",
+    "batch_size = 256\n",
+    "\n",
+    "train_dataset = keras.utils.timeseries_dataset_from_array(\n",
+    "    raw_data[:-delay],\n",
+    "    targets=temperature[delay:],\n",
+    "    sampling_rate=sampling_rate,\n",
+    "    sequence_length=sequence_length,\n",
+    "    shuffle=True,\n",
+    "    batch_size=batch_size,\n",
+    "    start_index=0,\n",
+    "    end_index=num_train_samples,\n",
+    ")\n",
+    "\n",
+    "val_dataset = keras.utils.timeseries_dataset_from_array(\n",
+    "    raw_data[:-delay],\n",
+    "    targets=temperature[delay:],\n",
+    "    sampling_rate=sampling_rate,\n",
+    "    sequence_length=sequence_length,\n",
+    "    shuffle=True,\n",
+    "    batch_size=batch_size,\n",
+    "    start_index=num_train_samples,\n",
+    "    end_index=num_train_samples + num_val_samples,\n",
+    ")\n",
+    "\n",
+    "test_dataset = keras.utils.timeseries_dataset_from_array(\n",
+    "    raw_data[:-delay],\n",
+    "    targets=temperature[delay:],\n",
+    "    sampling_rate=sampling_rate,\n",
+    "    sequence_length=sequence_length,\n",
+    "    shuffle=True,\n",
+    "    batch_size=batch_size,\n",
+    "    start_index=num_train_samples + num_val_samples,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "for samples, targets in train_dataset:\n",
+    "    print(\"samples shape:\", samples.shape)\n",
+    "    print(\"targets shape:\", targets.shape)\n",
+    "    break"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### A commonsense, non-machine-learning baseline"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def evaluate_naive_method(dataset):\n",
+    "    total_abs_err = 0.0\n",
+    "    samples_seen = 0\n",
+    "    for samples, targets in dataset:\n",
+    "        preds = samples[:, -1, 1] * std[1] + mean[1]\n",
+    "        total_abs_err += np.sum(np.abs(preds - targets))\n",
+    "        samples_seen += samples.shape[0]\n",
+    "    return total_abs_err / samples_seen\n",
+    "\n",
+    "print(f\"Validation MAE: {evaluate_naive_method(val_dataset):.2f}\")\n",
+    "print(f\"Test MAE: {evaluate_naive_method(test_dataset):.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Let's try a basic machine learning model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
+    "x = layers.Flatten()(inputs)\n",
+    "x = layers.Dense(16, activation=\"relu\")(x)\n",
+    "outputs = layers.Dense(1)(x)\n",
+    "model = keras.Model(inputs, outputs)\n",
+    "\n",
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\"jena_dense.keras\", save_best_only=True)\n",
+    "]\n",
+    "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n",
+    "history = model.fit(\n",
+    "    train_dataset,\n",
+    "    epochs=10,\n",
+    "    validation_data=val_dataset,\n",
+    "    callbacks=callbacks,\n",
+    ")\n",
+    "\n",
+    "model = keras.models.load_model(\"jena_dense.keras\")\n",
+    "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "loss = history.history[\"mae\"]\n",
+    "val_loss = history.history[\"val_mae\"]\n",
+    "epochs = range(1, len(loss) + 1)\n",
+    "plt.figure()\n",
+    "plt.plot(epochs, loss, \"r--\", label=\"Training MAE\")\n",
+    "plt.plot(epochs, val_loss, \"b\", label=\"Validation MAE\")\n",
+    "plt.title(\"Training and validation MAE\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Let's try a 1D convolutional model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
+    "x = layers.Conv1D(8, 24, activation=\"relu\")(inputs)\n",
+    "x = layers.MaxPooling1D(2)(x)\n",
+    "x = layers.Conv1D(8, 12, activation=\"relu\")(x)\n",
+    "x = layers.MaxPooling1D(2)(x)\n",
+    "x = layers.Conv1D(8, 6, activation=\"relu\")(x)\n",
+    "x = layers.GlobalAveragePooling1D()(x)\n",
+    "outputs = layers.Dense(1)(x)\n",
+    "model = keras.Model(inputs, outputs)\n",
+    "\n",
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\"jena_conv.keras\", save_best_only=True)\n",
+    "]\n",
+    "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n",
+    "history = model.fit(\n",
+    "    train_dataset,\n",
+    "    epochs=10,\n",
+    "    validation_data=val_dataset,\n",
+    "    callbacks=callbacks,\n",
+    ")\n",
+    "\n",
+    "model = keras.models.load_model(\"jena_conv.keras\")\n",
+    "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Recurrent neural networks"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
+    "x = layers.LSTM(16)(inputs)\n",
+    "outputs = layers.Dense(1)(x)\n",
+    "model = keras.Model(inputs, outputs)\n",
+    "\n",
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\"jena_lstm.keras\", save_best_only=True)\n",
+    "]\n",
+    "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n",
+    "history = model.fit(\n",
+    "    train_dataset,\n",
+    "    epochs=10,\n",
+    "    validation_data=val_dataset,\n",
+    "    callbacks=callbacks,\n",
+    ")\n",
+    "\n",
+    "model = keras.models.load_model(\"jena_lstm.keras\")\n",
+    "print(\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Understanding recurrent neural networks"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "timesteps = 100\n",
+    "input_features = 32\n",
+    "output_features = 64\n",
+    "inputs = np.random.random((timesteps, input_features))\n",
+    "state_t = np.zeros((output_features,))\n",
+    "W = np.random.random((output_features, input_features))\n",
+    "U = np.random.random((output_features, output_features))\n",
+    "b = np.random.random((output_features,))\n",
+    "successive_outputs = []\n",
+    "for input_t in inputs:\n",
+    "    output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b)\n",
+    "    successive_outputs.append(output_t)\n",
+    "    state_t = output_t\n",
+    "final_output_sequence = np.concatenate(successive_outputs, axis=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### A recurrent layer in Keras"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "num_features = 14\n",
+    "inputs = keras.Input(shape=(None, num_features))\n",
+    "outputs = layers.SimpleRNN(16)(inputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "num_features = 14\n",
+    "steps = 120\n",
+    "inputs = keras.Input(shape=(steps, num_features))\n",
+    "outputs = layers.SimpleRNN(16, return_sequences=False)(inputs)\n",
+    "print(outputs.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "num_features = 14\n",
+    "steps = 120\n",
+    "inputs = keras.Input(shape=(steps, num_features))\n",
+    "outputs = layers.SimpleRNN(16, return_sequences=True)(inputs)\n",
+    "print(outputs.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(steps, num_features))\n",
+    "x = layers.SimpleRNN(16, return_sequences=True)(inputs)\n",
+    "x = layers.SimpleRNN(16, return_sequences=True)(x)\n",
+    "outputs = layers.SimpleRNN(16)(x)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Getting the most out of recurrent neural networks"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Using recurrent dropout to fight overfitting"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
+    "x = layers.LSTM(32, recurrent_dropout=0.25)(inputs)\n",
+    "x = layers.Dropout(0.5)(x)\n",
+    "outputs = layers.Dense(1)(x)\n",
+    "model = keras.Model(inputs, outputs)\n",
+    "\n",
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\n",
+    "        \"jena_lstm_dropout.keras\", save_best_only=True\n",
+    "    )\n",
+    "]\n",
+    "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n",
+    "history = model.fit(\n",
+    "    train_dataset,\n",
+    "    epochs=50,\n",
+    "    validation_data=val_dataset,\n",
+    "    callbacks=callbacks,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Stacking recurrent layers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
+    "x = layers.GRU(32, recurrent_dropout=0.5, return_sequences=True)(inputs)\n",
+    "x = layers.GRU(32, recurrent_dropout=0.5)(x)\n",
+    "x = layers.Dropout(0.5)(x)\n",
+    "outputs = layers.Dense(1)(x)\n",
+    "model = keras.Model(inputs, outputs)\n",
+    "\n",
+    "callbacks = [\n",
+    "    keras.callbacks.ModelCheckpoint(\n",
+    "        \"jena_stacked_gru_dropout.keras\", save_best_only=True\n",
+    "    )\n",
+    "]\n",
+    "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n",
+    "history = model.fit(\n",
+    "    train_dataset,\n",
+    "    epochs=50,\n",
+    "    validation_data=val_dataset,\n",
+    "    callbacks=callbacks,\n",
+    ")\n",
+    "model = keras.models.load_model(\"jena_stacked_gru_dropout.keras\")\n",
+    "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Using bidirectional RNNs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
+    "x = layers.Bidirectional(layers.LSTM(16))(inputs)\n",
+    "outputs = layers.Dense(1)(x)\n",
+    "model = keras.Model(inputs, outputs)\n",
+    "\n",
+    "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n",
+    "history = model.fit(\n",
+    "    train_dataset,\n",
+    "    epochs=10,\n",
+    "    validation_data=val_dataset,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Going even further"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter13_timeseries-forecasting",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter14_text-classification.ipynb b/chapter14_text-classification.ipynb
new file mode 100644
index 0000000000..15e34f0f0c
--- /dev/null
+++ b/chapter14_text-classification.ipynb
@@ -0,0 +1,1439 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Text classification"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### A brief history of natural language processing"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Preparing text data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import regex as re\n",
+    "\n",
+    "def split_chars(text):\n",
+    "    return re.findall(r\".\", text)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "chars = split_chars(\"The quick brown fox jumped over the lazy dog.\")\n",
+    "chars[:12]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def split_words(text):\n",
+    "    return re.findall(r\"[\\w]+|[.,!?;]\", text)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "split_words(\"The quick brown fox jumped over the dog.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "vocabulary = {\n",
+    "    \"[UNK]\": 0,\n",
+    "    \"the\": 1,\n",
+    "    \"quick\": 2,\n",
+    "    \"brown\": 3,\n",
+    "    \"fox\": 4,\n",
+    "    \"jumped\": 5,\n",
+    "    \"over\": 6,\n",
+    "    \"dog\": 7,\n",
+    "    \".\": 8,\n",
+    "}\n",
+    "words = split_words(\"The quick brown fox jumped over the lazy dog.\")\n",
+    "indices = [vocabulary.get(word, 0) for word in words]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Character and word tokenization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class CharTokenizer:\n",
+    "    def __init__(self, vocabulary):\n",
+    "        self.vocabulary = vocabulary\n",
+    "        self.unk_id = vocabulary[\"[UNK]\"]\n",
+    "\n",
+    "    def standardize(self, inputs):\n",
+    "        return inputs.lower()\n",
+    "\n",
+    "    def split(self, inputs):\n",
+    "        return re.findall(r\".\", inputs)\n",
+    "\n",
+    "    def index(self, tokens):\n",
+    "        return [self.vocabulary.get(t, self.unk_id) for t in tokens]\n",
+    "\n",
+    "    def __call__(self, inputs):\n",
+    "        inputs = self.standardize(inputs)\n",
+    "        tokens = self.split(inputs)\n",
+    "        indices = self.index(tokens)\n",
+    "        return indices"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import collections\n",
+    "\n",
+    "def compute_char_vocabulary(inputs, max_size):\n",
+    "    char_counts = collections.Counter()\n",
+    "    for x in inputs:\n",
+    "        x = x.lower()\n",
+    "        tokens = re.findall(r\".\", x)\n",
+    "        char_counts.update(tokens)\n",
+    "    vocabulary = [\"[UNK]\"]\n",
+    "    most_common = char_counts.most_common(max_size - len(vocabulary))\n",
+    "    for token, count in most_common:\n",
+    "        vocabulary.append(token)\n",
+    "    return dict((token, i) for i, token in enumerate(vocabulary))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class WordTokenizer:\n",
+    "    def __init__(self, vocabulary):\n",
+    "        self.vocabulary = vocabulary\n",
+    "        self.unk_id = vocabulary[\"[UNK]\"]\n",
+    "\n",
+    "    def standardize(self, inputs):\n",
+    "        return inputs.lower()\n",
+    "\n",
+    "    def split(self, inputs):\n",
+    "        return re.findall(r\"[\\w]+|[.,!?;]\", inputs)\n",
+    "\n",
+    "    def index(self, tokens):\n",
+    "        return [self.vocabulary.get(t, self.unk_id) for t in tokens]\n",
+    "\n",
+    "    def __call__(self, inputs):\n",
+    "        inputs = self.standardize(inputs)\n",
+    "        tokens = self.split(inputs)\n",
+    "        indices = self.index(tokens)\n",
+    "        return indices"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def compute_word_vocabulary(inputs, max_size):\n",
+    "    word_counts = collections.Counter()\n",
+    "    for x in inputs:\n",
+    "        x = x.lower()\n",
+    "        tokens = re.findall(r\"[\\w]+|[.,!?;]\", x)\n",
+    "        word_counts.update(tokens)\n",
+    "    vocabulary = [\"[UNK]\"]\n",
+    "    most_common = word_counts.most_common(max_size - len(vocabulary))\n",
+    "    for token, count in most_common:\n",
+    "        vocabulary.append(token)\n",
+    "    return dict((token, i) for i, token in enumerate(vocabulary))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "\n",
+    "filename = keras.utils.get_file(\n",
+    "    origin=\"https://www.gutenberg.org/files/2701/old/moby10b.txt\",\n",
+    ")\n",
+    "moby_dick = list(open(filename, \"r\"))\n",
+    "\n",
+    "vocabulary = compute_char_vocabulary(moby_dick, max_size=100)\n",
+    "char_tokenizer = CharTokenizer(vocabulary)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Vocabulary length:\", len(vocabulary))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Vocabulary start:\", list(vocabulary.keys())[:10])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Vocabulary end:\", list(vocabulary.keys())[-10:])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Line length:\", len(char_tokenizer(\n",
+    "   \"Call me Ishmael. Some years ago--never mind how long precisely.\"\n",
+    ")))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "vocabulary = compute_word_vocabulary(moby_dick, max_size=2_000)\n",
+    "word_tokenizer = WordTokenizer(vocabulary)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Vocabulary length:\", len(vocabulary))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Vocabulary start:\", list(vocabulary.keys())[:5])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Vocabulary end:\", list(vocabulary.keys())[-5:])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Line length:\", len(word_tokenizer(\n",
+    "   \"Call me Ishmael. Some years ago--never mind how long precisely.\"\n",
+    ")))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Subword tokenization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "data = [\n",
+    "    \"the quick brown fox\",\n",
+    "    \"the slow brown fox\",\n",
+    "    \"the quick brown foxhound\",\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def count_and_split_words(data):\n",
+    "    counts = collections.Counter()\n",
+    "    for line in data:\n",
+    "        line = line.lower()\n",
+    "        for word in re.findall(r\"[\\w]+|[.,!?;]\", line):\n",
+    "            chars = re.findall(r\".\", word)\n",
+    "            split_word = \" \".join(chars)\n",
+    "            counts[split_word] += 1\n",
+    "    return dict(counts)\n",
+    "\n",
+    "counts = count_and_split_words(data)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "counts"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def count_pairs(counts):\n",
+    "    pairs = collections.Counter()\n",
+    "    for word, freq in counts.items():\n",
+    "        symbols = word.split()\n",
+    "        for pair in zip(symbols[:-1], symbols[1:]):\n",
+    "            pairs[pair] += freq\n",
+    "    return pairs\n",
+    "\n",
+    "def merge_pair(counts, first, second):\n",
+    "    split = re.compile(f\"(?<!\\S){first} {second}(?!\\S)\")\n",
+    "    merged = f\"{first}{second}\"\n",
+    "    return {split.sub(merged, word): count for word, count in counts.items()}\n",
+    "\n",
+    "for i in range(10):\n",
+    "    pairs = count_pairs(counts)\n",
+    "    first, second = max(pairs, key=pairs.get)\n",
+    "    counts = merge_pair(counts, first, second)\n",
+    "    print(list(counts.keys()))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def compute_sub_word_vocabulary(dataset, vocab_size):\n",
+    "    counts = count_and_split_words(dataset)\n",
+    "\n",
+    "    char_counts = collections.Counter()\n",
+    "    for word in counts:\n",
+    "        for char in word.split():\n",
+    "            char_counts[char] += counts[word]\n",
+    "    most_common = char_counts.most_common()\n",
+    "    vocab = [\"[UNK]\"] + [char for char, freq in most_common]\n",
+    "    merges = []\n",
+    "\n",
+    "    while len(vocab) < vocab_size:\n",
+    "        pairs = count_pairs(counts)\n",
+    "        if not pairs:\n",
+    "            break\n",
+    "        first, second = max(pairs, key=pairs.get)\n",
+    "        counts = merge_pair(counts, first, second)\n",
+    "        vocab.append(f\"{first}{second}\")\n",
+    "        merges.append(f\"{first} {second}\")\n",
+    "\n",
+    "    vocab = dict((token, index) for index, token in enumerate(vocab))\n",
+    "    merges = dict((token, rank) for rank, token in enumerate(merges))\n",
+    "    return vocab, merges"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class SubWordTokenizer:\n",
+    "    def __init__(self, vocabulary, merges):\n",
+    "        self.vocabulary = vocabulary\n",
+    "        self.merges = merges\n",
+    "        self.unk_id = vocabulary[\"[UNK]\"]\n",
+    "\n",
+    "    def standardize(self, inputs):\n",
+    "        return inputs.lower()\n",
+    "\n",
+    "    def bpe_merge(self, word):\n",
+    "        while True:\n",
+    "            pairs = re.findall(r\"(?<!\\S)\\S+ \\S+(?!\\S)\", word, overlapped=True)\n",
+    "            if not pairs:\n",
+    "                break\n",
+    "            best = min(pairs, key=lambda pair: self.merges.get(pair, 1e9))\n",
+    "            if best not in self.merges:\n",
+    "                break\n",
+    "            first, second = best.split()\n",
+    "            split = re.compile(f\"(?<!\\S){first} {second}(?!\\S)\")\n",
+    "            merged = f\"{first}{second}\"\n",
+    "            word = split.sub(merged, word)\n",
+    "        return word\n",
+    "\n",
+    "    def split(self, inputs):\n",
+    "        tokens = []\n",
+    "        for word in re.findall(r\"[\\w]+|[.,!?;]\", inputs):\n",
+    "            word = \" \".join(re.findall(r\".\", word))\n",
+    "            word = self.bpe_merge(word)\n",
+    "            tokens.extend(word.split())\n",
+    "        return tokens\n",
+    "\n",
+    "    def index(self, tokens):\n",
+    "        return [self.vocabulary.get(t, self.unk_id) for t in tokens]\n",
+    "\n",
+    "    def __call__(self, inputs):\n",
+    "        inputs = self.standardize(inputs)\n",
+    "        tokens = self.split(inputs)\n",
+    "        indices = self.index(tokens)\n",
+    "        return indices"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "vocabulary, merges = compute_sub_word_vocabulary(moby_dick, 2_000)\n",
+    "sub_word_tokenizer = SubWordTokenizer(vocabulary, merges)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Vocabulary length:\", len(vocabulary))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Vocabulary start:\", list(vocabulary.keys())[:10])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Vocabulary end:\", list(vocabulary.keys())[-7:])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(\"Line length:\", len(sub_word_tokenizer(\n",
+    "   \"Call me Ishmael. Some years ago--never mind how long precisely.\"\n",
+    ")))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Sets vs. sequences"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Loading the IMDb classification dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os, pathlib, shutil, random\n",
+    "\n",
+    "zip_path = keras.utils.get_file(\n",
+    "    origin=\"https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\",\n",
+    "    fname=\"imdb\",\n",
+    "    extract=True,\n",
+    ")\n",
+    "\n",
+    "imdb_extract_dir = pathlib.Path(zip_path) / \"aclImdb\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "for path in imdb_extract_dir.glob(\"*/*\"):\n",
+    "    if path.is_dir():\n",
+    "        print(path)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(open(imdb_extract_dir / \"train\" / \"pos\" / \"4077_10.txt\", \"r\").read())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_dir = pathlib.Path(\"imdb_train\")\n",
+    "test_dir = pathlib.Path(\"imdb_test\")\n",
+    "val_dir = pathlib.Path(\"imdb_val\")\n",
+    "\n",
+    "shutil.copytree(imdb_extract_dir / \"test\", test_dir)\n",
+    "\n",
+    "val_percentage = 0.2\n",
+    "for category in (\"neg\", \"pos\"):\n",
+    "    src_dir = imdb_extract_dir / \"train\" / category\n",
+    "    src_files = os.listdir(src_dir)\n",
+    "    random.Random(1337).shuffle(src_files)\n",
+    "    num_val_samples = int(len(src_files) * val_percentage)\n",
+    "\n",
+    "    os.makedirs(val_dir / category)\n",
+    "    for file in src_files[:num_val_samples]:\n",
+    "        shutil.copy(src_dir / file, val_dir / category / file)\n",
+    "    os.makedirs(train_dir / category)\n",
+    "    for file in src_files[num_val_samples:]:\n",
+    "        shutil.copy(src_dir / file, train_dir / category / file)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras.utils import text_dataset_from_directory\n",
+    "\n",
+    "batch_size = 32\n",
+    "train_ds = text_dataset_from_directory(train_dir, batch_size=batch_size)\n",
+    "val_ds = text_dataset_from_directory(val_dir, batch_size=batch_size)\n",
+    "test_ds = text_dataset_from_directory(test_dir, batch_size=batch_size)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Set models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Training a bag-of-words model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import layers\n",
+    "\n",
+    "max_tokens = 20_000\n",
+    "text_vectorization = layers.TextVectorization(\n",
+    "    max_tokens=max_tokens,\n",
+    "    split=\"whitespace\",\n",
+    "    output_mode=\"multi_hot\",\n",
+    ")\n",
+    "train_ds_no_labels = train_ds.map(lambda x, y: x)\n",
+    "text_vectorization.adapt(train_ds_no_labels)\n",
+    "\n",
+    "bag_of_words_train_ds = train_ds.map(\n",
+    "    lambda x, y: (text_vectorization(x), y), num_parallel_calls=8\n",
+    ")\n",
+    "bag_of_words_val_ds = val_ds.map(\n",
+    "    lambda x, y: (text_vectorization(x), y), num_parallel_calls=8\n",
+    ")\n",
+    "bag_of_words_test_ds = test_ds.map(\n",
+    "    lambda x, y: (text_vectorization(x), y), num_parallel_calls=8\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x, y = next(bag_of_words_train_ds.as_numpy_iterator())\n",
+    "x.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "y.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def build_linear_classifier(max_tokens, name):\n",
+    "    inputs = keras.Input(shape=(max_tokens,))\n",
+    "    outputs = layers.Dense(1, activation=\"sigmoid\")(inputs)\n",
+    "    model = keras.Model(inputs, outputs, name=name)\n",
+    "    model.compile(\n",
+    "        optimizer=\"adam\",\n",
+    "        loss=\"binary_crossentropy\",\n",
+    "        metrics=[\"accuracy\"],\n",
+    "    )\n",
+    "    return model\n",
+    "\n",
+    "model = build_linear_classifier(max_tokens, \"bag_of_words_classifier\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "early_stopping = keras.callbacks.EarlyStopping(\n",
+    "    monitor=\"val_loss\",\n",
+    "    restore_best_weights=True,\n",
+    "    patience=2,\n",
+    ")\n",
+    "history = model.fit(\n",
+    "    bag_of_words_train_ds,\n",
+    "    validation_data=bag_of_words_val_ds,\n",
+    "    epochs=10,\n",
+    "    callbacks=[early_stopping],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "accuracy = history.history[\"accuracy\"]\n",
+    "val_accuracy = history.history[\"val_accuracy\"]\n",
+    "epochs = range(1, len(accuracy) + 1)\n",
+    "\n",
+    "plt.plot(epochs, accuracy, \"r--\", label=\"Training accuracy\")\n",
+    "plt.plot(epochs, val_accuracy, \"b\", label=\"Validation accuracy\")\n",
+    "plt.title(\"Training and validation accuracy\")\n",
+    "plt.legend()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_loss, test_acc = model.evaluate(bag_of_words_test_ds)\n",
+    "test_acc"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Training a bigram model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "max_tokens = 30_000\n",
+    "text_vectorization = layers.TextVectorization(\n",
+    "    max_tokens=max_tokens,\n",
+    "    split=\"whitespace\",\n",
+    "    output_mode=\"multi_hot\",\n",
+    "    ngrams=2,\n",
+    ")\n",
+    "text_vectorization.adapt(train_ds_no_labels)\n",
+    "\n",
+    "bigram_train_ds = train_ds.map(\n",
+    "    lambda x, y: (text_vectorization(x), y), num_parallel_calls=8\n",
+    ")\n",
+    "bigram_val_ds = val_ds.map(\n",
+    "    lambda x, y: (text_vectorization(x), y), num_parallel_calls=8\n",
+    ")\n",
+    "bigram_test_ds = test_ds.map(\n",
+    "    lambda x, y: (text_vectorization(x), y), num_parallel_calls=8\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x, y = next(bigram_train_ds.as_numpy_iterator())\n",
+    "x.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "text_vectorization.get_vocabulary()[100:108]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = build_linear_classifier(max_tokens, \"bigram_classifier\")\n",
+    "model.fit(\n",
+    "    bigram_train_ds,\n",
+    "    validation_data=bigram_val_ds,\n",
+    "    epochs=10,\n",
+    "    callbacks=[early_stopping],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_loss, test_acc = model.evaluate(bigram_test_ds)\n",
+    "test_acc"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Sequence models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "max_length = 600\n",
+    "max_tokens = 30_000\n",
+    "text_vectorization = layers.TextVectorization(\n",
+    "    max_tokens=max_tokens,\n",
+    "    split=\"whitespace\",\n",
+    "    output_mode=\"int\",\n",
+    "    output_sequence_length=max_length,\n",
+    ")\n",
+    "text_vectorization.adapt(train_ds_no_labels)\n",
+    "\n",
+    "sequence_train_ds = train_ds.map(\n",
+    "    lambda x, y: (text_vectorization(x), y), num_parallel_calls=8\n",
+    ")\n",
+    "sequence_val_ds = val_ds.map(\n",
+    "    lambda x, y: (text_vectorization(x), y), num_parallel_calls=8\n",
+    ")\n",
+    "sequence_test_ds = test_ds.map(\n",
+    "    lambda x, y: (text_vectorization(x), y), num_parallel_calls=8\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x, y = next(sequence_test_ds.as_numpy_iterator())\n",
+    "x.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Training a recurrent model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "class OneHotEncoding(keras.Layer):\n",
+    "    def __init__(self, depth, **kwargs):\n",
+    "        super().__init__(**kwargs)\n",
+    "        self.depth = depth\n",
+    "\n",
+    "    def call(self, inputs):\n",
+    "        flat_inputs = ops.reshape(ops.cast(inputs, \"int\"), [-1])\n",
+    "        one_hot_vectors = ops.eye(self.depth)\n",
+    "        outputs = ops.take(one_hot_vectors, flat_inputs, axis=0)\n",
+    "        return ops.reshape(outputs, ops.shape(inputs) + (self.depth,))\n",
+    "\n",
+    "one_hot_encoding = OneHotEncoding(max_tokens)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x, y = next(sequence_train_ds.as_numpy_iterator())\n",
+    "one_hot_encoding(x).shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "hidden_dim = 64\n",
+    "inputs = keras.Input(shape=(max_length,), dtype=\"int32\")\n",
+    "x = one_hot_encoding(inputs)\n",
+    "x = layers.Bidirectional(layers.LSTM(hidden_dim))(x)\n",
+    "x = layers.Dropout(0.5)(x)\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "model = keras.Model(inputs, outputs, name=\"lstm_with_one_hot\")\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# \u26a0\ufe0fNOTE\u26a0\ufe0f: The following fit call will error on a T4 GPU on the TensorFlow\n",
+    "# backend due to a bug in TensorFlow. If you the follow cell errors out,\n",
+    "# do one of the following:\n",
+    "# - Skip the following two cells.\n",
+    "# - Switch to the Jax or Torch backend and re-run this notebook.\n",
+    "# - Change the GPU type in your runtime (requires Colab Pro as of this writing)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.fit(\n",
+    "    sequence_train_ds,\n",
+    "    validation_data=sequence_val_ds,\n",
+    "    epochs=10,\n",
+    "    callbacks=[early_stopping],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_loss, test_acc = model.evaluate(sequence_test_ds)\n",
+    "test_acc"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Understanding word embeddings"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Using a word embedding"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "hidden_dim = 64\n",
+    "inputs = keras.Input(shape=(max_length,), dtype=\"int32\")\n",
+    "x = keras.layers.Embedding(\n",
+    "    input_dim=max_tokens,\n",
+    "    output_dim=hidden_dim,\n",
+    "    mask_zero=True,\n",
+    ")(inputs)\n",
+    "x = keras.layers.Bidirectional(keras.layers.LSTM(hidden_dim))(x)\n",
+    "x = keras.layers.Dropout(0.5)(x)\n",
+    "outputs = keras.layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "model = keras.Model(inputs, outputs, name=\"lstm_with_embedding\")\n",
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.fit(\n",
+    "    sequence_train_ds,\n",
+    "    validation_data=sequence_val_ds,\n",
+    "    epochs=10,\n",
+    "    callbacks=[early_stopping],\n",
+    ")\n",
+    "test_loss, test_acc = model.evaluate(sequence_test_ds)\n",
+    "test_acc"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Pretraining a word embedding"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "imdb_vocabulary = text_vectorization.get_vocabulary()\n",
+    "tokenize_no_padding = keras.layers.TextVectorization(\n",
+    "    vocabulary=imdb_vocabulary,\n",
+    "    split=\"whitespace\",\n",
+    "    output_mode=\"int\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import tensorflow as tf\n",
+    "\n",
+    "context_size = 4\n",
+    "window_size = 9\n",
+    "\n",
+    "def window_data(token_ids):\n",
+    "    num_windows = tf.maximum(tf.size(token_ids) - context_size * 2, 0)\n",
+    "    windows = tf.range(window_size)[None, :]\n",
+    "    windows = windows + tf.range(num_windows)[:, None]\n",
+    "    windowed_tokens = tf.gather(token_ids, windows)\n",
+    "    return tf.data.Dataset.from_tensor_slices(windowed_tokens)\n",
+    "\n",
+    "def split_label(window):\n",
+    "    left = window[:context_size]\n",
+    "    right = window[context_size + 1 :]\n",
+    "    bag = tf.concat((left, right), axis=0)\n",
+    "    label = window[4]\n",
+    "    return bag, label\n",
+    "\n",
+    "dataset = keras.utils.text_dataset_from_directory(\n",
+    "    imdb_extract_dir / \"train\", batch_size=None\n",
+    ")\n",
+    "dataset = dataset.map(lambda x, y: x, num_parallel_calls=8)\n",
+    "dataset = dataset.map(tokenize_no_padding, num_parallel_calls=8)\n",
+    "dataset = dataset.interleave(window_data, cycle_length=8, num_parallel_calls=8)\n",
+    "dataset = dataset.map(split_label, num_parallel_calls=8)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "hidden_dim = 64\n",
+    "inputs = keras.Input(shape=(2 * context_size,))\n",
+    "cbow_embedding = layers.Embedding(\n",
+    "    max_tokens,\n",
+    "    hidden_dim,\n",
+    ")\n",
+    "x = cbow_embedding(inputs)\n",
+    "x = layers.GlobalAveragePooling1D()(x)\n",
+    "outputs = layers.Dense(max_tokens, activation=\"sigmoid\")(x)\n",
+    "cbow_model = keras.Model(inputs, outputs)\n",
+    "cbow_model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"sparse_categorical_accuracy\"],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "cbow_model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "dataset = dataset.batch(1024).cache()\n",
+    "cbow_model.fit(dataset, epochs=4)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Using the pretrained embedding for classification"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(max_length,))\n",
+    "lstm_embedding = layers.Embedding(\n",
+    "    input_dim=max_tokens,\n",
+    "    output_dim=hidden_dim,\n",
+    "    mask_zero=True,\n",
+    ")\n",
+    "x = lstm_embedding(inputs)\n",
+    "x = layers.Bidirectional(layers.LSTM(hidden_dim))(x)\n",
+    "x = layers.Dropout(0.5)(x)\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "model = keras.Model(inputs, outputs, name=\"lstm_with_cbow\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "lstm_embedding.embeddings.assign(cbow_embedding.embeddings)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "model.fit(\n",
+    "    sequence_train_ds,\n",
+    "    validation_data=sequence_val_ds,\n",
+    "    epochs=10,\n",
+    "    callbacks=[early_stopping],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_loss, test_acc = model.evaluate(sequence_test_ds)\n",
+    "test_acc"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter14_text-classification",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter15_language-models-and-the-transformer.ipynb b/chapter15_language-models-and-the-transformer.ipynb
new file mode 100644
index 0000000000..6f8182e1bc
--- /dev/null
+++ b/chapter15_language-models-and-the-transformer.ipynb
@@ -0,0 +1,1193 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Language models and the Transformer"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### The language model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Training a Shakespeare language model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "\n",
+    "filename = keras.utils.get_file(\n",
+    "    origin=(\n",
+    "        \"https://storage.googleapis.com/download.tensorflow.org/\"\n",
+    "        \"data/shakespeare.txt\"\n",
+    "    ),\n",
+    ")\n",
+    "shakespeare = open(filename, \"r\").read()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "shakespeare[:250]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import tensorflow as tf\n",
+    "\n",
+    "sequence_length = 100\n",
+    "\n",
+    "def split_input(input, sequence_length):\n",
+    "    for i in range(0, len(input), sequence_length):\n",
+    "        yield input[i : i + sequence_length]\n",
+    "\n",
+    "features = list(split_input(shakespeare[:-1], sequence_length))\n",
+    "labels = list(split_input(shakespeare[1:], sequence_length))\n",
+    "dataset = tf.data.Dataset.from_tensor_slices((features, labels))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x, y = next(dataset.as_numpy_iterator())\n",
+    "x[:50], y[:50]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import layers\n",
+    "\n",
+    "tokenizer = layers.TextVectorization(\n",
+    "    standardize=None,\n",
+    "    split=\"character\",\n",
+    "    output_sequence_length=sequence_length,\n",
+    ")\n",
+    "tokenizer.adapt(dataset.map(lambda text, labels: text))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "vocabulary_size = tokenizer.vocabulary_size()\n",
+    "vocabulary_size"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "dataset = dataset.map(\n",
+    "    lambda features, labels: (tokenizer(features), tokenizer(labels)),\n",
+    "    num_parallel_calls=8,\n",
+    ")\n",
+    "training_data = dataset.shuffle(10_000).batch(64).cache()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "embedding_dim = 256\n",
+    "hidden_dim = 1024\n",
+    "\n",
+    "inputs = layers.Input(shape=(sequence_length,), dtype=\"int\", name=\"token_ids\")\n",
+    "x = layers.Embedding(vocabulary_size, embedding_dim)(inputs)\n",
+    "x = layers.GRU(hidden_dim, return_sequences=True)(x)\n",
+    "x = layers.Dropout(0.1)(x)\n",
+    "outputs = layers.Dense(vocabulary_size, activation=\"softmax\")(x)\n",
+    "model = keras.Model(inputs, outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    metrics=[\"sparse_categorical_accuracy\"],\n",
+    ")\n",
+    "model.fit(training_data, epochs=20)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Generating Shakespeare"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = keras.Input(shape=(1,), dtype=\"int\", name=\"token_ids\")\n",
+    "input_state = keras.Input(shape=(hidden_dim,), name=\"state\")\n",
+    "\n",
+    "x = layers.Embedding(vocabulary_size, embedding_dim)(inputs)\n",
+    "x, output_state = layers.GRU(hidden_dim, return_state=True)(\n",
+    "    x, initial_state=input_state\n",
+    ")\n",
+    "outputs = layers.Dense(vocabulary_size, activation=\"softmax\")(x)\n",
+    "generation_model = keras.Model(\n",
+    "    inputs=(inputs, input_state),\n",
+    "    outputs=(outputs, output_state),\n",
+    ")\n",
+    "generation_model.set_weights(model.get_weights())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "tokens = tokenizer.get_vocabulary()\n",
+    "token_ids = range(vocabulary_size)\n",
+    "char_to_id = dict(zip(tokens, token_ids))\n",
+    "id_to_char = dict(zip(token_ids, tokens))\n",
+    "\n",
+    "prompt = \"\"\"\n",
+    "KING RICHARD III:\n",
+    "\"\"\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "input_ids = [char_to_id[c] for c in prompt]\n",
+    "state = keras.ops.zeros(shape=(1, hidden_dim))\n",
+    "for token_id in input_ids:\n",
+    "    inputs = keras.ops.expand_dims([token_id], axis=0)\n",
+    "    predictions, state = generation_model.predict((inputs, state), verbose=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "generated_ids = []\n",
+    "max_length = 250\n",
+    "for i in range(max_length):\n",
+    "    next_char = int(np.argmax(predictions, axis=-1)[0])\n",
+    "    generated_ids.append(next_char)\n",
+    "    inputs = keras.ops.expand_dims([next_char], axis=0)\n",
+    "    predictions, state = generation_model.predict((inputs, state), verbose=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "output = \"\".join([id_to_char[token_id] for token_id in generated_ids])\n",
+    "print(prompt + output)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Sequence-to-sequence learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### English-to-Spanish translation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import pathlib\n",
+    "\n",
+    "zip_path = keras.utils.get_file(\n",
+    "    origin=(\n",
+    "        \"http://storage.googleapis.com/download.tensorflow.org/data/spa-eng.zip\"\n",
+    "    ),\n",
+    "    fname=\"spa-eng\",\n",
+    "    extract=True,\n",
+    ")\n",
+    "text_path = pathlib.Path(zip_path) / \"spa-eng\" / \"spa.txt\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "with open(text_path) as f:\n",
+    "    lines = f.read().split(\"\\n\")[:-1]\n",
+    "text_pairs = []\n",
+    "for line in lines:\n",
+    "    english, spanish = line.split(\"\\t\")\n",
+    "    spanish = \"[start] \" + spanish + \" [end]\"\n",
+    "    text_pairs.append((english, spanish))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import random\n",
+    "random.choice(text_pairs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import random\n",
+    "\n",
+    "random.shuffle(text_pairs)\n",
+    "val_samples = int(0.15 * len(text_pairs))\n",
+    "train_samples = len(text_pairs) - 2 * val_samples\n",
+    "train_pairs = text_pairs[:train_samples]\n",
+    "val_pairs = text_pairs[train_samples : train_samples + val_samples]\n",
+    "test_pairs = text_pairs[train_samples + val_samples :]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import string\n",
+    "import re\n",
+    "\n",
+    "strip_chars = string.punctuation + \"\u00bf\"\n",
+    "strip_chars = strip_chars.replace(\"[\", \"\")\n",
+    "strip_chars = strip_chars.replace(\"]\", \"\")\n",
+    "\n",
+    "def custom_standardization(input_string):\n",
+    "    lowercase = tf.strings.lower(input_string)\n",
+    "    return tf.strings.regex_replace(\n",
+    "        lowercase, f\"[{re.escape(strip_chars)}]\", \"\"\n",
+    "    )\n",
+    "\n",
+    "vocab_size = 15000\n",
+    "sequence_length = 20\n",
+    "\n",
+    "english_tokenizer = layers.TextVectorization(\n",
+    "    max_tokens=vocab_size,\n",
+    "    output_mode=\"int\",\n",
+    "    output_sequence_length=sequence_length,\n",
+    ")\n",
+    "spanish_tokenizer = layers.TextVectorization(\n",
+    "    max_tokens=vocab_size,\n",
+    "    output_mode=\"int\",\n",
+    "    output_sequence_length=sequence_length + 1,\n",
+    "    standardize=custom_standardization,\n",
+    ")\n",
+    "train_english_texts = [pair[0] for pair in train_pairs]\n",
+    "train_spanish_texts = [pair[1] for pair in train_pairs]\n",
+    "english_tokenizer.adapt(train_english_texts)\n",
+    "spanish_tokenizer.adapt(train_spanish_texts)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "batch_size = 64\n",
+    "\n",
+    "def format_dataset(eng, spa):\n",
+    "    eng = english_tokenizer(eng)\n",
+    "    spa = spanish_tokenizer(spa)\n",
+    "    features = {\"english\": eng, \"spanish\": spa[:, :-1]}\n",
+    "    labels = spa[:, 1:]\n",
+    "    sample_weights = labels != 0\n",
+    "    return features, labels, sample_weights\n",
+    "\n",
+    "def make_dataset(pairs):\n",
+    "    eng_texts, spa_texts = zip(*pairs)\n",
+    "    eng_texts = list(eng_texts)\n",
+    "    spa_texts = list(spa_texts)\n",
+    "    dataset = tf.data.Dataset.from_tensor_slices((eng_texts, spa_texts))\n",
+    "    dataset = dataset.batch(batch_size)\n",
+    "    dataset = dataset.map(format_dataset, num_parallel_calls=4)\n",
+    "    return dataset.shuffle(2048).cache()\n",
+    "\n",
+    "train_ds = make_dataset(train_pairs)\n",
+    "val_ds = make_dataset(val_pairs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs, targets, sample_weights = next(iter(train_ds))\n",
+    "print(inputs[\"english\"].shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(inputs[\"spanish\"].shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(targets.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "print(sample_weights.shape)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Sequence-to-sequence learning with RNNs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "embed_dim = 256\n",
+    "hidden_dim = 1024\n",
+    "\n",
+    "source = keras.Input(shape=(None,), dtype=\"int32\", name=\"english\")\n",
+    "x = layers.Embedding(vocab_size, embed_dim, mask_zero=True)(source)\n",
+    "rnn_layer = layers.GRU(hidden_dim)\n",
+    "rnn_layer = layers.Bidirectional(rnn_layer, merge_mode=\"sum\")\n",
+    "encoder_output = rnn_layer(x)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "target = keras.Input(shape=(None,), dtype=\"int32\", name=\"spanish\")\n",
+    "x = layers.Embedding(vocab_size, embed_dim, mask_zero=True)(target)\n",
+    "rnn_layer = layers.GRU(hidden_dim, return_sequences=True)\n",
+    "x = rnn_layer(x, initial_state=encoder_output)\n",
+    "x = layers.Dropout(0.5)(x)\n",
+    "target_predictions = layers.Dense(vocab_size, activation=\"softmax\")(x)\n",
+    "seq2seq_rnn = keras.Model([source, target], target_predictions)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "seq2seq_rnn.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "seq2seq_rnn.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    weighted_metrics=[\"accuracy\"],\n",
+    ")\n",
+    "seq2seq_rnn.fit(train_ds, epochs=15, validation_data=val_ds)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "spa_vocab = spanish_tokenizer.get_vocabulary()\n",
+    "spa_index_lookup = dict(zip(range(len(spa_vocab)), spa_vocab))\n",
+    "\n",
+    "def generate_translation(input_sentence):\n",
+    "    tokenized_input_sentence = english_tokenizer([input_sentence])\n",
+    "    decoded_sentence = \"[start]\"\n",
+    "    for i in range(sequence_length):\n",
+    "        tokenized_target_sentence = spanish_tokenizer([decoded_sentence])\n",
+    "        inputs = [tokenized_input_sentence, tokenized_target_sentence]\n",
+    "        next_token_predictions = seq2seq_rnn.predict(inputs, verbose=0)\n",
+    "        sampled_token_index = np.argmax(next_token_predictions[0, i, :])\n",
+    "        sampled_token = spa_index_lookup[sampled_token_index]\n",
+    "        decoded_sentence += \" \" + sampled_token\n",
+    "        if sampled_token == \"[end]\":\n",
+    "            break\n",
+    "    return decoded_sentence\n",
+    "\n",
+    "test_eng_texts = [pair[0] for pair in test_pairs]\n",
+    "for _ in range(5):\n",
+    "    input_sentence = random.choice(test_eng_texts)\n",
+    "    print(\"-\")\n",
+    "    print(input_sentence)\n",
+    "    print(generate_translation(input_sentence))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### The Transformer architecture"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Dot-product attention"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Transformer encoder block"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class TransformerEncoder(keras.Layer):\n",
+    "    def __init__(self, hidden_dim, intermediate_dim, num_heads):\n",
+    "        super().__init__()\n",
+    "        key_dim = hidden_dim // num_heads\n",
+    "        self.self_attention = layers.MultiHeadAttention(num_heads, key_dim)\n",
+    "        self.self_attention_layernorm = layers.LayerNormalization()\n",
+    "        self.feed_forward_1 = layers.Dense(intermediate_dim, activation=\"relu\")\n",
+    "        self.feed_forward_2 = layers.Dense(hidden_dim)\n",
+    "        self.feed_forward_layernorm = layers.LayerNormalization()\n",
+    "\n",
+    "    def call(self, source, source_mask):\n",
+    "        residual = x = source\n",
+    "        mask = source_mask[:, None, :]\n",
+    "        x = self.self_attention(query=x, key=x, value=x, attention_mask=mask)\n",
+    "        x = x + residual\n",
+    "        x = self.self_attention_layernorm(x)\n",
+    "        residual = x\n",
+    "        x = self.feed_forward_1(x)\n",
+    "        x = self.feed_forward_2(x)\n",
+    "        x = x + residual\n",
+    "        x = self.feed_forward_layernorm(x)\n",
+    "        return x"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Transformer decoder block"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class TransformerDecoder(keras.Layer):\n",
+    "    def __init__(self, hidden_dim, intermediate_dim, num_heads):\n",
+    "        super().__init__()\n",
+    "        key_dim = hidden_dim // num_heads\n",
+    "        self.self_attention = layers.MultiHeadAttention(num_heads, key_dim)\n",
+    "        self.self_attention_layernorm = layers.LayerNormalization()\n",
+    "        self.cross_attention = layers.MultiHeadAttention(num_heads, key_dim)\n",
+    "        self.cross_attention_layernorm = layers.LayerNormalization()\n",
+    "        self.feed_forward_1 = layers.Dense(intermediate_dim, activation=\"relu\")\n",
+    "        self.feed_forward_2 = layers.Dense(hidden_dim)\n",
+    "        self.feed_forward_layernorm = layers.LayerNormalization()\n",
+    "\n",
+    "    def call(self, target, source, source_mask):\n",
+    "        residual = x = target\n",
+    "        x = self.self_attention(query=x, key=x, value=x, use_causal_mask=True)\n",
+    "        x = x + residual\n",
+    "        x = self.self_attention_layernorm(x)\n",
+    "        residual = x\n",
+    "        mask = source_mask[:, None, :]\n",
+    "        x = self.cross_attention(\n",
+    "            query=x, key=source, value=source, attention_mask=mask\n",
+    "        )\n",
+    "        x = x + residual\n",
+    "        x = self.cross_attention_layernorm(x)\n",
+    "        residual = x\n",
+    "        x = self.feed_forward_1(x)\n",
+    "        x = self.feed_forward_2(x)\n",
+    "        x = x + residual\n",
+    "        x = self.feed_forward_layernorm(x)\n",
+    "        return x"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Sequence-to-sequence learning with a Transformer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "hidden_dim = 256\n",
+    "intermediate_dim = 2048\n",
+    "num_heads = 8\n",
+    "\n",
+    "source = keras.Input(shape=(None,), dtype=\"int32\", name=\"english\")\n",
+    "x = layers.Embedding(vocab_size, hidden_dim)(source)\n",
+    "encoder_output = TransformerEncoder(hidden_dim, intermediate_dim, num_heads)(\n",
+    "    source=x,\n",
+    "    source_mask=source != 0,\n",
+    ")\n",
+    "\n",
+    "target = keras.Input(shape=(None,), dtype=\"int32\", name=\"spanish\")\n",
+    "x = layers.Embedding(vocab_size, hidden_dim)(target)\n",
+    "x = TransformerDecoder(hidden_dim, intermediate_dim, num_heads)(\n",
+    "    target=x,\n",
+    "    source=encoder_output,\n",
+    "    source_mask=source != 0,\n",
+    ")\n",
+    "x = layers.Dropout(0.5)(x)\n",
+    "target_predictions = layers.Dense(vocab_size, activation=\"softmax\")(x)\n",
+    "transformer = keras.Model([source, target], target_predictions)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "transformer.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "transformer.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    weighted_metrics=[\"accuracy\"],\n",
+    ")\n",
+    "transformer.fit(train_ds, epochs=15, validation_data=val_ds)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Embedding positional information"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "class PositionalEmbedding(keras.Layer):\n",
+    "    def __init__(self, sequence_length, input_dim, output_dim):\n",
+    "        super().__init__()\n",
+    "        self.token_embeddings = layers.Embedding(input_dim, output_dim)\n",
+    "        self.position_embeddings = layers.Embedding(sequence_length, output_dim)\n",
+    "\n",
+    "    def call(self, inputs):\n",
+    "        positions = ops.cumsum(ops.ones_like(inputs), axis=-1) - 1\n",
+    "        embedded_tokens = self.token_embeddings(inputs)\n",
+    "        embedded_positions = self.position_embeddings(positions)\n",
+    "        return embedded_tokens + embedded_positions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "hidden_dim = 256\n",
+    "intermediate_dim = 2056\n",
+    "num_heads = 8\n",
+    "\n",
+    "source = keras.Input(shape=(None,), dtype=\"int32\", name=\"english\")\n",
+    "x = PositionalEmbedding(sequence_length, vocab_size, hidden_dim)(source)\n",
+    "encoder_output = TransformerEncoder(hidden_dim, intermediate_dim, num_heads)(\n",
+    "    source=x,\n",
+    "    source_mask=source != 0,\n",
+    ")\n",
+    "\n",
+    "target = keras.Input(shape=(None,), dtype=\"int32\", name=\"spanish\")\n",
+    "x = PositionalEmbedding(sequence_length, vocab_size, hidden_dim)(target)\n",
+    "x = TransformerDecoder(hidden_dim, intermediate_dim, num_heads)(\n",
+    "    target=x,\n",
+    "    source=encoder_output,\n",
+    "    source_mask=source != 0,\n",
+    ")\n",
+    "x = layers.Dropout(0.5)(x)\n",
+    "target_predictions = layers.Dense(vocab_size, activation=\"softmax\")(x)\n",
+    "transformer = keras.Model([source, target], target_predictions)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "transformer.compile(\n",
+    "    optimizer=\"adam\",\n",
+    "    loss=\"sparse_categorical_crossentropy\",\n",
+    "    weighted_metrics=[\"accuracy\"],\n",
+    ")\n",
+    "transformer.fit(train_ds, epochs=30, validation_data=val_ds)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "spa_vocab = spanish_tokenizer.get_vocabulary()\n",
+    "spa_index_lookup = dict(zip(range(len(spa_vocab)), spa_vocab))\n",
+    "\n",
+    "def generate_translation(input_sentence):\n",
+    "    tokenized_input_sentence = english_tokenizer([input_sentence])\n",
+    "    decoded_sentence = \"[start]\"\n",
+    "    for i in range(sequence_length):\n",
+    "        tokenized_target_sentence = spanish_tokenizer([decoded_sentence])\n",
+    "        tokenized_target_sentence = tokenized_target_sentence[:, :-1]\n",
+    "        inputs = [tokenized_input_sentence, tokenized_target_sentence]\n",
+    "        next_token_predictions = transformer.predict(inputs, verbose=0)\n",
+    "        sampled_token_index = np.argmax(next_token_predictions[0, i, :])\n",
+    "        sampled_token = spa_index_lookup[sampled_token_index]\n",
+    "        decoded_sentence += \" \" + sampled_token\n",
+    "        if sampled_token == \"[end]\":\n",
+    "            break\n",
+    "    return decoded_sentence\n",
+    "\n",
+    "test_eng_texts = [pair[0] for pair in test_pairs]\n",
+    "for _ in range(5):\n",
+    "    input_sentence = random.choice(test_eng_texts)\n",
+    "    print(\"-\")\n",
+    "    print(input_sentence)\n",
+    "    print(generate_translation(input_sentence))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Classification with a pretrained Transformer"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Pretraining a Transformer encoder"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Loading a pretrained Transformer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras_hub\n",
+    "\n",
+    "tokenizer = keras_hub.models.Tokenizer.from_preset(\"roberta_base_en\")\n",
+    "backbone = keras_hub.models.Backbone.from_preset(\"roberta_base_en\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "tokenizer(\"The quick brown fox\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "backbone.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Preprocessing IMDb movie reviews"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os, pathlib, shutil, random\n",
+    "\n",
+    "zip_path = keras.utils.get_file(\n",
+    "    origin=\"https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\",\n",
+    "    fname=\"imdb\",\n",
+    "    extract=True,\n",
+    ")\n",
+    "\n",
+    "imdb_extract_dir = pathlib.Path(zip_path) / \"aclImdb\"\n",
+    "train_dir = pathlib.Path(\"imdb_train\")\n",
+    "test_dir = pathlib.Path(\"imdb_test\")\n",
+    "val_dir = pathlib.Path(\"imdb_val\")\n",
+    "\n",
+    "shutil.copytree(imdb_extract_dir / \"test\", test_dir, dirs_exist_ok=True)\n",
+    "\n",
+    "val_percentage = 0.2\n",
+    "for category in (\"neg\", \"pos\"):\n",
+    "    src_dir = imdb_extract_dir / \"train\" / category\n",
+    "    src_files = os.listdir(src_dir)\n",
+    "    random.Random(1337).shuffle(src_files)\n",
+    "    num_val_samples = int(len(src_files) * val_percentage)\n",
+    "\n",
+    "    os.makedirs(train_dir / category, exist_ok=True)\n",
+    "    os.makedirs(val_dir / category, exist_ok=True)\n",
+    "    for index, file in enumerate(src_files):\n",
+    "        if index < num_val_samples:\n",
+    "            shutil.copy(src_dir / file, val_dir / category / file)\n",
+    "        else:\n",
+    "            shutil.copy(src_dir / file, train_dir / category / file)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras.utils import text_dataset_from_directory\n",
+    "\n",
+    "batch_size = 16\n",
+    "train_ds = text_dataset_from_directory(train_dir, batch_size=batch_size)\n",
+    "val_ds = text_dataset_from_directory(val_dir, batch_size=batch_size)\n",
+    "test_ds = text_dataset_from_directory(test_dir, batch_size=batch_size)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def preprocess(text, label):\n",
+    "    packer = keras_hub.layers.StartEndPacker(\n",
+    "        sequence_length=512,\n",
+    "        start_value=tokenizer.start_token_id,\n",
+    "        end_value=tokenizer.end_token_id,\n",
+    "        pad_value=tokenizer.pad_token_id,\n",
+    "        return_padding_mask=True,\n",
+    "    )\n",
+    "    token_ids, padding_mask = packer(tokenizer(text))\n",
+    "    return {\"token_ids\": token_ids, \"padding_mask\": padding_mask}, label\n",
+    "\n",
+    "preprocessed_train_ds = train_ds.map(preprocess)\n",
+    "preprocessed_val_ds = val_ds.map(preprocess)\n",
+    "preprocessed_test_ds = test_ds.map(preprocess)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "next(iter(preprocessed_train_ds))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Fine-tuning a pretrained Transformer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "inputs = backbone.input\n",
+    "x = backbone(inputs)\n",
+    "x = x[:, 0, :]\n",
+    "x = layers.Dropout(0.1)(x)\n",
+    "x = layers.Dense(768, activation=\"relu\")(x)\n",
+    "x = layers.Dropout(0.1)(x)\n",
+    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
+    "classifier = keras.Model(inputs, outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "classifier.compile(\n",
+    "    optimizer=keras.optimizers.Adam(5e-5),\n",
+    "    loss=\"binary_crossentropy\",\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "classifier.fit(\n",
+    "    preprocessed_train_ds,\n",
+    "    validation_data=preprocessed_val_ds,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "classifier.evaluate(preprocessed_test_ds)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### What makes the Transformer effective?"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter15_language-models-and-the-transformer",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter16_text-generation.ipynb b/chapter16_text-generation.ipynb
new file mode 100644
index 0000000000..4dfc93d3e7
--- /dev/null
+++ b/chapter16_text-generation.ipynb
@@ -0,0 +1,1223 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io).\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Text generation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### A brief history of sequence generation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Training a mini-GPT"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "# Free up more GPU memory on the Jax and TensorFlow backends.\n",
+    "os.environ[\"XLA_PYTHON_CLIENT_MEM_FRACTION\"] = \"1.00\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "import pathlib\n",
+    "\n",
+    "extract_dir = keras.utils.get_file(\n",
+    "    fname=\"mini-c4\",\n",
+    "    origin=(\n",
+    "        \"https://hf.co/datasets/mattdangerw/mini-c4/resolve/main/mini-c4.zip\"\n",
+    "    ),\n",
+    "    extract=True,\n",
+    ")\n",
+    "extract_dir = pathlib.Path(extract_dir) / \"mini-c4\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "with open(extract_dir / \"shard0.txt\", \"r\") as f:\n",
+    "    print(f.readline().replace(\"\\\\n\", \"\\n\")[:100])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras_hub\n",
+    "import numpy as np\n",
+    "\n",
+    "vocabulary_file = keras.utils.get_file(\n",
+    "    origin=\"https://hf.co/mattdangerw/spiece/resolve/main/vocabulary.proto\",\n",
+    ")\n",
+    "tokenizer = keras_hub.tokenizers.SentencePieceTokenizer(vocabulary_file)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "tokenizer.tokenize(\"The quick brown fox.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "tokenizer.detokenize([450, 4996, 17354, 1701, 29916, 29889])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import tensorflow as tf\n",
+    "\n",
+    "batch_size = 64\n",
+    "sequence_length = 256\n",
+    "suffix = np.array([tokenizer.token_to_id(\"<|endoftext|>\")])\n",
+    "\n",
+    "def read_file(filename):\n",
+    "    ds = tf.data.TextLineDataset(filename)\n",
+    "    ds = ds.map(lambda x: tf.strings.regex_replace(x, r\"\\\\n\", \"\\n\"))\n",
+    "    ds = ds.map(tokenizer, num_parallel_calls=8)\n",
+    "    return ds.map(lambda x: tf.concat([x, suffix], -1))\n",
+    "\n",
+    "files = [str(file) for file in extract_dir.glob(\"*.txt\")]\n",
+    "ds = tf.data.Dataset.from_tensor_slices(files)\n",
+    "ds = ds.interleave(read_file, cycle_length=32, num_parallel_calls=32)\n",
+    "ds = ds.rebatch(sequence_length + 1, drop_remainder=True)\n",
+    "ds = ds.map(lambda x: (x[:-1], x[1:]))\n",
+    "ds = ds.batch(batch_size).prefetch(8)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "num_batches = 58746\n",
+    "num_val_batches = 500\n",
+    "num_train_batches = num_batches - num_val_batches\n",
+    "val_ds = ds.take(num_val_batches).repeat()\n",
+    "train_ds = ds.skip(num_val_batches).repeat()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Building the model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import layers\n",
+    "\n",
+    "class TransformerDecoder(keras.Layer):\n",
+    "    def __init__(self, hidden_dim, intermediate_dim, num_heads):\n",
+    "        super().__init__()\n",
+    "        key_dim = hidden_dim // num_heads\n",
+    "        self.self_attention = layers.MultiHeadAttention(\n",
+    "            num_heads, key_dim, dropout=0.1\n",
+    "        )\n",
+    "        self.self_attention_layernorm = layers.LayerNormalization()\n",
+    "        self.feed_forward_1 = layers.Dense(intermediate_dim, activation=\"relu\")\n",
+    "        self.feed_forward_2 = layers.Dense(hidden_dim)\n",
+    "        self.feed_forward_layernorm = layers.LayerNormalization()\n",
+    "        self.dropout = layers.Dropout(0.1)\n",
+    "\n",
+    "    def call(self, inputs):\n",
+    "        residual = x = inputs\n",
+    "        x = self.self_attention(query=x, key=x, value=x, use_causal_mask=True)\n",
+    "        x = self.dropout(x)\n",
+    "        x = x + residual\n",
+    "        x = self.self_attention_layernorm(x)\n",
+    "        residual = x\n",
+    "        x = self.feed_forward_1(x)\n",
+    "        x = self.feed_forward_2(x)\n",
+    "        x = self.dropout(x)\n",
+    "        x = x + residual\n",
+    "        x = self.feed_forward_layernorm(x)\n",
+    "        return x"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "class PositionalEmbedding(keras.Layer):\n",
+    "    def __init__(self, sequence_length, input_dim, output_dim):\n",
+    "        super().__init__()\n",
+    "        self.token_embeddings = layers.Embedding(input_dim, output_dim)\n",
+    "        self.position_embeddings = layers.Embedding(sequence_length, output_dim)\n",
+    "\n",
+    "    def call(self, inputs, reverse=False):\n",
+    "        if reverse:\n",
+    "            token_embeddings = self.token_embeddings.embeddings\n",
+    "            return ops.matmul(inputs, ops.transpose(token_embeddings))\n",
+    "        positions = ops.cumsum(ops.ones_like(inputs), axis=-1) - 1\n",
+    "        embedded_tokens = self.token_embeddings(inputs)\n",
+    "        embedded_positions = self.position_embeddings(positions)\n",
+    "        return embedded_tokens + embedded_positions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "keras.config.set_dtype_policy(\"mixed_float16\")\n",
+    "\n",
+    "vocab_size = tokenizer.vocabulary_size()\n",
+    "hidden_dim = 512\n",
+    "intermediate_dim = 2056\n",
+    "num_heads = 8\n",
+    "num_layers = 8\n",
+    "\n",
+    "inputs = keras.Input(shape=(None,), dtype=\"int32\", name=\"inputs\")\n",
+    "embedding = PositionalEmbedding(sequence_length, vocab_size, hidden_dim)\n",
+    "x = embedding(inputs)\n",
+    "x = layers.LayerNormalization()(x)\n",
+    "for i in range(num_layers):\n",
+    "    x = TransformerDecoder(hidden_dim, intermediate_dim, num_heads)(x)\n",
+    "outputs = embedding(x, reverse=True)\n",
+    "mini_gpt = keras.Model(inputs, outputs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Pretraining the model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class WarmupSchedule(keras.optimizers.schedules.LearningRateSchedule):\n",
+    "    def __init__(self):\n",
+    "        self.rate = 2e-4\n",
+    "        self.warmup_steps = 1_000.0\n",
+    "\n",
+    "    def __call__(self, step):\n",
+    "        step = ops.cast(step, dtype=\"float32\")\n",
+    "        scale = ops.minimum(step / self.warmup_steps, 1.0)\n",
+    "        return self.rate * scale"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "schedule = WarmupSchedule()\n",
+    "x = range(0, 5_000, 100)\n",
+    "y = [ops.convert_to_numpy(schedule(step)) for step in x]\n",
+    "plt.plot(x, y)\n",
+    "plt.xlabel(\"Train Step\")\n",
+    "plt.ylabel(\"Learning Rate\")\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# \u26a0\ufe0fNOTE\u26a0\ufe0f: If you can run the following with a Colab Pro GPU, we suggest you\n",
+    "# do so. This fit() call will take many hours on free tier GPUs. You can also\n",
+    "# reduce steps_per_epoch to try the code with a less trained model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "num_epochs = 8\n",
+    "steps_per_epoch = num_train_batches // num_epochs\n",
+    "validation_steps = num_val_batches\n",
+    "\n",
+    "mini_gpt.compile(\n",
+    "    optimizer=keras.optimizers.Adam(schedule),\n",
+    "    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n",
+    "    metrics=[\"accuracy\"],\n",
+    ")\n",
+    "mini_gpt.fit(\n",
+    "    train_ds,\n",
+    "    validation_data=val_ds,\n",
+    "    epochs=num_epochs,\n",
+    "    steps_per_epoch=steps_per_epoch,\n",
+    "    validation_steps=validation_steps,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Generative decoding"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def generate(prompt, max_length=64):\n",
+    "    tokens = list(ops.convert_to_numpy(tokenizer(prompt)))\n",
+    "    prompt_length = len(tokens)\n",
+    "    for _ in range(max_length - prompt_length):\n",
+    "        prediction = mini_gpt(ops.convert_to_numpy([tokens]))\n",
+    "        prediction = ops.convert_to_numpy(prediction[0, -1])\n",
+    "        tokens.append(np.argmax(prediction).item())\n",
+    "    return tokenizer.detokenize(tokens)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "prompt = \"A piece of advice\"\n",
+    "generate(prompt)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def compiled_generate(prompt, max_length=64):\n",
+    "    tokens = list(ops.convert_to_numpy(tokenizer(prompt)))\n",
+    "    prompt_length = len(tokens)\n",
+    "    tokens = tokens + [0] * (max_length - prompt_length)\n",
+    "    for i in range(prompt_length, max_length):\n",
+    "        prediction = mini_gpt.predict(np.array([tokens]), verbose=0)\n",
+    "        prediction = prediction[0, i - 1]\n",
+    "        tokens[i] = np.argmax(prediction).item()\n",
+    "    return tokenizer.detokenize(tokens)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import timeit\n",
+    "tries = 10\n",
+    "timeit.timeit(lambda: compiled_generate(prompt), number=tries) / tries"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Sampling strategies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def compiled_generate(prompt, sample_fn, max_length=64):\n",
+    "    tokens = list(ops.convert_to_numpy(tokenizer(prompt)))\n",
+    "    prompt_length = len(tokens)\n",
+    "    tokens = tokens + [0] * (max_length - prompt_length)\n",
+    "    for i in range(prompt_length, max_length):\n",
+    "        prediction = mini_gpt.predict(np.array([tokens]), verbose=0)\n",
+    "        prediction = prediction[0, i - 1]\n",
+    "        next_token = ops.convert_to_numpy(sample_fn(prediction))\n",
+    "        tokens[i] = np.array(next_token).item()\n",
+    "    return tokenizer.detokenize(tokens)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def greedy_search(preds):\n",
+    "    return ops.argmax(preds)\n",
+    "\n",
+    "compiled_generate(prompt, greedy_search)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def random_sample(preds, temperature=1.0):\n",
+    "    preds = preds / temperature\n",
+    "    return keras.random.categorical(preds[None, :], num_samples=1)[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "compiled_generate(prompt, random_sample)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from functools import partial\n",
+    "compiled_generate(prompt, partial(random_sample, temperature=2.0))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "compiled_generate(prompt, partial(random_sample, temperature=0.8))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "compiled_generate(prompt, partial(random_sample, temperature=0.2))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def top_k(preds, k=5, temperature=1.0):\n",
+    "    preds = preds / temperature\n",
+    "    top_preds, top_indices = ops.top_k(preds, k=k, sorted=False)\n",
+    "    choice = keras.random.categorical(top_preds[None, :], num_samples=1)[0]\n",
+    "    return ops.take_along_axis(top_indices, choice, axis=-1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "compiled_generate(prompt, partial(top_k, k=5))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "compiled_generate(prompt, partial(top_k, k=20))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "compiled_generate(prompt, partial(top_k, k=5, temperature=0.5))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Using a pretrained LLM"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Text generation with the Gemma model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import kagglehub\n",
+    "\n",
+    "kagglehub.login()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm = keras_hub.models.CausalLM.from_preset(\n",
+    "    \"gemma3_1b\",\n",
+    "    dtype=\"float32\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.compile(sampler=\"greedy\")\n",
+    "gemma_lm.generate(\"A piece of advice\", max_length=40)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.generate(\"How can I make brownies?\", max_length=40)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.generate(\n",
+    "    \"The following brownie recipe is easy to make in just a few \"\n",
+    "    \"steps.\\n\\nYou can start by\",\n",
+    "    max_length=40,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.generate(\n",
+    "    \"Tell me about the 542nd president of the United States.\",\n",
+    "    max_length=40,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Instruction fine-tuning"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import json\n",
+    "\n",
+    "PROMPT_TEMPLATE = \"\"\"\"[instruction]\\n{}[end]\\n[response]\\n\"\"\"\n",
+    "RESPONSE_TEMPLATE = \"\"\"{}[end]\"\"\"\n",
+    "\n",
+    "dataset_path = keras.utils.get_file(\n",
+    "    origin=(\n",
+    "        \"https://hf.co/datasets/databricks/databricks-dolly-15k/\"\n",
+    "        \"resolve/main/databricks-dolly-15k.jsonl\"\n",
+    "    ),\n",
+    ")\n",
+    "data = {\"prompts\": [], \"responses\": []}\n",
+    "with open(dataset_path) as file:\n",
+    "    for line in file:\n",
+    "        features = json.loads(line)\n",
+    "        if features[\"context\"]:\n",
+    "            continue\n",
+    "        data[\"prompts\"].append(PROMPT_TEMPLATE.format(features[\"instruction\"]))\n",
+    "        data[\"responses\"].append(RESPONSE_TEMPLATE.format(features[\"response\"]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "data[\"prompts\"][0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "data[\"responses\"][0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "ds = tf.data.Dataset.from_tensor_slices(data).shuffle(2000).batch(2)\n",
+    "val_ds = ds.take(100)\n",
+    "train_ds = ds.skip(100)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "preprocessor = gemma_lm.preprocessor\n",
+    "preprocessor.sequence_length = 512\n",
+    "batch = next(iter(train_ds))\n",
+    "x, y, sample_weight = preprocessor(batch)\n",
+    "x[\"token_ids\"].shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x[\"padding_mask\"].shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "y.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "sample_weight.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x[\"token_ids\"][0, :5], y[0, :5]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Low-Rank Adaptation (LoRA)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.backbone.enable_lora(rank=8)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.compile(\n",
+    "    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n",
+    "    optimizer=keras.optimizers.Adam(5e-5),\n",
+    "    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],\n",
+    ")\n",
+    "gemma_lm.fit(train_ds, validation_data=val_ds, epochs=1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.generate(\n",
+    "    \"[instruction]\\nHow can I make brownies?[end]\\n\"\n",
+    "    \"[response]\\n\",\n",
+    "    max_length=512,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.generate(\n",
+    "    \"[instruction]\\nWhat is a proper noun?[end]\\n\"\n",
+    "    \"[response]\\n\",\n",
+    "    max_length=512,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.generate(\n",
+    "    \"[instruction]\\nWho is the 542nd president of the United States?[end]\\n\"\n",
+    "    \"[response]\\n\",\n",
+    "    max_length=512,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Going further with LLMs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Reinforcement Learning with Human Feedback (RLHF)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Using a chatbot trained with RLHF"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# \u26a0\ufe0fNOTE\u26a0\ufe0f: If you are running on the free tier Colab GPUs, you will need to\n",
+    "# restart your runtime and run the notebook from here to free up memory for\n",
+    "# this 4 billion parameter model.\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\"\n",
+    "# Free up more GPU memory on the Jax and TensorFlow backends.\n",
+    "os.environ[\"XLA_PYTHON_CLIENT_MEM_FRACTION\"] = \"1.00\"\n",
+    "\n",
+    "import keras\n",
+    "import keras_hub\n",
+    "import kagglehub\n",
+    "import numpy as np\n",
+    "\n",
+    "kagglehub.login()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm = keras_hub.models.CausalLM.from_preset(\n",
+    "    \"gemma3_instruct_4b\",\n",
+    "    dtype=\"bfloat16\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "PROMPT_TEMPLATE = \"\"\"<start_of_turn>user\n",
+    "{}<end_of_turn>\n",
+    "<start_of_turn>model\n",
+    "\"\"\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "prompt = \"Why can't you assign values in Jax tensors? Be brief!\"\n",
+    "gemma_lm.generate(PROMPT_TEMPLATE.format(prompt), max_length=512)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "prompt = \"Who is the 542nd president of the United States?\"\n",
+    "gemma_lm.generate(PROMPT_TEMPLATE.format(prompt), max_length=512)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Multimodal LLMs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "image_url = (\n",
+    "    \"https://github.com/mattdangerw/keras-nlp-scripts/\"\n",
+    "    \"blob/main/learned-python.png?raw=true\"\n",
+    ")\n",
+    "image_path = keras.utils.get_file(origin=image_url)\n",
+    "\n",
+    "image = np.array(keras.utils.load_img(image_path))\n",
+    "plt.axis(\"off\")\n",
+    "plt.imshow(image)\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.preprocessor.max_images_per_prompt = 1\n",
+    "gemma_lm.preprocessor.sequence_length = 512\n",
+    "prompt = \"What is going on in this image? Be concise!<start_of_image>\"\n",
+    "gemma_lm.generate({\n",
+    "    \"prompts\": PROMPT_TEMPLATE.format(prompt),\n",
+    "    \"images\": [image],\n",
+    "})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "prompt = \"What is the snake wearing?<start_of_image>\"\n",
+    "gemma_lm.generate({\n",
+    "    \"prompts\": PROMPT_TEMPLATE.format(prompt),\n",
+    "    \"images\": [image],\n",
+    "})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Foundation models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Retrieval Augmented Generation (RAG)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### \"Reasoning\" models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "prompt = \"\"\"Judy wrote a 2-page letter to 3 friends twice a week for 3 months.\n",
+    "How many letters did she write?\n",
+    "Be brief, and add \"ANSWER:\" before your final answer.\"\"\"\n",
+    "\n",
+    "gemma_lm.compile(sampler=\"random\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.generate(PROMPT_TEMPLATE.format(prompt))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "gemma_lm.generate(PROMPT_TEMPLATE.format(prompt))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Where are LLMs heading next?"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter16_text-generation",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter17_image-generation.ipynb b/chapter17_image-generation.ipynb
new file mode 100644
index 0000000000..b34e4d5839
--- /dev/null
+++ b/chapter17_image-generation.ipynb
@@ -0,0 +1,902 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Image generation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Deep learning for image generation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Sampling from latent spaces of images"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Variational autoencoders"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Implementing a VAE with Keras"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "latent_dim = 2\n",
+    "\n",
+    "image_inputs = keras.Input(shape=(28, 28, 1))\n",
+    "x = layers.Conv2D(32, 3, activation=\"relu\", strides=2, padding=\"same\")(\n",
+    "    image_inputs\n",
+    ")\n",
+    "x = layers.Conv2D(64, 3, activation=\"relu\", strides=2, padding=\"same\")(x)\n",
+    "x = layers.Flatten()(x)\n",
+    "x = layers.Dense(16, activation=\"relu\")(x)\n",
+    "z_mean = layers.Dense(latent_dim, name=\"z_mean\")(x)\n",
+    "z_log_var = layers.Dense(latent_dim, name=\"z_log_var\")(x)\n",
+    "encoder = keras.Model(image_inputs, [z_mean, z_log_var], name=\"encoder\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "encoder.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "class Sampler(keras.Layer):\n",
+    "    def __init__(self, **kwargs):\n",
+    "        super().__init__(**kwargs)\n",
+    "        self.seed_generator = keras.random.SeedGenerator()\n",
+    "        self.built = True\n",
+    "\n",
+    "    def call(self, z_mean, z_log_var):\n",
+    "        batch_size = ops.shape(z_mean)[0]\n",
+    "        z_size = ops.shape(z_mean)[1]\n",
+    "        epsilon = keras.random.normal(\n",
+    "            (batch_size, z_size), seed=self.seed_generator\n",
+    "        )\n",
+    "        return z_mean + ops.exp(0.5 * z_log_var) * epsilon"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "latent_inputs = keras.Input(shape=(latent_dim,))\n",
+    "x = layers.Dense(7 * 7 * 64, activation=\"relu\")(latent_inputs)\n",
+    "x = layers.Reshape((7, 7, 64))(x)\n",
+    "x = layers.Conv2DTranspose(64, 3, activation=\"relu\", strides=2, padding=\"same\")(\n",
+    "    x\n",
+    ")\n",
+    "x = layers.Conv2DTranspose(32, 3, activation=\"relu\", strides=2, padding=\"same\")(\n",
+    "    x\n",
+    ")\n",
+    "decoder_outputs = layers.Conv2D(1, 3, activation=\"sigmoid\", padding=\"same\")(x)\n",
+    "decoder = keras.Model(latent_inputs, decoder_outputs, name=\"decoder\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "decoder.summary(line_length=80)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class VAE(keras.Model):\n",
+    "    def __init__(self, encoder, decoder, **kwargs):\n",
+    "        super().__init__(**kwargs)\n",
+    "        self.encoder = encoder\n",
+    "        self.decoder = decoder\n",
+    "        self.sampler = Sampler()\n",
+    "        self.reconstruction_loss_tracker = keras.metrics.Mean(\n",
+    "            name=\"reconstruction_loss\"\n",
+    "        )\n",
+    "        self.kl_loss_tracker = keras.metrics.Mean(name=\"kl_loss\")\n",
+    "\n",
+    "    def call(self, inputs):\n",
+    "        return self.encoder(inputs)\n",
+    "\n",
+    "    def compute_loss(self, x, y, y_pred, sample_weight=None, training=True):\n",
+    "        original = x\n",
+    "        z_mean, z_log_var = y_pred\n",
+    "        reconstruction = self.decoder(self.sampler(z_mean, z_log_var))\n",
+    "\n",
+    "        reconstruction_loss = ops.mean(\n",
+    "            ops.sum(\n",
+    "                keras.losses.binary_crossentropy(x, reconstruction), axis=(1, 2)\n",
+    "            )\n",
+    "        )\n",
+    "        kl_loss = -0.5 * (\n",
+    "            1 + z_log_var - ops.square(z_mean) - ops.exp(z_log_var)\n",
+    "        )\n",
+    "        total_loss = reconstruction_loss + ops.mean(kl_loss)\n",
+    "\n",
+    "        self.reconstruction_loss_tracker.update_state(reconstruction_loss)\n",
+    "        self.kl_loss_tracker.update_state(kl_loss)\n",
+    "        return total_loss"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "(x_train, _), (x_test, _) = keras.datasets.mnist.load_data()\n",
+    "mnist_digits = np.concatenate([x_train, x_test], axis=0)\n",
+    "mnist_digits = np.expand_dims(mnist_digits, -1).astype(\"float32\") / 255\n",
+    "\n",
+    "vae = VAE(encoder, decoder)\n",
+    "vae.compile(optimizer=keras.optimizers.Adam())\n",
+    "vae.fit(mnist_digits, epochs=30, batch_size=128)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "n = 30\n",
+    "digit_size = 28\n",
+    "figure = np.zeros((digit_size * n, digit_size * n))\n",
+    "\n",
+    "grid_x = np.linspace(-1, 1, n)\n",
+    "grid_y = np.linspace(-1, 1, n)[::-1]\n",
+    "\n",
+    "for i, yi in enumerate(grid_y):\n",
+    "    for j, xi in enumerate(grid_x):\n",
+    "        z_sample = np.array([[xi, yi]])\n",
+    "        x_decoded = vae.decoder.predict(z_sample)\n",
+    "        digit = x_decoded[0].reshape(digit_size, digit_size)\n",
+    "        figure[\n",
+    "            i * digit_size : (i + 1) * digit_size,\n",
+    "            j * digit_size : (j + 1) * digit_size,\n",
+    "        ] = digit\n",
+    "\n",
+    "plt.figure(figsize=(15, 15))\n",
+    "start_range = digit_size // 2\n",
+    "end_range = n * digit_size + start_range\n",
+    "pixel_range = np.arange(start_range, end_range, digit_size)\n",
+    "sample_range_x = np.round(grid_x, 1)\n",
+    "sample_range_y = np.round(grid_y, 1)\n",
+    "plt.xticks(pixel_range, sample_range_x)\n",
+    "plt.yticks(pixel_range, sample_range_y)\n",
+    "plt.xlabel(\"z[0]\")\n",
+    "plt.ylabel(\"z[1]\")\n",
+    "plt.axis(\"off\")\n",
+    "plt.imshow(figure, cmap=\"Greys_r\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Diffusion models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The Oxford Flowers dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "fpath = keras.utils.get_file(\n",
+    "    origin=\"https://www.robots.ox.ac.uk/~vgg/data/flowers/102/102flowers.tgz\",\n",
+    "    extract=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "batch_size = 32\n",
+    "image_size = 128\n",
+    "images_dir = os.path.join(fpath, \"jpg\")\n",
+    "dataset = keras.utils.image_dataset_from_directory(\n",
+    "    images_dir,\n",
+    "    labels=None,\n",
+    "    image_size=(image_size, image_size),\n",
+    "    crop_to_aspect_ratio=True,\n",
+    ")\n",
+    "dataset = dataset.rebatch(\n",
+    "    batch_size,\n",
+    "    drop_remainder=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from matplotlib import pyplot as plt\n",
+    "\n",
+    "for batch in dataset:\n",
+    "    img = batch.numpy()[0]\n",
+    "    break\n",
+    "plt.imshow(img.astype(\"uint8\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### A U-Net denoising autoencoder"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def residual_block(x, width):\n",
+    "    input_width = x.shape[3]\n",
+    "    if input_width == width:\n",
+    "        residual = x\n",
+    "    else:\n",
+    "        residual = layers.Conv2D(width, 1)(x)\n",
+    "    x = layers.BatchNormalization(center=False, scale=False)(x)\n",
+    "    x = layers.Conv2D(width, 3, padding=\"same\", activation=\"swish\")(x)\n",
+    "    x = layers.Conv2D(width, 3, padding=\"same\")(x)\n",
+    "    x = x + residual\n",
+    "    return x\n",
+    "\n",
+    "def get_model(image_size, widths, block_depth):\n",
+    "    noisy_images = keras.Input(shape=(image_size, image_size, 3))\n",
+    "    noise_rates = keras.Input(shape=(1, 1, 1))\n",
+    "\n",
+    "    x = layers.Conv2D(widths[0], 1)(noisy_images)\n",
+    "    n = layers.UpSampling2D(image_size, interpolation=\"nearest\")(noise_rates)\n",
+    "    x = layers.Concatenate()([x, n])\n",
+    "\n",
+    "    skips = []\n",
+    "    for width in widths[:-1]:\n",
+    "        for _ in range(block_depth):\n",
+    "            x = residual_block(x, width)\n",
+    "            skips.append(x)\n",
+    "        x = layers.AveragePooling2D(pool_size=2)(x)\n",
+    "\n",
+    "    for _ in range(block_depth):\n",
+    "        x = residual_block(x, widths[-1])\n",
+    "\n",
+    "    for width in reversed(widths[:-1]):\n",
+    "        x = layers.UpSampling2D(size=2, interpolation=\"bilinear\")(x)\n",
+    "        for _ in range(block_depth):\n",
+    "            x = layers.Concatenate()([x, skips.pop()])\n",
+    "            x = residual_block(x, width)\n",
+    "\n",
+    "    pred_noise_masks = layers.Conv2D(3, 1, kernel_initializer=\"zeros\")(x)\n",
+    "\n",
+    "    return keras.Model([noisy_images, noise_rates], pred_noise_masks)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The concepts of diffusion time and diffusion schedule"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def diffusion_schedule(\n",
+    "    diffusion_times,\n",
+    "    min_signal_rate=0.02,\n",
+    "    max_signal_rate=0.95,\n",
+    "):\n",
+    "    start_angle = ops.cast(ops.arccos(max_signal_rate), \"float32\")\n",
+    "    end_angle = ops.cast(ops.arccos(min_signal_rate), \"float32\")\n",
+    "    diffusion_angles = start_angle + diffusion_times * (end_angle - start_angle)\n",
+    "    signal_rates = ops.cos(diffusion_angles)\n",
+    "    noise_rates = ops.sin(diffusion_angles)\n",
+    "    return noise_rates, signal_rates"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "diffusion_times = ops.arange(0.0, 1.0, 0.01)\n",
+    "noise_rates, signal_rates = diffusion_schedule(diffusion_times)\n",
+    "\n",
+    "diffusion_times = ops.convert_to_numpy(diffusion_times)\n",
+    "noise_rates = ops.convert_to_numpy(noise_rates)\n",
+    "signal_rates = ops.convert_to_numpy(signal_rates)\n",
+    "\n",
+    "plt.plot(diffusion_times, noise_rates, label=\"Noise rate\")\n",
+    "plt.plot(diffusion_times, signal_rates, label=\"Signal rate\")\n",
+    "\n",
+    "plt.xlabel(\"Diffusion time\")\n",
+    "plt.legend()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The training process"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class DiffusionModel(keras.Model):\n",
+    "    def __init__(self, image_size, widths, block_depth, **kwargs):\n",
+    "        super().__init__(**kwargs)\n",
+    "        self.image_size = image_size\n",
+    "        self.denoising_model = get_model(image_size, widths, block_depth)\n",
+    "        self.seed_generator = keras.random.SeedGenerator()\n",
+    "        self.loss = keras.losses.MeanAbsoluteError()\n",
+    "        self.normalizer = keras.layers.Normalization()\n",
+    "\n",
+    "    def denoise(self, noisy_images, noise_rates, signal_rates):\n",
+    "        pred_noise_masks = self.denoising_model([noisy_images, noise_rates])\n",
+    "        pred_images = (\n",
+    "            noisy_images - noise_rates * pred_noise_masks\n",
+    "        ) / signal_rates\n",
+    "        return pred_images, pred_noise_masks\n",
+    "\n",
+    "    def call(self, images):\n",
+    "        images = self.normalizer(images)\n",
+    "        noise_masks = keras.random.normal(\n",
+    "            (batch_size, self.image_size, self.image_size, 3),\n",
+    "            seed=self.seed_generator,\n",
+    "        )\n",
+    "        diffusion_times = keras.random.uniform(\n",
+    "            (batch_size, 1, 1, 1),\n",
+    "            minval=0.0,\n",
+    "            maxval=1.0,\n",
+    "            seed=self.seed_generator,\n",
+    "        )\n",
+    "        noise_rates, signal_rates = diffusion_schedule(diffusion_times)\n",
+    "        noisy_images = signal_rates * images + noise_rates * noise_masks\n",
+    "        pred_images, pred_noise_masks = self.denoise(\n",
+    "            noisy_images, noise_rates, signal_rates\n",
+    "        )\n",
+    "        return pred_images, pred_noise_masks, noise_masks\n",
+    "\n",
+    "    def compute_loss(self, x, y, y_pred, sample_weight=None, training=True):\n",
+    "        _, pred_noise_masks, noise_masks = y_pred\n",
+    "        return self.loss(noise_masks, pred_noise_masks)\n",
+    "\n",
+    "    def generate(self, num_images, diffusion_steps):\n",
+    "        noisy_images = keras.random.normal(\n",
+    "            (num_images, self.image_size, self.image_size, 3),\n",
+    "            seed=self.seed_generator,\n",
+    "        )\n",
+    "        step_size = 1.0 / diffusion_steps\n",
+    "        for step in range(diffusion_steps):\n",
+    "            diffusion_times = ops.ones((num_images, 1, 1, 1)) - step * step_size\n",
+    "            noise_rates, signal_rates = diffusion_schedule(diffusion_times)\n",
+    "            pred_images, pred_noises = self.denoise(\n",
+    "                noisy_images, noise_rates, signal_rates\n",
+    "            )\n",
+    "            next_diffusion_times = diffusion_times - step_size\n",
+    "            next_noise_rates, next_signal_rates = diffusion_schedule(\n",
+    "                next_diffusion_times\n",
+    "            )\n",
+    "            noisy_images = (\n",
+    "                next_signal_rates * pred_images + next_noise_rates * pred_noises\n",
+    "            )\n",
+    "        images = (\n",
+    "            self.normalizer.mean + pred_images * self.normalizer.variance**0.5\n",
+    "        )\n",
+    "        return ops.clip(images, 0.0, 255.0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The generation process"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Visualizing results with a custom callback"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class VisualizationCallback(keras.callbacks.Callback):\n",
+    "    def __init__(self, diffusion_steps=20, num_rows=3, num_cols=6):\n",
+    "        self.diffusion_steps = diffusion_steps\n",
+    "        self.num_rows = num_rows\n",
+    "        self.num_cols = num_cols\n",
+    "\n",
+    "    def on_epoch_end(self, epoch=None, logs=None):\n",
+    "        generated_images = self.model.generate(\n",
+    "            num_images=self.num_rows * self.num_cols,\n",
+    "            diffusion_steps=self.diffusion_steps,\n",
+    "        )\n",
+    "\n",
+    "        plt.figure(figsize=(self.num_cols * 2.0, self.num_rows * 2.0))\n",
+    "        for row in range(self.num_rows):\n",
+    "            for col in range(self.num_cols):\n",
+    "                i = row * self.num_cols + col\n",
+    "                plt.subplot(self.num_rows, self.num_cols, i + 1)\n",
+    "                img = ops.convert_to_numpy(generated_images[i]).astype(\"uint8\")\n",
+    "                plt.imshow(img)\n",
+    "                plt.axis(\"off\")\n",
+    "        plt.tight_layout()\n",
+    "        plt.show()\n",
+    "        plt.close()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### It's go time!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = DiffusionModel(image_size, widths=[32, 64, 96, 128], block_depth=2)\n",
+    "model.normalizer.adapt(dataset)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(\n",
+    "    optimizer=keras.optimizers.AdamW(\n",
+    "        learning_rate=keras.optimizers.schedules.InverseTimeDecay(\n",
+    "            initial_learning_rate=1e-3,\n",
+    "            decay_steps=1000,\n",
+    "            decay_rate=0.1,\n",
+    "        ),\n",
+    "        use_ema=True,\n",
+    "        ema_overwrite_frequency=100,\n",
+    "    ),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.fit(\n",
+    "    dataset,\n",
+    "    epochs=100,\n",
+    "    callbacks=[\n",
+    "        VisualizationCallback(),\n",
+    "        keras.callbacks.ModelCheckpoint(\n",
+    "            filepath=\"diffusion_model.weights.h5\",\n",
+    "            save_weights_only=True,\n",
+    "            save_best_only=True,\n",
+    "        ),\n",
+    "    ],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Text-to-image models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "if keras.config.backend() == \"torch\":\n",
+    "    # The rest of this chapter will not do any training. The following keeps\n",
+    "    # PyTorch from using too much memory by disabling gradients. TensorFlow and\n",
+    "    # JAX use a much smaller memory footprint and do not need this hack.\n",
+    "    import torch\n",
+    "\n",
+    "    torch.set_grad_enabled(False)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras_hub\n",
+    "\n",
+    "height, width = 512, 512\n",
+    "task = keras_hub.models.TextToImage.from_preset(\n",
+    "    \"stable_diffusion_3_medium\",\n",
+    "    image_shape=(height, width, 3),\n",
+    "    dtype=\"float16\",\n",
+    ")\n",
+    "prompt = \"A NASA astraunaut riding an origami elephant in New York City\"\n",
+    "task.generate(prompt)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "task.generate(\n",
+    "    {\n",
+    "        \"prompts\": prompt,\n",
+    "        \"negative_prompts\": \"blue color\",\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "from PIL import Image\n",
+    "\n",
+    "def display(images):\n",
+    "    return Image.fromarray(np.concatenate(images, axis=1))\n",
+    "\n",
+    "display([task.generate(prompt, num_steps=x) for x in [5, 10, 15, 20, 25]])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Exploring the latent space of a text-to-image model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import random\n",
+    "\n",
+    "def get_text_embeddings(prompt):\n",
+    "    token_ids = task.preprocessor.generate_preprocess([prompt])\n",
+    "    negative_token_ids = task.preprocessor.generate_preprocess([\"\"])\n",
+    "    return task.backbone.encode_text_step(token_ids, negative_token_ids)\n",
+    "\n",
+    "def denoise_with_text_embeddings(embeddings, num_steps=28, guidance_scale=7.0):\n",
+    "    latents = random.normal((1, height // 8, width // 8, 16))\n",
+    "    for step in range(num_steps):\n",
+    "        latents = task.backbone.denoise_step(\n",
+    "            latents,\n",
+    "            embeddings,\n",
+    "            step,\n",
+    "            num_steps,\n",
+    "            guidance_scale,\n",
+    "        )\n",
+    "    return task.backbone.decode_step(latents)[0]\n",
+    "\n",
+    "def scale_output(x):\n",
+    "    x = ops.convert_to_numpy(x)\n",
+    "    x = np.clip((x + 1.0) / 2.0, 0.0, 1.0)\n",
+    "    return np.round(x * 255.0).astype(\"uint8\")\n",
+    "\n",
+    "embeddings = get_text_embeddings(prompt)\n",
+    "image = denoise_with_text_embeddings(embeddings)\n",
+    "scale_output(image)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "[x.shape for x in embeddings]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "def slerp(t, v1, v2):\n",
+    "    v1, v2 = ops.cast(v1, \"float32\"), ops.cast(v2, \"float32\")\n",
+    "    v1_norm = ops.linalg.norm(ops.ravel(v1))\n",
+    "    v2_norm = ops.linalg.norm(ops.ravel(v2))\n",
+    "    dot = ops.sum(v1 * v2 / (v1_norm * v2_norm))\n",
+    "    theta_0 = ops.arccos(dot)\n",
+    "    sin_theta_0 = ops.sin(theta_0)\n",
+    "    theta_t = theta_0 * t\n",
+    "    sin_theta_t = ops.sin(theta_t)\n",
+    "    s0 = ops.sin(theta_0 - theta_t) / sin_theta_0\n",
+    "    s1 = sin_theta_t / sin_theta_0\n",
+    "    return s0 * v1 + s1 * v2\n",
+    "\n",
+    "def interpolate_text_embeddings(e1, e2, start=0, stop=1, num=10):\n",
+    "    embeddings = []\n",
+    "    for t in np.linspace(start, stop, num):\n",
+    "        embeddings.append(\n",
+    "            (\n",
+    "                slerp(t, e1[0], e2[0]),\n",
+    "                e1[1],\n",
+    "                slerp(t, e1[2], e2[2]),\n",
+    "                e1[3],\n",
+    "            )\n",
+    "        )\n",
+    "    return embeddings"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "prompt1 = \"A friendly dog looking up in a field of flowers\"\n",
+    "prompt2 = \"A horrifying, tentacled creature hovering over a field of flowers\"\n",
+    "e1 = get_text_embeddings(prompt1)\n",
+    "e2 = get_text_embeddings(prompt2)\n",
+    "\n",
+    "images = []\n",
+    "for et in interpolate_text_embeddings(e1, e2, start=0.5, stop=0.6, num=9):\n",
+    "    image = denoise_with_text_embeddings(et)\n",
+    "    images.append(scale_output(image))\n",
+    "display(images)"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter17_image-generation",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter18_best-practices-for-the-real-world.ipynb b/chapter18_best-practices-for-the-real-world.ipynb
new file mode 100644
index 0000000000..d7e28359aa
--- /dev/null
+++ b/chapter18_best-practices-for-the-real-world.ipynb
@@ -0,0 +1,598 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras keras-hub --upgrade -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "cellView": "form",
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "# @title\n",
+    "import os\n",
+    "from IPython.core.magic import register_cell_magic\n",
+    "\n",
+    "@register_cell_magic\n",
+    "def backend(line, cell):\n",
+    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
+    "    if current == required:\n",
+    "        get_ipython().run_cell(cell)\n",
+    "    else:\n",
+    "        print(\n",
+    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
+    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
+    "        )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Best practices for the real world"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Getting the most out of your models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Hyperparameter optimization"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Using KerasTuner"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install keras-tuner -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras import layers\n",
+    "\n",
+    "def build_model(hp):\n",
+    "    units = hp.Int(name=\"units\", min_value=16, max_value=64, step=16)\n",
+    "    model = keras.Sequential(\n",
+    "        [\n",
+    "            layers.Dense(units, activation=\"relu\"),\n",
+    "            layers.Dense(10, activation=\"softmax\"),\n",
+    "        ]\n",
+    "    )\n",
+    "    optimizer = hp.Choice(name=\"optimizer\", values=[\"rmsprop\", \"adam\"])\n",
+    "    model.compile(\n",
+    "        optimizer=optimizer,\n",
+    "        loss=\"sparse_categorical_crossentropy\",\n",
+    "        metrics=[\"accuracy\"],\n",
+    "    )\n",
+    "    return model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import keras_tuner as kt\n",
+    "\n",
+    "class SimpleMLP(kt.HyperModel):\n",
+    "    def __init__(self, num_classes):\n",
+    "        self.num_classes = num_classes\n",
+    "\n",
+    "    def build(self, hp):\n",
+    "        units = hp.Int(name=\"units\", min_value=16, max_value=64, step=16)\n",
+    "        model = keras.Sequential(\n",
+    "            [\n",
+    "                layers.Dense(units, activation=\"relu\"),\n",
+    "                layers.Dense(self.num_classes, activation=\"softmax\"),\n",
+    "            ]\n",
+    "        )\n",
+    "        optimizer = hp.Choice(name=\"optimizer\", values=[\"rmsprop\", \"adam\"])\n",
+    "        model.compile(\n",
+    "            optimizer=optimizer,\n",
+    "            loss=\"sparse_categorical_crossentropy\",\n",
+    "            metrics=[\"accuracy\"],\n",
+    "        )\n",
+    "        return model\n",
+    "\n",
+    "hypermodel = SimpleMLP(num_classes=10)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "tuner = kt.BayesianOptimization(\n",
+    "    build_model,\n",
+    "    objective=\"val_accuracy\",\n",
+    "    max_trials=20,\n",
+    "    executions_per_trial=2,\n",
+    "    directory=\"mnist_kt_test\",\n",
+    "    overwrite=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "tuner.search_space_summary()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()\n",
+    "x_train = x_train.reshape((-1, 28 * 28)).astype(\"float32\") / 255\n",
+    "x_test = x_test.reshape((-1, 28 * 28)).astype(\"float32\") / 255\n",
+    "x_train_full = x_train[:]\n",
+    "y_train_full = y_train[:]\n",
+    "num_val_samples = 10000\n",
+    "x_train, x_val = x_train[:-num_val_samples], x_train[-num_val_samples:]\n",
+    "y_train, y_val = y_train[:-num_val_samples], y_train[-num_val_samples:]\n",
+    "callbacks = [\n",
+    "    keras.callbacks.EarlyStopping(monitor=\"val_loss\", patience=5),\n",
+    "]\n",
+    "tuner.search(\n",
+    "    x_train,\n",
+    "    y_train,\n",
+    "    batch_size=128,\n",
+    "    epochs=100,\n",
+    "    validation_data=(x_val, y_val),\n",
+    "    callbacks=callbacks,\n",
+    "    verbose=2,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "top_n = 4\n",
+    "best_hps = tuner.get_best_hyperparameters(top_n)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def get_best_epoch(hp):\n",
+    "    model = build_model(hp)\n",
+    "    callbacks = [\n",
+    "        keras.callbacks.EarlyStopping(\n",
+    "            monitor=\"val_loss\", mode=\"min\", patience=10\n",
+    "        )\n",
+    "    ]\n",
+    "    history = model.fit(\n",
+    "        x_train,\n",
+    "        y_train,\n",
+    "        validation_data=(x_val, y_val),\n",
+    "        epochs=100,\n",
+    "        batch_size=128,\n",
+    "        callbacks=callbacks,\n",
+    "    )\n",
+    "    val_loss_per_epoch = history.history[\"val_loss\"]\n",
+    "    best_epoch = val_loss_per_epoch.index(min(val_loss_per_epoch)) + 1\n",
+    "    print(f\"Best epoch: {best_epoch}\")\n",
+    "    return best_epoch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def get_best_trained_model(hp):\n",
+    "    best_epoch = get_best_epoch(hp)\n",
+    "    model = build_model(hp)\n",
+    "    model.fit(\n",
+    "        x_train_full, y_train_full, batch_size=128, epochs=int(best_epoch * 1.2)\n",
+    "    )\n",
+    "    return model\n",
+    "\n",
+    "best_models = []\n",
+    "for hp in best_hps:\n",
+    "    model = get_best_trained_model(hp)\n",
+    "    model.evaluate(x_test, y_test)\n",
+    "    best_models.append(model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "best_models = tuner.get_best_models(top_n)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### The art of crafting the right search space"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### The future of hyperparameter tuning: automated machine learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Model ensembling"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Scaling up model training with multiple devices"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Multi-GPU training"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Data parallelism: Replicating your model on each GPU"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Model parallelism: Splitting your model across multiple GPUs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Distributed training in practice"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Getting your hands on two or more GPUs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Using data parallelism with JAX"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Using model parallelism with JAX"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### The DeviceMesh API"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "###### The LayoutMap API"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### TPU training"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Using step fusing to improve TPU utilization"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Speeding up training and inference with lower-precision computation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Understanding floating-point precision"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Float16 inference"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Mixed-precision training"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Using loss scaling with mixed precision"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "##### Beyond mixed precision: float8 training"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Faster inference with quantization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from keras import ops\n",
+    "\n",
+    "x = ops.array([[0.1, 0.9], [1.2, -0.8]])\n",
+    "kernel = ops.array([[-0.1, -2.2], [1.1, 0.7]])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def abs_max_quantize(value):\n",
+    "    abs_max = ops.max(ops.abs(value), keepdims=True)\n",
+    "    scale = ops.divide(127, abs_max + 1e-7)\n",
+    "    scaled_value = value * scale\n",
+    "    scaled_value = ops.clip(ops.round(scaled_value), -127, 127)\n",
+    "    scaled_value = ops.cast(scaled_value, dtype=\"int8\")\n",
+    "    return scaled_value, scale\n",
+    "\n",
+    "int_x, x_scale = abs_max_quantize(x)\n",
+    "int_kernel, kernel_scale = abs_max_quantize(kernel)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "int_y = ops.matmul(int_x, int_kernel)\n",
+    "y = ops.cast(int_y, dtype=\"float32\") / (x_scale * kernel_scale)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "y"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "ops.matmul(x, kernel)"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter18_best-practices-for-the-real-world",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/second_edition/README.md b/second_edition/README.md
new file mode 100644
index 0000000000..53b72c363f
--- /dev/null
+++ b/second_edition/README.md
@@ -0,0 +1,30 @@
+# Second edition notebooks
+
+These are the notebooks for the second edition of the book, originally published in 2021. These notebooks use `tf.keras` with TensorFlow 2.16.
+
+## Table of contents
+
+* [Chapter 2: The mathematical building blocks of neural networks](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter02_mathematical-building-blocks.ipynb)
+* [Chapter 3: Introduction to Keras and TensorFlow](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter03_introduction-to-keras-and-tf.ipynb)
+* [Chapter 4: Getting started with neural networks: classification and regression](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter04_getting-started-with-neural-networks.ipynb)
+* [Chapter 5: Fundamentals of machine learning](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter05_fundamentals-of-ml.ipynb)
+* [Chapter 7: Working with Keras: a deep dive](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter07_working-with-keras.ipynb)
+* [Chapter 8: Introduction to deep learning for computer vision](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb)
+* Chapter 9: Advanced deep learning for computer vision
+    - [Part 1: Image segmentation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter09_part01_image-segmentation.ipynb)
+    - [Part 2: Modern convnet architecture patterns](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter09_part02_modern-convnet-architecture-patterns.ipynb)
+    - [Part 3: Interpreting what convnets learn](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter09_part03_interpreting-what-convnets-learn.ipynb)
+* [Chapter 10: Deep learning for timeseries](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter10_dl-for-timeseries.ipynb)
+* Chapter 11: Deep learning for text
+    - [Part 1: Introduction](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter11_part01_introduction.ipynb)
+    - [Part 2: Sequence models](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter11_part02_sequence-models.ipynb)
+    - [Part 3: Transformer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter11_part03_transformer.ipynb)
+    - [Part 4: Sequence-to-sequence learning](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb)
+* Chapter 12: Generative deep learning
+    - [Part 1: Text generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part01_text-generation.ipynb)
+    - [Part 2: Deep Dream](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part02_deep-dream.ipynb)
+    - [Part 3: Neural style transfer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part03_neural-style-transfer.ipynb)
+    - [Part 4: Variational autoencoders](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part04_variational-autoencoders.ipynb)
+    - [Part 5: Generative adversarial networks](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part05_gans.ipynb)
+* [Chapter 13: Best practices for the real world](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter13_best-practices-for-the-real-world.ipynb)
+* [Chapter 14: Conclusions](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter14_conclusions.ipynb)
diff --git a/second_edition/chapter02_mathematical-building-blocks.ipynb b/second_edition/chapter02_mathematical-building-blocks.ipynb
new file mode 100644
index 0000000000..01edc9becc
--- /dev/null
+++ b/second_edition/chapter02_mathematical-building-blocks.ipynb
@@ -0,0 +1,1469 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "# The mathematical building blocks of neural networks"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## A first look at a neural network"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Loading the MNIST dataset in Keras**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras.datasets import mnist\n",
+    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_images.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "len(train_labels)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_labels"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_images.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "len(test_labels)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_labels"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**The network architecture**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow import keras\n",
+    "from tensorflow.keras import layers\n",
+    "model = keras.Sequential([\n",
+    "    layers.Dense(512, activation=\"relu\"),\n",
+    "    layers.Dense(10, activation=\"softmax\")\n",
+    "])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**The compilation step**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"sparse_categorical_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Preparing the image data**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_images = train_images.reshape((60000, 28 * 28))\n",
+    "train_images = train_images.astype(\"float32\") / 255\n",
+    "test_images = test_images.reshape((10000, 28 * 28))\n",
+    "test_images = test_images.astype(\"float32\") / 255"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**\"Fitting\" the model**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.fit(train_images, train_labels, epochs=5, batch_size=128)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Using the model to make predictions**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_digits = test_images[0:10]\n",
+    "predictions = model.predict(test_digits)\n",
+    "predictions[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "predictions[0].argmax()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "predictions[0][7]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_labels[0]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Evaluating the model on new data**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "test_loss, test_acc = model.evaluate(test_images, test_labels)\n",
+    "print(f\"test_acc: {test_acc}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Data representations for neural networks"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Scalars (rank-0 tensors)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "x = np.array(12)\n",
+    "x"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x.ndim"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Vectors (rank-1 tensors)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = np.array([12, 3, 6, 14, 7])\n",
+    "x"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x.ndim"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Matrices (rank-2 tensors)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = np.array([[5, 78, 2, 34, 0],\n",
+    "              [6, 79, 3, 35, 1],\n",
+    "              [7, 80, 4, 36, 2]])\n",
+    "x.ndim"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Rank-3 and higher-rank tensors"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = np.array([[[5, 78, 2, 34, 0],\n",
+    "               [6, 79, 3, 35, 1],\n",
+    "               [7, 80, 4, 36, 2]],\n",
+    "              [[5, 78, 2, 34, 0],\n",
+    "               [6, 79, 3, 35, 1],\n",
+    "               [7, 80, 4, 36, 2]],\n",
+    "              [[5, 78, 2, 34, 0],\n",
+    "               [6, 79, 3, 35, 1],\n",
+    "               [7, 80, 4, 36, 2]]])\n",
+    "x.ndim"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Key attributes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras.datasets import mnist\n",
+    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_images.ndim"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_images.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_images.dtype"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Displaying the fourth digit**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "digit = train_images[4]\n",
+    "plt.imshow(digit, cmap=plt.cm.binary)\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_labels[4]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Manipulating tensors in NumPy"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "my_slice = train_images[10:100]\n",
+    "my_slice.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "my_slice = train_images[10:100, :, :]\n",
+    "my_slice.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "my_slice = train_images[10:100, 0:28, 0:28]\n",
+    "my_slice.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "my_slice = train_images[:, 14:, 14:]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "my_slice = train_images[:, 7:-7, 7:-7]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### The notion of data batches"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "batch = train_images[:128]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "batch = train_images[128:256]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "n = 3\n",
+    "batch = train_images[128 * n:128 * (n + 1)]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Real-world examples of data tensors"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Vector data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Timeseries data or sequence data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Image data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Video data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## The gears of neural networks: tensor operations"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Element-wise operations"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def naive_relu(x):\n",
+    "    assert len(x.shape) == 2\n",
+    "    x = x.copy()\n",
+    "    for i in range(x.shape[0]):\n",
+    "        for j in range(x.shape[1]):\n",
+    "            x[i, j] = max(x[i, j], 0)\n",
+    "    return x"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def naive_add(x, y):\n",
+    "    assert len(x.shape) == 2\n",
+    "    assert x.shape == y.shape\n",
+    "    x = x.copy()\n",
+    "    for i in range(x.shape[0]):\n",
+    "        for j in range(x.shape[1]):\n",
+    "            x[i, j] += y[i, j]\n",
+    "    return x"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import time\n",
+    "\n",
+    "x = np.random.random((20, 100))\n",
+    "y = np.random.random((20, 100))\n",
+    "\n",
+    "t0 = time.time()\n",
+    "for _ in range(1000):\n",
+    "    z = x + y\n",
+    "    z = np.maximum(z, 0.)\n",
+    "print(\"Took: {0:.2f} s\".format(time.time() - t0))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "t0 = time.time()\n",
+    "for _ in range(1000):\n",
+    "    z = naive_add(x, y)\n",
+    "    z = naive_relu(z)\n",
+    "print(\"Took: {0:.2f} s\".format(time.time() - t0))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Broadcasting"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "X = np.random.random((32, 10))\n",
+    "y = np.random.random((10,))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "y = np.expand_dims(y, axis=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "Y = np.concatenate([y] * 32, axis=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def naive_add_matrix_and_vector(x, y):\n",
+    "    assert len(x.shape) == 2\n",
+    "    assert len(y.shape) == 1\n",
+    "    assert x.shape[1] == y.shape[0]\n",
+    "    x = x.copy()\n",
+    "    for i in range(x.shape[0]):\n",
+    "        for j in range(x.shape[1]):\n",
+    "            x[i, j] += y[j]\n",
+    "    return x"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "x = np.random.random((64, 3, 32, 10))\n",
+    "y = np.random.random((32, 10))\n",
+    "z = np.maximum(x, y)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Tensor product"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = np.random.random((32,))\n",
+    "y = np.random.random((32,))\n",
+    "z = np.dot(x, y)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def naive_vector_dot(x, y):\n",
+    "    assert len(x.shape) == 1\n",
+    "    assert len(y.shape) == 1\n",
+    "    assert x.shape[0] == y.shape[0]\n",
+    "    z = 0.\n",
+    "    for i in range(x.shape[0]):\n",
+    "        z += x[i] * y[i]\n",
+    "    return z"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def naive_matrix_vector_dot(x, y):\n",
+    "    assert len(x.shape) == 2\n",
+    "    assert len(y.shape) == 1\n",
+    "    assert x.shape[1] == y.shape[0]\n",
+    "    z = np.zeros(x.shape[0])\n",
+    "    for i in range(x.shape[0]):\n",
+    "        for j in range(x.shape[1]):\n",
+    "            z[i] += x[i, j] * y[j]\n",
+    "    return z"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def naive_matrix_vector_dot(x, y):\n",
+    "    z = np.zeros(x.shape[0])\n",
+    "    for i in range(x.shape[0]):\n",
+    "        z[i] = naive_vector_dot(x[i, :], y)\n",
+    "    return z"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def naive_matrix_dot(x, y):\n",
+    "    assert len(x.shape) == 2\n",
+    "    assert len(y.shape) == 2\n",
+    "    assert x.shape[1] == y.shape[0]\n",
+    "    z = np.zeros((x.shape[0], y.shape[1]))\n",
+    "    for i in range(x.shape[0]):\n",
+    "        for j in range(y.shape[1]):\n",
+    "            row_x = x[i, :]\n",
+    "            column_y = y[:, j]\n",
+    "            z[i, j] = naive_vector_dot(row_x, column_y)\n",
+    "    return z"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Tensor reshaping"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "train_images = train_images.reshape((60000, 28 * 28))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = np.array([[0., 1.],\n",
+    "             [2., 3.],\n",
+    "             [4., 5.]])\n",
+    "x.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = x.reshape((6, 1))\n",
+    "x"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = np.zeros((300, 20))\n",
+    "x = np.transpose(x)\n",
+    "x.shape"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Geometric interpretation of tensor operations"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### A geometric interpretation of deep learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## The engine of neural networks: gradient-based optimization"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### What's a derivative?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Derivative of a tensor operation: the gradient"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Stochastic gradient descent"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Chaining derivatives: The Backpropagation algorithm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The chain rule"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Automatic differentiation with computation graphs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The gradient tape in TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import tensorflow as tf\n",
+    "x = tf.Variable(0.)\n",
+    "with tf.GradientTape() as tape:\n",
+    "    y = 2 * x + 3\n",
+    "grad_of_y_wrt_x = tape.gradient(y, x)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "x = tf.Variable(tf.random.uniform((2, 2)))\n",
+    "with tf.GradientTape() as tape:\n",
+    "    y = 2 * x + 3\n",
+    "grad_of_y_wrt_x = tape.gradient(y, x)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "W = tf.Variable(tf.random.uniform((2, 2)))\n",
+    "b = tf.Variable(tf.zeros((2,)))\n",
+    "x = tf.random.uniform((2, 2))\n",
+    "with tf.GradientTape() as tape:\n",
+    "    y = tf.matmul(x, W) + b\n",
+    "grad_of_y_wrt_W_and_b = tape.gradient(y, [W, b])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Looking back at our first example"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
+    "train_images = train_images.reshape((60000, 28 * 28))\n",
+    "train_images = train_images.astype(\"float32\") / 255\n",
+    "test_images = test_images.reshape((10000, 28 * 28))\n",
+    "test_images = test_images.astype(\"float32\") / 255"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential([\n",
+    "    layers.Dense(512, activation=\"relu\"),\n",
+    "    layers.Dense(10, activation=\"softmax\")\n",
+    "])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"sparse_categorical_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model.fit(train_images, train_labels, epochs=5, batch_size=128)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Reimplementing our first example from scratch in TensorFlow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### A simple Dense class"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import tensorflow as tf\n",
+    "\n",
+    "class NaiveDense:\n",
+    "    def __init__(self, input_size, output_size, activation):\n",
+    "        self.activation = activation\n",
+    "\n",
+    "        w_shape = (input_size, output_size)\n",
+    "        w_initial_value = tf.random.uniform(w_shape, minval=0, maxval=1e-1)\n",
+    "        self.W = tf.Variable(w_initial_value)\n",
+    "\n",
+    "        b_shape = (output_size,)\n",
+    "        b_initial_value = tf.zeros(b_shape)\n",
+    "        self.b = tf.Variable(b_initial_value)\n",
+    "\n",
+    "    def __call__(self, inputs):\n",
+    "        return self.activation(tf.matmul(inputs, self.W) + self.b)\n",
+    "\n",
+    "    @property\n",
+    "    def weights(self):\n",
+    "        return [self.W, self.b]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### A simple Sequential class"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "class NaiveSequential:\n",
+    "    def __init__(self, layers):\n",
+    "        self.layers = layers\n",
+    "\n",
+    "    def __call__(self, inputs):\n",
+    "        x = inputs\n",
+    "        for layer in self.layers:\n",
+    "           x = layer(x)\n",
+    "        return x\n",
+    "\n",
+    "    @property\n",
+    "    def weights(self):\n",
+    "       weights = []\n",
+    "       for layer in self.layers:\n",
+    "           weights += layer.weights\n",
+    "       return weights"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = NaiveSequential([\n",
+    "    NaiveDense(input_size=28 * 28, output_size=512, activation=tf.nn.relu),\n",
+    "    NaiveDense(input_size=512, output_size=10, activation=tf.nn.softmax)\n",
+    "])\n",
+    "assert len(model.weights) == 4"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### A batch generator"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import math\n",
+    "\n",
+    "class BatchGenerator:\n",
+    "    def __init__(self, images, labels, batch_size=128):\n",
+    "        assert len(images) == len(labels)\n",
+    "        self.index = 0\n",
+    "        self.images = images\n",
+    "        self.labels = labels\n",
+    "        self.batch_size = batch_size\n",
+    "        self.num_batches = math.ceil(len(images) / batch_size)\n",
+    "\n",
+    "    def next(self):\n",
+    "        images = self.images[self.index : self.index + self.batch_size]\n",
+    "        labels = self.labels[self.index : self.index + self.batch_size]\n",
+    "        self.index += self.batch_size\n",
+    "        return images, labels"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Running one training step"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def one_training_step(model, images_batch, labels_batch):\n",
+    "    with tf.GradientTape() as tape:\n",
+    "        predictions = model(images_batch)\n",
+    "        per_sample_losses = tf.keras.losses.sparse_categorical_crossentropy(\n",
+    "            labels_batch, predictions)\n",
+    "        average_loss = tf.reduce_mean(per_sample_losses)\n",
+    "    gradients = tape.gradient(average_loss, model.weights)\n",
+    "    update_weights(gradients, model.weights)\n",
+    "    return average_loss"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "learning_rate = 1e-3\n",
+    "\n",
+    "def update_weights(gradients, weights):\n",
+    "    for g, w in zip(gradients, weights):\n",
+    "        w.assign_sub(g * learning_rate)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras import optimizers\n",
+    "\n",
+    "optimizer = optimizers.SGD(learning_rate=1e-3)\n",
+    "\n",
+    "def update_weights(gradients, weights):\n",
+    "    optimizer.apply_gradients(zip(gradients, weights))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### The full training loop"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "def fit(model, images, labels, epochs, batch_size=128):\n",
+    "    for epoch_counter in range(epochs):\n",
+    "        print(f\"Epoch {epoch_counter}\")\n",
+    "        batch_generator = BatchGenerator(images, labels)\n",
+    "        for batch_counter in range(batch_generator.num_batches):\n",
+    "            images_batch, labels_batch = batch_generator.next()\n",
+    "            loss = one_training_step(model, images_batch, labels_batch)\n",
+    "            if batch_counter % 100 == 0:\n",
+    "                print(f\"loss at batch {batch_counter}: {loss:.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras.datasets import mnist\n",
+    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
+    "\n",
+    "train_images = train_images.reshape((60000, 28 * 28))\n",
+    "train_images = train_images.astype(\"float32\") / 255\n",
+    "test_images = test_images.reshape((10000, 28 * 28))\n",
+    "test_images = test_images.astype(\"float32\") / 255\n",
+    "\n",
+    "fit(model, train_images, train_labels, epochs=10, batch_size=128)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Evaluating the model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "predictions = model(test_images)\n",
+    "predictions = predictions.numpy()\n",
+    "predicted_labels = np.argmax(predictions, axis=1)\n",
+    "matches = predicted_labels == test_labels\n",
+    "print(f\"accuracy: {matches.mean():.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Summary"
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter02_mathematical-building-blocks.i",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter03_introduction-to-keras-and-tf.ipynb b/second_edition/chapter03_introduction-to-keras-and-tf.ipynb
similarity index 100%
rename from chapter03_introduction-to-keras-and-tf.ipynb
rename to second_edition/chapter03_introduction-to-keras-and-tf.ipynb
diff --git a/chapter04_getting-started-with-neural-networks.ipynb b/second_edition/chapter04_getting-started-with-neural-networks.ipynb
similarity index 100%
rename from chapter04_getting-started-with-neural-networks.ipynb
rename to second_edition/chapter04_getting-started-with-neural-networks.ipynb
diff --git a/second_edition/chapter05_fundamentals-of-ml.ipynb b/second_edition/chapter05_fundamentals-of-ml.ipynb
new file mode 100644
index 0000000000..dd61f4ead8
--- /dev/null
+++ b/second_edition/chapter05_fundamentals-of-ml.ipynb
@@ -0,0 +1,786 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "# Fundamentals of machine learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Generalization: The goal of machine learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Underfitting and overfitting"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Noisy training data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Ambiguous features"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Rare features and spurious correlations"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Adding white-noise channels or all-zeros channels to MNIST**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras.datasets import mnist\n",
+    "import numpy as np\n",
+    "\n",
+    "(train_images, train_labels), _ = mnist.load_data()\n",
+    "train_images = train_images.reshape((60000, 28 * 28))\n",
+    "train_images = train_images.astype(\"float32\") / 255\n",
+    "\n",
+    "train_images_with_noise_channels = np.concatenate(\n",
+    "    [train_images, np.random.random((len(train_images), 784))], axis=1)\n",
+    "\n",
+    "train_images_with_zeros_channels = np.concatenate(\n",
+    "    [train_images, np.zeros((len(train_images), 784))], axis=1)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Training the same model on MNIST data with noise channels or all-zero channels**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow import keras\n",
+    "from tensorflow.keras import layers\n",
+    "\n",
+    "def get_model():\n",
+    "    model = keras.Sequential([\n",
+    "        layers.Dense(512, activation=\"relu\"),\n",
+    "        layers.Dense(10, activation=\"softmax\")\n",
+    "    ])\n",
+    "    model.compile(optimizer=\"rmsprop\",\n",
+    "                  loss=\"sparse_categorical_crossentropy\",\n",
+    "                  metrics=[\"accuracy\"])\n",
+    "    return model\n",
+    "\n",
+    "model = get_model()\n",
+    "history_noise = model.fit(\n",
+    "    train_images_with_noise_channels, train_labels,\n",
+    "    epochs=10,\n",
+    "    batch_size=128,\n",
+    "    validation_split=0.2)\n",
+    "\n",
+    "model = get_model()\n",
+    "history_zeros = model.fit(\n",
+    "    train_images_with_zeros_channels, train_labels,\n",
+    "    epochs=10,\n",
+    "    batch_size=128,\n",
+    "    validation_split=0.2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Plotting a validation accuracy comparison**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "val_acc_noise = history_noise.history[\"val_accuracy\"]\n",
+    "val_acc_zeros = history_zeros.history[\"val_accuracy\"]\n",
+    "epochs = range(1, 11)\n",
+    "plt.plot(epochs, val_acc_noise, \"b-\",\n",
+    "         label=\"Validation accuracy with noise channels\")\n",
+    "plt.plot(epochs, val_acc_zeros, \"b--\",\n",
+    "         label=\"Validation accuracy with zeros channels\")\n",
+    "plt.title(\"Effect of noise channels on validation accuracy\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Accuracy\")\n",
+    "plt.legend()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### The nature of generalization in deep learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Fitting a MNIST model with randomly shuffled labels**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "(train_images, train_labels), _ = mnist.load_data()\n",
+    "train_images = train_images.reshape((60000, 28 * 28))\n",
+    "train_images = train_images.astype(\"float32\") / 255\n",
+    "\n",
+    "random_train_labels = train_labels[:]\n",
+    "np.random.shuffle(random_train_labels)\n",
+    "\n",
+    "model = keras.Sequential([\n",
+    "    layers.Dense(512, activation=\"relu\"),\n",
+    "    layers.Dense(10, activation=\"softmax\")\n",
+    "])\n",
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"sparse_categorical_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "model.fit(train_images, random_train_labels,\n",
+    "          epochs=100,\n",
+    "          batch_size=128,\n",
+    "          validation_split=0.2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### The manifold hypothesis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Interpolation as a source of generalization"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Why deep learning works"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Training data is paramount"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Evaluating machine-learning models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Training, validation, and test sets"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Simple hold-out validation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### K-fold validation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Iterated K-fold validation with shuffling"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Beating a common-sense baseline"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Things to keep in mind about model evaluation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Improving model fit"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Tuning key gradient descent parameters"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Training a MNIST model with an incorrectly high learning rate**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "(train_images, train_labels), _ = mnist.load_data()\n",
+    "train_images = train_images.reshape((60000, 28 * 28))\n",
+    "train_images = train_images.astype(\"float32\") / 255\n",
+    "\n",
+    "model = keras.Sequential([\n",
+    "    layers.Dense(512, activation=\"relu\"),\n",
+    "    layers.Dense(10, activation=\"softmax\")\n",
+    "])\n",
+    "model.compile(optimizer=keras.optimizers.RMSprop(1.),\n",
+    "              loss=\"sparse_categorical_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "model.fit(train_images, train_labels,\n",
+    "          epochs=10,\n",
+    "          batch_size=128,\n",
+    "          validation_split=0.2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**The same model with a more appropriate learning rate**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential([\n",
+    "    layers.Dense(512, activation=\"relu\"),\n",
+    "    layers.Dense(10, activation=\"softmax\")\n",
+    "])\n",
+    "model.compile(optimizer=keras.optimizers.RMSprop(1e-2),\n",
+    "              loss=\"sparse_categorical_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "model.fit(train_images, train_labels,\n",
+    "          epochs=10,\n",
+    "          batch_size=128,\n",
+    "          validation_split=0.2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Leveraging better architecture priors"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Increasing model capacity"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**A simple logistic regression on MNIST**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential([layers.Dense(10, activation=\"softmax\")])\n",
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"sparse_categorical_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "history_small_model = model.fit(\n",
+    "    train_images, train_labels,\n",
+    "    epochs=20,\n",
+    "    batch_size=128,\n",
+    "    validation_split=0.2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "val_loss = history_small_model.history[\"val_loss\"]\n",
+    "epochs = range(1, 21)\n",
+    "plt.plot(epochs, val_loss, \"b--\",\n",
+    "         label=\"Validation loss\")\n",
+    "plt.title(\"Effect of insufficient model capacity on validation loss\")\n",
+    "plt.xlabel(\"Epochs\")\n",
+    "plt.ylabel(\"Loss\")\n",
+    "plt.legend()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential([\n",
+    "    layers.Dense(96, activation=\"relu\"),\n",
+    "    layers.Dense(96, activation=\"relu\"),\n",
+    "    layers.Dense(10, activation=\"softmax\"),\n",
+    "])\n",
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"sparse_categorical_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "history_large_model = model.fit(\n",
+    "    train_images, train_labels,\n",
+    "    epochs=20,\n",
+    "    batch_size=128,\n",
+    "    validation_split=0.2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Improving generalization"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Dataset curation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Feature engineering"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Using early stopping"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "### Regularizing your model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Reducing the network's size"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Original model**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras.datasets import imdb\n",
+    "(train_data, train_labels), _ = imdb.load_data(num_words=10000)\n",
+    "\n",
+    "def vectorize_sequences(sequences, dimension=10000):\n",
+    "    results = np.zeros((len(sequences), dimension))\n",
+    "    for i, sequence in enumerate(sequences):\n",
+    "        results[i, sequence] = 1.\n",
+    "    return results\n",
+    "train_data = vectorize_sequences(train_data)\n",
+    "\n",
+    "model = keras.Sequential([\n",
+    "    layers.Dense(16, activation=\"relu\"),\n",
+    "    layers.Dense(16, activation=\"relu\"),\n",
+    "    layers.Dense(1, activation=\"sigmoid\")\n",
+    "])\n",
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"binary_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "history_original = model.fit(train_data, train_labels,\n",
+    "                             epochs=20, batch_size=512, validation_split=0.4)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Version of the model with lower capacity**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential([\n",
+    "    layers.Dense(4, activation=\"relu\"),\n",
+    "    layers.Dense(4, activation=\"relu\"),\n",
+    "    layers.Dense(1, activation=\"sigmoid\")\n",
+    "])\n",
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"binary_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "history_smaller_model = model.fit(\n",
+    "    train_data, train_labels,\n",
+    "    epochs=20, batch_size=512, validation_split=0.4)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Version of the model with higher capacity**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential([\n",
+    "    layers.Dense(512, activation=\"relu\"),\n",
+    "    layers.Dense(512, activation=\"relu\"),\n",
+    "    layers.Dense(1, activation=\"sigmoid\")\n",
+    "])\n",
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"binary_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "history_larger_model = model.fit(\n",
+    "    train_data, train_labels,\n",
+    "    epochs=20, batch_size=512, validation_split=0.4)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Adding weight regularization"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Adding L2 weight regularization to the model**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras import regularizers\n",
+    "model = keras.Sequential([\n",
+    "    layers.Dense(16,\n",
+    "                 kernel_regularizer=regularizers.l2(0.002),\n",
+    "                 activation=\"relu\"),\n",
+    "    layers.Dense(16,\n",
+    "                 kernel_regularizer=regularizers.l2(0.002),\n",
+    "                 activation=\"relu\"),\n",
+    "    layers.Dense(1, activation=\"sigmoid\")\n",
+    "])\n",
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"binary_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "history_l2_reg = model.fit(\n",
+    "    train_data, train_labels,\n",
+    "    epochs=20, batch_size=512, validation_split=0.4)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Different weight regularizers available in Keras**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras import regularizers\n",
+    "regularizers.l1(0.001)\n",
+    "regularizers.l1_l2(l1=0.001, l2=0.001)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "#### Adding dropout"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "**Adding dropout to the IMDB model**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "model = keras.Sequential([\n",
+    "    layers.Dense(16, activation=\"relu\"),\n",
+    "    layers.Dropout(0.5),\n",
+    "    layers.Dense(16, activation=\"relu\"),\n",
+    "    layers.Dropout(0.5),\n",
+    "    layers.Dense(1, activation=\"sigmoid\")\n",
+    "])\n",
+    "model.compile(optimizer=\"rmsprop\",\n",
+    "              loss=\"binary_crossentropy\",\n",
+    "              metrics=[\"accuracy\"])\n",
+    "history_dropout = model.fit(\n",
+    "    train_data, train_labels,\n",
+    "    epochs=20, batch_size=512, validation_split=0.4)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text"
+   },
+   "source": [
+    "## Summary"
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "collapsed_sections": [],
+   "name": "chapter05_fundamentals-of-ml.i",
+   "private_outputs": false,
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/chapter07_working-with-keras.ipynb b/second_edition/chapter07_working-with-keras.ipynb
similarity index 100%
rename from chapter07_working-with-keras.ipynb
rename to second_edition/chapter07_working-with-keras.ipynb
diff --git a/chapter08_intro-to-dl-for-computer-vision.ipynb b/second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb
similarity index 99%
rename from chapter08_intro-to-dl-for-computer-vision.ipynb
rename to second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb
index 60072bce8a..2459d444c4 100644
--- a/chapter08_intro-to-dl-for-computer-vision.ipynb
+++ b/second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb
@@ -264,6 +264,17 @@
     "!kaggle competitions download -c dogs-vs-cats"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "colab_type": "code"
+   },
+   "outputs": [],
+   "source": [
+    "!unzip -qq dogs-vs-cats.zip"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 0,
@@ -1210,4 +1221,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 0
-}
\ No newline at end of file
+}
diff --git a/chapter09_part01_image-segmentation.ipynb b/second_edition/chapter09_part01_image-segmentation.ipynb
similarity index 100%
rename from chapter09_part01_image-segmentation.ipynb
rename to second_edition/chapter09_part01_image-segmentation.ipynb
diff --git a/chapter09_part02_modern-convnet-architecture-patterns.ipynb b/second_edition/chapter09_part02_modern-convnet-architecture-patterns.ipynb
similarity index 100%
rename from chapter09_part02_modern-convnet-architecture-patterns.ipynb
rename to second_edition/chapter09_part02_modern-convnet-architecture-patterns.ipynb
diff --git a/chapter09_part03_interpreting-what-convnets-learn.ipynb b/second_edition/chapter09_part03_interpreting-what-convnets-learn.ipynb
similarity index 100%
rename from chapter09_part03_interpreting-what-convnets-learn.ipynb
rename to second_edition/chapter09_part03_interpreting-what-convnets-learn.ipynb
diff --git a/chapter10_dl-for-timeseries.ipynb b/second_edition/chapter10_dl-for-timeseries.ipynb
similarity index 100%
rename from chapter10_dl-for-timeseries.ipynb
rename to second_edition/chapter10_dl-for-timeseries.ipynb
diff --git a/chapter11_part01_introduction.ipynb b/second_edition/chapter11_part01_introduction.ipynb
similarity index 100%
rename from chapter11_part01_introduction.ipynb
rename to second_edition/chapter11_part01_introduction.ipynb
diff --git a/chapter11_part02_sequence-models.ipynb b/second_edition/chapter11_part02_sequence-models.ipynb
similarity index 100%
rename from chapter11_part02_sequence-models.ipynb
rename to second_edition/chapter11_part02_sequence-models.ipynb
diff --git a/chapter11_part03_transformer.ipynb b/second_edition/chapter11_part03_transformer.ipynb
similarity index 100%
rename from chapter11_part03_transformer.ipynb
rename to second_edition/chapter11_part03_transformer.ipynb
diff --git a/chapter11_part04_sequence-to-sequence-learning.ipynb b/second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb
similarity index 99%
rename from chapter11_part04_sequence-to-sequence-learning.ipynb
rename to second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb
index a08929dedb..8f7bf72641 100644
--- a/chapter11_part04_sequence-to-sequence-learning.ipynb
+++ b/second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb
@@ -106,6 +106,8 @@
     "import tensorflow as tf\n",
     "import string\n",
     "import re\n",
+    "from tensorflow import keras\n",
+    "from tensorflow.keras import layers\n",
     "\n",
     "strip_chars = string.punctuation + \"\u00bf\"\n",
     "strip_chars = strip_chars.replace(\"[\", \"\")\n",
@@ -403,6 +405,8 @@
     "            padding_mask = tf.cast(\n",
     "                mask[:, tf.newaxis, :], dtype=\"int32\")\n",
     "            padding_mask = tf.minimum(padding_mask, causal_mask)\n",
+    "        else:\n",
+    "            padding_mask = mask\n",
     "        attention_output_1 = self.attention_1(\n",
     "            query=inputs,\n",
     "            value=inputs,\n",
@@ -618,4 +622,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 0
-}
\ No newline at end of file
+}
diff --git a/chapter12_part01_text-generation.ipynb b/second_edition/chapter12_part01_text-generation.ipynb
similarity index 97%
rename from chapter12_part01_text-generation.ipynb
rename to second_edition/chapter12_part01_text-generation.ipynb
index 1c43438d3b..f683c1d73b 100644
--- a/chapter12_part01_text-generation.ipynb
+++ b/second_edition/chapter12_part01_text-generation.ipynb
@@ -293,6 +293,8 @@
     "            padding_mask = tf.cast(\n",
     "                mask[:, tf.newaxis, :], dtype=\"int32\")\n",
     "            padding_mask = tf.minimum(padding_mask, causal_mask)\n",
+    "        else:\n",
+    "            padding_mask = mask\n",
     "        attention_output_1 = self.attention_1(\n",
     "            query=inputs,\n",
     "            value=inputs,\n",
@@ -391,6 +393,8 @@
     "        self.model_input_length = model_input_length\n",
     "        self.temperatures = temperatures\n",
     "        self.print_freq = print_freq\n",
+    "        vectorized_prompt = text_vectorization([prompt])[0].numpy()\n",
+    "        self.prompt_length = np.nonzero(vectorized_prompt == 0)[0][0]\n",
     "\n",
     "    def on_epoch_end(self, epoch, logs=None):\n",
     "        if (epoch + 1) % self.print_freq != 0:\n",
@@ -401,7 +405,9 @@
     "            for i in range(self.generate_length):\n",
     "                tokenized_sentence = text_vectorization([sentence])\n",
     "                predictions = self.model(tokenized_sentence)\n",
-    "                next_token = sample_next(predictions[0, i, :])\n",
+    "                next_token = sample_next(\n",
+    "                    predictions[0, self.prompt_length - 1 + i, :]\n",
+    "                )\n",
     "                sampled_token = tokens_index[next_token]\n",
     "                sentence += \" \" + sampled_token\n",
     "            print(sentence)\n",
diff --git a/chapter12_part02_deep-dream.ipynb b/second_edition/chapter12_part02_deep-dream.ipynb
similarity index 100%
rename from chapter12_part02_deep-dream.ipynb
rename to second_edition/chapter12_part02_deep-dream.ipynb
diff --git a/chapter12_part03_neural-style-transfer.ipynb b/second_edition/chapter12_part03_neural-style-transfer.ipynb
similarity index 100%
rename from chapter12_part03_neural-style-transfer.ipynb
rename to second_edition/chapter12_part03_neural-style-transfer.ipynb
diff --git a/chapter12_part04_variational-autoencoders.ipynb b/second_edition/chapter12_part04_variational-autoencoders.ipynb
similarity index 100%
rename from chapter12_part04_variational-autoencoders.ipynb
rename to second_edition/chapter12_part04_variational-autoencoders.ipynb
diff --git a/chapter12_part05_gans.ipynb b/second_edition/chapter12_part05_gans.ipynb
similarity index 100%
rename from chapter12_part05_gans.ipynb
rename to second_edition/chapter12_part05_gans.ipynb
diff --git a/chapter13_best-practices-for-the-real-world.ipynb b/second_edition/chapter13_best-practices-for-the-real-world.ipynb
similarity index 99%
rename from chapter13_best-practices-for-the-real-world.ipynb
rename to second_edition/chapter13_best-practices-for-the-real-world.ipynb
index 68736349e6..1d4b3b28c6 100644
--- a/chapter13_best-practices-for-the-real-world.ipynb
+++ b/second_edition/chapter13_best-practices-for-the-real-world.ipynb
@@ -244,6 +244,7 @@
    "source": [
     "def get_best_trained_model(hp):\n",
     "    best_epoch = get_best_epoch(hp)\n",
+    "    model = build_model(hp)\n",
     "    model.fit(\n",
     "        x_train_full, y_train_full,\n",
     "        batch_size=128, epochs=int(best_epoch * 1.2))\n",
diff --git a/chapter14_conclusions.ipynb b/second_edition/chapter14_conclusions.ipynb
similarity index 100%
rename from chapter14_conclusions.ipynb
rename to second_edition/chapter14_conclusions.ipynb