|
1525 | 1525 | "cell_type": "markdown",
|
1526 | 1526 | "metadata": {},
|
1527 | 1527 | "source": [
|
1528 |
| - "The overall ResNet architecture consists of stacking multiple ResNet blocks, of which some are downsampling the input. When talking about ResNet blocks in the whole network, we usually group them by the same output shape. Hence, if we say the ResNet has `[3,3,3]` blocks, it means that we have 3 times a group of 3 ResNet blocks, where a subsampling is taking place in the fourth and seventh block. The same notation is used by many other implementations such as in the [torchvision library](https://pytorch.org/docs/stable/_modules/torchvision/models/resnet.html#resnet18) from PyTorch. Our code looks as follows:" |
| 1528 | + "The overall ResNet architecture consists of stacking multiple ResNet blocks, of which some are downsampling the input. When talking about ResNet blocks in the whole network, we usually group them by the same output shape. Hence, if we say the ResNet has `[3,3,3]` blocks, it means that we have 3 times a group of 3 ResNet blocks, where a subsampling is taking place in the fourth and seventh block. The ResNet with `[3,3,3]` blocks on CIFAR10 is visualized below.\n", |
| 1529 | + "\n", |
| 1530 | + "<center width=\"100%\"><img src=\"resnet_notation.svg\" width=\"500px\"></center>\n", |
| 1531 | + "\n", |
| 1532 | + "The three groups operate on the resolutions $32\\times32$, $16\\times16$ and $8\\times8$ respectively. The blocks in orange denote ResNet blocks with downsampling. The same notation is used by many other implementations such as in the [torchvision library](https://pytorch.org/docs/stable/_modules/torchvision/models/resnet.html#resnet18) from PyTorch. Thus, our code looks as follows:" |
1529 | 1533 | ]
|
1530 | 1534 | },
|
1531 | 1535 | {
|
|
1540 | 1544 | " \"\"\"\n",
|
1541 | 1545 | " Inputs: \n",
|
1542 | 1546 | " num_classes - Number of classification outputs (10 for CIFAR10)\n",
|
1543 |
| - " num_blocks - List with the number of ResNet blocks to use. The first block of each group uses downsampling, expect the first.\n", |
| 1547 | + " num_blocks - List with the number of ResNet blocks to use. The first block of each group uses downsampling, except the first.\n", |
1544 | 1548 | " c_hidden - List with the hidden dimensionalities in the different blocks. Usually multiplied by 2 the deeper we go.\n",
|
1545 | 1549 | " act_fn_name - Name of the activation function to use, looked up in \"act_fn_by_name\"\n",
|
1546 | 1550 | " block_name - Name of the ResNet block, looked up in \"resnet_blocks_by_name\"\n",
|
|
1576 | 1580 | " blocks = []\n",
|
1577 | 1581 | " for block_idx, block_count in enumerate(self.hparams.num_blocks):\n",
|
1578 | 1582 | " for bc in range(block_count):\n",
|
1579 |
| - " subsample = (bc == 0 and block_idx > 0) # Subsample the first block of each \"super-block\", except the very first one.\n", |
| 1583 | + " subsample = (bc == 0 and block_idx > 0) # Subsample the first block of each group, except the very first one.\n", |
1580 | 1584 | " blocks.append(\n",
|
1581 | 1585 | " self.hparams.block_class(c_in=c_hidden[block_idx if not subsample else (block_idx-1)],\n",
|
1582 | 1586 | " act_fn=self.hparams.act_fn,\n",
|
|
2273 | 2277 | "name": "python",
|
2274 | 2278 | "nbconvert_exporter": "python",
|
2275 | 2279 | "pygments_lexer": "ipython3",
|
2276 |
| - "version": "3.7.3" |
| 2280 | + "version": "3.7.4" |
2277 | 2281 | }
|
2278 | 2282 | },
|
2279 | 2283 | "nbformat": 4,
|
|
0 commit comments