Skip to content

Files

Latest commit

769b998 · Dec 22, 2021

History

History
This branch is 178 commits behind BR-IDL/PaddleViT:develop.

semantic_segmentation

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Dec 22, 2021
Aug 15, 2021
Oct 11, 2021
Jul 8, 2021
Dec 22, 2021
Jul 19, 2021
Aug 11, 2021
Dec 1, 2021
Dec 1, 2021
Dec 3, 2021
Dec 3, 2021
Jul 8, 2021
Dec 3, 2021
Dec 3, 2021

English | 简体中文

Semantic segmentation toolkit based on Visual Transformers

Semantic segmentation aims at classifying each pixel in an image to a specified semantic category, including objects (e.g., bicycle, car, people) and stuff (e.g., road, bench, sky).

Environment

This code is developed under the following configurations:

Hardware: 1/2/4/8 GPU for training and testing Software: Centos 6.10, CUDA=10.2 Python=3.8, Paddle=2.1.0

Installation

  1. Create a conda virtual environment and activate it.
conda create -n paddlevit python=3.8
conda activate ppvit
  1. Install PaddlePaddle following the official instructions, e.g.,
conda install paddlepaddle-gpu==2.1.0 cudatoolkit=10.2 --channel https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/
  1. Install PaddleViT
git clone https://github.com/BR-IDL/PaddleViT.git
cd PaddleViT/semantic_segmentation
pip3 install -r requirements.txt

Demo

We provide a demo script demo.py. This script performs inference on single images. You can put the input images in ./demo/img.

cd demo
CUDA_VISIBLE_DEVICES=0 python3 demo.py \
    --config ${CONFIG_FILE} \
    --model_path ${MODEL_PATH} \
    --pretrained_backbone ${PRETRAINED_BACKBONE} \
    --img_dir ${IMAGE_DIRECTORY} \
    --results_dir ${RESULT_DIRECTRORY}

Examples:

cd demo
CUDA_VISIBLE_DEVICES=0 python3 demo.py \
    --config ../configs/setr/SETR_PUP_Large_768x768_80k_cityscapes_bs_8.yaml \
    --model_path ../pretrain_models/setr/SETR_PUP_cityscapes_b8_80k.pdparams \
    --pretrained_backbone ../pretrain_models/backbones/vit_large_patch16_224.pdparams \
    --img_dir ./img/ \
    --results_dir ./results/

Quick start: training and testing models

1. Preparing data

Pascal-Context dataset

Download Pascal-Context dataset. "pascal_context/SegmentationClassContext" is generated by running the script voc2010_to_pascalcontext.py. Specifically, downloading the PASCAL VOC2010 from http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar, and annotation file from https://codalabuser.blob.core.windows.net/public/trainval_merged.json. It should have this basic structure:

pascal_context
|-- Annotations
|-- ImageSets
|-- JPEGImages
|-- SegmentationClass
|-- SegmentationClassContext
|-- SegmentationObject
|-- trainval_merged.json
|-- voc2010_to_pascalcontext.py

ADE20K dataset

Download ADE20K dataset from http://sceneparsing.csail.mit.edu/. It should have this basic structure:

ADEChallengeData2016
|-- annotations
|   |-- training
|   `-- validation
|-- images
|   |-- training
|   `-- validation
|-- objectInfo150.txt
`-- sceneCategories.txt

Cityscapes dataset

Download Cityscapes dataset from https://www.cityscapes-dataset.com/. **labelTrainIds.png are used for cityscapes training, which are generated by the script convert_cityscapes.py. It should have this basic structure:

cityscapes
|-- gtFine
|   |-- test
|   |-- train
|   `-- val
|-- leftImg8bit
|   |-- test
|   |-- train
|   `-- val

Trans10kV2 dataset

Download Trans10kV2 dataset from Google Drive. or Baidu Drive. code: oqms . It should have this basic structure:

Trans10K_cls12
|-- test
|   |-- images
|   `-- masks_12
|-- train
|   |-- images
|   `-- masks_12
|-- validation
|   |-- images
|   `-- masks_12

2. Testing

Single-scale testing on single GPU

CUDA_VISIBLE_DEVICES=0 python3  val.py  \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
    --model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams

Multi-scale testing on single GPU

CUDA_VISIBLE_DEVICES=0,1 python3 val.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
    --model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams \
    --multi_scales True

Single-scale testing on multi GPU

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -u -m paddle.distributed.launch val.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
    --model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams

Multi-scale testing on multi GPU

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -u -m paddle.distributed.launch val.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml \
    --model_path ./pretrain_models/setr/SETR_MLA_pascal_context_b8_80k.pdparams \
    --multi_scales True

Note:

  • that the -model_path option accepts the path of pretrained weights file (segmentation model, e.g., setr)

3. Training

Training on single GPU

CUDA_VISIBLE_DEVICES=0 python3  train.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml

Note:

  • The training options such as lr, image size, model layers, etc., can be changed in the .yaml file set in -cfg. All the available settings can be found in ./config.py

Training on multi GPU

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -u -m paddle.distributed.launch train.py \
    --config ./configs/setr/SETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml

Note:

  • The training options such as lr, image size, model layers, etc., can be changed in the .yaml file set in -cfg. All the available settings can be found in ./config.py

Contact

If you have any questions regarding this repo, please create an issue.