Content Overview
- Install necessary dependencies
- Import required libraries
- Custom dataset preparation for semantic segmentation
- Configure the DeepLabV3 Mobilenet model for custom dataset
- Create the Task object (tfm.core.base_task.Task) from the config_definitions.TaskConfig
This tutorial trains a DeepLabV3 with Mobilenet V2 as backbone model from the TensorFlow Model Garden package (tensorflow-models).
Model Garden contains a collection of state-of-the-art models, implemented with TensorFlow's high-level APIs. The implementations demonstrate the best practices for modeling, letting users to take full advantage of TensorFlow for their research and product development.
Dataset: Oxford-IIIT Pets
- The Oxford-IIIT pet dataset is a 37-category pet image dataset with roughly 200 images for each class. The images have large variations in scale, pose, and lighting. All images have an associated ground truth annotation of breed.
This tutorial demonstrates how to:
- Use models from the TensorFlow Models package.
- Train/Fine-tune a pre-built DeepLabV3 with mobilenet as backbone for Semantic Segmentation.
- Export the trained/tuned DeepLabV3 model
Install necessary dependencies
pip install -U -q "tf-models-official"
Import required libraries
import os
import pprint
import numpy as np
import matplotlib.pyplot as plt
from IPython import display
import tensorflow as tf
import tensorflow_datasets as tfds
import orbit
import tensorflow_models as tfm
from official.vision.data import tfrecord_lib
from official.vision.utils import summary_manager
from official.vision.serving import export_saved_model_lib
from official.vision.utils.object_detection import visualization_utils
pp = pprint.PrettyPrinter(indent=4) # Set Pretty Print Indentation
print(tf.__version__) # Check the version of tensorflow used
%matplotlib inline
2024-02-02 12:12:13.799558: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-02 12:12:13.799625: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-02 12:12:13.801330: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2.15.0
Custom dataset preparation for semantic segmentation
Models in the Official repository (of model-garden) require models in a TFRecords dataformat.
Please check this resource to learn more about TFRecords data format.
Oxford_IIIT_pet:3 dataset is taken from Tensorflow Datasets
(train_ds, val_ds, test_ds), info = tfds.load(
'oxford_iiit_pet:3.*.*',
split=['train+test[:50%]', 'test[50%:80%]', 'test[80%:100%]'],
with_info=True)
info
tfds.core.DatasetInfo(
name='oxford_iiit_pet',
full_name='oxford_iiit_pet/3.2.0',
description="""
The Oxford-IIIT pet dataset is a 37 category pet image dataset with roughly 200
images for each class. The images have large variations in scale, pose and
lighting. All images have an associated ground truth annotation of breed.
""",
homepage='http://www.robots.ox.ac.uk/~vgg/data/pets/',
data_dir='gs://tensorflow-datasets/datasets/oxford_iiit_pet/3.2.0',
file_format=tfrecord,
download_size=773.52 MiB,
dataset_size=774.69 MiB,
features=FeaturesDict({
'file_name': Text(shape=(), dtype=string),
'image': Image(shape=(None, None, 3), dtype=uint8),
'label': ClassLabel(shape=(), dtype=int64, num_classes=37),
'segmentation_mask': Image(shape=(None, None, 1), dtype=uint8),
'species': ClassLabel(shape=(), dtype=int64, num_classes=2),
}),
supervised_keys=('image', 'label'),
disable_shuffling=False,
splits={
'test': <SplitInfo num_examples=3669, num_shards=4>,
'train': <SplitInfo num_examples=3680, num_shards=4>,
},
citation="""@InProceedings{parkhi12a,
author = "Parkhi, O. M. and Vedaldi, A. and Zisserman, A. and Jawahar, C.~V.",
title = "Cats and Dogs",
booktitle = "IEEE Conference on Computer Vision and Pattern Recognition",
year = "2012",
}""",
)
Helper function to encode dataset as tfrecords
def process_record(record):
keys_to_features = {
'image/encoded': tfrecord_lib.convert_to_feature(
tf.io.encode_jpeg(record['image']).numpy()),
'image/height': tfrecord_lib.convert_to_feature(record['image'].shape[0]),
'image/width': tfrecord_lib.convert_to_feature(record['image'].shape[1]),
'image/segmentation/class/encoded':tfrecord_lib.convert_to_feature(
tf.io.encode_png(record['segmentation_mask'] - 1).numpy())
}
example = tf.train.Example(
features=tf.train.Features(feature=keys_to_features))
return example
Write TFRecords to a folder
output_dir = './oxford_iiit_pet_tfrecords/'
LOG_EVERY = 100
if not os.path.exists(output_dir):
os.mkdir(output_dir)
def write_tfrecords(dataset, output_path, num_shards=1):
writers = [
tf.io.TFRecordWriter(
output_path + '-%05d-of-%05d.tfrecord' % (i, num_shards))
for i in range(num_shards)
]
for idx, record in enumerate(dataset):
if idx % LOG_EVERY == 0:
print('On image %d', idx)
tf_example = process_record(record)
writers[idx % num_shards].write(tf_example.SerializeToString())
Write training data as TFRecords
output_train_tfrecs = output_dir + 'train'
write_tfrecords(train_ds, output_train_tfrecs, num_shards=10)
On image %d 0
On image %d 100
On image %d 200
On image %d 300
On image %d 400
On image %d 500
On image %d 600
On image %d 700
On image %d 800
Corrupt JPEG data: 240 extraneous bytes before marker 0xd9
Corrupt JPEG data: premature end of data segment
On image %d 900
On image %d 1000
On image %d 1100
On image %d 1200
On image %d 1300
On image %d 1400
On image %d 1500
On image %d 1600
On image %d 1700
On image %d 1800
On image %d 1900
On image %d 2000
On image %d 2100
On image %d 2200
On image %d 2300
On image %d 2400
On image %d 2500
On image %d 2600
On image %d 2700
On image %d 2800
On image %d 2900
On image %d 3000
On image %d 3100
On image %d 3200
On image %d 3300
On image %d 3400
On image %d 3500
On image %d 3600
On image %d 3700
On image %d 3800
On image %d 3900
On image %d 4000
On image %d 4100
On image %d 4200
On image %d 4300
On image %d 4400
On image %d 4500
On image %d 4600
On image %d 4700
On image %d 4800
On image %d 4900
On image %d 5000
On image %d 5100
On image %d 5200
On image %d 5300
On image %d 5400
On image %d 5500
Write validation data as TFRecords
output_val_tfrecs = output_dir + 'val'
write_tfrecords(val_ds, output_val_tfrecs, num_shards=5)
On image %d 0
On image %d 100
On image %d 200
On image %d 300
On image %d 400
On image %d 500
On image %d 600
On image %d 700
On image %d 800
On image %d 900
On image %d 1000
On image %d 1100
Write test data as TFRecords
output_test_tfrecs = output_dir + 'test'
write_tfrecords(test_ds, output_test_tfrecs, num_shards=5)
On image %d 0
On image %d 100
On image %d 200
On image %d 300
On image %d 400
On image %d 500
On image %d 600
On image %d 700
Configure the DeepLabV3 Mobilenet model for custom dataset
train_data_tfrecords = './oxford_iiit_pet_tfrecords/train*'
val_data_tfrecords = './oxford_iiit_pet_tfrecords/val*'
test_data_tfrecords = './oxford_iiit_pet_tfrecords/test*'
trained_model = './trained_model/'
export_dir = './exported_model/'
In Model Garden, the collections of parameters that define a model are called configs. Model Garden can create a config based on a known set of parameters via a factory.
Use the mnv2_deeplabv3_pascal
experiment configuration, as defined by tfm.vision.configs.semantic_segmentation.mnv2_deeplabv3_pascal
.
Please find all the registered experiements here
The configuration defines an experiment to train a DeepLabV3 model with MobilenetV2 as backbone and ASPP as decoder.
There are also other alternative experiments available such as
seg_deeplabv3_pascal
seg_deeplabv3plus_pascal
seg_resnetfpn_pascal
mnv2_deeplabv3plus_cityscapes
and more. One can switch to them by changing the experiment name argument to the get_exp_config
function.
exp_config = tfm.core.exp_factory.get_exp_config('mnv2_deeplabv3_pascal')
model_ckpt_path = './model_ckpt/'
if not os.path.exists(model_ckpt_path):
os.mkdir(model_ckpt_path)
!gsutil cp gs://tf_model_garden/cloud/vision-2.0/deeplab/deeplabv3_mobilenetv2_coco/best_ckpt-63.data-00000-of-00001 './model_ckpt/'
!gsutil cp gs://tf_model_garden/cloud/vision-2.0/deeplab/deeplabv3_mobilenetv2_coco/best_ckpt-63.index './model_ckpt/'
Copying gs://tf_model_garden/cloud/vision-2.0/deeplab/deeplabv3_mobilenetv2_coco/best_ckpt-63.data-00000-of-00001...
Operation completed over 1 objects/28.2 MiB.
Copying gs://tf_model_garden/cloud/vision-2.0/deeplab/deeplabv3_mobilenetv2_coco/best_ckpt-63.index...
Operation completed over 1 objects/12.5 KiB.
Adjust the model and dataset configurations so that it works with custom dataset.
num_classes = 3
WIDTH, HEIGHT = 128, 128
input_size = [HEIGHT, WIDTH, 3]
BATCH_SIZE = 16
# Backbone Config
exp_config.task.init_checkpoint = model_ckpt_path + 'best_ckpt-63'
exp_config.task.freeze_backbone = True
# Model Config
exp_config.task.model.num_classes = num_classes
exp_config.task.model.input_size = input_size
# Training Data Config
exp_config.task.train_data.aug_scale_min = 1.0
exp_config.task.train_data.aug_scale_max = 1.0
exp_config.task.train_data.input_path = train_data_tfrecords
exp_config.task.train_data.global_batch_size = BATCH_SIZE
exp_config.task.train_data.dtype = 'float32'
exp_config.task.train_data.output_size = [HEIGHT, WIDTH]
exp_config.task.train_data.preserve_aspect_ratio = False
exp_config.task.train_data.seed = 21 # Reproducable Training Data
# Validation Data Config
exp_config.task.validation_data.input_path = val_data_tfrecords
exp_config.task.validation_data.global_batch_size = BATCH_SIZE
exp_config.task.validation_data.dtype = 'float32'
exp_config.task.validation_data.output_size = [HEIGHT, WIDTH]
exp_config.task.validation_data.preserve_aspect_ratio = False
exp_config.task.validation_data.groundtruth_padded_size = [HEIGHT, WIDTH]
exp_config.task.validation_data.seed = 21 # Reproducable Validation Data
exp_config.task.validation_data.resize_eval_groundtruth = True # To enable validation loss
Adjust the trainer configuration.
logical_device_names = [logical_device.name
for logical_device in tf.config.list_logical_devices()]
if 'GPU' in ''.join(logical_device_names):
print('This may be broken in Colab.')
device = 'GPU'
elif 'TPU' in ''.join(logical_device_names):
print('This may be broken in Colab.')
device = 'TPU'
else:
print('Running on CPU is slow, so only train for a few steps.')
device = 'CPU'
train_steps = 2000
exp_config.trainer.steps_per_loop = int(train_ds.__len__().numpy() // BATCH_SIZE)
exp_config.trainer.summary_interval = exp_config.trainer.steps_per_loop # steps_per_loop = num_of_validation_examples // eval_batch_size
exp_config.trainer.checkpoint_interval = exp_config.trainer.steps_per_loop
exp_config.trainer.validation_interval = exp_config.trainer.steps_per_loop
exp_config.trainer.validation_steps = int(train_ds.__len__().numpy() // BATCH_SIZE) # validation_steps = num_of_validation_examples // eval_batch_size
exp_config.trainer.train_steps = train_steps
exp_config.trainer.optimizer_config.warmup.linear.warmup_steps = exp_config.trainer.steps_per_loop
exp_config.trainer.optimizer_config.learning_rate.type = 'cosine'
exp_config.trainer.optimizer_config.learning_rate.cosine.decay_steps = train_steps
exp_config.trainer.optimizer_config.learning_rate.cosine.initial_learning_rate = 0.1
exp_config.trainer.optimizer_config.warmup.linear.warmup_learning_rate = 0.05
This may be broken in Colab.
Print the modified configuration.
pp.pprint(exp_config.as_dict())
display.Javascript('google.colab.output.setIframeHeight("500px");')
{ 'runtime': { 'all_reduce_alg': None,
'batchnorm_spatial_persistent': False,
'dataset_num_private_threads': None,
'default_shard_dim': -1,
'distribution_strategy': 'mirrored',
'enable_xla': False,
'gpu_thread_mode': None,
'loss_scale': None,
'mixed_precision_dtype': None,
'num_cores_per_replica': 1,
'num_gpus': 0,
'num_packs': 1,
'per_gpu_thread_count': 0,
'run_eagerly': False,
'task_index': -1,
'tpu': None,
'tpu_enable_xla_dynamic_padder': None,
'use_tpu_mp_strategy': False,
'worker_hosts': None},
'task': { 'allow_image_summary': True,
'differential_privacy_config': None,
'eval_input_partition_dims': [],
'evaluation': { 'report_per_class_iou': True,
'report_train_mean_iou': True},
'export_config': {'rescale_output': False},
'freeze_backbone': True,
'init_checkpoint': './model_ckpt/best_ckpt-63',
'init_checkpoint_modules': ['backbone', 'decoder'],
'losses': { 'class_weights': [],
'gt_is_matting_map': False,
'ignore_label': 255,
'l2_weight_decay': 4e-05,
'label_smoothing': 0.0,
'loss_weight': 1.0,
'mask_scoring_weight': 1.0,
'top_k_percent_pixels': 1.0,
'use_binary_cross_entropy': False,
'use_groundtruth_dimension': True},
'model': { 'backbone': { 'mobilenet': { 'filter_size_scale': 1.0,
'model_id': 'MobileNetV2',
'output_intermediate_endpoints': False,
'output_stride': 16,
'stochastic_depth_drop_rate': 0.0},
'type': 'mobilenet'},
'decoder': { 'aspp': { 'dilation_rates': [],
'dropout_rate': 0.0,
'level': 4,
'num_filters': 256,
'output_tensor': False,
'pool_kernel_size': [],
'spp_layer_version': 'v1',
'use_depthwise_convolution': False},
'type': 'aspp'},
'head': { 'decoder_max_level': None,
'decoder_min_level': None,
'feature_fusion': None,
'level': 4,
'logit_activation': None,
'low_level': 2,
'low_level_num_filters': 48,
'num_convs': 0,
'num_filters': 256,
'prediction_kernel_size': 1,
'upsample_factor': 1,
'use_depthwise_convolution': False},
'input_size': [128, 128, 3],
'mask_scoring_head': None,
'max_level': 6,
'min_level': 3,
'norm_activation': { 'activation': 'relu',
'norm_epsilon': 0.001,
'norm_momentum': 0.99,
'use_sync_bn': True},
'num_classes': 3},
'name': None,
'train_data': { 'additional_dense_features': [],
'apply_tf_data_service_before_batching': False,
'aug_policy': None,
'aug_rand_hflip': True,
'aug_scale_max': 1.0,
'aug_scale_min': 1.0,
'autotune_algorithm': None,
'block_length': 1,
'cache': False,
'crop_size': [],
'cycle_length': 10,
'decoder': { 'simple_decoder': { 'attribute_names': [ ],
'mask_binarize_threshold': None,
'regenerate_source_id': False},
'type': 'simple_decoder'},
'deterministic': None,
'drop_remainder': True,
'dtype': 'float32',
'enable_shared_tf_data_service_between_parallel_trainers': False,
'enable_tf_data_service': False,
'file_type': 'tfrecord',
'global_batch_size': 16,
'groundtruth_padded_size': [],
'image_feature': { 'feature_name': 'image/encoded',
'mean': ( 123.675,
116.28,
103.53),
'num_channels': 3,
'stddev': ( 58.395,
57.120000000000005,
57.375)},
'input_path': './oxford_iiit_pet_tfrecords/train*',
'is_training': True,
'output_size': [128, 128],
'prefetch_buffer_size': None,
'preserve_aspect_ratio': False,
'resize_eval_groundtruth': True,
'seed': 21,
'sharding': True,
'shuffle_buffer_size': 1000,
'tf_data_service_address': None,
'tf_data_service_job_name': None,
'tfds_as_supervised': False,
'tfds_data_dir': '',
'tfds_name': '',
'tfds_skip_decoding_feature': '',
'tfds_split': '',
'trainer_id': None,
'weights': None},
'train_input_partition_dims': [],
'validation_data': { 'additional_dense_features': [],
'apply_tf_data_service_before_batching': False,
'aug_policy': None,
'aug_rand_hflip': True,
'aug_scale_max': 1.0,
'aug_scale_min': 1.0,
'autotune_algorithm': None,
'block_length': 1,
'cache': False,
'crop_size': [],
'cycle_length': 10,
'decoder': { 'simple_decoder': { 'attribute_names': [ ],
'mask_binarize_threshold': None,
'regenerate_source_id': False},
'type': 'simple_decoder'},
'deterministic': None,
'drop_remainder': False,
'dtype': 'float32',
'enable_shared_tf_data_service_between_parallel_trainers': False,
'enable_tf_data_service': False,
'file_type': 'tfrecord',
'global_batch_size': 16,
'groundtruth_padded_size': [128, 128],
'image_feature': { 'feature_name': 'image/encoded',
'mean': ( 123.675,
116.28,
103.53),
'num_channels': 3,
'stddev': ( 58.395,
57.120000000000005,
57.375)},
'input_path': './oxford_iiit_pet_tfrecords/val*',
'is_training': False,
'output_size': [128, 128],
'prefetch_buffer_size': None,
'preserve_aspect_ratio': False,
'resize_eval_groundtruth': True,
'seed': 21,
'sharding': True,
'shuffle_buffer_size': 1000,
'tf_data_service_address': None,
'tf_data_service_job_name': None,
'tfds_as_supervised': False,
'tfds_data_dir': '',
'tfds_name': '',
'tfds_skip_decoding_feature': '',
'tfds_split': '',
'trainer_id': None,
'weights': None} },
'trainer': { 'allow_tpu_summary': False,
'best_checkpoint_eval_metric': 'mean_iou',
'best_checkpoint_export_subdir': 'best_ckpt',
'best_checkpoint_metric_comp': 'higher',
'checkpoint_interval': 344,
'continuous_eval_timeout': 3600,
'eval_tf_function': True,
'eval_tf_while_loop': False,
'loss_upper_bound': 1000000.0,
'max_to_keep': 5,
'optimizer_config': { 'ema': None,
'learning_rate': { 'cosine': { 'alpha': 0.0,
'decay_steps': 2000,
'initial_learning_rate': 0.1,
'name': 'CosineDecay',
'offset': 0},
'type': 'cosine'},
'optimizer': { 'sgd': { 'clipnorm': None,
'clipvalue': None,
'decay': 0.0,
'global_clipnorm': None,
'momentum': 0.9,
'name': 'SGD',
'nesterov': False},
'type': 'sgd'},
'warmup': { 'linear': { 'name': 'linear',
'warmup_learning_rate': 0.05,
'warmup_steps': 344},
'type': 'linear'} },
'preemption_on_demand_checkpoint': True,
'recovery_begin_steps': 0,
'recovery_max_trials': 0,
'steps_per_loop': 344,
'summary_interval': 344,
'train_steps': 2000,
'train_tf_function': True,
'train_tf_while_loop': True,
'validation_interval': 344,
'validation_steps': 344,
'validation_summary_subdir': 'validation'} }
<IPython.core.display.Javascript object>
Set up the distribution strategy.
# Setting up the Strategy
if exp_config.runtime.mixed_precision_dtype == tf.float16:
tf.keras.mixed_precision.set_global_policy('mixed_float16')
if 'GPU' in ''.join(logical_device_names):
distribution_strategy = tf.distribute.MirroredStrategy()
elif 'TPU' in ''.join(logical_device_names):
tf.tpu.experimental.initialize_tpu_system()
tpu = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='/device:TPU_SYSTEM:0')
distribution_strategy = tf.distribute.experimental.TPUStrategy(tpu)
else:
print('Warning: this will be really slow.')
distribution_strategy = tf.distribute.OneDeviceStrategy(logical_device_names[0])
print("Done")
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1', '/job:localhost/replica:0/task:0/device:GPU:2', '/job:localhost/replica:0/task:0/device:GPU:3')
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1', '/job:localhost/replica:0/task:0/device:GPU:2', '/job:localhost/replica:0/task:0/device:GPU:3')
Done
Create the Task
object (tfm.core.base_task.Task
) from the config_definitions.TaskConfig
.
The Task
object has all the methods necessary for building the dataset, building the model, and running training & evaluation. These methods are driven by tfm.core.train_lib.run_experiment
.
model_dir = './trained_model/'
with distribution_strategy.scope():
task = tfm.core.task_factory.get_task(exp_config.task, logging_dir=model_dir)
Originally published on the