行业报告 AI展会 数据标注 标注供求
数据标注数据集
主页 > 数据挖掘 > 正文

TensorFlow object_detect 修改配置文件进行数据增强处

数据增强处理可以用于图像数据集不够充分的情况下对各个类别图像进行数据扩充,以此弥补因图像数据量不足以及图像单一所造成训练后的模型泛化能力不强等问题。

目前图像数据增强常用的方法有:图像旋转、翻转(水平、垂直镜像处理)、模糊、光照调整、加噪声、图像平移、图像缩放以及图像拼接等方法。大多数是先将原有的图像通过写好的程序进行数据增强并同时保存至本地,然后将原图像和数据增强后的图像整合输入模型中训练(其实我也是这样的方法处理,总结下:效率较慢;原图像大小如果较大,数据增强后数据量大小较大,耗时耗空间)。

最近在用TensorFlow object_detect API做目标检测时发现可以通过在配置文件中增添数据增强方法,该配置文件中数据增强模块会调用preprocessor.py程序中的一系列数据增强的方法,因此在将训练数据输入模型中进行训练时,TensorFlow object_detect通过设置好的数据增强方法对输入的数据进行随机增强处理(效率较快;不用保存图像至本地,省时省空间;)

通过实际配置文件为例,如:object_detection/samples/configs/ssd_mobilenet_v1_300x300_coco14_sync.config

配置文件中

data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}

 

 

 

上面两个增强方法分别为随机水平翻转和随机裁剪,若需要做更多的增强处理方法可以按照配置文件中的格式添加,可以添加的数据增强方法可以查看preprocessor.py中的函数。

preprocessor.py地址可参考https://github.com/tensorflow/models/blob/master/research/object_detection/core/preprocessor.py

preprocessor.py中数据增强预处理方法映射可以如下所示,根据下列函数名按照上述格式在配置文件中增添便可对数据进行增强处理。

(以上是个人近期研究后的观点,若理解有误望提出修正)。

prep_func_arg_map = {
normalize_image: (fields.InputDataFields.image,),
random_horizontal_flip: (
fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
groundtruth_instance_masks,
groundtruth_keypoints,
),
random_vertical_flip: (
fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
groundtruth_instance_masks,
groundtruth_keypoints,
),
random_rotation90: (
fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
groundtruth_instance_masks,
groundtruth_keypoints,
),
random_pixel_value_scale: (fields.InputDataFields.image,),
random_image_scale: (
fields.InputDataFields.image,
groundtruth_instance_masks,
),
random_rgb_to_gray: (fields.InputDataFields.image,),
random_adjust_brightness: (fields.InputDataFields.image,),
random_adjust_contrast: (fields.InputDataFields.image,),
random_adjust_hue: (fields.InputDataFields.image,),
random_adjust_saturation: (fields.InputDataFields.image,),
random_distort_color: (fields.InputDataFields.image,),
random_jitter_boxes: (fields.InputDataFields.groundtruth_boxes,),
random_crop_image: (fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes,
groundtruth_label_weights,
groundtruth_label_confidences,
multiclass_scores,
groundtruth_instance_masks, groundtruth_keypoints),
random_pad_image: (fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes),
random_crop_pad_image: (fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes,
groundtruth_label_weights,
groundtruth_label_confidences,
multiclass_scores),
random_crop_to_aspect_ratio: (
fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes,
groundtruth_label_weights,
groundtruth_label_confidences,
multiclass_scores,
groundtruth_instance_masks,
groundtruth_keypoints,
),
random_pad_to_aspect_ratio: (
fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
groundtruth_instance_masks,
groundtruth_keypoints,
),
random_black_patches: (fields.InputDataFields.image,),
retain_boxes_above_threshold: (
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes,
groundtruth_label_weights,
groundtruth_label_confidences,
multiclass_scores,
groundtruth_instance_masks,
groundtruth_keypoints,
),
image_to_float: (fields.InputDataFields.image,),
random_resize_method: (fields.InputDataFields.image,),
resize_to_range: (
fields.InputDataFields.image,
groundtruth_instance_masks,
),
resize_to_min_dimension: (
fields.InputDataFields.image,
groundtruth_instance_masks,
),
scale_boxes_to_pixel_coordinates: (
fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
groundtruth_keypoints,
),
resize_image: (
fields.InputDataFields.image,
groundtruth_instance_masks,
),
subtract_channel_mean: (fields.InputDataFields.image,),
one_hot_encoding: (fields.InputDataFields.groundtruth_image_classes,),
rgb_to_gray: (fields.InputDataFields.image,),
ssd_random_crop: (fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes,
groundtruth_label_weights,
groundtruth_label_confidences,
multiclass_scores,
groundtruth_instance_masks,
groundtruth_keypoints),
ssd_random_crop_pad: (fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes,
groundtruth_label_weights,
groundtruth_label_confidences,
multiclass_scores),
ssd_random_crop_fixed_aspect_ratio: (
fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes,
groundtruth_label_weights,
groundtruth_label_confidences,
multiclass_scores,
groundtruth_instance_masks,
groundtruth_keypoints),
ssd_random_crop_pad_fixed_aspect_ratio: (
fields.InputDataFields.image,
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes,
groundtruth_label_weights,
groundtruth_label_confidences,
multiclass_scores,
groundtruth_instance_masks,
groundtruth_keypoints,
),
convert_class_logits_to_softmax: (multiclass_scores,),
}

微信公众号

声明:本站部分作品是由网友自主投稿和发布、编辑整理上传,对此类作品本站仅提供交流平台,转载的目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,不为其版权负责。如果您发现网站上有侵犯您的知识产权的作品,请与我们取得联系,我们会及时修改或删除。

网友评论:

发表评论
请自觉遵守互联网相关的政策法规,严禁发布色情、暴力、反动的言论。
评价:
表情:
用户名: 验证码:点击我更换图片
SEM推广服务

Copyright©2005-2026 Sykv.com 可思数据 版权所有    京ICP备14056871号

关于我们   免责声明   广告合作   版权声明   联系我们   原创投稿   网站地图  

可思数据 数据标注行业联盟

扫码入群
扫码关注

微信公众号

返回顶部