An experiment of transferring backbone of yolov3 into mobilenetv3 which is implemented by TF/Keras and inspired by qqwweee/keras-yolo3 and xiaochus/MobileNetV3


Generate your own annotation file and class names file.
One row for one image;
Row format: image_file_path box1 box2 ... boxN;
Box format: x_min,y_min,x_max,y_max,class_id (no space).
For VOC dataset, try python voc_annotation.py
Here is an example:

        path/to/img1.jpg 50,100,150,200,0 30,50,200,120,3
        path/to/img2.jpg 120,300,250,600,2

Modify train.py and start training.
python train.py

If you want to train from scratch ,set load_pretrained=False ;if training was interupted , you can set load_pretrained=True and load weights from weights_path ,then restart training.


Use --help to see usage of yolo_video.py:

usage: yolo_video.py [-h] [--model MODEL] [--anchors ANCHORS]
                  [--classes CLASSES] [--gpu_num GPU_NUM] [--image]
                  [--input] [--output]

positional arguments:
  --input Video input path
  --output Video output path

optional arguments:
  -h, --help show this help message and exit
  --model MODEL path to model weight file, default model_data/yolo.h5
  --anchors ANCHORS path to anchor definitions, default
  --classes CLASSES path to class definitions, default
  --gpu_num GPU_NUM Number of GPU to use, default 1
  --image Image detection mode, will ignore all positional arguments


