NO IMAGE

0 介紹

根據李斯琦操作步驟總結如下:

注意:
(1)我沒有直接將此專案中檔案放入ssd weiliu89/caffe的目錄中,而是把此專案放入ssd同級目錄下,這樣比較清楚。當然路徑要注意修改。
(2)所有程式碼及說明已經放到github上。地址:ZhangXinNan/SSD_scene_text_detection
(3)我的目錄結構如下:

github
caffe_ssd
SSD_scene_text_detection

1 SceneText資料介紹

下載資料解壓後,有兩個資料夾,test-textloc-gt和train-textloc,我們把它放到 xx/SSD_scene_text_detection/data/scenetext下邊。目錄結構如下

SSD_scene_text_detection
data
scenetext
test-textloc-gt
train-textloc

那兩個資料夾下分別有xxx.jpg和gt_xxx.txt,xxx代表一個數字,gt_xxx.txt代表xxx.jpg所對應的標註資料,例如less data/scenetext/train-textloc/gt_100.txt

158,128,412,182,"Footpath"
442,128,501,170,"To"
393,198,488,240,"and"
63,200,363,242,"Colchester"
71,271,383,313,"Greenstead"

每一行代表一個字元區域的標資訊,每行有5列,用逗號分隔,前4列代表座標,最後1列代表字元內容。

2 生成轉lmdb所用檔案

2.1 生成trainval.txt和test.txt

2.1.1 生成標註的xml檔案

先將給定的 gt_**.txt 標籤檔案轉換為 Pascal VOC XML 格式。
見程式碼create_xml.py
程式碼中是把xml檔案也放到了jpg和txt檔案所放目錄。例如:

data/scenetext/test-textloc-gt/156.jpg
data/scenetext/test-textloc-gt/156.xml
data/scenetext/test-textloc-gt/gt_156.txt

2.1.2 生成檔案列表

見程式碼create_train_test_file.py
生成的檔案trainval.txt和test.txt直接放到了data/scenetext目錄下

SSD_scene_text_detection
data
scenetext
test-textloc-gt
train-textloc
test.txt
trainval.txt

2.2 生成test_name_size.txt

見程式碼test_name_size.py,得到train_name_size.txt和test_name_size.txt

SSD_scene_text_detection
data
scenetext
test-textloc-gt
train-textloc
test.txt
trainval.txt
test_name_size.txt
train_name_size.txt

2.3 生成labelmap_voc_scenetext.prototxt

建立檔案labelmap_voc_scenetext.prototxt,填寫如下:

item {
name: "none_of_the_above"
label: 0
display_name: "background"
}
item {
name: "text"
label: 1
display_name: "text"
}

3 生成lmdb

sh create_lmdbdata_scenetext.sh

注意:
1. 修改一下caffe_root的路徑(編譯caffe版ssd的目錄)。
例如:caffe_root=/data/zhangxin/github/caffe_ssd
2. 設定caffe/python的目錄
例如: export PYTHONPATH=/data/dmcvcache/zhangxin/caffe_ssd/python
3. 如果遇到ImportError: No module named skimage.io
則:安裝 一下, pip install scikit-image
4. 需要在caffe_ssd/models/VGGNet/目錄下存放下載好的VGG_ILSVRC_16_layers_fc_reduced.caffemodel檔案。
5. 錯誤:

F0628 14:18:57.290895 33280 relu_layer.cu:26] Check failed: error == cudaSuccess (11 vs. 0)  invalid argument
*** Check failure stack trace: ***
@     0x7fdf009ffe6d  (unknown)
@     0x7fdf00a01ced  (unknown)
@     0x7fdf009ffa5c  (unknown)
@     0x7fdf00a0263e  (unknown)
F0628 14:18:57.298449 33281 relu_layer.cu:26] Check failed: error == cudaSuccess (11 vs. 0)  invalid argument
*** Check failure stack trace: ***
@     0x7fdf0a9b0ea0  caffe::ReLULayer<>::Forward_gpu()
@     0x7fdf009ffe6d  (unknown)
@     0x7fdf0a796435  caffe::Net<>::ForwardFromTo()
@     0x7fdf00a01ced  (unknown)
@     0x7fdf0a7967a7  caffe::Net<>::Forward()
@     0x7fdf009ffa5c  (unknown)
F0628 14:18:57.298449 33281 relu_layer.cu:26] Check failed: error == cudaSuccess (11 vs. 0)  invalid argumentF0628 14:18:57.304498 33225 relu_layer.cu:26] Check failed: error == cudaSuccess (11 vs. 0)  invalid argument
*** Check failure stack trace: ***
@     0x7fdf00a0263e  (unknown)
@     0x7fdf0a907ba8  caffe::Solver<>::Step()
@     0x7fdf009ffe6d  (unknown)
@     0x7fdf0a90da66  caffe::P2PSync<>::InternalThreadEntry()
@     0x7fdf00a01ced  (unknown)
@     0x7fdf0a9b0ea0  caffe::ReLULayer<>::Forward_gpu()
@     0x7fdf0a8feea0  caffe::InternalThread::entry()
@     0x7fdf009ffa5c  (unknown)
@     0x7fdf0a796435  caffe::Net<>::ForwardFromTo()
@     0x7fdefed0224a  (unknown)
@     0x7fdf00a0263e  (unknown)
@     0x7fdefb853df3  start_thread
@     0x7fdf0a7967a7  caffe::Net<>::Forward()
@     0x7fdf0a9b0ea0  caffe::ReLULayer<>::Forward_gpu()
@     0x7fdefb5813dd  __clone
^C[1]   Done                    nohup python ssd_icdar_scenetext.py
You have mail in /var/spool/mail/root

解決方法:

I added the following lines to commands using multiple “arch” flags in Nvidia's NVCC compiler, and the error does not occur anymore.
-gencode=arch=compute_20,code=\"sm_20,compute_20\"
-gencode=arch=compute_30,code=\"sm_30,compute_30\"
-gencode=arch=compute_35,code=\"sm_35,compute_35\"
-gencode=arch=compute_50,code=\"sm_50,compute_50\" 

4 訓練

python ssd_icdar_scenetext.py

5 檢測

使用ssd_detect.py,根據caffe_ssd目錄下 examples/ssd_detect.ipynb 重寫。
使用方法:

python ssd_detect.py \
--gpu_id 0 \
--labelmap_file data/scenetext/labelmap_voc_scenetext.prototxt \
--model_def models/VGGNet/scenetext/SSD_300x300/deploy.prototxt \
--image_resize 300 \
--model_weights models/VGGNet/scenetext/SSD_300x300/VGG_scenetext_SSD_300x300_iter_50000.caffemodel \
--image_file data/scenetext/test-textloc-gt/101.jpg