text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

Overview

text-detection-ctpn

Scene text detection based on ctpn (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be found here. Also, the origin repo in caffe can be found in here. For more detail about the paper and code, see this blog. If you got any questions, check the issue first, if the problem persists, open a new issue.


NOTICE: Thanks to banjin-xjy, banjin and I have reonstructed this repo. The old repo was written based on Faster-RCNN, and remains tons of useless code and dependencies, make it hard to understand and maintain. Hence we reonstruct this repo. The old code is saved in branch master


roadmap

  • reonstruct the repo
  • cython nms and bbox utils
  • loss function as referred in paper
  • oriented text connector
  • BLSTM

setup

nms and bbox utils are written in cython, hence you have to build the library first.

cd utils/bbox
chmod +x make.sh
./make.sh

It will generate a nms.so and a bbox.so in current folder.


demo

  • follow setup to build the library
  • download the ckpt file from googl drive or baidu yun
  • put checkpoints_mlt/ in text-detection-ctpn/
  • put your images in data/demo, the results will be saved in data/res, and run demo in the root
python ./main/demo.py

training

prepare data

  • First, download the pre-trained model of VGG net and put it in data/vgg_16.ckpt. you can download it from tensorflow/models
  • Second, download the dataset we prepared from google drive or baidu yun. put the downloaded data in data/dataset/mlt, then start the training.
  • Also, you can prepare your own dataset according to the following steps.
  • Modify the DATA_FOLDER and OUTPUT in utils/prepare/split_label.py according to your dataset. And run split_label.py in the root
python ./utils/prepare/split_label.py
  • it will generate the prepared data in data/dataset/
  • The input file format demo of split_label.py can be found in gt_img_859.txt. And the output file of split_label.py is img_859.txt. A demo image of the prepared data is shown below.


train

Simplely run

python ./main/train.py
  • The model provided in checkpoints_mlt is trained on GTX1070 for 50k iters. It takes about 0.25s per iter. So it will takes about 3.5 hours to finished 50k iterations.

some results

NOTICE: all the photos used below are collected from the internet. If it affects you, please contact me to delete them.


oriented text connector

  • oriented text connector has been implemented, i's working, but still need futher improvement.
  • left figure is the result for DETECT_MODE H, right figure for DETECT_MODE O

Comments
  • How to export model for Tensorflow Serving?

    How to export model for Tensorflow Serving?

    Follow some tutorials of exporting model for Tensorflow Serving, I've come up below trial:

    cfg_from_file('ctpn/text.yml')
    config = tf.ConfigProto(allow_soft_placement=True)
    with tf.Session(config=config) as sess:
            net = get_network("VGGnet_test")
    
            saver = tf.train.Saver()
            try:
                ckpt = tf.train.get_checkpoint_state(cfg.TEST.checkpoints_path)
                saver.restore(sess, ckpt.model_checkpoint_path)
            except:
                raise 'Missing pre-trained model: {}'.format(ckpt.model_checkpoint_path)
    
           # The main export trial is here
           #############################
            export_path = os.path.join(
                tf.compat.as_bytes('/tmp/ctpn'),
                tf.compat.as_bytes(str(1)))
            builder = tf.saved_model.builder.SavedModelBuilder(export_path)
    
            freezing_graph = sess.graph
            prediction_signature = tf.saved_model.signature_def_utils.predict_signature_def(
                inputs={'input': freezing_graph.get_tensor_by_name('Placeholder:0')},
                outputs={'output': freezing_graph.get_tensor_by_name('Placeholder_1:0')}
            )
    
            builder.add_meta_graph_and_variables(
                sess,
                [tf.saved_model.tag_constants.SERVING],
                signature_def_map={
                    tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: prediction_signature
                },
                clear_devices=True)
    
            builder.save()
            print('[INFO] Export SavedModel into {}'.format(export_path))
            #############################
    

    With output of freezing_graph is:

    Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32)
    Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
    Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
    Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32)
    Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32)
    Tensor("rpn_cls_score/Reshape_1:0", shape=(?, ?, ?, 20), dtype=float32)
    Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
    Tensor("Reshape_2:0", shape=(?, ?, ?, 20), dtype=float32)
    Tensor("rpn_bbox_pred/Reshape_1:0", shape=(?, ?, ?, 40), dtype=float32)
    Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32)
    

    After I got /tmp/ctpn/1 exported model, I try to load into Tensorflow Serving server:

    tensorflow_model_server --port=9000 --model_name=ctpn --model_base_path=/tmp/ctpn
    

    But it came up an error:

    Loading servable: {name: detector version: 1} failed: Not found: Op type not registered 'PyFunc' in binary running on [...]. Make sure the Op and Kernel are registered in the binary running in this process.
    

    So there are 2 questions:

    • Am I right about the inputs (Placeholder:0) and the outputs (Placeholder_1:0) of prediction_signature
    • Where do I miss PyFunc?
    opened by hiepph 20
  • exuse me !!!!do you have some tricks when training model?why do i train MLT dataset as default parameters and after 50000iters,it can detect nothing?

    exuse me !!!!do you have some tricks when training model?why do i train MLT dataset as default parameters and after 50000iters,it can detect nothing?

    exuse me !!!! do you have some tricks when training model?why do i train MLT dataset as default parameters and after 50000 iters,it can detect nothing?

    opened by cjt222 15
  • BiLSTM and Training Time

    BiLSTM and Training Time

    Thanks for sharing your implementation with us. I have implemented CTPN with Caffe which failed to converge when adding LSTM. First, I want to ask whether you have added the BiLSTM in your code or not. I am new to tensorflow. After looking at the code, I think you just implement the LSTM not the BiLSTM, is it right ? Second, I want to ask how long did you train your model? I have run the train script of your programs on a GPU device. It seems that it would take 5-6 days to finish the first 180000 iterations.

    Thanks very much.

    opened by TaoDream 14
  • I try to change code from python2 to python3

    I try to change code from python2 to python3

    I try to change code from python2 to python3,but when I finish all the mistake,and run the code,it caused error below,i do not know where it come from and how to carry out it,how can i do?Thank you so much

    2017-10-26 14:07:03.163176: W tensorflow/core/framework/op_kernel.cc:1158] Unknown: KeyError: b'TEST' Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call return fn(*args) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn status, run_metadata) File "/usr/local/lib/python3.6/contextlib.py", line 88, in exit next(self.gen) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]] [[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "ctpn/demo.py", line 95, in _, _ = test_ctpn(sess, net, im) File "/root/chengjuntao/text-detection-ctpn/lib/fast_rcnn/test.py", line 171, in test_ctpn rois = sess.run([net.get_output('rois')[0]],feed_dict=feed_dict) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run run_metadata_ptr) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run feed_dict_string, options, run_metadata) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run target_list, options, run_metadata) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]] [[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

    Caused by op 'rois/PyFunc', defined at: File "ctpn/demo.py", line 85, in net = get_network("VGGnet_test") File "/root/chengjuntao/text-detection-ctpn/lib/networks/factory.py", line 20, in get_network return VGGnet_test() File "/root/chengjuntao/text-detection-ctpn/lib/networks/VGGnet_test.py", line 14, in init self.setup() File "/root/chengjuntao/text-detection-ctpn/lib/networks/VGGnet_test.py", line 68, in setup .proposal_layer(_feat_stride, anchor_scales, 'TEST', name='rois')) File "/root/chengjuntao/text-detection-ctpn/lib/networks/network.py", line 28, in layer_decorated layer_output = op(self, layer_input, *args, **kwargs) File "/root/chengjuntao/text-detection-ctpn/lib/networks/network.py", line 241, in proposal_layer [tf.float32,tf.float32]) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 198, in py_func input=inp, token=token, Tout=Tout, name=name) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_script_ops.py", line 38, in _py_func name=name) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in init self._traceback = _extract_stack()

    UnknownError (see above for traceback): KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]] [[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

    opened by cjt222 14
  • ModuleNotFoundError: No module named 'lib'

    ModuleNotFoundError: No module named 'lib'

    When I try to execute the demo.py im receiving this error

    File "demo.py", line 9, in from lib.networks.factory import get_network ModuleNotFoundError: No module named 'lib'

    Can someone please help me with this ?

    opened by SuryaprakashNSM 12
  • AttributeError: 'NoneType' object has no attribute 'model_checkpoint_path'

    AttributeError: 'NoneType' object has no attribute 'model_checkpoint_path'

    File "/home/hjq/PycharmProjects/Tensorflow-OCR/text-detection-ctpn-master/ctpn/demo.py", line 103, in raise 'Check your pretrained {:s}'.format(ckpt.model_checkpoint_path) AttributeError: 'NoneType' object has no attribute 'model_checkpoint_path'

    opened by seawater668 11
  • After CTPN. What is your idea?

    After CTPN. What is your idea?

    Hello. eragonruan! I was very impressed with your code. One more time. Thank you so much.

    I have a problem. When the image passes through the CTPN, a green box is created. If I put this image in the OCR engine, how should I separate it (green box)? My idea is to use OpenCV. But the green box is on the text. so it hides the text. Can I draw a thin line to solve the problem?

    opened by rudebono 10
  • win10 on cpu encounter KeyError: b'TEST' error

    win10 on cpu encounter KeyError: b'TEST' error

    @eragonruan My enviroment is win10+cpu+tensorflow1.3. I have spend lot of time for this; I'm confused about this. Please help me on your spare time. Thanks!

    Here is the error: `2018-07-26 11:11:58.723047: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2018-07-26 11:11:58.728604: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32) Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32) Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32) Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32) Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32) Tensor("rpn_cls_score/Reshape_1:0", shape=(?, ?, ?, 20), dtype=float32) Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32) Tensor("Reshape_2:0", shape=(?, ?, ?, 20), dtype=float32) Tensor("rpn_bbox_pred/Reshape_1:0", shape=(?, ?, ?, 40), dtype=float32) Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32) Loading network VGGnet_test... Restoring from checkpoints/VGGnet_fast_rcnn_iter_50000.ckpt... done 2018-07-26 11:12:09.006369: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\framework\op_kernel.cc:1192] Unknown: KeyError: b'TEST' Traceback (most recent call last): File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1327, in _do_call return fn(*args) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1306, in _run_fn status, run_metadata) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\contextlib.py", line 88, in exit next(self.gen) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_2, rpn_bbox_pred/Reshape_1, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "./ctpn/demo.py", line 97, in _, _ = test_ctpn(sess, net, im) File "D:\workspace\text-detection-ctpn-master\lib\fast_rcnn\test.py", line 51, in test_ctpn rois = sess.run([net.get_output('rois')[0]],feed_dict=feed_dict) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 895, in run run_metadata_ptr) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1124, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1321, in _do_run options, run_metadata) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1340, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_2, rpn_bbox_pred/Reshape_1, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]

    Caused by op 'rois/PyFunc', defined at: File "./ctpn/demo.py", line 82, in net = get_network("VGGnet_test") File "D:\workspace\text-detection-ctpn-master\lib\networks\factory.py", line 8, in get_network return VGGnet_test() File "D:\workspace\text-detection-ctpn-master\lib\networks\VGGnet_test.py", line 15, in init self.setup() File "D:\workspace\text-detection-ctpn-master\lib\networks\VGGnet_test.py", line 56, in setup .proposal_layer(_feat_stride, anchor_scales, 'TEST', name='rois')) File "D:\workspace\text-detection-ctpn-master\lib\networks\network.py", line 21, in layer_decorated layer_output = op(self, layer_input, *args, **kwargs) File "D:\workspace\text-detection-ctpn-master\lib\networks\network.py", line 215, in proposal_layer [tf.float32,tf.float32]) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\ops\script_ops.py", line 203, in py_func input=inp, token=token, Tout=Tout, name=name) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\ops\gen_script_ops.py", line 36, in _py_func name=name) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op op_def=op_def) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\framework\ops.py", line 2630, in create_op original_op=self._default_original_op, op_def=op_def) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\framework\ops.py", line 1204, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

    UnknownError (see above for traceback): KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_2, rpn_bbox_pred/Reshape_1, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]`

    opened by Crocodiles 9
  •   UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. This may consume a large amount of memory.

    UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. This may consume a large amount of memory.

    when i try to train my own datasets, i faced this core dump,

    Computing bounding-box regression targets... bbox target means: [[ 0. 0. 0. 0.] [ 0. 0. 0. 0.]] [ 0. 0. 0. 0.] bbox target stdevs: [[ 0.1 0.1 0.2 0.2] [ 0.1 0.1 0.2 0.2]] [ 0.1 0.1 0.2 0.2] Normalizing targets done Solving... /data/resys/var/python2.7.3/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py:91: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. " Segmentation fault (core dumped)

    this fault caused by the function train_model() in lib/fast-rcnn/train.py , when it run train_op=opt.apply_gradients(list(zip(grads,tvars),global_step=global_step)) .

    have you ever faced this error?

    opened by louisly 8
  • WHERE ARE gt_img_1001.txt ... gt_img_6000.txt FILES?!

    WHERE ARE gt_img_1001.txt ... gt_img_6000.txt FILES?!

    I cloned the project, downloaded the pretrained model and the training data, unzipped them then got a folder named TEXTVOC that contains other folders (Annotations, ImagesSets and JPEGImages). I placed TEXTVOC in the data folder and edited the paths in the split_label.py as following: path = '/home/hani/Desktop/text-detection-ctpn/data/TEXTVOC/JPEGImages' gt_path = '/home/hani/Desktop/text-detection-ctpn/data/TEXTVOC/Annotations'. But, when I run it python lib/prepare_training_data/split_label.py, I got this error:

    /home/hani/Desktop/text-detection-ctpn/data/TEXTVOC/JPEGImages/img_1001.jpg
    Traceback (most recent call last):
      File "lib/prepare_training_data/split_label.py", line 34, in <module>
        with open(gt_file, 'r') as f:
    FileNotFoundError: [Errno 2] No such file or directory: '/home/hani/Desktop/text-detection-ctpn/data/TEXTVOC/Annotations/gt_img_1001.txt'
    

    As it is shown, the gt_img_1001.txt is not missing, which means whether the gt_path is wrong whether those files do not exist. I looked into all folders and I didn't find any file starting with gt_* PS: I also run this command: ln -s TEXTVOC VOCdevkit2007, so I didn't miss anything that should be done.

    opened by maky-hnou 7
  • how to run the train scripts

    how to run the train scripts

    在split_label.py中有两个路径path = '/media/D/code/OCR/text-detection-ctpn/data/mlt_english+chinese/image'和gt_path = '/media/D/code/OCR/text-detection-ctpn/data/mlt_english+chinese/label',代码中会分别读取这个两个路径下的文件,请问....../label中放什么样的文件?谢谢

    opened by jibadallz 7
  • Bump tensorflow-gpu from 1.4.0 to 2.9.3

    Bump tensorflow-gpu from 1.4.0 to 2.9.3

    Bumps tensorflow-gpu from 1.4.0 to 2.9.3.

    Release notes

    Sourced from tensorflow-gpu's releases.

    TensorFlow 2.9.3

    Release 2.9.3

    This release introduces several vulnerability fixes:

    TensorFlow 2.9.2

    Release 2.9.2

    This releases introduces several vulnerability fixes:

    ... (truncated)

    Changelog

    Sourced from tensorflow-gpu's changelog.

    Release 2.9.3

    This release introduces several vulnerability fixes:

    Release 2.8.4

    This release introduces several vulnerability fixes:

    ... (truncated)

    Commits
    • a5ed5f3 Merge pull request #58584 from tensorflow/vinila21-patch-2
    • 258f9a1 Update py_func.cc
    • cd27cfb Merge pull request #58580 from tensorflow-jenkins/version-numbers-2.9.3-24474
    • 3e75385 Update version numbers to 2.9.3
    • bc72c39 Merge pull request #58482 from tensorflow-jenkins/relnotes-2.9.3-25695
    • 3506c90 Update RELEASE.md
    • 8dcb48e Update RELEASE.md
    • 4f34ec8 Merge pull request #58576 from pak-laura/c2.99f03a9d3bafe902c1e6beb105b2f2417...
    • 6fc67e4 Replace CHECK with returning an InternalError on failing to create python tuple
    • 5dbe90a Merge pull request #58570 from tensorflow/r2.9-7b174a0f2e4
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump numpy from 1.14.2 to 1.22.0

    Bump numpy from 1.14.2 to 1.22.0

    Bumps numpy from 1.14.2 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Model Quantization

    Model Quantization

    The pretrained model is favorable as it detects text with high accuracy. However, it takes long time to inference with CPU, this is out of expectation in production deployment. Is there a way to quantize the model?

    What I've tried is to convert the checkpoint model to saved_model format, then load from saved_model to perform quantization with TFLite converter, code snippet as followed:

    # Load checkpoint and convert to saved_model
    import tf
    trained_checkpoint_prefix = "checkpoints_mlt/ctpn_50000.ckpt"
    export_dir = "exported_model"
    
    graph = tf.Graph()
    with tf.compat.v1.Session(graph=graph) as sess:
        # Restore from checkpoint
        loader = tf.compat.v1.train.import_meta_graph(trained_checkpoint_prefix + ".meta")
        loader.restore(sess, trained_checkpoint_prefix)
    
    # Export checkpoint to SavedModel
    builder = tf.compat.v1.saved_model.builder.SavedModelBuilder(export_dir)
    builder.add_meta_graph_and_variables(sess,
                                         [tf.saved_model.TRAINING, tf.saved_model.SERVING],
                                         strip_default_attrs=True)
    builder.save()
    

    In result, I got a .pb file and and a variables folder with checkpoint and index files inside. Then errors popped out when I tried to perform quantization:

    converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    converter.inference_input_type = tf.int8  # or tf.uint8
    converter.inference_output_type = tf.int8  # or tf.uint8
    tflite_quant_model = converter.convert()
    

    This is the error:

    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-4-03205673177f> in <module>
         11 converter.inference_input_type = tf.int8  # or tf.uint8
         12 converter.inference_output_type = tf.int8  # or tf.uint8
    ---> 13 tflite_quant_model = converter.convert()
    
    ~/virtualenvironment/tf2/lib/python3.6/site-packages/tensorflow/lite/python/lite.py in convert(self)
        450     # TODO(b/130297984): Add support for converting multiple function.
        451     if len(self._funcs) != 1:
    --> 452       raise ValueError("This converter can only convert a single "
        453                        "ConcreteFunction. Converting multiple functions is "
        454                        "under development.")
    
    ValueError: This converter can only convert a single ConcreteFunction. Converting multiple functions is under development.
    

    Understand that this error was raised due to the multiple inputs input_image and input_im_info required by the model. Appreciate if anyone could help.

    opened by leeshien 0
  • Bump ipython from 5.1.0 to 7.16.3

    Bump ipython from 5.1.0 to 7.16.3

    Bumps ipython from 5.1.0 to 7.16.3.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Releases(untagged-48d74c6337a71b6b5f87)
Owner
Shaohui Ruan
Interested in machine learning & computer vision
Shaohui Ruan
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 05, 2022
Geometric Augmentation for Text Image

Text Image Augmentation A general geometric augmentation tool for text images in the CVPR 2020 paper "Learn to Augment: Joint Data Augmentation and Ne

Canjie Luo 440 Jan 05, 2023
In this project we will be using the live feed coming from the webcam to create a virtual mouse with complete functionalities.

Virtual Mouse Using OpenCV In this project we will be using the live feed coming from the webcam to create a virtual mouse using hand tracking. Projec

Hassan Shahzad 8 Dec 20, 2022
Using computer vision method to recognize and calcutate the features of the architecture.

building-feature-recognition In this repository, we accomplished building feature recognition using traditional/dl-assisted computer vision method. Th

4 Aug 11, 2022
(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

BRNet Introduction This is a release of the code of our paper Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds,

86 Oct 05, 2022
Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

This repository is mainly for camera intrinsic calibration and hand-eye calibration. Synthetic experiments are conducted in PyBullet simulator. 1. Tes

CAI Junhao 7 Oct 03, 2022
Open Source Computer Vision Library

OpenCV: Open Source Computer Vision Library Resources Homepage: https://opencv.org Courses: https://opencv.org/courses Docs: https://docs.opencv.org/m

OpenCV 65.7k Jan 03, 2023
FastOCR is a desktop application for OCR API.

FastOCR FastOCR is a desktop application for OCR API. Installation Arch Linux fastocr-git @ AUR Build from AUR or install with your favorite AUR helpe

Bruce Zhang 58 Jan 07, 2023
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 07, 2022
Distilling Knowledge via Knowledge Review, CVPR 2021

ReviewKD Distilling Knowledge via Knowledge Review Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia This project provides an implementation for the

DV Lab 194 Dec 28, 2022
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Bridging Video-text Retrieval with Multiple Choice Questions, CVPR 2022 (Oral) Paper | Project Page | Pre-trained Model | CLIP-Initialized Pre-trained

Applied Research Center (ARC), Tencent PCG 99 Jan 06, 2023
EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

DeepCam Shenzhen 141 Dec 31, 2022
Detect the mathematical formula from the given picture and the same formula is extracted and converted into the latex code

Mathematical formulae extractor The goal of this project is to create a learning based system that takes an image of a math formula and returns corres

6 May 22, 2022
Image Smoothing and Blurring Using OpenCV

Image-Smoothing-and-Blurring-Using-OpenCV This repository contains codes for performing image smoothing and blurring using OpenCV. There are different

Happy N. Monday 3 Feb 15, 2022
Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

Image Thresholding is used for image segmentation. From a grayscale image, thresholding can be used to create binary images. In thresholding we pick a threshold T.

Grace Ugochi Nneji 3 Feb 15, 2022
Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

camloop Forget the boilerplate from OpenCV camera loops and get to coding the interesting stuff Table of Contents Usage Install Quickstart More advanc

Gabriel Lefundes 9 Nov 12, 2021
deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications

Automatic Weapon Detection Deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications. Loved the pro

Janhavi 4 Mar 04, 2022
Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

DewarpNet This repository contains the codes for DewarpNet training. Recent Updates [May, 2020] Added evaluation images and an important note about Ma

<a href=[email protected]"> 354 Jan 01, 2023
Implementation of EAST scene text detector in Keras

EAST: An Efficient and Accurate Scene Text Detector This is a Keras implementation of EAST based on a Tensorflow implementation made by argman. The or

Jan Zdenek 208 Nov 15, 2022
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 05, 2023