Deep Learning Chinese Word Segment

Overview

引用 

  本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098

构建

  1. 安装好bazel代码构建工具,安装好tensorflow(目前本项目需要tf 1.0.0alpha版本以上)

  2. 切换到本项目代码目录,运行./configure

  3. 编译后台服务

    bazel build //kcws/cc:seg_backend_api

训练

  1. 关注待字闺中公众号 回复 kcws 获取语料下载地址:

    logo

  2. 解压语料到一个目录

  3. 切换到代码目录,运行:

python kcws/train/process_anno_file.py <语料目录> pre_chars_for_w2v.txt

bazel build third_party/word2vec:word2vec

先得到初步词表

./bazel-bin/third_party/word2vec/word2vec -train pre_chars_for_w2v.txt -save-vocab pre_vocab.txt -min-count 3

处理低频词   python kcws/train/replace_unk.py pre_vocab.txt pre_chars_for_w2v.txt chars_for_w2v.txt

训练word2vec

./bazel-bin/third_party/word2vec/word2vec -train chars_for_w2v.txt -output vec.txt -size 50 -sample 1e-4 -negative 5 -hs 1 -binary 0 -iter 5

构建训练语料工具

bazel build kcws/train:generate_training

生成语料

./bazel-bin/kcws/train/generate_training vec.txt <语料目录> all.txt

得到train.txt , test.txt文件

python kcws/train/filter_sentence.py all.txt

  1. 安装好tensorflow,切换到kcws代码目录,运行:

python kcws/train/train_cws.py --word2vec_path vec.txt --train_data_path <绝对路径到train.txt> --test_data_path test.txt --max_sentence_len 80 --learning_rate 0.001  (默认使用IDCNN模型,可设置参数”--use_idcnn False“来切换BiLSTM模型)

  1. 生成vocab

bazel build kcws/cc:dump_vocab

./bazel-bin/kcws/cc/dump_vocab vec.txt kcws/models/basic_vocab.txt

  1. 导出训练好的模型

python tools/freeze_graph.py --input_graph logs/graph.pbtxt --input_checkpoint logs/model.ckpt --output_node_names "transitions,Reshape_7" --output_graph kcws/models/seg_model.pbtxt

  1. 词性标注模型下载 (临时方案,后续文档给出词性标注模型训练,导出等)

    https://pan.baidu.com/s/1bYmABk 下载pos_model.pbtxt到kcws/models/目录下

  2. 运行web service

./bazel-bin/kcws/cc/seg_backend_api --model_path=kcws/models/seg_model.pbtxt(绝对路径到seg_model.pbtxt>) --vocab_path=kcws/models/basic_vocab.txt --max_sentence_len=80

词性标注的训练说明:

https://github.com/koth/kcws/blob/master/pos_train.md

自定义词典

目前支持自定义词典是在解码阶段,参考具体使用方式请参考kcws/cc/test_seg.cc 字典为文本格式,每一行格式如下:

<自定义词条>\t<权重>

比如:

蓝瘦香菇 4

权重为一个正整数,一般4以上,越大越重要

demo

http://45.32.100.248:9090/

附: 使用相同模型训练的公司名识别demo:

http://45.32.100.248:18080

Comments
  • 大神,bazel build //kcws/cc:seg_backend_api 报错

    大神,bazel build //kcws/cc:seg_backend_api 报错

    ERROR: /root/kcws/third_party/gflags/BUILD:12:1: Executing genrule //third_party/gflags:gflags-srcs failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 77.

    opened by maczhao 15
  • ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted.

    ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted.

    Hi, when I build the kcws, there are some issues, how can I fix them?

    the issues are as follow:

    [[email protected] cc]# /opt/BioDir/dl/bazel-0.4.3/output/bazel build //kcws/cc:seg_backend_api WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.build/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing. WARNING: /root/.cache/bazel/_bazel_root/067d099fd5fd2abf4236febace697e72/external/org_tensorflow/tensorflow/workspace.bzl:13:5: path_prefix was specified to tf_workspace but is no longer used and will be removed in the future. WARNING: /root/.cache/bazel/_bazel_root/067d099fd5fd2abf4236febace697e72/external/org_tensorflow/tensorflow/workspace.bzl:15:5: tf_repo_name was specified to tf_workspace but is no longer used and will be removed in the future. ERROR: /root/.cache/bazel/_bazel_root/067d099fd5fd2abf4236febace697e72/external/org_tensorflow/tensorflow/core/platform/default/build_config/BUILD:108:1: error loading package '@jpeg//': Extension file not found. Unable to load package for '//third_party:common.bzl': BUILD file not found on package path and referenced by '@org_tensorflow//tensorflow/core/platform/default/build_config:jpeg'. ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted. INFO: Elapsed time: 2.612s

    ================= I build the bazel tools as follow:

    [[email protected] bazel-0.4.3]# bash ./compile.sh INFO: You can skip this first step by providing a path to the bazel binary as second argument: INFO: ./compile.sh compile /path/to/bazel  Building Bazel from scratch.......  Building Bazel with Bazel. .WARNING: /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE:1: Workspace name in /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE (@io_bazel) does not match the name given in the repository's definition (@bazel_tools); this will cause a build error in future versions. INFO: Found 1 target... INFO: From Compiling third_party/ijar/platform_utils.cc [for host]: third_party/ijar/platform_utils.cc: In function 'bool devtools_ijar::write_file(const char*, mode_t, const void*, size_t)': third_party/ijar/platform_utils.cc:67:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (write(fd, data, size) != size) { ^ INFO: From Compiling third_party/ijar/platform_utils.cc: third_party/ijar/platform_utils.cc: In function 'bool devtools_ijar::write_file(const char*, mode_t, const void*, size_t)': third_party/ijar/platform_utils.cc:67:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (write(fd, data, size) != size) { ^ INFO: From Compiling third_party/ijar/ijar.cc: third_party/ijar/ijar.cc: In member function 'virtual bool devtools_ijar::JarStripperProcessor::Accept(const char*, devtools_ijar::u4)': third_party/ijar/ijar.cc:66:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (filename_len >= CLASS_EXTENSION_LENGTH) { ^ INFO: From Compiling third_party/ijar/ijar.cc [for host]: third_party/ijar/ijar.cc: In member function 'virtual bool devtools_ijar::JarStripperProcessor::Accept(const char*, devtools_ijar::u4)': third_party/ijar/ijar.cc:66:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (filename_len >= CLASS_EXTENSION_LENGTH) { ^ INFO: From Compiling src/main/cpp/blaze_util_posix.cc: src/main/cpp/blaze_util_posix.cc: In function 'void blaze::Daemonize(const string&)': src/main/cpp/blaze_util_posix.cc:190:28: warning: ignoring return value of 'int dup(int)', declared with attribute warn_unused_result [-Wunused-result] (void) dup(STDOUT_FILENO); // stderr (2>&1) ^ src/main/cpp/blaze_util_posix.cc: In function 'uint64_t blaze::AcquireLock(const string&, bool, bool, blaze::BlazeLock*)': src/main/cpp/blaze_util_posix.cc:578:30: warning: ignoring return value of 'int ftruncate(int, __off_t)', declared with attribute warn_unused_result [-Wunused-result] (void) ftruncate(lockfd, 0); ^ src/main/cpp/blaze_util_posix.cc:583:47: warning: ignoring return value of 'ssize_t write(int, const void*, size_t)', declared with attribute warn_unused_result [-Wunused-result] (void) write(lockfd, msg.data(), msg.size()); ^ INFO: From JavacBootstrap src/java_tools/buildjar/java/com/google/devtools/build/buildjar/libbootstrap_JarOwner.jar [for host]: warning: Implicitly compiled files were not subject to annotation processing. Use -proc:none to disable annotation processing or -implicit to specify a policy for implicit compilation. 1 warning INFO: From Building src/main/protobuf/libextra_actions_base_java_proto.jar (1 source jar): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/java_tools/junitrunner/java/com/google/testing/coverage/JacocoCoverage.jar (9 source files): Note: src/java_tools/junitrunner/java/com/google/testing/coverage/MethodProbesMapper.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/tools/android/java/com/google/devtools/build/android/ziputils/libziputils_lib.jar (12 source files): Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libconcurrent.jar (18 source files): Note: src/main/java/com/google/devtools/build/lib/concurrent/AbstractQueueVisitor.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building third_party/java/apkbuilder/apkbuilder.jar (15 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libutil.jar (45 source files): Note: src/main/java/com/google/devtools/build/lib/util/OrderedSetMultimap.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/cmdline/libcmdline.jar (10 source files): Note: src/main/java/com/google/devtools/build/lib/cmdline/RepositoryName.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/skyframe/libskyframe.jar (67 source files): Note: src/main/java/com/google/devtools/build/skyframe/ReverseDepsUtilImpl.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libsyntax.jar (86 source files): Note: src/main/java/com/google/devtools/build/lib/syntax/BuiltinFunction.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libpackages-internal.jar (98 source files): Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/actions/libactions.jar (91 source files): Note: src/main/java/com/google/devtools/build/lib/actions/Actions.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libbuild-base.jar (381 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libproto-rules.jar (13 source files): Note: src/main/java/com/google/devtools/build/lib/rules/proto/ProtoCommon.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/query2/libquery2.jar (12 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/query2/libquery-output.jar (10 source files): Note: src/main/java/com/google/devtools/build/lib/query2/output/QueryOutputUtils.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/rules/genquery/libgenquery.jar (2 source files): Note: src/main/java/com/google/devtools/build/lib/rules/genquery/GenQuery.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/rules/cpp/libcpp.jar (80 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libpython-rules.jar (15 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libjava-compilation.jar (37 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: src/main/java/com/google/devtools/build/lib/rules/java/JavaCompileAction.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libjava-rules.jar (32 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libandroid-rules.jar (59 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libideinfo.jar (4 source files): Note: src/main/java/com/google/devtools/build/lib/ideinfo/AndroidStudioInfoAspect.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/rules/objc/libobjc.jar (114 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: src/main/java/com/google/devtools/build/lib/rules/objc/IterableWrapper.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libruntime.jar (94 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/sandbox/libsandbox.jar (16 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/worker/libworker.jar (11 source files): Note: src/main/java/com/google/devtools/build/lib/worker/WorkerSpawnStrategy.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libbazel-rules.jar (87 source files, 14 resources): Note: src/main/java/com/google/devtools/build/lib/bazel/rules/java/BazelJavaSemantics.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. Target //src:bazel up-to-date: bazel-bin/src/bazel INFO: Elapsed time: 178.725s, Critical Path: 170.17s WARNING: /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE:1: Workspace name in /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE (@io_bazel) does not match the name given in the repository's definition (@bazel_tools); this will cause a build error in future versions.

    Build successful! Binary is here: /opt/BioDir/dl/bazel-0.4.3/output/bazel

    opened by Sun-shan 12
  • error when run bazel build //kcws/cc:seg_backend_api

    error when run bazel build //kcws/cc:seg_backend_api

    ERROR: com.google.devtools.build.lib.packages.BuildFileContainsErrorsException: error loading package '': Encountered error while reading extension file 'tensorflow/workspace.bzl': no such package '@org_tensorflow//tensorflow': local_repository rule //external:org_tensorflow must specify an existing directory. INFO: Elapsed time: 0.049s

    build on: centos6.8 x64 no gpu support Build label: 0.4.1- (@non-git) tensorflow-0.11.0

    opened by busyfree 11
  • 关于标注部分的问题

    关于标注部分的问题

    大神好,我昨天仔细研究了您新添加的词性标注模块,然后我发现有几步好像有点问题,我自己尝试更改了一下,现在已经跑通了,99.57%的准确率,请您看看,问题如下: 1、在第五步骤,传入参数“lines_withpos.txt”,然而在代码里面并没有写入信息,我觉得应该得在代码里面添加 写入每个标注与其对应的序号。 2、在第六步骤,传入的第三个参数应该是上一步生成的词典“lines_withpos.txt”而不是”pos_vocab.txt“。

    您看这样是正确的吗?

    opened by oneapmlj 7
  • gflags link failed

    gflags link failed

    Linking using thirdparty gflags failed.

    Fixed by using self compiled gflags, maybe version issues of gflag. Modification made to Build files.

    
    --- a/third_party/glog/BUILD
    +++ b/third_party/glog/BUILD
    @@ -45,10 +45,7 @@ cc_library(
             "include/glog/stl_logging.h",
             "include/glog/vlog_is_on.h",
         ],
    -    deps = [
    -      "//third_party/gflags:gflags-cxx",
    -
    -    ],
    +    linkopts = ["-lgflags"],
         hdrs = [
             "include/glog/logging.h",
         ],
    
    opened by Vimos 7
  • 修改了max_word_num 的最大值,运行起来报错

    修改了max_word_num 的最大值,运行起来报错

    koth大大,请教个问题,我修改了 seg_backend_api.cc的 DEFINE_int32(max_word_num, 300, "max num of word per sentence ");将值改到了300,我测试的句子里面的字数比较多,在运行时报以下错误: E0918 11:23:35.434610 26934 tfmodel.cc:88] Error during inference: Invalid argument: Input to reshape is a tensor with 640 values, but the requested shape requires a multiple of 1200 [[Node: Reshape_7 = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _output_shapes=[[?,300,4]], _device="/job:localhost/replica:0/task:0/cpu: 0"](idcnn_1/scores, Reshape_7/shape)]] 2017-09-18 11:23:35.434675: E kcws/cc/tf_seg_model.cc:321] Error during inference:

    这种情况是不是我要重新训练models里面的word_vocab.txt文件?还是什么问题呢?如果是word_vocab.txt的问题,这个文本文件怎么训练呢?谢谢解惑.

    opened by younger911 6
  • F tensorflow/core/platform/cpu_feature_guard.cc:35] The TensorFlow library was compiled to use AVX2 instructions, but these aren't available on your machine.

    F tensorflow/core/platform/cpu_feature_guard.cc:35] The TensorFlow library was compiled to use AVX2 instructions, but these aren't available on your machine.

    [email protected]:/mnt/kcws# export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.12.1-cp27-none-linux_x86_64.whl [email protected]:/mnt/kcws# pip install --upgrade $TF_BINARY_URL

    通过这种安装的tensorflow,可以运行的。 但是这个项目启动会抛这个错误

    opened by weisong82 6
  • 关于默认分词的效果

    关于默认分词的效果

    我按照说明操作后,分词的效果如下。分词效果不是很准,下面是分词结果,这个正常吗? { "msg": "OK", "segments": [ "赵雅", "淇", "洒泪", "道", "歉", " ", "和林", "丹", "没", "有", "任", "何", "经济", "关", "系" ], "status": 0 }

    duplicate 
    opened by dengzz 5
  • embedding_size  AssertionError

    embedding_size AssertionError

    在最后train的时候:也就是运行: python kcws/train/train_cws_lstm.py --word2vec_path vec.txt --train_data_path <绝对路径到train.txt> --test_data_path test.txt --max_sentence_len 80 --learning_rate 0.001

    报错: Traceback (most recent call last): File "kcws/train/train_cws_lstm.py", line 262, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "kcws/train/train_cws_lstm.py", line 228, in main FLAGS.word2vec_path, FLAGS.num_hidden) File "kcws/train/train_cws_lstm.py", line 62, in init self.c2v = self.load_w2v(c2vPath) File "kcws/train/train_cws_lstm.py", line 132, in load_w2v assert (dim == (FLAGS.embedding_size)) AssertionError

    然后修改了:train_cws_lstm.py 的 tf.app.flags.DEFINE_integer("embedding_size", 50, "embedding size")tf.app.flags.DEFINE_integer("embedding_size", 200, "embedding size")就好

    opened by rockyzhengwu 5
  • 词性标注模型最后一步报错 MemoryError

    词性标注模型最后一步报错 MemoryError

    $ python tools/freeze_graph.py --input_graph pos_logs/graph.pbtxt --input_checkpoint pos_logs/model.ckpt --output_node_names "transitions,Reshape_9" --output_graph kcws/models/pos_model.pbtxt Traceback (most recent call last): File "tools/freeze_graph.py", line 202, in app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "tools/freeze_graph.py", line 134, in main FLAGS.variable_names_blacklist) File "tools/freeze_graph.py", line 93, in freeze_graph text_format.Merge(f.read().decode("utf-8"), input_graph_def) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 525, in Merge descriptor_pool=descriptor_pool) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 579, in MergeLines return parser.MergeLines(lines, message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 612, in MergeLines self._ParseOrMerge(lines, message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 627, in _ParseOrMerge self._MergeField(tokenizer, message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 727, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 815, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 727, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 815, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 727, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 815, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 727, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 815, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 714, in _MergeField tokenizer.Consume(':') File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 1078, in Consume if not self.TryConsume(token): File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 1065, in TryConsume self.NextToken() File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 1314, in NextToken match = self._TOKEN.match(self._current_line, self._column) MemoryError

    opened by kinghuangdd 4
  • 编译后台服务出现新错误。。。

    编译后台服务出现新错误。。。

    您好,执行命令:bazel build //kcws/cc:seg_backend_api 报错如下:

    ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:5:1: Reassignment of builtin build function 'package_name' not permitted. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/glog/BUILD:5:1: Reassignment of builtin build function 'package_name' not permitted. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:41:1: Target '//third_party/gflags:empty.cc' contains an error and its package is in error and referenced by '//third_party/gflags:gflags-cxx'. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:41:1: Target '//third_party/gflags:include/gflags/gflags_declare.h' contains an error and its package is in error and referenced by '//third_party/gflags:gflags-cxx'. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:41:1: Target '//third_party/gflags:lib/libgflags.a' contains an error and its package is in error and referenced by '//third_party/gflags:gflags-cxx'. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:41:1: Target '//third_party/gflags:include/gflags/gflags.h' contains an error and its package is in error and referenced by '//third_party/gflags:gflags-cxx'. ERROR: /home/di/pycharmProjects/segment/kcws/base/BUILD:3:1: Target '//third_party/gflags:gflags-cxx' contains an error and its package is in error and referenced by '//base:base'. ERROR: /home/di/pycharmProjects/segment/kcws/base/BUILD:3:1: Target '//third_party/glog:glog-cxx' contains an error and its package is in error and referenced by '//base:base'. ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted. INFO: Elapsed time: 0.167s

    执行命令:bazel build third_party/word2vec:word2vec 能成功bazel,其他的命令如:bazel build kcws/train:generate_training,bazel build kcws/cc:dump_vocab均会类似如上错误。在build文件中加了“licenses(["notice"])”依然不行。。。 请问大神这是是什么原因,有空的话能不能帮看一下,不甚感激!

    opened by yufengzhixing 4
  • 编译后台服务报错

    编译后台服务报错

    WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files: /home/cly/github/kcws/tools/bazel.rc INFO: Writing tracer profile to '/home/cly/.cache/bazel/_bazel_cly/271de499a4ab5fb7350261a41335ecd2/command.profile.gz' ERROR: /home/cly/github/kcws/WORKSPACE:5:1: name 'new_http_archive' is not defined ERROR: /home/cly/github/kcws/WORKSPACE:18:1: name 'new_http_archive' is not defined ERROR: /home/cly/github/kcws/WORKSPACE:34:1: name 'http_archive' is not defined ERROR: error loading package '': Encountered error while reading extension file 'tools/build_defs/repo/http.bzl': no such package '@bazel_tools//tools/build_defs/repo': error loading package 'external': Could not load //external package ERROR: error loading package '': Encountered error while reading extension file 'tools/build_defs/repo/http.bzl': no such package '@bazel_tools//tools/build_defs/repo': error loading package 'external': Could not load //external package INFO: Elapsed time: 0.032s INFO: 0 processes. FAILED: Build did NOT complete successfully (0 packages loaded)

    opened by lingyiliu016 2
  • error C++ compilation of rule '@protobuf//:protobuf' failed (Exit 2). cl: 命令行 error D8021 :无效的数值参数“/Wwrite-strings”

    error C++ compilation of rule '@protobuf//:protobuf' failed (Exit 2). cl: 命令行 error D8021 :无效的数值参数“/Wwrite-strings”

    ERROR: C:/users/thomas/appdata/local/temp/_bazel_thomas/infhcau0/external/protob uf/BUILD:113:1: C++ compilation of rule '@protobuf//:protobuf' failed (Exit 2): cl.exe failed: error executing command cd C:/users/thomas/appdata/local/temp/_bazel_thomas/infhcau0/execroot/main

    SET INCLUDE=F:\Tools\Microsoft Visual Studio 14.0\VC\INCLUDE;F:\Tools\Microsof t Visual Studio 14.0\VC\ATLMFC\INCLUDE;C:\Program Files (x86)\Windows Kits\10\in clude\10.0.14393.0\ucrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\inclu de\um;C:\Program Files (x86)\Windows Kits\10\include\10.0.14393.0\shared;C:\Prog ram Files (x86)\Windows Kits\10\include\10.0.14393.0\um;C:\Program Files (x86)\W indows Kits\10\include\10.0.14393.0\winrt; SET LIB=F:\Tools\Microsoft Visual Studio 14.0\VC\LIB\amd64;F:\Tools\Microsof t Visual Studio 14.0\VC\ATLMFC\LIB\amd64;C:\Program Files (x86)\Windows Kits\10
    lib\10.0.14393.0\ucrt\x64;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\lib \um\x64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.14393.0\um\x64; SET PATH=F:\Tools\Microsoft Visual Studio 14.0\Common7\IDE\CommonExtensions
    Microsoft\TestWindow;F:\Tools\Microsoft Visual Studio 14.0\VC\BIN\amd64;C:\WINDO WS\Microsoft.NET\Framework64\v4.0.30319;F:\Tools\Microsoft Visual Studio 14.0\VC \VCPackages;F:\Tools\Microsoft Visual Studio 14.0\Common7\IDE;F:\Tools\Microsoft Visual Studio 14.0\Common7\Tools;F:\Tools\Microsoft Visual Studio 14.0\Team Too ls\Performance Tools\x64;F:\Tools\Microsoft Visual Studio 14.0\Team Tools\Perfor mance Tools;C:\Program Files (x86)\Windows Kits\10\bin\x64;C:\Program Files (x86 )\Windows Kits\10\bin\x86;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\b in\NETFX 4.6.1 Tools\x64;;C:\WINDOWS\system32 SET PWD=/proc/self/cwd SET TEMP=C:\Users\Thomas\AppData\Local\Temp SET TMP=C:\Users\Thomas\AppData\Local\Temp F:/Tools/Microsoft Visual Studio 14.0/VC/bin/amd64/cl.exe /c external/protobuf /src/google/protobuf/struct.pb.cc /Fobazel-out/msvc_x64-fastbuild/bin/external/p rotobuf/objs/protobuf/external/protobuf/src/google/protobuf/struct.pb.o /nologo /DCOMPILER_MSVC /DNOMINMAX /D_WIN32_WINNT=0x0600 /D_CRT_SECURE_NO_DEPRECATE /D CRT_SECURE_NO_WARNINGS /D_SILENCE_STDEXT_HASH_DEPRECATION_WARNINGS /bigobj /Zm50 0 /J /Gy /GF /EHsc /wd4351 /wd4291 /wd4250 /wd4996 /Iexternal/protobuf /Ibazel-o ut/msvc_x64-fastbuild/genfiles/external/protobuf /Iexternal/bazel_tools /Ibazel- out/msvc_x64-fastbuild/genfiles/external/bazel_tools /Iexternal/protobuf/src /Ib azel-out/msvc_x64-fastbuild/genfiles/external/protobuf/src /Iexternal/bazel_tool s/tools/cpp/gcc3 /showIncludes /MT /Od /Z7 -DHAVE_PTHREAD -Wall -Wwrite-strings -Woverloaded-virtual -Wno-sign-compare -Wno-unused-function. cl: 命令行 error D8021 :无效的数值参数“/Wwrite-strings” Target //kcws/cc:seg_backend_api failed to build ____Elapsed time: 2.704s, Critical Path: 0.13s

    opened by thomas1984 2
  • 关于模型导出--output_node_names

    关于模型导出--output_node_names "transitions,Reshape_9" "transitions,Reshape_7" 什么意思

    模型导出时指定 output node 在解码的时候作为模型的输出; 训练的时候不是应该指定这两个名字吗? 我在bilstm.py 文件找到了 Reshape_7 这个output的定义 但没找到pos训练 Reshape_9 这个output的定义 以及transitions的定义, 这两个是tensorflow 默认的output node还是什么? 麻烦解释下,谢谢

    opened by forever1dream 3
Releases(test)
Text to QR-CODE

QR CODE GENERATO USING PYTHON Author : RAFIK BOUDALIA. Installation Use the package manager pip to install foobar. pip install pyqrcode Usage from tki

Rafik Boudalia 2 Oct 13, 2021
Course material for the Multi-agents and computer graphics course

TC2008B Course material for the Multi-agents and computer graphics course. Setup instructions Strongly recommend using a custom conda environment. Ins

16 Dec 13, 2022
Table recognition inside douments using neural networks

TableTrainNet A simple project for training and testing table recognition in documents. This project was developed to make a neural network which reco

Giovanni Cavallin 93 Jul 24, 2022
This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Pinch-zoom This is a python project based on real-time hand-gesture detection, to zoom in or out, using the distance between the index finger and the

Harshit Bhalla 6 Jul 11, 2022
Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Mixed supervision for surface-defect detection: from weakly to fully supervised learning [Computers in Industry 2021] Official PyTorch implementation

ViCoS Lab 169 Dec 30, 2022
Pixel art search engine for opengameart

Pixel Art Reverse Image Search for OpenGameArt What does the final search look like? The final search with an example can be found here. It looks like

Eivind Magnus Hvidevold 92 Nov 06, 2022
(CVPR 2021) ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

ST3D Code release for the paper ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection, CVPR 2021 Authors: Jihan Yang*, Shaoshu

CVMI Lab 224 Dec 28, 2022
The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Mask TextSpotter A Pytorch implementation of Mask TextSpotter along with its extension can be find here Introduction This is the official implementati

Pengyuan Lyu 261 Nov 21, 2022
A bot that extract text from images using the Tesseract OCR.

Text from image (OCR) @ocr_text_bot A simple bot to extract text from images. Usage What do I need? A AWS key configured locally, see here. NodeJS. I

Weverton Marques 4 Aug 06, 2021
Volume Control using OpenCV

Gesture-Volume-Control Volume Control using OpenCV Here i made volume control using Python and OpenCV in which we can control the volume of our laptop

Mudit Sinha 3 Oct 10, 2021
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Subramanyam 76 Dec 06, 2022
天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 - 第三名解决方案

天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 比赛链接 个人博客记录 目录结构 ├── final------------------------------------决赛方案PPT ├── preliminary_contest--------------------

19 Aug 17, 2022
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 03, 2023
virtual mouse which can copy files, close tabs and many other features !

AI Virtual Mouse Controller Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera. Fingertip loca

Diwas Pandey 23 Oct 05, 2021
The first open-source library that detects the font of a text in a image.

Typefont Typefont is an experimental library that detects the font of a text in a image. Usage Import the main function and invoke it like in the foll

Vasile Pește 1.6k Feb 24, 2022
BNF Globalization Code (CVPR 2016)

Boundary Neural Fields Globalization This is the code for Boundary Neural Fields globalization method. The technical report of the method can be found

25 Apr 15, 2022
Ocular is a state-of-the-art historical OCR system.

Ocular Ocular is a state-of-the-art historical OCR system. Its primary features are: Unsupervised learning of unknown fonts: requires only document im

228 Dec 30, 2022
Pure Javascript OCR for more than 100 Languages 📖🎉🖥

Version 2 is now available and under development in the master branch, read a story about v2: Why I refactor tesseract.js v2? Check the support/1.x br

Project Naptha 29.2k Jan 05, 2023
A post-processing tool for scanned sheets of paper.

unpaper Originally written by Jens Gulden — see AUTHORS for more information. Licensed under GNU GPL v2 — see COPYING for more information. Overview u

27 Dec 07, 2022