实体识别

  • 网络Entity Recognition;named entity recognition;entity identification
实体识别实体识别
  1. 指定实体识别(NamedEntityRecognition,NER):识别和提取指定的实体。

    Named Entity Recognition ( NER ): recognize and extract named entities .

  2. HMM与自动规则提取相结合的中文命名实体识别

    HMM combined with automatic rules-extracting for Chinese Named Entity recognition

  3. 基于本体的WEB挖掘在信息检索中的设计与实现中医症状病机实体识别及其关系挖掘研究

    The Designment and Implementation in Information Retrieval for WEB Mining Based on Ontology Research on Symptoms and Pathogenesis Recognition of Traditional Chinese Medicine and Its Relations

  4. 中文组织机构名的识别是中文信息处理中的一个重要任务,也是命名实体识别(NamedEntityRecognition)研究的重点之一。

    Chinese organization names recognition is a fundamental task in Chinese Information Processing and an important subtask of Named Entity Recognition ( NER ) .

  5. 我们建立了两步的基于CRF模型的中文命名实体识别系统。

    We built up a two step system under the CRFs model .

  6. 实验表明,中文命名实体识别总的精确率、召回率和F值分别达到了86.93%,83.69%,85.28%。

    The precision , recall and F-score of the Chinese named entity recognition are 86.93 % , 83.69 % , 85.28 % , respectively .

  7. 数据ETL过程中的实体识别方法

    Entity identification method for data ETL process

  8. 本文实现的中文命名实体识别系统采用了隐马尔可夫模型(HiddenMarkovmodel,HMM)与自动规则提取相结合的方法。

    This paper presents a Chinese named entity recognition system that integrates the Hidden Markov Model ( HMM ) and rules which are automatic extracted from the training corpus .

  9. 实验结果表明,在融入产品的品牌特征和系列特征之后,系统对产品名实体识别的F值提升了8.42%。

    The Experimental result shows that after added brand feature and series feature into feature template , the F-Measure of product named entity recognition system improved 8.42 % .

  10. 在863组织的命名实体识别评测中,系统的准确率、召回率和F值分别达到了81.93%,78.20%,80.02%。

    In a named entity test organized by the 863 program , the precision , recall and F-score of the system reach 81.93 % , 78.20 % and 80.02 % respectively .

  11. 在这篇论文中,我们把机器学习方法用于中文命名实体识别,这里我们使用了最大熵和Boosting两种机器学习的方法。

    In this thesis , two machine learning methods are applied on named entity recognition . One is maximum entropy method and another is boosting algorithm .

  12. 基于Snowball方法的命名实体识别

    Snowball-based Named Entity Recognition

  13. XML已经成为数据表示,存储与交换的标准,在XML信息的识别与整合应用中,XML数据的实体识别技术有着大量的需求。

    XML has been the standard of data representation , data storage and data exchanging . In the application of XML data identification and integration , the technology of entity identification of XML data is in great demand .

  14. 提出了一种基于DOM树和两层角色HMM标注的产品命名实体识别算法,实验表明该算法具有较好的识别效果。

    Propose new named entity recognition algorithm based on DOM-tree and hierarchical roles HMM tagging . The experiment results show that the algorithm has better recognition results .

  15. 在自然语言处理领域,实体识别是信息提取、句法分析、机器翻译、面向SEMANTICWEB的元数据标注等应用领域重要的基础性工具。

    In the field of natural language processing , entity recognition is the key technique in many Chinese information Processing applications such as information extraction , syntactic analysis , machine translation , metadata annotate for Semantic Web and so on .

  16. 目前,在XML数据实体识别的研究中,主要的方法是基于距离度量和相似性函数,而且研究人员忽略了XML数据实体识别的优化问题。

    At present , in the research of entity identification technology of XML data , the main methods are based on the distance measure and similarity functions , and researchers usually ignore the optimization of entity identification of XML .

  17. 得出结论:简单向量距离分类法在该领域的效果与SVM不相上下,并且命名实体识别会使结果有一定提高。

    The results show that classification based on distance of simple vectors are not worse than those based on SVM in this domain and the pre-process via a named entity recognition can improve the performance .

  18. 为此,文中主要说明了四种中文命名实体识别方法,包括规则、隐马尔可夫模型(HiddenMarkovmodel,HMM)、最大熵模型(MaximumEntropy,ME)和条件随机域(conditionalrandomfields,CRF)。

    So , we introduced four kinds of Chinese Named Entity recognition techniques in the paper , including rule-based method , Hidden Markov Model ( HMM ), Maximum Entropy ( ME ) and Conditional Random Fields ( CRF ) .

  19. 针对统计方法对语料依赖性强的问题,本文提出改进的基于TBL的日文名实体识别后处理方法。

    As for the corpus dependency problem in statistic method , the thesis proposed an improved TBL based Japanese NER post-processing strategy .

  20. 在命名实体识别方面,利用半Markov条件随机场对评论中的实体进行识别,综合运用了包括实体级词典特征在内的多类特征,有效的解决了实体变体的问题,提高了识别的准确率。

    For named entity recognization , this approach employ semi-Markov CRF to recognize the entities in reviews and exploit a variety clues including entity-level dictionary features , thereby effectively resolving the entity variety and improving the accuracy of the entity recognition .

  21. 在命名实体识别方面,本文采用改进的Viterbi算法对初始观察序列重新标注,并求出最佳的状态序列。

    In the area of Named entity recognition , we use third-order Hidden Markov Model : Improve the Viterbi Algorithm . On the initial observation sequence re-tagging , the state obtained the best sequence .

  22. 本文针对中文的命名实体识别问题,描述了正则化Winnow算法,并给出了基于较大规模语料库的实验结果。

    In this paper , we describe the regularized Winnow algorithm for Chinese named entity recognition . The experiment results for a large corpus are presented .

  23. 首次把条件随机域(CRF)模型应用到了中文名实体识别中,且根据中文的特点,定义了多种特征模板。

    In this paper , a new probabilistic model , conditional random fields ( CRF ), which is very fit for labeling sequence data , is firstly introduced to the task of Chinese named entity recognition ( CNER ) .

  24. 通过对实体识别、实体参数和属性参数的提取方法、相关技术进行的研究,确定了加工特征提取的实现方法,建立了STEP-NC词法解释器模型。

    A method for abstracting machining feature is presented , and an interpreter model for STEP-NC controller is designed , based on discussion of solid recognition and abstracting method for attribute of solid parameters . 4 .

  25. 与大多数自然语言处理技术一样,命名实体识别的方法主要分为两大类:基于规则(rule-based)的方法和基于统计(statistic-based)的方法。

    Mainly , there are two approaches for extracting the relations between the named entities . Just as most of the Natural Language Process technologies , the methods of NER have two classes , statistic-based and rule-based .

  26. 基于领域特点,我们在实验中主要采用知识表辅助机器学习的方法,统计模型选用了条件随机场(CRF)。命名实体识别是信息抽取的基础。

    On account of the field specificity , we mainly adopt the method of knowledge-table-assisted machine learning and choose Conditional Random Fields ( CRF ) as the statistical model . Firstly , named entity recognition is regarded as the basis of information extraction .

  27. 中文人名识别是中文命名实体识别(NER)的一个重点工作,广泛应用于信息检索、信息抽取、机器翻译等领域。

    Chinese Personal Name Recognition ( CPNR ) plays an important role in Named Entity Recognition ( NER ) task ; it is usually used in information retrieval , information extraction and machine translation and so on .

  28. 本文对基于B/S方式的教材管理信息系统开发进行系统分析,给出了系统的架构和结合MVC模式及对象到关系映射技术的系统开发模式,并以教材采购为例使用系统实体识别方法识别类。

    Here is a system analysis on management information system of teaching book based on B / S mode . System frame and development mode combined MVC pattern with Object-to-Relational Mapping technology are presented . As a case , classes on book purchase are gotten with system entity identified method .

  29. 命名实体识别(NER)是信息抽取的基础模块,在信息检索、机器翻译、数据挖掘、自动文摘等领域发挥着重要作用。

    In the taxonomy of computational linguistics tasks , Named entity recognition falls under the domain of " information extraction " . The task has particular significance for information retrieval , machine translation , the automatic indexing of documents , and data mining , etc.

  30. 在开发的实现语义数据集成的联通统一客户资料系统(UCIS)中,用实体识别算法进行测试,得到的平均返回率和精度分别为86.3%、96.5%,能够满足工程应用的要求。

    Average returning rate and precision tested with entity identification algorithms are respectively 86.3 % , 96.5 % in developed UCIS ( UniCom Client Information System ) that carried out data integration based on semantics , which can meet engineering application .