全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
电子学报  2013 

中心词驱动句法分析中的平滑技术

DOI: 10.3969/j.issn.0372-2112.2013.07.015, PP. 1337-1342

Keywords: 句法分析模型,平滑算法,中心词驱动句法分析,聚类算法

Full-Text   Cite this paper   Add to My Lib

Abstract:

解决数据稀疏问题是中心词驱动句法分析中的一个重要问题,基于词类的统计语言模型是解决统计模型数据稀疏问题的重要方法.本文在分析经典平滑算法的基础上,提出一种基于语义依存信息和互信息的词聚类算法,并利用绝对权重差分方法构造了一种可变长语言模型,即根据历史词对当前词预测所作的贡献不同,n值的大小也随之变化.进而提出了一种基于语义类和可变长模型的中心词驱动句法分析改进模型,既增强了句法分析模型的消歧能力,又解决了严重的数据稀疏问题.改进模型性能有了明显的提高,精确率和召回率分别为84.53%和82.41%,综合指标F值比Collins的中心词驱动句法分析模型提高了2.02个百分点.

References

[1]  Jesus Vilares,Miguel A Alonso,Manuel Vilares.Extraction of complex index terms in non-English IR:A shallow parsing based approach[J].Information Processing and Management,2008,44(4):1517-1537.
[2]  Aviran S,Siegel P H,Wolf J K.Optimal parsing trees for run-length coding of biased data[J].IEEE Transaction on Information Theory,2008,54(2):841-849.
[3]  Daniel Jurafsky,James H.Martin.Speech and Language Processing[M].New Jersey:Prentice Hall,2009.210-265.
[4]  Eugene Charniak.Statistical parsing with a context-free grammar and word statistics [A].Proceedings of the 14th National Conference on Artificial Intelligence [C].Menlo Park,1997.598-603.
[5]  Collins M.Head-Driven Statistical Models for Natural Language Parsing [D].Pennsylvania:The University of Pennsylvania,1999.65-78.
[6]  Collins M.Head-driven statistical models for natural language parsing[J].Computational Linguistics,2003,29(4):589-637.
[7]  Chen S F,Goodman J.An empirical study of smoothing techniques for language modeling [A].Proceedings of the 34th Annual Meeting on Association for Computational Linguistics [C].Stroudsburg:Association for Computational Linguistics, 1996.310-318.
[8]  Bikel DM,Miller S,et al.Nymble:A high-performance learning name-finder[A].Proceedings of the 5th Conf on Applied Natural Language Processing [C].Stroudsburg:Association for Computational Linguistics,1997.194-201.
[9]  Lee L.Similarity-Based Approaches to Natural Language Processing [D].Cambridge,MA:Harvard University,1997.25-87.
[10]  代印唐,吴承荣,等.层级分类概率句法分析[J].软件学报,2011,22(2):245-257. DAI Yin-Tang,WU Cheng-Rong,et al.Hierarchically classified probabilistic grammar parsing[J].Journal of Software,2011,22(2):245-257.(in Chinese)
[11]  ZHOU De-yu,HE Yu-lan.Discriminative training of the hidden vectors state model for semantic parsing[J].IEEE Transaction on Knowledge and Data Engineering,2009,21(1):66-77.
[12]  孙昂,江铭虎,贺一帆,等.基于句法分析和答案分类的中文问答系统[J].电子学报,2008,36(5):833-839. SUN Ang,JIANG Ming-hu,HE Yi-fan,et al.Chinese question answering based on syntax analysis and answer classification[J].Acta Electronica Sinica,2008,36(5):833-839.(in Chinese)
[13]  陈毅恒,秦兵,等.基于ontology抽取优化初始选择的检索结果聚类[J].电子学报,2008,36(12A):166-171. CHEN Yi-heng,QIN Bing,et al.Search result clustering based on centroid optimization by ontology extraction[J].Acta Electronica Sinica,2008,36(12A):166-171.(in Chinese)
[14]  David M Magerman.Statistical decision-tree models for parsing [A].Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics [C].Cambridge,1995.276-283.
[15]  Eugene Charniak.A maximum-entropy-inspired parser [A].Proceedings of the First Conference of the North American Chapter of the Association for Computational Linguistics [C].Seattle,2000.132-139.
[16]  刘水,李生,赵铁军等.头驱动句法分析中的直接插值平滑算法[J].软件学报,2009,20(11):2915-2924. LIU Shui,LI Sheng,ZHAO Tie-Jun,et al.Directly smooth interpolation algorithm in head-driven parsing[J].Journal of Software,2009,20(11):2915-2924.(in Chinese)
[17]  Frederick J,Mercer RL.Interpolated estimation of Markov source parameters from sparse data [A].Proceedings of the Workshop on Pattern Recognition in Practice [C].New York:Institute of Electrical and Electronics,1980.381-397.
[18]  Witten IH,Bell TC.The zero-frequency problem:Estimating the probabilities of novel events in adaptive text compression[J].IEEE Transactions on Information Theory,1991,37(4):1085-1094.
[19]  Gao Jian-feng,Goodman J,Miao Jiang-bo.The use of clustering techniques for language model-application to Asian language[J].Computational Linguistics and Chinese Language Processing,2001,6(1):27-60.
[20]  袁里驰.基于相似度的词聚类算法和可变长语言模型[J].小型微型计算机系统,2009,30(5):912-915. YUAN Li-chi.Word clustering based on similarity and vari-gram language model.Journal of Chinese Computer Systems,2009,30(5):912-915.(in Chinese)
[21]  袁里驰.基于词聚类的依存句法分析[J].中南大学学报:自然科学版,2011,42(7):2023-2027. YUAN Li-chi.Dependency language paring model based on word clustering[J].Journal of Central South University:Natural Science,2011,42(7):2023-2027.(in Chinese)

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133