基于深度学习的中医古文献临床经验抽取
作者:
作者单位:

1.四川大学计算机学院;2.成都中医药大学医学信息工程学院;3.成都中医药大学基础医学院

作者简介:

通讯作者:

中图分类号:

TP391

基金项目:

国家重点研发项目(2020YFB0704502);国家自然科学基金(61801058)


Extracting clinical experiences from ancient literature of Traditional Chinese Medicine via deep learning
Author:
Affiliation:

1.Department of Computer Science, Sichuan University;2.College of Medical Information Engineering, Chengdu University of TCM;3.College of Basic Medical, Chengdu University of TCM

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    中医古文献蕴藏着丰富的临床经验,是古代中医在行医过程中对临床诊疗的经验性总结,体现了中医学形成和发展的理论框架和思想基础.然而这些宝贵的临床经验不仅量大,而且分散在不 同的文献中,使得中医从业者手工很难快速全面地获取它们,文献检索工具也只能提供文档级别的信息筛选,无法为这种细粒度的信息获取提供支持.此外,古汉语相对于现代汉语的不同特点 也限制了主流文本分析工具的使用效果.为此本文提出面向临床经验获取的中医古文献信息抽取任务,用于识别古文献中描述临床经验的文本片段,手工标注了样本数据用于这种抽取模型的训 练和测试,并设计了基于深度学习的序列标注器用于完成该任务.考虑到标注数据量小可能带来的过度拟合问题,本文引入对抗训练和虚拟对抗训练来增强模型的泛化能力.一系列充分的实验 验证了模型的有效性,表明利用信息抽取技术从古文献获取中医临床经验具有可行性,为这一新的信息抽取任务提供了有希望的研究基线和可复用的标注数据集.

    Abstract:

    Ancient literature of Traditional Chinese Medicine (TCM) contains rich clinical experiences, which is the empirical summary of clinical diagnosis and treatment in the process of ancient Chinese medicine practice, and embodies the theoretical framework and ideological basis of the formation and development of TCM. However, due to the volume and dispersion of valuable clinical experiences, it is difficult for TCM doctors to quickly and comprehensively obtain the clinical information they need from ancient literature manually, and the document retrieval tools can only provide documentlevel information screening, which cannot support finegrained information extraction. In addition, the different characteristics of ancient Chinese relative to modern Chinese also limit the use of mainstream text analysis tools. For this reason, we propose a task of information extraction from the ancient literature of TCM for obtaining clinical experiences, which is used to identify text fragments describing clinical experiences in ancient literature and manually annotate sample data for training and testing the extraction task, a sequence labeling model is designed based on deep learning to complete the task. Considering the overfitting problem that can be brought about by the small amount of annotated data, we introduce adversarial training and virtual adversarial training to enhance the generalization ability of the proposed model. A series of sufficient experiments are conducted on the clinical experience dataset to verify the effectiveness of the model, and the experimental results show the feasibility of extracting clinical experiences from ancient literature by information extraction technology, and a promising baseline and a reusable annotated dataset for the new information extraction task are available.

    参考文献
    相似文献
    引证文献
引用本文

引用本文格式: 卢永美,卜令梅,陈黎,于中华,张婷婷,叶莹. 基于深度学习的中医古文献临床经验抽取[J]. 四川大学学报: 自然科学版, 2022, 59: 023005.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-09-26
  • 最后修改日期:2021-10-19
  • 录用日期:2021-10-22
  • 在线发布日期: 2022-04-01
  • 出版日期: