一种针对木马流量的特征选择方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP391

基金项目:

国家自然科学基金,其它


A Feature Selection Method for Remote Access Trojan’s Traffic
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对现有基于会话流异常行为的木马检测方法中,普遍存在所选特征代表性不足、特征间信息冗余导致检测效果差的问题,提出一种特征选择方法.首先,通过捕捉流量对木马通信行为加以分析,根据各阶段提取相关的属性,并在每一属性上进行派生,得到足够充分的特征集合.然后,为了衡量特征的重要性和特征间的相关性,提出了改进的特征重要性评价系数和基于关联信息熵的联合相关性评价系数,并设计了基于序列后向选择策略的特征选择算法,以得到自适应规模的特征子集.算法通过每一轮迭代计算特征的评价系数,通过排序完成选择.为验证该算法有效性,采用朴素贝叶斯分类和支持向量机分类算法设计了与FCBF算法和IG算法的对比实验,相较于FCBF算法,在两种分类算法上的召回率分别提升3.76%、164%,F1值提升分别为1.04、0.99.相较于IG算法,召回率提升分别为6.46%、4.96%,F1值提升分别为3.56、3.18.实验结果表明,提出的特征选择算法能够有效选择木马流量各个属性上的特征,克服特征间关联性带来的影响,在缩减特征维度的同时提升木马通信流量的检测效果.

    Abstract:

    In view of the existing Trojan detection methods based on network flow’s anomalous behaviors, there are some problems, such as the lack of representativeness of features, the poor detection effect caused by information redundancy. A feature selection method was proposed in this paper. Firstly, in order to get sufficient feature set, features were derived from the relevant attributes by analyzing communication behavior of Trojan. Then, in order to measure the importance of feature and the correlation between features, the improved evaluation coefficient of feature importance and the joint evaluation coefficient based on correlation entropy are proposed, and a feature selection algorithm based on sequential backward selection was designed to obtain a feature subset of adaptive size. The evaluation coefficient of feature was calculated through each iteration, and the selection was done by sorting those features. In order to verify the validity, the proposed method was compared with FCBF and IG by using the naive bayes classification and the SVM classification algorithm. Compared with FCBF, the recall rate of the two classification algorithms increased by 3.76% and 1.64%, and the F1 increased by 1.04 and 0.99. Compared with IG, the recall rate increased by 6.46% and 4.96%, and the F1 increased by 3.56 and 3.18, respectively. The results of contrast experiments showed that the proposed model can effectively select the characteristics of each attribute from Trojan traffic by using the proposed feature selection algorithm, and overcome the influence of the correlation between features by using the optimized feature subset. The purpose of reducing feature dimension and improving Trojan traffic detection was achieved.

    参考文献
    相似文献
    引证文献
引用本文

引用本文格式: 张瑜,刘晓洁,李贝贝. 一种针对木马流量的特征选择方法[J]. 四川大学学报: 自然科学版, 2021, 58: 012004.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-05-11
  • 最后修改日期:2020-09-18
  • 录用日期:2020-09-20
  • 在线发布日期: 2021-01-20
  • 出版日期: