| 72 | 1 | 29 |
| 下载次数 | 被引频次 | 阅读次数 |
分析新垃圾邮件发现的意义,设计用来发现新垃圾邮件的相似度测量算法——Spam-SMA,该算法使用N元字串(N-Gram)作为比较用特征,基于该算法,在规则判分的反垃圾邮件框架下,提出1种新垃圾邮件发现机制,并通过对SpamAssassin的扩展实现了该机制。在邮件服务器上进行了多次实验,结果证明,该机制可有效实现新垃圾邮件的发现。
Abstract:The necessiy of a new overrun spam detection is analyzed.Then aspam detection algorithm based on similarity measure is designed, which is named Spam-SMA and it makes use of the N-gram as features for comparison.A mechanism of new overrun spam detection which uses the Spam-SMA algorithm is proposed based on the score-based rule anti-spam scheme,such as SpamAssassin,and a module of Spamassassin is implemented based on the mechanism.Experiment results show that this mechanism is effective.
[1]王斌,潘文峰.基于内容的垃圾邮件过滤技术综述[J].中文信息学报,2005,19(5):1-10.
[2]张尼,方滨兴.垃圾邮件过滤技术综述[C].北京:全国网络与信息安全技术研讨会,2005.
[3]王鑫,陈光英,段海新,等.基于用户反馈和增量学习的垃圾邮件识别方法[J].清华大学学报,2006,46(1):70-73.
[4]Schapire R,Singer Y,Singhal A.Boosting and rocchio applied to textfiltering[C].Austrailia:Proceedings of the 21st Annual InternationalACM SIGIR Conference on Research and Development in InformationRetrieval,1998.
[5]SpamAssassin.http://spamassassin.apache.org.
[6]Segal R,Crawford J,Kephart J,et al.SpamGuru:An enterprise an-ti-spam filtering system[C].Mountain View California:Proceedingsof the First Conference on E-mail and Anti-Spam,2004.
[7]Airoldi E,Malin B.Data mining challenges for electronic safety:thecase of fraudulent intent detection in e-mails[C].UK:Workshop onPrivacy and Security Aspects of Data Mining,2004.
[8]Frantzi K,Ananiadou S,Mima H.Automatic recognition of multi-word terms:the C-value/NC-value method[J].International Journalon Digital Libraries,2000,3(2):115-130.
基本信息:
DOI:10.16441/j.cnki.hdxb.2008.s1.040
中图分类号:TP393.098
引用信息:
[1]张登科,易秀双,王兴伟.一种基于相似度测量的新垃圾邮件发现机制[J].中国海洋大学学报(自然科学版),2008,38(S1):147-150.DOI:10.16441/j.cnki.hdxb.2008.s1.040.
基金信息:
国家高技术研究发展计划项目(2006AA01Z214);; 国家自然科学基金项目(60673159,70671020);; 新世纪优秀人才支持计划项目;; 教育部科学技术研究重点项目(108040);; 高等学校博士学科点专项科研基金课题(20060145012,20070145017);; 辽宁省自然科学基金项目(20062022);; 长江学者和创新团队发展计划资助
2008-10-15
2008-10-15