jivoi/awesome-ml-for-cybersecurity

GitHub: jivoi/awesome-ml-for-cybersecurity

一份精心整理的机器学习与网络安全交叉领域资源清单，涵盖数据集、论文、书籍、教程和课程，为安全从业者和研究者提供系统化的知识导航。

Stars: 9210 | Forks: 1931

# 极佳的网络安全机器学习 [![Awesom](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome) [

](https://github.com/jivoi/awesome-ml-for-cybersecurity) 一份精心整理的、与将机器学习应用于网络安全相关的极其出色的工具和资源列表。 ## 目录 - [数据集](#-datasets) - [论文](#-papers) - [书籍](#-books) - [演讲](#-talks) - [教程](#-tutorials) - [课程](#-courses) - [其他](#-miscellaneous) ## [↑](#table-of-contents) 数据集 * [HIKARI-2021 数据集](https://zenodo.org/record/5199540) * [安全相关数据样本](http://www.secrepo.com/) * [DARPA 入侵检测数据集](https://www.ll.mit.edu/r-d/datasets) [ [1998](https://www.ll.mit.edu/r-d/datasets/1998-darpa-intrusion-detection-evaluation-dataset) / [1999](https://www.ll.mit.edu/r-d/datasets/1999-darpa-intrusion-detection-evaluation-dataset) ] * [Stratosphere IPS 数据集](https://stratosphereips.org/category/dataset.html) * [开放数据集](http://csr.lanl.gov/data/) * [国家安全局数据捕获](http://www.westpoint.edu/crc/SitePages/DataSets.aspx) * [ADFA 入侵检测数据集](https://www.unsw.adfa.edu.au/australian-centre-for-cyber-security/cybersecurity/ADFA-IDS-Datasets/) * [NSL-KDD 数据集](https://github.com/defcom17/NSL_KDD) * [恶意 URL 数据集](http://sysnet.ucsd.edu/projects/url/) * [多源网络安全事件](http://csr.lanl.gov/data/cyber1/) * [KDD Cup 1999 数据](http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html) * [Web 攻击 Payload](https://github.com/foospidy/payloads) * [WAF 恶意查询数据集](https://github.com/faizann24/Fwaf-Machine-Learning-driven-Web-Application-Firewall) * [恶意软件训练数据集](https://github.com/marcoramilli/MalwareTrainingSets) * [Aktaion 数据集](https://github.com/jzadeh/Aktaion/tree/master/data) * [DeepEnd Research 的 CRIME 数据库](https://www.dropbox.com/sh/7fo4efxhpenexqp/AADHnRKtL6qdzCdRlPmJpS8Aa/CRIME?dl=0) * [公开可用的 PCAP 文件](http://www.netresec.com/?page=PcapFiles) * [2007 TREC 公开垃圾邮件语料库](https://plg.uwaterloo.ca/~gvcormac/treccorpus07/) * [Drebin Android 恶意软件数据集](https://www.sec.cs.tu-bs.de/~danarp/drebin/) * [PhishingCorpus 数据集](https://monkey.org/~jose/phishing/) * [EMBER](https://github.com/endgameinc/ember) * [Vizsec 研究](https://vizsec.org/data/) * [SHERLOCK](http://bigdata.ise.bgu.ac.il/sherlock/index.html#/) * [探测 / 端口扫描 - 数据集 ](https://github.com/gubertoli/ProbingDataset) * [爱琴海无线入侵数据集 (AWID)](http://icsdweb.aegean.gr/awid/) * [BODMAS PE 恶意软件数据集](https://whyisyoung.github.io/BODMAS/) ## [↑](#table-of-contents) 论文 * [基于真实和加密合成攻击流量的网络入侵检测数据集生成](https://www.mdpi.com/2076-3417/11/17/7868/htm) * [快速、精简且准确：使用神经网络对密码可猜测性进行建模](https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/melicher) * [封闭世界之外：在检测网络入侵中使用机器学习](http://ieeexplore.ieee.org/document/5504793/?reload=true) * [基于异常 Payload 的网络入侵检测](https://link.springer.com/chapter/10.1007/978-3-540-30143-1_11) * [使用元数据和结构特征检测恶意 PDF](http://dl.acm.org/citation.cfm?id=2420987) * [对抗性支持向量机学习](https://dl.acm.org/citation.cfm?id=2339697) * [利用机器学习颠覆您的垃圾邮件过滤器](https://dl.acm.org/citation.cfm?id=1387709.1387716) * [CAMP – 与内容无关的恶意软件防护](http://www.covert.io/research-papers/security/CAMP%20-%20Content%20Agnostic%20Malware%20Protection.pdf) * [Notos – 为 DNS 构建动态信誉系统](http://www.covert.io/research-papers/security/Notos%20-%20Building%20a%20dynamic%20reputation%20system%20for%20dns.pdf) * [Kopis – 在上层 DNS 层级检测恶意软件域名](http://www.covert.io/research-papers/security/Kopis%20-%20Detecting%20malware%20domains%20at%20the%20upper%20dns%20hierarchy.pdf) * [Pleiades – 从丢弃流量到僵尸网络 – 检测基于 DGA 的恶意软件的兴起](http://www.covert.io/research-papers/security/From%20throw-away%20traffic%20to%20bots%20-%20detecting%20the%20rise%20of%20dga-based%20malware.pdf) * [EXPOSURE – 使用被动 DNS 分析发现恶意域名](http://www.covert.io/research-papers/security/Exposure%20-%20Finding%20malicious%20domains%20using%20passive%20dns%20analysis.pdf) * [Polonium – 用于恶意软件检测的 TB 级图挖掘](http://www.covert.io/research-papers/security/Polonium%20-%20Tera-Scale%20Graph%20Mining%20for%20Malware%20Detection.pdf) * [Nazca – 在大规模网络中检测恶意软件分发](http://www.covert.io/research-papers/security/Nazca%20-%20%20Detecting%20Malware%20Distribution%20in%20Large-Scale%20Networks.pdf) * [PAYL – 基于异常 Payload 的网络入侵检测](http://www.covert.io/research-papers/security/PAYL%20-%20Anomalous%20Payload-based%20Network%20Intrusion%20Detection.pdf) * [Anagram – 抵抗模仿攻击的内容异常检测器](http://www.covert.io/research-papers/security/Anagram%20-%20A%20Content%20Anomaly%20Detector%20Resistant%20to%20Mimicry%20Attack.pdf) * [机器学习在网络安全中的应用](https://www.researchgate.net/publication/283083699_Applications_of_Machine_Learning_in_Cyber_Security) * [用于构建网络攻击检测系统的数据挖掘 (俄语)](http://vak.ed.gov.ru/az/server/php/filer.php?table=att_case&fld=autoref&key%5B%5D=100003407) * [为企业网络入侵检测系统选择数据挖掘技术 (俄语)](http://engjournal.ru/articles/987/987.pdf) * [信息安全任务中计算机网络层级表示的神经网络方法 (俄语)](http://engjournal.ru/articles/534/534.pdf) * [智能数据分析方法与入侵检测 (俄语)](http://vestnik.sibsutis.ru/uploads/1459329553_3576.pdf) * [网络攻击检测系统中的降维技术](http://elib.bsu.by/bitstream/123456789/120105/1/v17no3p284.pdf) * [机器的崛起：机器学习及其网络安全应用](https://www.nccgroup.trust/globalassets/our-research/uk/whitepapers/2017/rise-of-the-machines-preliminaries-wp-new-template-final_web.pdf) * [网络安全中的机器学习：半人马时代](https://go.recordedfuture.com/hubfs/white-papers/machine-learning.pdf) * [自动逃避分类器：以 PDF 恶意软件分类器为例](https://www.cs.virginia.edu/~evans/pubs/ndss2016/) * [将数据科学武器化用于社会工程——Twitter 上的自动化端到端鱼叉式网络钓鱼](https://www.blackhat.com/docs/us-16/materials/us-16-Seymour-Tully-Weaponizing-Data-Science-For-Social-Engineering-Automated-E2E-Spear-Phishing-On-Twitter.pdf) * [机器学习：威胁狩猎的现实检验](https://s3-eu-central-1.amazonaws.com/evermade-fsecure-assets/wp-content/uploads/2019/09/17153425/countercept-whitepaper-machine-learning.pdf) * [基于神经网络的图嵌入用于跨平台二进制代码相似性检测](https://arxiv.org/abs/1708.06525) * [面向隐私保护机器学习的实用安全聚合](https://eprint.iacr.org/2017/281.pdf) * [DeepLog：通过深度学习从系统日志进行异常检测与诊断](https://acmccs.github.io/papers/p1285-duA.pdf) * [eXpose：带有嵌入的字符级卷积神经网络，用于检测恶意 URL、文件路径和注册表键](https://arxiv.org/pdf/1702.08568.pdf) * [基于事件类型核算的安全事件关联大数据技术 (俄语)](http://cyberrus.com/wp-content/uploads/2018/02/2-16-524-17_1.-Kotenko.pdf) * [神经网络在检测应用层低强度 DDoS 攻击中的应用研究 (俄语)](http://cyberrus.com/wp-content/uploads/2018/02/23-29-524-17_3.-Tarasov.pdf) * [使用深度神经网络检测恶意 PowerShell 命令](https://arxiv.org/pdf/1804.04177.pdf) * [面向消费级物联网设备的机器学习 DDoS 检测](https://arxiv.org/pdf/1804.04159.pdf) * [通过系统日志的智能分析检测计算机系统中的异常 (俄语)](http://cyberrus.com/wp-content/uploads/2018/06/33-43-226-18_4.-Sheluhin.pdf) * [EMBER：用于训练静态 PE 恶意软件机器学习模型的开放数据集](https://arxiv.org/pdf/1804.04637.pdf) * [使用数据挖掘技术的恶意软件检测方法最新进展调查](https://link.springer.com/article/10.1186/s13673-018-0125-x) * [网络中使用监督学习技术检测恶意可移植可执行文件的研究](https://www.researchgate.net/publication/318665164_Investigation_of_malicious_portable_executable_file_detection_on_the_network_using_supervised_learning_techniques) * [网络安全中的机器学习：指南](https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=633583) * [封闭世界之外：在检测网络入侵中使用机器学习](https://personal.utdallas.edu/~muratk/courses/dmsec_files/oakland10-ml.pdf) * [基于机器学习的工业物联网网络漏洞分析](https://arxiv.org/abs/1911.05771) * [Hopper：横向移动的建模与检测](https://arxiv.org/pdf/2105.13442.pdf1) * [通过强化学习和自我对弈寻找有效的安全策略](https://arxiv.org/abs/2009.08120) * [通过最优停止进行入侵防御](https://arxiv.org/abs/2111.00289) * [网络风险管理：AI 生成的威胁预警 (论文)](https://stacks.stanford.edu/file/druid:mw190gm2975/faberSubmission-augmented.pdf) ## [↑](#table-of-contents) 书籍 * [网络安全中的数据挖掘与机器学习](https://www.amazon.com/Data-Mining-Machine-Learning-Cybersecurity/dp/1439839425) * [计算机安全的机器学习与数据挖掘](https://www.amazon.com/Machine-Learning-Mining-Computer-Security/dp/184628029X) * [网络异常检测：机器学习视角](https://www.amazon.com/Network-Anomaly-Detection-Learning-Perspective/dp/1466582081) * [机器学习与安全：用数据和算法保护系统](https://www.amazon.com/Machine-Learning-Security-Protecting-Algorithms/dp/1491979909) * [面向安全专业人士的人工智能简介](https://github.com/cylance/IntroductionToMachineLearningForSecurityPros/blob/master/IntroductionToArtificialIntelligenceForSecurityProfessionals_Cylance.pdf) * [精通用于渗透测试的机器学习](https://www.packtpub.com/networking-and-servers/mastering-machine-learning-penetration-testing) * [恶意软件数据科学：攻击检测与归因](https://nostarch.com/malwaredatascience) ## [↑](#table-of-contents) 演讲 * [使用机器学习支持信息安全](https://www.youtube.com/watch?v=tukidI5vuBs) * [在信息不全的情况下防御网络](https://www.youtube.com/watch?v=36IT9VgGr0g) * [将机器学习应用于网络安全监控](https://www.youtube.com/watch?v=vy-jpFpm1AU) * [衡量你的威胁情报源的 IQ](https://www.youtube.com/watch?v=yG6QlHOAWiE) * [数据驱动的威胁情报：指标传播与共享的度量](https://www.youtube.com/watch?v=6JMEKnes-w0) * [应用机器学习进行数据窃取及其他有趣话题](https://www.youtube.com/watch?v=dGwH7m4N8DE) * [因为数学而安全：基于机器学习监控的深入探讨](https://www.youtube.com/watch?v=TYVCVzEJhhQ) * [机器欺骗 101：攻破深度学习系统](https://www.youtube.com/watch?v=JAGDpJFFM2A) * [Delta Zero, KingPhish3r – 将数据科学武器化用于社会工程](https://www.youtube.com/watch?v=l7U0pDcsKLg) * [击败机器学习：你的安全供应商没有告诉你的事](https://www.youtube.com/watch?v=oiuS1DyFNd8) * [众包：用于恶意软件能力检测的群体训练机器学习模型](https://www.youtube.com/watch?v=u6a7afsD39A) * [击败机器学习：检测恶意软件的系统性缺陷](https://www.youtube.com/watch?v=sPtbDUJjhbk) * [数据包捕获村 – Theodora Titonis – 机器学习如何发现恶意软件](https://www.youtube.com/watch?v=2cQRSPFSY-s) * [在 5 分钟内构建一个杀毒软件 – 全新机器学习 #7。一部有趣的视频](https://www.youtube.com/watch?v=iLNHVwSu9EA&t=245s) * [使用机器学习搜寻恶意软件](https://www.youtube.com/watch?v=zT-4zdtvR30) * [用于威胁检测的机器学习](https://www.youtube.com/watch?v=qVwktOa-F34) * [机器学习与云：颠覆威胁检测与防御](https://www.youtube.com/watch?v=fRklX97iGIw) * [使用机器学习和深度学习进行欺诈检测](https://www.youtube.com/watch?v=gHtN4jU69W0) * [深度学习在流量识别中的应用](https://www.youtube.com/watch?v=yZ-Y1WCM0lc) * [在信息不全的情况下防御网络：一种机器学习方法](https://www.youtube.com/watch?v=_0CRSF6yPB4) * [机器学习与数据科学](https://vimeo.com/112702666) * [云规模网络防御机器学习的进展](https://www.youtube.com/watch?v=skSIIvvZFIk) * [应用机器学习：击败现代恶意文档](https://www.youtube.com/watch?v=ZAuCEgA3itI) * [利用机器学习和 GPO 自动防御勒索软件](https://www.rsaconference.com/writable/presentations/file_upload/spo2-t11_automated-prevention-of-ransomware-with-machine-learning-and-gpos.pdf) * [通过挖掘安全文献学习检测恶意软件](https://www.usenix.org/conference/enigma2017/conference-program/presentation/dumitras) * [Clarence Chio 和 Anto Joseph - 信息安全中的实用机器学习](https://conference.hitb.org/hitbsecconf2017ams/materials/D1T3%20-%20Clarence%20Chio%20and%20Anto%20Joseph%20-%20Practical%20Machine%20Learning%20in%20Infosecurity.pdf) * [云规模网络防御机器学习的进展](https://www.youtube.com/watch?v=6Slj2FV9CLA) * [基于机器学习的网络入侵检测技术](https://www.youtube.com/watch?v=-EUJgpiJ8Jo) * [信息安全中的实用机器学习](https://www.youtube.com/watch?v=YF2dm6GZf2U) * [AI 与安全](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/07/AI_and_Security_Dawn_Song.pdf) * [信息安全中的 AI](https://vimeo.com/230502013) * [超越黑名单：通过机器学习检测恶意 URL](https://www.youtube.com/watch?v=Kd3svc9HZ0Y) * [机器学习驱动的网络威胁狩猎](https://www.youtube.com/watch?v=c-c-IQ5pFXw) * [武器化机器学习：人类被高估了](https://www.youtube.com/watch?v=QbX7BhjOOvY) * [机器学习、攻击与自动化的未来](https://www.youtube.com/watch?v=BWFdxAG_TGk) * [将红队与蓝队对抗引入机器学习](https://www.youtube.com/watch?v=e5O0Oxt5dYI) * [使用 Azure 和泰坦尼克号数据集解释机器学习](https://www.youtube.com/watch?v=x1DfjUEYm0k) * [利用机器来攻击机器](https://www.youtube.com/watch?v=VuLvzL-WbBQ) * [使用可视化和机器学习分析活动目录事件日志](https://www.youtube.com/watch?v=ISbbzaCGBns) * [强化机器学习防御以对抗对抗性攻击](https://www.youtube.com/watch?v=CAwua_lugV8) * [面向黑客的深度神经网络：方法、应用与开源工具](https://www.youtube.com/watch?v=fKJ8sTi6H88) * [机器学习在威胁猎人日常工作中的应用](https://www.youtube.com/watch?v=vWMRVhDCpao) * [关于 AI 的真相：网络安全中的机器学习 - Josh Fu](https://www.youtube.com/watch?v=RzakalH1eL8) * [使用深度学习自动检测软件漏洞](https://www.youtube.com/watch?v=tpzT8ppx5-s) * [构建和破坏机器学习系统 - Johann Rehberger](https://www.youtube.com/watch?v=-SV80sIBhqY) * [机器学习基础设施的漏洞 - Sergey Gordeychik](https://www.youtube.com/watch?v=5bWyY3kocdE) ## [↑](#table-of-contents) 教程 * [基于机器学习的密码强度分类](http://web.archive.org/web/20170606022743/http://fsecurify.com/machine-learning-based-password-strength-checking/) * [使用机器学习对数据包捕获进行分类](https://medium.com/@siddharthsatpathy.ss/introducing-flowmeter-97e0507862b6) * [使用机器学习检测恶意 URL](http://web.archive.org/web/20170514093208/http://fsecurify.com/using-machine-learning-detect-malicious-urls/) * [使用深度学习破解 Captcha 系统](https://deepmlblog.wordpress.com/2016/01/03/how-to-break-a-captcha-system/) * [用于网络安全和入侵检测的数据挖掘](https://www.r-bloggers.com/data-mining-for-network-security-and-intrusion-detection/) * [应用机器学习改进您的入侵检测系统](https://securityintelligence.com/applying-machine-learning-to-improve-your-intrusion-detection-system/) * [使用 Suricata 和机器学习分析僵尸网络](http://blogs.splunk.com/2017/01/30/analyzing-botnets-with-suricata-machine-learning/) * [fWaf – 机器学习驱动的 Web 应用防火墙](http://web.archive.org/web/20170706222016/http://fsecurify.com/fwaf-machine-learning-driven-web-application-firewall/) * [用于网络安全的深度会话学习](https://blog.cyberreboot.org/deep-session-learning-for-cyber-security-e7c0f6804b81#.eo2m4alid) * [用于恶意软件检测的机器学习](http://resources.infosecinstitute.com/machine-learning-malware-detection/) * [ShadowBroders 泄露：一种机器学习方法](https://marcoramilli.blogspot.ru/2017/04/shadowbrokers-leak-machine-learning.html) * [信息安全中的实用机器学习 - Virtualbox 镜像及资料](https://docs.google.com/document/d/1v4plS1EhLBfjaz-9GHBqspTH7vnrJfqLrLjeP9k9i9A/edit) * [面向大规模电子犯罪取证的机器学习工具包](http://blog.trendmicro.com/trendlabs-security-intelligence/defplorex-machine-learning-toolkit-large-scale-ecrime-forensics/) * [基于机器学习的 WebShell 检测](https://github.com/lcatro/WebShell-Detect-By-Machine-Learning) * [为 SOC 构建机器学习模型](https://www.fireeye.com/blog/threat-research/2018/06/build-machine-learning-models-for-the-soc.html) * [使用循环神经网络检测 Web 攻击](https://aivillage.org/posts/detecting-web-attacks-rnn/) * [红队中的机器学习，第 1 部分](https://silentbreaksecurity.com/machine-learning-for-red-teams-part-1/) * [使用机器学习检测反向 Shell](https://www.cyberbit.com/blog/endpoint-security/detecting-reverse-shell-with-machine-learning/) * [使用机器检测混淆的命令行](https://www.fireeye.com/blog/threat-research/2018/11/obfuscated-command-line-detection-using-machine-learning.html) * [使用循环神经网络检测 Web 攻击 (俄语)](https://habr.com/ru/company/pt/blog/439202/) * [机器学习清晰又令人毛骨悚然的危险：破解密码](https://towardsdatascience.com/clear-and-creepy-danger-of-machine-learning-hacking-passwords-a01a7d6076d5) * [基于父子进程关系发现异常模式](https://www.elastic.co/cn/blog/discovering-anomalous-patterns-based-on-parent-child-process-relationships) * [用于检测钓鱼网站的机器学习](https://faizanahmad.tech/blog/2020/02/phishytics-machine-learning-for-phishing-websites-detection/) * [在活动目录中使用机器学习进行密码搜寻](https://blog.hunniccyber.com/password-hunting-with-ml-in-active-directory/) * [如何独立开发基于机器学习的计算机攻击检测系统 (俄语)](https://habr.com/ru/post/538296/) ## [↑](#table-of-contents) 课程 * [斯坦福大学的网络安全数据挖掘](http://web.stanford.edu/class/cs259d/) * [面向信息安全的数据科学与机器学习](http://www.pentesteracademy.com/course?id=30) * [Udemy 上的网络安全数据科学](https://www.udemy.com/cybersecurity-data-science) * [Udemy 上的红队黑客机器学习](https://www.udemy.com/course/machine-learning-for-red-team-hackers/) * [面向安全的机器学习](https://security.kiwi/docs/introduction/) ## [↑](#table-of-contents) 其他 * [系统利用人类专家的输入预测 85% 的网络攻击](http://news.mit.edu/2016/ai-system-predicts-85-percent-cyber-attacks-using-input-human-experts-0418) * [通过查看数据包头部的机器学习数据包分类工具](https://github.com/deepfence/FlowMeter) * [使用机器学习的网络安全开源项目列表](http://www.mlsec.org/) * [关于机器学习与安全的源代码](https://github.com/13o-bbr-bbq/machine_learning_security) * [《精通用于渗透测试的机器学习》的源代码](https://github.com/PacktPublishing/Mastering-Machine-Learning-for-Penetration-Testing) * [用于分析渗透测试截图的卷积神经网络](https://github.com/BishopFox/eyeballer) * [用于安全和欺诈检测的大数据与数据科学](http://www.kdnuggets.com/2015/12/big-data-science-security-fraud-detection.html) * [StringSifter - 一款机器学习工具，根据字符串与恶意软件分析的相关性对其进行排名](https://github.com/fireeye/stringsifter) ## 许可证 ![cc 许可证](http://i.creativecommons.org/l/by-sa/4.0/88x31.png) 本作品采用 [知识共享署名-相同方式共享 4.0 国际](http://creativecommons.org/licenses/by-sa/4.0/) 许可证。

标签：AI安全, AMSI绕过, Apex, Chat Copilot, DARPA, DAST, KDD99, ML, TruffleHog, WAF, Web安全, 人工智能, 威胁检测, 学习资源, 学术论文, 安全数据集, 恶意URL检测, 恶意软件分析, 教程, 机器学习, 用户模式Hook绕过, 网络安全, 网络流量分析, 蓝队分析, 资源合集, 逆向工具, 隐私保护