批量网站备份文件扫描器——ihoneyBakFileScan_Modify
作者:Sec-Labs | 发布时间:
项目介绍
批量网站备份文件扫描器,增加文件规则,优化内存占用
项目地址
https://github.com/VMsec/ihoneyBakFileScan_Modify
2022.2.8 添加、修改内容
增加备份文件fuzz规则
修改备份文件大小判断方式(pip3 install hurry-filesize)
修改备份文件是否存在的判断规则
修改为多线程扫描,内存占用更小
经测试 1h1g vps 500线程可以拉满
1. 简介
1.1 网站备份文件泄露可能造成的危害:
1. 网站存在备份文件:网站存在备份文件,例如数据库备份文件、网站源码备份文件等,攻击者利用该信息可以更容易得到网站权限,导致网站被黑。
2. 敏感文件泄露是高危漏洞之一,敏感文件包括数据库配置信息,网站后台路径,物理路径泄露等,此漏洞可以帮助攻击者进一步攻击,敞开系统的大门。
3. 由于目标备份文件较大(xxx.G),可能存在更多敏感数据泄露
4. 该备份文件被下载后,可以被用来做代码审计,进而造成更大的危害
5. 该信息泄露会暴露服务器的敏感信息,使攻击者能够通过泄露的信息进行进一步入侵。
1.2 依赖环境
开发环境:
python3 python3.5.3
pip3.5 pip 10.0.1
requests 2.19.1
安装第三方依赖库:
pip3.5 install requests
pip3 install hurry-filesize
1.3 工具核心:
1. 常见后缀:
* '.rar', '.zip', '.gz', '.sql.gz', '.tar.gz' ...
2. 文件头识别:
* rar:526172211a0700cf9073
* zip:504b0304140000000800
* gz:1f8b080000000000000b,也包括'.sql.gz',取'1f8b0800' 作为keyword
* tar.gz: 1f8b0800
* sql:每种导出方式有不同的文件头
* Adminer:
* mysqldump:
* phpMyAdmin:
* navicat:
3. 数据库备份导出方式识别:
* 导出方式 文件头字符: 前10个16进制字符:
* mysqldump: -- MySQL dump: 2d2d204d7953514c
* phpMyAdmin: -- phpMyAdmin SQL Dump: 2d2d207068704d794164
* navicat: /* Navicat : 2f2a0a204e617669636174
* Adminer: -- Adminer x.x.x MySQL dump: 2d2d2041646d696e6572 (5月9日新增xxx.sql)
* Navicat MySQL Data Transfer: /* Navicat: 2f2a0a4e617669636174
* 一种未知导出方式: -- -------: 2d2d202d2d2d2d2d2d2d
4. 根据域名自动生成相关扫描字典:
➜ ihoneyBakFileScan python3.5 ihoneyBakFileScan.py -u https://www.ihoney.net.cn
[ ] https://www.ihoney.net.cn/__zep__/js.zip
[ ] https://www.ihoney.net.cn/faisunzip.zip
[ ] https://www.ihoney.net.cn/www.ihoney.net.cn.rar
[ ] https://www.ihoney.net.cn/wwwihoneynetcn.rar
[ ] https://www.ihoney.net.cn/ihoneynetcn.rar
[ ] https://www.ihoney.net.cn/ihoney.net.cn.rar
[ ] https://www.ihoney.net.cn/www.rar
[ ] https://www.ihoney.net.cn/ihoney.rar
[*] https://www.ihoney.net.cn/www.ihoney.net.cn.zip size:0M
[ ] https://www.ihoney.net.cn/wwwihoneynetcn.zip
[ ] https://www.ihoney.net.cn/ihoneynetcn.zip
[ ] https://www.ihoney.net.cn/ihoney.net.cn.zip
[ ] https://www.ihoney.net.cn/www.zip
[ ] https://www.ihoney.net.cn/ihoney.zip
[ ] https://www.ihoney.net.cn/www.ihoney.net.cn.gz
[ ] https://www.ihoney.net.cn/wwwihoneynetcn.gz
[ ] https://www.ihoney.net.cn/ihoneynetcn.gz
[ ] https://www.ihoney.net.cn/ihoney.net.cn.gz
[ ] https://www.ihoney.net.cn/www.gz
[ ] https://www.ihoney.net.cn/ihoney.gz
[ ] https://www.ihoney.net.cn/www.ihoney.net.cn.sql.gz
[ ] https://www.ihoney.net.cn/wwwihoneynetcn.sql.gz
[ ] https://www.ihoney.net.cn/ihoneynetcn.sql.gz
[ ] https://www.ihoney.net.cn/ihoney.net.cn.sql.gz
[ ] https://www.ihoney.net.cn/www.sql.gz
[ ] https://www.ihoney.net.cn/ihoney.sql.gz
[ ] https://www.ihoney.net.cn/www.ihoney.net.cn.tar.gz
[ ] https://www.ihoney.net.cn/wwwihoneynetcn.tar.gz
[ ] https://www.ihoney.net.cn/ihoneynetcn.tar.gz
[ ] https://www.ihoney.net.cn/ihoney.net.cn.tar.gz
[ ] https://www.ihoney.net.cn/www.tar.gz
[ ] https://www.ihoney.net.cn/ihoney.tar.gz
[ ] https://www.ihoney.net.cn/www.ihoney.net.cn.sql
[ ] https://www.ihoney.net.cn/wwwihoneynetcn.sql
[ ] https://www.ihoney.net.cn/ihoneynetcn.sql
[ ] https://www.ihoney.net.cn/ihoney.net.cn.sql
[ ] https://www.ihoney.net.cn/www.sql
[ ] https://www.ihoney.net.cn/ihoney.sql
5. 自动记录扫描成功的备份地址到以时间命名的文件
例如 20180616_16-28-14.txt:
https://www.ihoney.net.cn/ihoney.tar.gz size:0M
https://www.ihoney.net.cn/www.ihoney.net.cn.zip size:0M
2. 使用方式
参数:
-h --help 查看工具使用帮助
-f --url-file 批量时指定存放url的文件,每行url需要指定http://或者https://,否则默认使用http://
-t --thread 指定线程数,建议100
-u --url 单个url扫描时指定url
-d --dict-file 自定义扫描字典
使用:
批量url扫描 python3.5 ihoneyBakFileScan.py -t 100 -f url.txt
单个url扫描 python3.5 ihoneyBakFileScan.py -u https://www.ihoneysec.top/
python3.5 ihoneyBakFileScan.py -u www.ihoney.net.cn
python3.5 ihoneyBakFileScan.py -u www.ihoney.net.cn -d dict.txt
3. ChangeLog:
[2018.04.20] 首发T00ls:支持rar,zip后缀备份文件头识别,根据域名自动生成相关扫描字典,自动记录扫描成功的备份地址到文件
[2018.04.26]
在原本扫描成功的备份地址后增加了备份大小,以方便快速识别有效备份。
增加了.sql文件识别,也是识别文件头的方式,文件头我目前检测到三种,分别是不同方式导出的:1.mysql,2.phpmyadmin,3.navicat。
[2018.05.19] 新增识别Adminer导出的两种格式:baidu.sql、baodu.sql.gz
[2018.05.31] 新增Navicat MySQL Data Transfer备份导出方式和另一种未知导出方式
[2018.06.16] 修复支持https站扫描,并从旧项目中抽出来独立作为一个项目
[2018.06.18] 从多线程加队列改为多进程加进程池,提升扫描速度
4. 核心代码:
# 2018.04.20 www.T00ls.net
# __author__: ihoneysec
# -*- coding: UTF-8 -*-
import requests
import logging
from binascii import b2a_hex
import threading
from queue import Queue
from argparse import ArgumentParser
from copy import deepcopy
from datetime import datetime
from hurry.filesize import size
requests.packages.urllib3.disable_warnings()
logging.basicConfig(level=logging.WARNING, format="%(message)s")
def vlun(q,df):
while q.empty() is not True:
urltarget = q.get()
#print(urltarget)
'''
rar_byte = '526172'
zip_byte = '504b03'
gz_byte = '1f8b080000000000000b'
mysqldump_byte = '2d2d204d7953514c'
phpmyadmin_byte = '2d2d207068704d794164'
navicat_byte = '2f2a0a204e6176696361'
adminer_byte = '2d2d2041646d696e6572'
other_byte = '2d2d202d2d2d2d2d2d2d'
navicat_MDT_byte = '2f2a0a4e617669636174'
tar_gz_byte = '1f8b0800'
'''
try:
r = requests.get(url=urltarget, headers=headers, timeout=timeout, allow_redirects=False, stream=True, verify=False)
#content = b2a_hex(r.raw.read(10)).decode()
if (r.status_code == 200)&('html' not in r.headers.get('Content-Type'))&('xml' not in r.headers.get('Content-Type'))&('json' not in r.headers.get('Content-Type'))&('javascript' not in r.headers.get('Content-Type')):
'''
rarsize = int(r.headers.get('Content-Length'))
if rarsize >= 1024000000:
unit = int(rarsize) // 1024 // 1024 / 1000
rarsize = str(unit) + 'G'
elif rarsize >= 1024000:
unit = int(rarsize) // 1024 // 1024
rarsize = str(unit) + 'M'
else:
unit = int(rarsize) // 1024
rarsize = str(unit) + 'K'
if content.startswith(rar_byte) or content.startswith(zip_byte) or content.startswith(gz_byte) or content.startswith(
mysqldump_byte) or content.startswith(
phpmyadmin_byte) or content.startswith(navicat_byte) or content.startswith(adminer_byte) or content.startswith(
other_byte) or content.startswith(navicat_MDT_byte) or content.startswith(tar_gz_byte):
#if int(unit)>0:
'''
tmp_rarsize = int(r.headers.get('Content-Length'))
rarsize = str(size(tmp_rarsize))
if (int(rarsize[0:-1])>0):
logging.warning('[ success ] {} size:{}'.format(urltarget, rarsize))
with open(df, 'a') as f:
try:
f.write(str(urltarget) + ' ' + 'size:' + str(rarsize) + '\n')
except:
pass
else:
logging.warning('[ fail ] {}'.format(urltarget))
else:
logging.warning('[ fail ] {}'.format(urltarget))
except Exception as e:
pass
q.task_done()
def urlcheck(target=None, ulist=None):
if target is not None and ulist is not None:
if target.startswith('http://') or target.startswith('https://'):
if target.endswith('/'):
ulist.append(target)
else:
ulist.append(target + '/')
else:
line = 'http://' + target
if line.endswith('/'):
ulist.append(line)
else:
ulist.append(line + '/')
return ulist
def dispatcher(url_file=None, url=None, max_thread=1, dic=None):
urllist = []
check_urllist = []
global q
if url_file is not None and url is None:
with open(str(url_file)) as f:
while True:
line = str(f.readline()).strip()
if line:
urllist = urlcheck(line, urllist)
else:
break
elif url is not None and url_file is None:
url = str(url.strip())
urllist = urlcheck(url, urllist)
else:
pass
with open(datefile, 'w'):
pass
for u in urllist:
cport = None
# ucp = u.strip('https://').strip('http://')
if u.startswith('http://'):
ucp = u.lstrip('http://')
elif u.startswith('https://'):
ucp = u.lstrip('https://')
if '/' in ucp:
ucp = ucp.split('/')[0]
if ':' in ucp:
cport = ucp.split(':')[1]
ucp = ucp.split(':')[0]
www1 = ucp.split('.')
else:
www1 = ucp.split('.')
wwwlen = len(www1)
wwwhost = ''
for i in range(1, wwwlen):
wwwhost += www1[i]
current_info_dic = deepcopy(dic) # deep copy
suffixFormat = ['.zip','.rar','.tar.gz','.tgz','.tar.bz2','.tar','.jar','.war','.7z','.bak','.sql','.gz','.sql.gz','.tar.tgz']
domainDic = [ucp, ucp.replace('.', ''), wwwhost, ucp.split('.', 1)[-1], www1[0], www1[1]]
for s in suffixFormat:
for d in domainDic:
current_info_dic.extend([d + s])
for info in current_info_dic:
url = str(u) + str(info)
check_urllist.append(url)
print("[add check] "+url)
q = Queue()
for url in check_urllist:
q.put(url)
for index in range(max_thread):
thread = threading.Thread(target=vlun, args=(q,datefile))
# thread.daemon = True
thread.start()
q.join()
if __name__ == '__main__':
usageexample = '\n Example: python3.5 ihoneyBakFileScan -t 100 -f url.txt\n'
usageexample += ' '
usageexample += 'python3.5 ihoneyBakFileScan.py -u https://www.example.com/'
parser = ArgumentParser(add_help=True, usage=usageexample, description='A Website Backup File Leak Scan Tool.')
parser.add_argument('-f', '--url-file', dest="url_file", help="Example: url.txt")
parser.add_argument('-t', '--thread', dest="max_threads", nargs='?', type=int, default=1, help="Max threads")
parser.add_argument('-u', '--url', dest='url', nargs='?', type=str, help="Example: http://www.example.com/")
parser.add_argument('-d', '--dict-file', dest='dict_file', nargs='?', help="Example: dict.txt")
args = parser.parse_args()
# Use the program default dictionary, Accurate scanning mode, Automatic dictionary generation based on domain name.
tmp_suffixFormat = ['.zip','.rar','.tar.gz','.tgz','.tar.bz2','.tar','.jar','.war','.7z','.bak','.sql','.gz','.sql.gz','.tar.tgz']
#77
tmp_info_dic = ['1','127.0.0.1','2010','2011','2012','2013','2014','2015','2016','2017','2018','2019','2020','2021','2022','2023','2024','2025','__zep__/js','admin','archive','asp','aspx','auth','back','backup','backups','bak','bbs','bin','clients','code','com','customers','dat','data','database','db','dump','engine','error_log','faisunzip','files','forum','home','html','index','joomla','js','jsp','local','localhost','master','media','members','my','mysql','new','old','orders','php','sales','site','sql','store','tar','test','user','users','vb','web','website','wordpress','wp','www','wwwroot','root']
#130
#tmp_info_dic = ['__zep__/js','0','00','000','012','1','111','123','127.0.0.1','2','2010','2011','2012','2013','2014','2015','2016','2017','2018','2019','2020','2021','2022','2023','2024','2025','234','3','333','4','444','5','555','6','666','7','777','8','888','9','999','a','about','admin','app','application','archive','asp','aspx','auth','b','back','backup','backups','bak','bbs','beifen','bin','cache','clients','code','com','config','core','customers','dat','data','database','db','download','dump','engine','error_log','extend','files','forum','ftp','home','html','img','include','index','install','joomla','js','jsp','local','login','localhost','master','media','members','my','mysql','new','old','orders','output','package','php','public','root','runtime','sales','server','shujuku','site','sjk','sql','store','tar','template','test','upload','user','users','vb','vendor','wangzhan','web','website','wordpress','wp','www','wwwroot','wz','数据库','数据库备份','网站','网站备份']
info_dic = []
for a in tmp_info_dic:
for b in tmp_suffixFormat:
info_dic.extend([a + b])
datefile = datetime.now().strftime('%Y%m%d_%H-%M-%S.txt')
headers = {'User-Agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36", }
timeout = 10
try:
if args.dict_file:
# Custom scan dictionary
# This mode is not recommended for bulk scans. It is prone to false positives and can reduce program efficiency.
custom_dict = list(set([i.replace("\n", "") for i in open(str(args.dict_file), "r").readlines()]))
info_dic.extend(custom_dict)
if args.url:
dispatcher(url=args.url, max_thread=args.max_threads, dic=info_dic)
elif args.url_file:
dispatcher(url_file=args.url_file, max_thread=args.max_threads, dic=info_dic)
else:
print("[!] Please specify a URL, or URL file name.")
except Exception as e:
print(e)
标签:工具分享, 敏感信息查询工具