Python爬虫反爬---前端JS对url参数的加密,Python,Java解密算法
原标题:Python爬虫反爬---前端JS对url参数的加密,Python,Java解密算法
原文来自:CSDN 原文链接:https://blog.csdn.net/qq_36853469/article/details/103364412
只给定初始值

加密后的URl:

加密前后:
加密前能拿到:4028492d6e647e23016eb10507286507
被处理成:
2576c1666c61673d33266e616d653d47435f4a59266b65793d3430323834393264366536343765323330313665623130353037323836353037
加密后:
http://ggzy.haikou.gov.cn/login.do?method=newDetail¶m=
2576c1666c61673d33266e616d653d47435f4a59266b65793d3430323834393264366536343765323330313665623130353037323836353037

初略过一遍后确定加密JS

再参考博客,链接:https://blog.csdn.net/brantni/article/details/48025479,确定这就是加密方法,参考JS+Java后台解密方法,
# -*- coding: UTF-8 -*-
'''
@Author :Jason
解密前:4028492d6e647e23016eb10507286507
加密后的第二段:2576c1666c61673d33266e616d653d47435f4a59266b65793d 3430323834393264366536343765323330313665623130353037323836353037
'''
'''
用Python实现解密方法
'''
import random
import re
import time
class DecgyptHN(object):
def __init__(self):
self.originStr = "4028492d6e647e23016eafe1e4f75359"
self.first_url = "http://ggzy.haikou.gov.cn/login.do?method=newDetail¶m="
self.second_random = str(random.randint(100, 999))
self.third_url = "6c1666c61673d33266e616d653d5a435f4a59266b65793d"
def dectypt(self,originStr):
if re.findall(r'[a-zA-Z]*',originStr) or re.findall(r'[_-+.]*',originStr):
tempList = []
# for i in range(0,len(originStr)):
# time.sleep(3)
# originStr[i] = ord(originStr[i]) #转换成Unicode编码
# originStr[i] = self.toHex(originStr[i])#转换成16进制
# temp = temp + originStr[i]
# i += 1
#21行报错, 字符串是不可变数据类型
for i in range(0,len(originStr)):
t = list(originStr)
t[0] = ord(originStr[i]) #转换成Unicode编码
try:
t[0] = self.toHex(int(t[0]))#转换成16进制
except:
t[0] = self.toHex(t[0])
tempList.append(t[0])
i += 1
temp = "".join('%s' % id for id in tempList) # 不能直接join,因为列表中有整数,不然会报错
return temp+"{1"
else: #如果全是数字,直接转换成16进制
temp = self.toHex(originStr)
# print(originStr)
return temp+"{0"
def toHex(self, num):
"""
:type num: int
:rtype: str
"""
chaDic = {10: 'a', 11: 'b', 12: 'c', 13: 'd', 14: 'e', 15: 'f'}
if num >= 0:
hexStr = ""
while num >= 16:
rest = num % 16
hexStr = chaDic.get(rest, str(rest)) + hexStr
num //= 16
hexStr = chaDic.get(num, str(num)) + hexStr
return hexStr
else:
if num == -2147483648: # 特殊情况,负数最大值
return "80000000"
num = -num # 负数取反
bitList = [0] * 31
tail = 30
while num >= 2: # 数字转二进制
rest = num % 2
bitList[tail] = rest
tail -= 1
num //= 2
bitList[tail] = num
for i in range(31): # 反码
bitList[i] = 1 if bitList[i] == 0 else 0
tail = 30
add = 1
while add + bitList[tail] == 2: # 反码加1
bitList[tail] = 0
tail -= 1
bitList[tail] = 1
bitList = [1] + bitList # 添加负号
hexStr = ""
for i in range(0, 32, 4): # 二进制转16进制
add = 0
for j in range(0, 4):
add += bitList[i + j] * 2 ** (3 - j)
hexStr += chaDic.get(add, str(add))
return hexStr
def main(self):
fourth_url = self.dectypt(self.originStr)
full_url = self.first_url + self.second_random + self.third_url + fourth_url
return full_url
if __name__ == "__main__":
HN = DecgyptHN()
full_url = HN.main()
print(full_url)免责声明:本文来自互联网新闻客户端自媒体,不代表本网的观点和立场。
合作及投稿邮箱:E-mail:editor@tusaishared.com
下一篇:文本特征提取
热门资源
Python 爬虫(二)...
所谓爬虫就是模拟客户端发送网络请求,获取网络响...
TensorFlow从1到2...
原文第四篇中,我们介绍了官方的入门案例MNIST,功...
TensorFlow从1到2...
“回归”这个词,既是Regression算法的名称,也代表...
NLP自然语言处理的...
NLP自然语言处理的开发环境搭建
机器学习常用性能...
我们以图片分类来举例,当然换成文本、语音等也是...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com