Python爬虫反爬---前端JS对url参数的加密,Python,Java解密算法
原标题:Python爬虫反爬---前端JS对url参数的加密,Python,Java解密算法
原文来自:CSDN 原文链接:https://blog.csdn.net/qq_36853469/article/details/103364412
只给定初始值
加密后的URl:
加密前后:
加密前能拿到:4028492d6e647e23016eb10507286507
被处理成:
2576c1666c61673d33266e616d653d47435f4a59266b65793d3430323834393264366536343765323330313665623130353037323836353037
加密后:
http://ggzy.haikou.gov.cn/login.do?method=newDetail¶m=
2576c1666c61673d33266e616d653d47435f4a59266b65793d3430323834393264366536343765323330313665623130353037323836353037
初略过一遍后确定加密JS
再参考博客,链接:https://blog.csdn.net/brantni/article/details/48025479,确定这就是加密方法,参考JS+Java后台解密方法,
# -*- coding: UTF-8 -*- ''' @Author :Jason 解密前:4028492d6e647e23016eb10507286507 加密后的第二段:2576c1666c61673d33266e616d653d47435f4a59266b65793d 3430323834393264366536343765323330313665623130353037323836353037 ''' ''' 用Python实现解密方法 ''' import random import re import time class DecgyptHN(object): def __init__(self): self.originStr = "4028492d6e647e23016eafe1e4f75359" self.first_url = "http://ggzy.haikou.gov.cn/login.do?method=newDetail¶m=" self.second_random = str(random.randint(100, 999)) self.third_url = "6c1666c61673d33266e616d653d5a435f4a59266b65793d" def dectypt(self,originStr): if re.findall(r'[a-zA-Z]*',originStr) or re.findall(r'[_-+.]*',originStr): tempList = [] # for i in range(0,len(originStr)): # time.sleep(3) # originStr[i] = ord(originStr[i]) #转换成Unicode编码 # originStr[i] = self.toHex(originStr[i])#转换成16进制 # temp = temp + originStr[i] # i += 1 #21行报错, 字符串是不可变数据类型 for i in range(0,len(originStr)): t = list(originStr) t[0] = ord(originStr[i]) #转换成Unicode编码 try: t[0] = self.toHex(int(t[0]))#转换成16进制 except: t[0] = self.toHex(t[0]) tempList.append(t[0]) i += 1 temp = "".join('%s' % id for id in tempList) # 不能直接join,因为列表中有整数,不然会报错 return temp+"{1" else: #如果全是数字,直接转换成16进制 temp = self.toHex(originStr) # print(originStr) return temp+"{0" def toHex(self, num): """ :type num: int :rtype: str """ chaDic = {10: 'a', 11: 'b', 12: 'c', 13: 'd', 14: 'e', 15: 'f'} if num >= 0: hexStr = "" while num >= 16: rest = num % 16 hexStr = chaDic.get(rest, str(rest)) + hexStr num //= 16 hexStr = chaDic.get(num, str(num)) + hexStr return hexStr else: if num == -2147483648: # 特殊情况,负数最大值 return "80000000" num = -num # 负数取反 bitList = [0] * 31 tail = 30 while num >= 2: # 数字转二进制 rest = num % 2 bitList[tail] = rest tail -= 1 num //= 2 bitList[tail] = num for i in range(31): # 反码 bitList[i] = 1 if bitList[i] == 0 else 0 tail = 30 add = 1 while add + bitList[tail] == 2: # 反码加1 bitList[tail] = 0 tail -= 1 bitList[tail] = 1 bitList = [1] + bitList # 添加负号 hexStr = "" for i in range(0, 32, 4): # 二进制转16进制 add = 0 for j in range(0, 4): add += bitList[i + j] * 2 ** (3 - j) hexStr += chaDic.get(add, str(add)) return hexStr def main(self): fourth_url = self.dectypt(self.originStr) full_url = self.first_url + self.second_random + self.third_url + fourth_url return full_url if __name__ == "__main__": HN = DecgyptHN() full_url = HN.main() print(full_url)
免责声明:本文来自互联网新闻客户端自媒体,不代表本网的观点和立场。
合作及投稿邮箱:E-mail:editor@tusaishared.com
下一篇:文本特征提取
热门资源
Python 爬虫(二)...
所谓爬虫就是模拟客户端发送网络请求,获取网络响...
TensorFlow从1到2...
原文第四篇中,我们介绍了官方的入门案例MNIST,功...
TensorFlow从1到2...
“回归”这个词,既是Regression算法的名称,也代表...
TensorFlow2.0(10...
前面的博客中我们说过,在加载数据和预处理数据时...
机器学习中的熵、...
熵 (entropy) 这一词最初来源于热力学。1948年,克...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com