src/eric7/WebBrowser/SafeBrowsing/SafeBrowsingUrl.py

Tue, 23 Apr 2024 11:26:04 +0200

author
Detlev Offenbach <detlev@die-offenbachs.de>
date
Tue, 23 Apr 2024 11:26:04 +0200
branch
eric7
changeset 10692
9becf9ca115c
parent 10503
6a37b6ac3928
child 11090
f5f5f5803935
permissions
-rw-r--r--

Changed the source code and the source code documentation to improve the indication of unused method/function arguments.

5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
1 # -*- coding: utf-8 -*-
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
2
10439
21c28b0f9e41 Updated copyright for 2024.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9653
diff changeset
3 # Copyright (c) 2017 - 2024 Detlev Offenbach <detlev@die-offenbachs.de>
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
4 #
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
5
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
6 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
7 Module implementing an URL representation suitable for Google Safe Browsing.
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
8 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
9
9473
3f23dbf37dbe Resorted the import statements using isort.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9462
diff changeset
10 import contextlib
3f23dbf37dbe Resorted the import statements using isort.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9462
diff changeset
11 import hashlib
3f23dbf37dbe Resorted the import statements using isort.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9462
diff changeset
12 import posixpath
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
13 import re
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
14 import socket
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
15 import struct
7192
a22eee00b052 Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 6942
diff changeset
16 import urllib.parse
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
17
9413
80c06d472826 Changed the eric7 import statements to include the package name (i.e. eric7) in order to not fiddle with sys.path.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9221
diff changeset
18 from eric7 import Preferences
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
19
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
20
8207
d359172d11be Applied some more code simplifications suggested by the new Simplify checker.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 7988
diff changeset
21 class SafeBrowsingUrl:
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
22 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
23 Class implementing an URL representation suitable for Google Safe Browsing.
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
24 """
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
25
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
26 #
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
27 # Modeled after the URL class of the gglsbl package.
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
28 # https://github.com/afilipovich/gglsbl
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
29 #
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
30 def __init__(self, url):
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
31 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
32 Constructor
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
33
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
34 @param url URL to be embedded
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
35 @type str
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
36 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
37 self.__url = url
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
38
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
39 def hashes(self):
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
40 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
41 Public method to get the hashes of all possible permutations of the URL
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
42 in canonical form.
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
43
7988
c4c17121eff8 Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 7923
diff changeset
44 @yield URL hashes
c4c17121eff8 Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 7923
diff changeset
45 @ytype bytes
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
46 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
47 for variant in self.permutations(self.canonical()):
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
48 urlHash = self.digest(variant)
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
49 yield urlHash
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
50
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
51 def canonical(self):
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
52 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
53 Public method to convert the URL to the canonical form.
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
54
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
55 @return canonical form of the URL
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
56 @rtype str
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
57 """
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
58
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
59 def fullUnescape(u):
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
60 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
61 Method to recursively unescape an URL.
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
62
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
63 @param u URL string to unescape
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
64 @type str
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
65 @return unescaped URL string
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
66 @rtype str
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
67 """
7192
a22eee00b052 Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 6942
diff changeset
68 uu = urllib.parse.unquote(u)
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
69 if uu == u:
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
70 return uu
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
71 else:
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
72 return fullUnescape(uu)
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
73
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
74 def quote(s):
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
75 """
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
76 Method to quote a string.
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
77
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
78 @param string to be quoted
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
79 @type str
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
80 @return quoted string
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
81 @rtype str
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
82 """
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
83 safeChars = "!\"$&'()*+,-./:;<=>?@[\\]^_`{|}~"
7192
a22eee00b052 Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 6942
diff changeset
84 return urllib.parse.quote(s, safe=safeChars)
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
85
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
86 url = self.__url.strip()
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
87 url = url.replace("\n", "").replace("\r", "").replace("\t", "")
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
88 url = url.split("#", 1)[0]
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
89 if url.startswith("//"):
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
90 url = Preferences.getWebBrowser("DefaultScheme")[:-3] + url
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
91 if len(url.split("://")) <= 1:
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
92 url = Preferences.getWebBrowser("DefaultScheme") + url
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
93 url = quote(fullUnescape(url))
8561
641304b46f08 Next batch of changes for QtWebEngine as of Qt 6.2.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 8312
diff changeset
94 urlParts = urllib.parse.urlsplit(url)
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
95 if not urlParts[0]:
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
96 url = Preferences.getWebBrowser("DefaultScheme") + url
8561
641304b46f08 Next batch of changes for QtWebEngine as of Qt 6.2.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 8312
diff changeset
97 urlParts = urllib.parse.urlsplit(url)
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
98 protocol = urlParts.scheme
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
99 host = fullUnescape(urlParts.hostname)
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
100 path = fullUnescape(urlParts.path)
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
101 query = urlParts.query
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
102 if not query and "?" not in url:
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
103 query = None
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
104 if not path:
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
105 path = "/"
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
106 path = posixpath.normpath(path).replace("//", "/")
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
107 if path[-1] != "/":
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
108 path += "/"
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
109 port = urlParts.port
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
110 host = host.strip(".")
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
111 host = re.sub(r"\.+", ".", host).lower()
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
112 if host.isdigit():
9462
e65379fdbd97 Changed code to resolve or acknowledge some potential security issues.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9413
diff changeset
113 with contextlib.suppress(OSError):
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
114 host = socket.inet_ntoa(struct.pack("!I", int(host)))
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
115 if host.startswith("0x") and "." not in host:
9462
e65379fdbd97 Changed code to resolve or acknowledge some potential security issues.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9413
diff changeset
116 with contextlib.suppress(OSError):
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
117 host = socket.inet_ntoa(struct.pack("!I", int(host, 16)))
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
118 quotedPath = quote(path)
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
119 quotedHost = quote(host)
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
120 if port is not None:
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
121 quotedHost = "{0}:{1}".format(quotedHost, port)
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
122 canonicalUrl = "{0}://{1}{2}".format(protocol, quotedHost, quotedPath)
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
123 if query is not None:
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
124 canonicalUrl = "{0}?{1}".format(canonicalUrl, query)
5808
7bf90dcae4e1 Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
125 return canonicalUrl
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
126
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
127 @staticmethod
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
128 def permutations(url):
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
129 """
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
130 Static method to determine all permutations of host name and path
10503
6a37b6ac3928 Renamed some modules/variables/settings to get rid (mostly) of inappropriate words.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 10439
diff changeset
131 which can be applied to blocked URLs.
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
132
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
133 @param url URL string to be permuted
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
134 @type str
7988
c4c17121eff8 Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 7923
diff changeset
135 @yield permutated URL strings
c4c17121eff8 Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 7923
diff changeset
136 @ytype str
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
137 """
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
138
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
139 def hostPermutations(host):
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
140 """
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
141 Method to generate the permutations of the host name.
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
142
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
143 @param host host name
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
144 @type str
7988
c4c17121eff8 Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 7923
diff changeset
145 @yield permutated host names
c4c17121eff8 Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 7923
diff changeset
146 @ytype str
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
147 """
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
148 if re.match(r"\d+\.\d+\.\d+\.\d+", host):
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
149 yield host
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
150 return
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
151 parts = host.split(".")
5811
5358a3c7995f Done implementing the SafeBrowsingAPIClient class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5809
diff changeset
152 partsLen = min(len(parts), 5)
5358a3c7995f Done implementing the SafeBrowsingAPIClient class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5809
diff changeset
153 if partsLen > 4:
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
154 yield host
5811
5358a3c7995f Done implementing the SafeBrowsingAPIClient class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5809
diff changeset
155 for i in range(partsLen - 1):
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
156 yield ".".join(parts[i - partsLen :])
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
157
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
158 def pathPermutations(path):
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
159 """
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
160 Method to generate the permutations of the path.
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
161
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
162 @param path path to be processed
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
163 @type str
7988
c4c17121eff8 Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 7923
diff changeset
164 @yield permutated paths
c4c17121eff8 Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 7923
diff changeset
165 @ytype str
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
166 """
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
167 yield path
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
168 query = None
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
169 if "?" in path:
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
170 path, query = path.split("?", 1)
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
171 if query is not None:
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
172 yield path
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
173 pathParts = path.split("/")[0:-1]
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
174 curPath = ""
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
175 for i in range(min(4, len(pathParts))):
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
176 curPath = curPath + pathParts[i] + "/"
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
177 yield curPath
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
178
7192
a22eee00b052 Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 6942
diff changeset
179 protocol, addressStr = urllib.parse.splittype(url)
a22eee00b052 Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 6942
diff changeset
180 host, path = urllib.parse.splithost(addressStr)
10692
9becf9ca115c Changed the source code and the source code documentation to improve the indication of unused method/function arguments.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 10503
diff changeset
181 _user, host = urllib.parse.splituser(host)
7192
a22eee00b052 Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 6942
diff changeset
182 host, port = urllib.parse.splitport(host)
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
183 host = host.strip("/")
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
184 seenPermutations = set()
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
185 for h in hostPermutations(host):
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
186 for p in pathPermutations(path):
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
187 u = "{0}{1}".format(h, p)
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
188 if u not in seenPermutations:
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
189 yield u
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
190 seenPermutations.add(u)
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
191
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
192 @staticmethod
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
193 def digest(url):
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
194 """
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
195 Static method to calculate the SHA256 digest of an URL string.
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
196
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
197 @param url URL string
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
198 @type str
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
199 @return SHA256 digest of the URL string
5817
a5f6c9128500 Started implementing the SafeBrowsingCache class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5811
diff changeset
200 @rtype bytes
5809
5b53c17b7d93 Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 5808
diff changeset
201 """
9221
bf71ee032bb4 Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 9209
diff changeset
202 return hashlib.sha256(url.encode("utf-8")).digest()

eric ide

mercurial