Tue, 23 Apr 2024 11:26:04 +0200
Changed the source code and the source code documentation to improve the indication of unused method/function arguments.
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
1 | # -*- coding: utf-8 -*- |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
2 | |
10439
21c28b0f9e41
Updated copyright for 2024.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9653
diff
changeset
|
3 | # Copyright (c) 2017 - 2024 Detlev Offenbach <detlev@die-offenbachs.de> |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
4 | # |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
5 | |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
6 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
7 | Module implementing an URL representation suitable for Google Safe Browsing. |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
8 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
9 | |
9473
3f23dbf37dbe
Resorted the import statements using isort.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9462
diff
changeset
|
10 | import contextlib |
3f23dbf37dbe
Resorted the import statements using isort.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9462
diff
changeset
|
11 | import hashlib |
3f23dbf37dbe
Resorted the import statements using isort.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9462
diff
changeset
|
12 | import posixpath |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
13 | import re |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
14 | import socket |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
15 | import struct |
7192
a22eee00b052
Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
6942
diff
changeset
|
16 | import urllib.parse |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
17 | |
9413
80c06d472826
Changed the eric7 import statements to include the package name (i.e. eric7) in order to not fiddle with sys.path.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9221
diff
changeset
|
18 | from eric7 import Preferences |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
19 | |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
20 | |
8207
d359172d11be
Applied some more code simplifications suggested by the new Simplify checker.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
7988
diff
changeset
|
21 | class SafeBrowsingUrl: |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
22 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
23 | Class implementing an URL representation suitable for Google Safe Browsing. |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
24 | """ |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
25 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
26 | # |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
27 | # Modeled after the URL class of the gglsbl package. |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
28 | # https://github.com/afilipovich/gglsbl |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
29 | # |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
30 | def __init__(self, url): |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
31 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
32 | Constructor |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
33 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
34 | @param url URL to be embedded |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
35 | @type str |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
36 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
37 | self.__url = url |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
38 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
39 | def hashes(self): |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
40 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
41 | Public method to get the hashes of all possible permutations of the URL |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
42 | in canonical form. |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
43 | |
7988
c4c17121eff8
Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
7923
diff
changeset
|
44 | @yield URL hashes |
c4c17121eff8
Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
7923
diff
changeset
|
45 | @ytype bytes |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
46 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
47 | for variant in self.permutations(self.canonical()): |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
48 | urlHash = self.digest(variant) |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
49 | yield urlHash |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
50 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
51 | def canonical(self): |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
52 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
53 | Public method to convert the URL to the canonical form. |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
54 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
55 | @return canonical form of the URL |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
56 | @rtype str |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
57 | """ |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
58 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
59 | def fullUnescape(u): |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
60 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
61 | Method to recursively unescape an URL. |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
62 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
63 | @param u URL string to unescape |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
64 | @type str |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
65 | @return unescaped URL string |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
66 | @rtype str |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
67 | """ |
7192
a22eee00b052
Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
6942
diff
changeset
|
68 | uu = urllib.parse.unquote(u) |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
69 | if uu == u: |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
70 | return uu |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
71 | else: |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
72 | return fullUnescape(uu) |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
73 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
74 | def quote(s): |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
75 | """ |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
76 | Method to quote a string. |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
77 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
78 | @param string to be quoted |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
79 | @type str |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
80 | @return quoted string |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
81 | @rtype str |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
82 | """ |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
83 | safeChars = "!\"$&'()*+,-./:;<=>?@[\\]^_`{|}~" |
7192
a22eee00b052
Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
6942
diff
changeset
|
84 | return urllib.parse.quote(s, safe=safeChars) |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
85 | |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
86 | url = self.__url.strip() |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
87 | url = url.replace("\n", "").replace("\r", "").replace("\t", "") |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
88 | url = url.split("#", 1)[0] |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
89 | if url.startswith("//"): |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
90 | url = Preferences.getWebBrowser("DefaultScheme")[:-3] + url |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
91 | if len(url.split("://")) <= 1: |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
92 | url = Preferences.getWebBrowser("DefaultScheme") + url |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
93 | url = quote(fullUnescape(url)) |
8561
641304b46f08
Next batch of changes for QtWebEngine as of Qt 6.2.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
8312
diff
changeset
|
94 | urlParts = urllib.parse.urlsplit(url) |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
95 | if not urlParts[0]: |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
96 | url = Preferences.getWebBrowser("DefaultScheme") + url |
8561
641304b46f08
Next batch of changes for QtWebEngine as of Qt 6.2.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
8312
diff
changeset
|
97 | urlParts = urllib.parse.urlsplit(url) |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
98 | protocol = urlParts.scheme |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
99 | host = fullUnescape(urlParts.hostname) |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
100 | path = fullUnescape(urlParts.path) |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
101 | query = urlParts.query |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
102 | if not query and "?" not in url: |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
103 | query = None |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
104 | if not path: |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
105 | path = "/" |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
106 | path = posixpath.normpath(path).replace("//", "/") |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
107 | if path[-1] != "/": |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
108 | path += "/" |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
109 | port = urlParts.port |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
110 | host = host.strip(".") |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
111 | host = re.sub(r"\.+", ".", host).lower() |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
112 | if host.isdigit(): |
9462
e65379fdbd97
Changed code to resolve or acknowledge some potential security issues.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9413
diff
changeset
|
113 | with contextlib.suppress(OSError): |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
114 | host = socket.inet_ntoa(struct.pack("!I", int(host))) |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
115 | if host.startswith("0x") and "." not in host: |
9462
e65379fdbd97
Changed code to resolve or acknowledge some potential security issues.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9413
diff
changeset
|
116 | with contextlib.suppress(OSError): |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
117 | host = socket.inet_ntoa(struct.pack("!I", int(host, 16))) |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
118 | quotedPath = quote(path) |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
119 | quotedHost = quote(host) |
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
120 | if port is not None: |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
121 | quotedHost = "{0}:{1}".format(quotedHost, port) |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
122 | canonicalUrl = "{0}://{1}{2}".format(protocol, quotedHost, quotedPath) |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
123 | if query is not None: |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
124 | canonicalUrl = "{0}?{1}".format(canonicalUrl, query) |
5808
7bf90dcae4e1
Started implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
125 | return canonicalUrl |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
126 | |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
127 | @staticmethod |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
128 | def permutations(url): |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
129 | """ |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
130 | Static method to determine all permutations of host name and path |
10503
6a37b6ac3928
Renamed some modules/variables/settings to get rid (mostly) of inappropriate words.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
10439
diff
changeset
|
131 | which can be applied to blocked URLs. |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
132 | |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
133 | @param url URL string to be permuted |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
134 | @type str |
7988
c4c17121eff8
Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
7923
diff
changeset
|
135 | @yield permutated URL strings |
c4c17121eff8
Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
7923
diff
changeset
|
136 | @ytype str |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
137 | """ |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
138 | |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
139 | def hostPermutations(host): |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
140 | """ |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
141 | Method to generate the permutations of the host name. |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
142 | |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
143 | @param host host name |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
144 | @type str |
7988
c4c17121eff8
Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
7923
diff
changeset
|
145 | @yield permutated host names |
c4c17121eff8
Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
7923
diff
changeset
|
146 | @ytype str |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
147 | """ |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
148 | if re.match(r"\d+\.\d+\.\d+\.\d+", host): |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
149 | yield host |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
150 | return |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
151 | parts = host.split(".") |
5811
5358a3c7995f
Done implementing the SafeBrowsingAPIClient class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5809
diff
changeset
|
152 | partsLen = min(len(parts), 5) |
5358a3c7995f
Done implementing the SafeBrowsingAPIClient class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5809
diff
changeset
|
153 | if partsLen > 4: |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
154 | yield host |
5811
5358a3c7995f
Done implementing the SafeBrowsingAPIClient class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5809
diff
changeset
|
155 | for i in range(partsLen - 1): |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
156 | yield ".".join(parts[i - partsLen :]) |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
157 | |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
158 | def pathPermutations(path): |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
159 | """ |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
160 | Method to generate the permutations of the path. |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
161 | |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
162 | @param path path to be processed |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
163 | @type str |
7988
c4c17121eff8
Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
7923
diff
changeset
|
164 | @yield permutated paths |
c4c17121eff8
Updated source code documentation with the new tags.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
7923
diff
changeset
|
165 | @ytype str |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
166 | """ |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
167 | yield path |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
168 | query = None |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
169 | if "?" in path: |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
170 | path, query = path.split("?", 1) |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
171 | if query is not None: |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
172 | yield path |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
173 | pathParts = path.split("/")[0:-1] |
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
174 | curPath = "" |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
175 | for i in range(min(4, len(pathParts))): |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
176 | curPath = curPath + pathParts[i] + "/" |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
177 | yield curPath |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
178 | |
7192
a22eee00b052
Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
6942
diff
changeset
|
179 | protocol, addressStr = urllib.parse.splittype(url) |
a22eee00b052
Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
6942
diff
changeset
|
180 | host, path = urllib.parse.splithost(addressStr) |
10692
9becf9ca115c
Changed the source code and the source code documentation to improve the indication of unused method/function arguments.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
10503
diff
changeset
|
181 | _user, host = urllib.parse.splituser(host) |
7192
a22eee00b052
Started removing runtime support for Python2 and PyQt4.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
6942
diff
changeset
|
182 | host, port = urllib.parse.splitport(host) |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
183 | host = host.strip("/") |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
184 | seenPermutations = set() |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
185 | for h in hostPermutations(host): |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
186 | for p in pathPermutations(path): |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
187 | u = "{0}{1}".format(h, p) |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
188 | if u not in seenPermutations: |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
189 | yield u |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
190 | seenPermutations.add(u) |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
191 | |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
192 | @staticmethod |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
193 | def digest(url): |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
194 | """ |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
195 | Static method to calculate the SHA256 digest of an URL string. |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
196 | |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
197 | @param url URL string |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
198 | @type str |
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
199 | @return SHA256 digest of the URL string |
5817
a5f6c9128500
Started implementing the SafeBrowsingCache class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5811
diff
changeset
|
200 | @rtype bytes |
5809
5b53c17b7d93
Done implementing the SafeBrowsingUrl class.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5808
diff
changeset
|
201 | """ |
9221
bf71ee032bb4
Reformatted the source code using the 'Black' utility.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
9209
diff
changeset
|
202 | return hashlib.sha256(url.encode("utf-8")).digest() |