ProjectWeb/Html5Prettifier.py

Wed, 30 Dec 2020 11:02:10 +0100

author
Detlev Offenbach <detlev@die-offenbachs.de>
date
Wed, 30 Dec 2020 11:02:10 +0100
changeset 34
a6d8718f37b5
parent 30
38092622e612
child 35
a3f1dcf94fe4
permissions
-rw-r--r--

Updated copyright for 2021.

5
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
1 # -*- coding: utf-8 -*-
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
2
34
a6d8718f37b5 Updated copyright for 2021.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 30
diff changeset
3 # Copyright (c) 2015 - 2021 Detlev Offenbach <detlev@die-offenbachs.de>
5
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
4 #
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
5
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
6 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
7 Module implementing a class to prettify HTML code.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
8 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
9
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
10 import re
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
11
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
12 from PyQt5.QtCore import QObject
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
13
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
14 import Preferences
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
15
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
16
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
17 class Html5Prettifier(QObject):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
18 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
19 Class implementing the HTML5 prettifier.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
20 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
21 def __init__(self, html, parent=None):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
22 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
23 Constructor
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
24
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
25 @param html HTML text to be prettified (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
26 @param parent reference to the parent object (QObject)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
27 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
28 super(Html5Prettifier, self).__init__(parent)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
29
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
30 self.__html = html
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
31 self.__indentWidth = Preferences.getEditor("IndentWidth")
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
32
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
33 def getPrettifiedHtml(self):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
34 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
35 Public method to prettify the HTML code.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
36
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
37 @return prettified HTML code (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
38 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
39 from bs4 import BeautifulSoup
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
40 soup = BeautifulSoup(self.__html, "html.parser")
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
41 prettyHtml = soup.prettify(formatter=self.tagPrettify)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
42 # prettify comments
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
43 prettyHtml = re.sub("<!--(.*?)-->", self.commentPrettify, prettyHtml,
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
44 flags=re.DOTALL)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
45 # indent all HTML
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
46 prettyHtml = re.sub("^( *)(.*)$",
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
47 r"{0}\2".format(r"\1" * self.__indentWidth),
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
48 prettyHtml, flags=re.MULTILINE)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
49
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
50 return prettyHtml
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
51
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
52 def tagPrettify(self, tag):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
53 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
54 Public method to prettify HTML tags.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
55
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
56 @param tag tag to be prettified (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
57 @return prettified tag (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
58 """
30
38092622e612 Removed support for Python2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 29
diff changeset
59 return re.sub(" {{1,{0}}}".format(self.__indentWidth), " ", tag,
38092622e612 Removed support for Python2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 29
diff changeset
60 flags=re.MULTILINE)
5
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
61
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
62 def commentPrettify(self, matchobj):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
63 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
64 Public method to prettify HTML comments.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
65
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
66 @param matchobj reference to the match object (re.MatchObject)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
67 @return prettified comment (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
68 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
69 if re.search("\n", matchobj.group()):
30
38092622e612 Removed support for Python2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 29
diff changeset
70 return self.tagPrettify(matchobj.group())
5
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
71 else:
30
38092622e612 Removed support for Python2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 29
diff changeset
72 return matchobj.group()

eric ide

mercurial