ProjectWeb/Html5Prettifier.py

Wed, 01 Jan 2020 11:58:59 +0100

author
Detlev Offenbach <detlev@die-offenbachs.de>
date
Wed, 01 Jan 2020 11:58:59 +0100
changeset 29
38577502d613
parent 28
ed9d3d3857af
child 30
38092622e612
permissions
-rw-r--r--

Updated copyright for 2020.

5
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
1 # -*- coding: utf-8 -*-
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
2
29
38577502d613 Updated copyright for 2020.
Detlev Offenbach <detlev@die-offenbachs.de>
parents: 28
diff changeset
3 # Copyright (c) 2015 - 2020 Detlev Offenbach <detlev@die-offenbachs.de>
5
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
4 #
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
5
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
6 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
7 Module implementing a class to prettify HTML code.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
8 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
9
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
10 from __future__ import unicode_literals
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
11
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
12 import re
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
13
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
14
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
15 from PyQt5.QtCore import QObject
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
16
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
17 import Preferences
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
18
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
19
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
20 class Html5Prettifier(QObject):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
21 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
22 Class implementing the HTML5 prettifier.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
23 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
24 def __init__(self, html, parent=None):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
25 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
26 Constructor
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
27
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
28 @param html HTML text to be prettified (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
29 @param parent reference to the parent object (QObject)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
30 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
31 super(Html5Prettifier, self).__init__(parent)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
32
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
33 self.__html = html
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
34 self.__indentWidth = Preferences.getEditor("IndentWidth")
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
35
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
36 def getPrettifiedHtml(self):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
37 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
38 Public method to prettify the HTML code.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
39
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
40 @return prettified HTML code (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
41 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
42 from bs4 import BeautifulSoup
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
43 soup = BeautifulSoup(self.__html, "html.parser")
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
44 prettyHtml = soup.prettify(formatter=self.tagPrettify)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
45 # prettify comments
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
46 prettyHtml = re.sub("<!--(.*?)-->", self.commentPrettify, prettyHtml,
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
47 flags=re.DOTALL)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
48 # indent all HTML
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
49 prettyHtml = re.sub("^( *)(.*)$",
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
50 r"{0}\2".format(r"\1" * self.__indentWidth),
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
51 prettyHtml, flags=re.MULTILINE)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
52
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
53 return prettyHtml
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
54
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
55 def tagPrettify(self, tag):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
56 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
57 Public method to prettify HTML tags.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
58
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
59 @param tag tag to be prettified (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
60 @return prettified tag (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
61 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
62 result = re.sub(" {{1,{0}}}".format(self.__indentWidth), " ", tag,
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
63 flags=re.MULTILINE)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
64 return result
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
65
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
66 def commentPrettify(self, matchobj):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
67 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
68 Public method to prettify HTML comments.
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
69
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
70 @param matchobj reference to the match object (re.MatchObject)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
71 @return prettified comment (string)
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
72 """
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
73 if re.search("\n", matchobj.group()):
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
74 result = self.tagPrettify(matchobj.group())
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
75 else:
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
76 result = matchobj.group()
31bc1ef6f624 Added the HTML prettifier.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
77 return result

eric ide

mercurial