eric6/ThirdParty/CharDet/chardet/enums.py

Wed, 13 Jul 2022 15:34:50 +0200

author
Detlev Offenbach <detlev@die-offenbachs.de>
date
Wed, 13 Jul 2022 15:34:50 +0200
branch
with_python2
changeset 9225
bf799f79455c
parent 6942
2602857055c5
permissions
-rw-r--r--

Revisions <no_multi_processing, Variables Viewer, with_python2> closed.

5714
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
1 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
2 All of the Enums that are used throughout the chardet package.
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
3
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
4 :author: Dan Blanchard (dan.blanchard@gmail.com)
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
5 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
6
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
7
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
8 class InputState(object):
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
9 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
10 This enum represents the different states a universal detector can be in.
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
11 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
12 PURE_ASCII = 0
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
13 ESC_ASCII = 1
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
14 HIGH_BYTE = 2
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
15
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
16
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
17 class LanguageFilter(object):
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
18 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
19 This enum represents the different language filters we can apply to a
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
20 ``UniversalDetector``.
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
21 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
22 CHINESE_SIMPLIFIED = 0x01
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
23 CHINESE_TRADITIONAL = 0x02
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
24 JAPANESE = 0x04
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
25 KOREAN = 0x08
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
26 NON_CJK = 0x10
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
27 ALL = 0x1F
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
28 CHINESE = CHINESE_SIMPLIFIED | CHINESE_TRADITIONAL
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
29 CJK = CHINESE | JAPANESE | KOREAN
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
30
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
31
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
32 class ProbingState(object):
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
33 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
34 This enum represents the different states a prober can be in.
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
35 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
36 DETECTING = 0
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
37 FOUND_IT = 1
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
38 NOT_ME = 2
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
39
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
40
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
41 class MachineState(object):
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
42 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
43 This enum represents the different states a state machine can be in.
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
44 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
45 START = 0
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
46 ERROR = 1
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
47 ITS_ME = 2
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
48
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
49
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
50 class SequenceLikelihood(object):
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
51 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
52 This enum represents the likelihood of a character following the previous one.
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
53 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
54 NEGATIVE = 0
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
55 UNLIKELY = 1
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
56 LIKELY = 2
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
57 POSITIVE = 3
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
58
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
59 @classmethod
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
60 def get_num_categories(cls):
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
61 """:returns: The number of likelihood categories in the enum."""
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
62 return 4
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
63
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
64
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
65 class CharacterCategory(object):
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
66 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
67 This enum represents the different categories language models for
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
68 ``SingleByteCharsetProber`` put characters into.
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
69
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
70 Anything less than CONTROL is considered a letter.
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
71 """
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
72 UNDEFINED = 255
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
73 LINE_BREAK = 254
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
74 SYMBOL = 253
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
75 DIGIT = 252
90c57b50600f Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
76 CONTROL = 251

eric ide

mercurial