Wed, 13 Jul 2022 15:34:50 +0200
Revisions <no_multi_processing, Variables Viewer, with_python2> closed.
5310
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
1 | Chardet: The Universal Character Encoding Detector |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
2 | -------------------------------------------------- |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
3 | |
5714
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
4 | .. image:: https://img.shields.io/travis/chardet/chardet/stable.svg |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
5 | :alt: Build status |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
6 | :target: https://travis-ci.org/chardet/chardet |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
7 | |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
8 | .. image:: https://img.shields.io/coveralls/chardet/chardet/stable.svg |
5763
e2d839b69ff3
Updated chardet to version 3.0.4 and corrected the changelog file.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5714
diff
changeset
|
9 | :target: https://coveralls.io/r/chardet/chardet |
5714
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
10 | |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
11 | .. image:: https://img.shields.io/pypi/v/chardet.svg |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
12 | :target: https://warehouse.python.org/project/chardet/ |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
13 | :alt: Latest version on PyPI |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
14 | |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
15 | .. image:: https://img.shields.io/pypi/l/chardet.svg |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
16 | :alt: License |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
17 | |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
18 | |
5310
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
19 | Detects |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
20 | - ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants) |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
21 | - Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese) |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
22 | - EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese) |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
23 | - EUC-KR, ISO-2022-KR (Korean) |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
24 | - KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic) |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
25 | - ISO-8859-5, windows-1251 (Bulgarian) |
5714
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
26 | - ISO-8859-1, windows-1252 (Western European languages) |
5310
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
27 | - ISO-8859-7, windows-1253 (Greek) |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
28 | - ISO-8859-8, windows-1255 (Visual and Logical Hebrew) |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
29 | - TIS-620 (Thai) |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
30 | |
5714
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
31 | .. note:: |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
32 | Our ISO-8859-2 and windows-1250 (Hungarian) probers have been temporarily |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
33 | disabled until we can retrain the models. |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
34 | |
5763
e2d839b69ff3
Updated chardet to version 3.0.4 and corrected the changelog file.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5714
diff
changeset
|
35 | Requires Python 2.6, 2.7, or 3.3+. |
5310
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
36 | |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
37 | Installation |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
38 | ------------ |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
39 | |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
40 | Install from `PyPI <https://pypi.python.org/pypi/chardet>`_:: |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
41 | |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
42 | pip install chardet |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
43 | |
5714
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
44 | Documentation |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
45 | ------------- |
90c57b50600f
Updated chardet to 3.0.2.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5310
diff
changeset
|
46 | |
5763
e2d839b69ff3
Updated chardet to version 3.0.4 and corrected the changelog file.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
5714
diff
changeset
|
47 | For users, docs are now available at https://chardet.readthedocs.io/. |
5310
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
48 | |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
49 | Command-line Tool |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
50 | ----------------- |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
51 | |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
52 | chardet comes with a command-line script which reports on the encodings of one |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
53 | or more files:: |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
54 | |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
55 | % chardetect somefile someotherfile |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
56 | somefile: windows-1252 with confidence 0.5 |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
57 | someotherfile: ascii with confidence 1.0 |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
58 | |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
59 | About |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
60 | ----- |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
61 | |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
62 | This is a continuation of Mark Pilgrim's excellent chardet. Previously, two |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
63 | versions needed to be maintained: one that supported python 2.x and one that |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
64 | supported python 3.x. We've recently merged with `Ian Cordasco <https://github.com/sigmavirus24>`_'s |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
65 | `charade <https://github.com/sigmavirus24/charade>`_ fork, so now we have one |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
66 | coherent version that works for Python 2.6+. |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
67 | |
f2b774d78b4a
Updated chardet to version 2.3.0.
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
68 | :maintainer: Dan Blanchard |