Mon, 28 Dec 2009 16:03:33 +0000
Started porting eric4 to Python3
0
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
2 | <html lang="en"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
3 | <head> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
4 | <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
5 | <title>Supported encodings [Universal Encoding Detector]</title> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
6 | <link rel="stylesheet" href="css/chardet.css" type="text/css"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
7 | <link rev="made" href="mailto:mark@diveintomark.org"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
8 | <meta name="generator" content="DocBook XSL Stylesheets V1.65.1"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
9 | <meta name="keywords" content="character, set, encoding, detection, Python, XML, feed"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
10 | <link rel="start" href="index.html" title="Documentation"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
11 | <link rel="up" href="index.html" title="Documentation"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
12 | <link rel="prev" href="faq.html" title="Frequently asked questions"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
13 | <link rel="next" href="usage.html" title="Usage"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
14 | </head> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
15 | <body id="chardet-feedparser-org" class="docs"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
16 | <div class="z" id="intro"><div class="sectionInner"><div class="sectionInner2"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
17 | <div class="s" id="pageHeader"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
18 | <h1><a href="/">Universal Encoding Detector</a></h1> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
19 | <p>Character encoding auto-detection in Python. As smart as your browser. Open source.</p> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
20 | </div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
21 | <div class="s" id="quickSummary"><ul> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
22 | <li class="li1"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
23 | <a href="http://chardet.feedparser.org/download/">Download</a> ·</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
24 | <li class="li2"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
25 | <a href="index.html">Documentation</a> ·</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
26 | <li class="li3"><a href="faq.html" title="Frequently Asked Questions">FAQ</a></li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
27 | </ul></div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
28 | </div></div></div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
29 | <div id="main"><div id="mainInner"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
30 | <p id="breadcrumb">You are here: <a href="index.html">Documentation</a> → <span class="thispage">Supported encodings</span></p> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
31 | <div class="section" lang="en"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
32 | <div class="titlepage"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
33 | <div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
34 | <div><h2 class="title"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
35 | <a name="encodings" class="skip" href="#encodings" title="link to this section"><img src="images/permalink.gif" alt="[link]" title="link to this section" width="8" height="9"></a> Supported encodings</h2></div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
36 | <div><div class="abstract"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
37 | <h3 class="title"></h3> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
38 | <p><span class="application">Universal Encoding Detector</span> currently supports over two dozen character encodings.</p> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
39 | </div></div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
40 | </div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
41 | <div></div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
42 | </div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
43 | <div class="itemizedlist"><ul> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
44 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
45 | <tt class="literal">Big5</tt>, <tt class="literal">GB2312</tt>/<tt class="literal">GB18030</tt>, <tt class="literal">EUC-TW</tt>, <tt class="literal">HZ-GB-2312</tt>, and <tt class="literal">ISO-2022-CN</tt> (Traditional and Simplified Chinese)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
46 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
47 | <tt class="literal">EUC-JP</tt>, <tt class="literal">SHIFT_JIS</tt>, and <tt class="literal">ISO-2022-JP</tt> (Japanese)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
48 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
49 | <tt class="literal">EUC-KR</tt> and <tt class="literal">ISO-2022-KR</tt> (Korean)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
50 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
51 | <tt class="literal">KOI8-R</tt>, <tt class="literal">MacCyrillic</tt>, <tt class="literal">IBM855</tt>, <tt class="literal">IBM866</tt>, <tt class="literal">ISO-8859-5</tt>, and <tt class="literal">windows-1251</tt> (Russian)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
52 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
53 | <tt class="literal">ISO-8859-2</tt> and <tt class="literal">windows-1250</tt> (Hungarian)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
54 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
55 | <tt class="literal">ISO-8859-5</tt> and <tt class="literal">windows-1251</tt> (Bulgarian)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
56 | <li><tt class="literal">windows-1252</tt></li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
57 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
58 | <tt class="literal">ISO-8859-7</tt> and <tt class="literal">windows-1253</tt> (Greek)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
59 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
60 | <tt class="literal">ISO-8859-8</tt> and <tt class="literal">windows-1255</tt> (Visual and Logical Hebrew)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
61 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
62 | <tt class="literal">TIS-620</tt> (Thai)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
63 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
64 | <tt class="literal">UTF-32</tt> <acronym title="Big Endian">BE</acronym>, <acronym title="Little Endian">LE</acronym>, 3412-ordered, or 2143-ordered (with a <acronym title="Byte Order Mark">BOM</acronym>)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
65 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
66 | <tt class="literal">UTF-16</tt> <acronym title="Big Endian">BE</acronym> or <acronym title="Little Endian">LE</acronym> (with a <acronym title="Byte Order Mark">BOM</acronym>)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
67 | <li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
68 | <tt class="literal">UTF-8</tt> (with or without a <acronym title="Byte Order Mark">BOM</acronym>)</li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
69 | <li><acronym>ASCII</acronym></li> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
70 | </ul></div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
71 | <a name="id667094"></a><table class="caution" border="0" summary=""> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
72 | <tr><td rowspan="2" align="center" valign="top" width="1%"><img src="images/caution.png" alt="Caution" title="" width="24" height="24"></td></tr> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
73 | <tr><td colspan="2" align="left" valign="top" width="99%">Due to inherent similarities between certain encodings, some encodings may be detected incorrectly. In my tests, the most problematic case was Hungarian text encoded as <tt class="literal">ISO-8859-2</tt> or <tt class="literal">windows-1250</tt> (encoded as one but reported as the other). Also, Greek text encoded as <tt class="literal">ISO-8859-7</tt> was often mis-reported as <tt class="literal">ISO-8859-2</tt>. Your mileage may vary.</td></tr> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
74 | </table> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
75 | </div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
76 | <div class="footernavigation"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
77 | <div style="float: left">← <a class="NavigationArrow" href="faq.html">Frequently asked questions</a> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
78 | </div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
79 | <div style="text-align: right"> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
80 | <a class="NavigationArrow" href="usage.html">Usage</a> →</div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
81 | </div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
82 | <hr> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
83 | <div id="footer"><p class="copyright">Copyright © 2006, 2007, 2008 Mark Pilgrim · <a href="mailto:mark@diveintomark.org">mark@diveintomark.org</a> · <a href="license.html">Terms of use</a></p></div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
84 | </div></div> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
85 | </body> |
de9c2efb9d02
Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff
changeset
|
86 | </html> |