ThirdParty/CharDet/docs/supported-encodings.html

Mon, 28 Dec 2009 16:03:33 +0000

author
Detlev Offenbach <detlev@die-offenbachs.de>
date
Mon, 28 Dec 2009 16:03:33 +0000
changeset 0
de9c2efb9d02
child 3537
7662053c3906
permissions
-rw-r--r--

Started porting eric4 to Python3

0
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
1 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
2 <html lang="en">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
3 <head>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
4 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
5 <title>Supported encodings [Universal Encoding Detector]</title>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
6 <link rel="stylesheet" href="css/chardet.css" type="text/css">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
7 <link rev="made" href="mailto:mark@diveintomark.org">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
8 <meta name="generator" content="DocBook XSL Stylesheets V1.65.1">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
9 <meta name="keywords" content="character, set, encoding, detection, Python, XML, feed">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
10 <link rel="start" href="index.html" title="Documentation">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
11 <link rel="up" href="index.html" title="Documentation">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
12 <link rel="prev" href="faq.html" title="Frequently asked questions">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
13 <link rel="next" href="usage.html" title="Usage">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
14 </head>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
15 <body id="chardet-feedparser-org" class="docs">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
16 <div class="z" id="intro"><div class="sectionInner"><div class="sectionInner2">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
17 <div class="s" id="pageHeader">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
18 <h1><a href="/">Universal Encoding Detector</a></h1>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
19 <p>Character encoding auto-detection in Python. As smart as your browser. Open source.</p>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
20 </div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
21 <div class="s" id="quickSummary"><ul>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
22 <li class="li1">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
23 <a href="http://chardet.feedparser.org/download/">Download</a> ·</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
24 <li class="li2">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
25 <a href="index.html">Documentation</a> ·</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
26 <li class="li3"><a href="faq.html" title="Frequently Asked Questions">FAQ</a></li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
27 </ul></div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
28 </div></div></div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
29 <div id="main"><div id="mainInner">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
30 <p id="breadcrumb">You are here: <a href="index.html">Documentation</a><span class="thispage">Supported encodings</span></p>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
31 <div class="section" lang="en">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
32 <div class="titlepage">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
33 <div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
34 <div><h2 class="title">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
35 <a name="encodings" class="skip" href="#encodings" title="link to this section"><img src="images/permalink.gif" alt="[link]" title="link to this section" width="8" height="9"></a> Supported encodings</h2></div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
36 <div><div class="abstract">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
37 <h3 class="title"></h3>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
38 <p><span class="application">Universal Encoding Detector</span> currently supports over two dozen character encodings.</p>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
39 </div></div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
40 </div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
41 <div></div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
42 </div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
43 <div class="itemizedlist"><ul>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
44 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
45 <tt class="literal">Big5</tt>, <tt class="literal">GB2312</tt>/<tt class="literal">GB18030</tt>, <tt class="literal">EUC-TW</tt>, <tt class="literal">HZ-GB-2312</tt>, and <tt class="literal">ISO-2022-CN</tt> (Traditional and Simplified Chinese)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
46 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
47 <tt class="literal">EUC-JP</tt>, <tt class="literal">SHIFT_JIS</tt>, and <tt class="literal">ISO-2022-JP</tt> (Japanese)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
48 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
49 <tt class="literal">EUC-KR</tt> and <tt class="literal">ISO-2022-KR</tt> (Korean)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
50 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
51 <tt class="literal">KOI8-R</tt>, <tt class="literal">MacCyrillic</tt>, <tt class="literal">IBM855</tt>, <tt class="literal">IBM866</tt>, <tt class="literal">ISO-8859-5</tt>, and <tt class="literal">windows-1251</tt> (Russian)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
52 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
53 <tt class="literal">ISO-8859-2</tt> and <tt class="literal">windows-1250</tt> (Hungarian)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
54 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
55 <tt class="literal">ISO-8859-5</tt> and <tt class="literal">windows-1251</tt> (Bulgarian)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
56 <li><tt class="literal">windows-1252</tt></li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
57 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
58 <tt class="literal">ISO-8859-7</tt> and <tt class="literal">windows-1253</tt> (Greek)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
59 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
60 <tt class="literal">ISO-8859-8</tt> and <tt class="literal">windows-1255</tt> (Visual and Logical Hebrew)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
61 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
62 <tt class="literal">TIS-620</tt> (Thai)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
63 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
64 <tt class="literal">UTF-32</tt> <acronym title="Big Endian">BE</acronym>, <acronym title="Little Endian">LE</acronym>, 3412-ordered, or 2143-ordered (with a <acronym title="Byte Order Mark">BOM</acronym>)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
65 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
66 <tt class="literal">UTF-16</tt> <acronym title="Big Endian">BE</acronym> or <acronym title="Little Endian">LE</acronym> (with a <acronym title="Byte Order Mark">BOM</acronym>)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
67 <li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
68 <tt class="literal">UTF-8</tt> (with or without a <acronym title="Byte Order Mark">BOM</acronym>)</li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
69 <li><acronym>ASCII</acronym></li>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
70 </ul></div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
71 <a name="id667094"></a><table class="caution" border="0" summary="">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
72 <tr><td rowspan="2" align="center" valign="top" width="1%"><img src="images/caution.png" alt="Caution" title="" width="24" height="24"></td></tr>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
73 <tr><td colspan="2" align="left" valign="top" width="99%">Due to inherent similarities between certain encodings, some encodings may be detected incorrectly. In my tests, the most problematic case was Hungarian text encoded as <tt class="literal">ISO-8859-2</tt> or <tt class="literal">windows-1250</tt> (encoded as one but reported as the other). Also, Greek text encoded as <tt class="literal">ISO-8859-7</tt> was often mis-reported as <tt class="literal">ISO-8859-2</tt>. Your mileage may vary.</td></tr>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
74 </table>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
75 </div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
76 <div class="footernavigation">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
77 <div style="float: left">← <a class="NavigationArrow" href="faq.html">Frequently asked questions</a>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
78 </div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
79 <div style="text-align: right">
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
80 <a class="NavigationArrow" href="usage.html">Usage</a> →</div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
81 </div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
82 <hr>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
83 <div id="footer"><p class="copyright">Copyright © 2006, 2007, 2008 Mark Pilgrim · <a href="mailto:mark@diveintomark.org">mark@diveintomark.org</a> · <a href="license.html">Terms of use</a></p></div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
84 </div></div>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
85 </body>
de9c2efb9d02 Started porting eric4 to Python3
Detlev Offenbach <detlev@die-offenbachs.de>
parents:
diff changeset
86 </html>

eric ide

mercurial