[SCM] WebKit Debian packaging branch, debian/experimental, updated. upstream/1.3.3-10851-g50815da
tkent at chromium.org
tkent at chromium.org
Wed Dec 22 18:13:46 UTC 2010
The following commit has been merged in the debian/experimental branch:
commit 68ad9ecb23568890090e395da15b1f75916352e6
Author: tkent at chromium.org <tkent at chromium.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Date: Thu Dec 9 00:54:23 2010 +0000
Yensign hack should work with Shift_JIS and ISO-2022-JP encodings.
https://bugs.webkit.org/show_bug.cgi?id=49714
Reviewed by Alexey Proskuryakov.
WebCore:
IE chooses a font which shows a yensign for 0x5c code point for a page
encoded in x-mac-japanese, ISO-2022-JP, EUC-JP, Shift_JIS, Shift_JIS_X0213-2000,
x-sjis, and Windows-31J.
We have emulated this behavior by replacing 0x5c with 0xa5 for EUC-JP and
Shift_JIS_X0213-2000. This change adds other encodings above.
Also, we move the HashSet initialization for isJapanese() and
backslashAsCurrencySymbol() to TextEncodingRegistry.cpp because of
ease of making them multi-thread safe.
* platform/text/TextEncoding.cpp:
(WebCore::TextEncoding::isJapanese): Just calls isJapaneseEncoding().
(WebCore::TextEncoding::backslashAsCurrencySymbol): Uses shouldShowBackslashAsCurrencySymbolIn().
* platform/text/TextEncodingRegistry.cpp:
(WebCore::addEncodingName): Moved from TextEncoding.cpp, and stop using atomicCanonicalTextEncodingName().
(WebCore::buildQuirksSets): Added. Initializes HashSets for isJapaneseEncoding() and shouldShowBackslashAsCurrencySymbolIn().
(WebCore::isJapaneseEncoding):
(WebCore::shouldShowBackslashAsCurrencySymbolIn):
(WebCore::extendTextCodecMaps): Add a call to buildQuirksSets().
* platform/text/TextEncodingRegistry.h:
LayoutTests:
Use Shift_JIS instead of Shift_JIS_X0213-2000 because Shift_JIS_X0213-2000
encoding is available only on Mac.
Add a test for ISO-2022-JP.
* editing/selection/find-yensign-and-backslash-expected.txt:
* editing/selection/find-yensign-and-backslash.html:
* platform/chromium/test_expectations.txt:
git-svn-id: http://svn.webkit.org/repository/webkit/trunk@73566 268f45cc-cd09-0410-ab3c-d52691b4dbfc
diff --git a/LayoutTests/ChangeLog b/LayoutTests/ChangeLog
index 25cd880..f7a2f68 100644
--- a/LayoutTests/ChangeLog
+++ b/LayoutTests/ChangeLog
@@ -1,3 +1,18 @@
+2010-12-08 Kent Tamura <tkent at chromium.org>
+
+ Reviewed by Alexey Proskuryakov.
+
+ Yensign hack should work with Shift_JIS and ISO-2022-JP encodings.
+ https://bugs.webkit.org/show_bug.cgi?id=49714
+
+ Use Shift_JIS instead of Shift_JIS_X0213-2000 because Shift_JIS_X0213-2000
+ encoding is available only on Mac.
+ Add a test for ISO-2022-JP.
+
+ * editing/selection/find-yensign-and-backslash-expected.txt:
+ * editing/selection/find-yensign-and-backslash.html:
+ * platform/chromium/test_expectations.txt:
+
2010-12-08 Andy Estes <aestes at apple.com>
Reviewed by Darin Adler.
diff --git a/LayoutTests/editing/selection/find-yensign-and-backslash-expected.txt b/LayoutTests/editing/selection/find-yensign-and-backslash-expected.txt
index baa0493..6f0ee93 100644
--- a/LayoutTests/editing/selection/find-yensign-and-backslash-expected.txt
+++ b/LayoutTests/editing/selection/find-yensign-and-backslash-expected.txt
@@ -1,11 +1,13 @@
\-in-body
-
+
Results
We can find a backslash in EUC-JP page by finding a yen sign: PASS
We can find a backslash in EUC-JP text control by finding a yen sign: PASS
-We can find a backslash in Shift_JIS_X0213-2000 page by finding a yen sign: PASS
-We can find a backslash in Shift_JIS_X0213-2000 text control by finding a yen sign: PASS
+We can find a backslash in Shift_JIS page by finding a yen sign: PASS
+We can find a backslash in Shift_JIS text control by finding a yen sign: PASS
+We can find a backslash in ISO-2022-JP page by finding a yen sign: PASS
+We can find a backslash in ISO-2022-JP text control by finding a yen sign: PASS
We can NOT find a backslash in UTF8 page by finding a yen sign: PASS
We can NOT find a backslash in UTF8 text control by finding a yen sign: PASS
diff --git a/LayoutTests/editing/selection/find-yensign-and-backslash.html b/LayoutTests/editing/selection/find-yensign-and-backslash.html
index 1b748a1..216a3b6 100644
--- a/LayoutTests/editing/selection/find-yensign-and-backslash.html
+++ b/LayoutTests/editing/selection/find-yensign-and-backslash.html
@@ -1,6 +1,5 @@
<!DOCTYPE html>
<html>
-
<head>
<meta charset="UTF-8">
<script>
@@ -22,7 +21,7 @@ function shouldBeTrue(condition, testName)
function test()
{
// With these encodings, backslashes will be transcoded into yen signs.
- var encodings = ["EUC-JP", "Shift_JIS_X0213-2000"];
+ var encodings = ["EUC-JP", "Shift_JIS", "ISO-2022-JP"];
for (var i = 0; i < encodings.length; i++) {
var encoding = encodings[i];
var frameDocument = frames[i].document;
@@ -36,19 +35,17 @@ function test()
</script>
</head>
-
<body onload="test()">
<div>\-in-body</div>
<input value=\-in-input>
<iframe src="data:text/html;charset=EUC-JP,<body>\-in-body<input value=\-in-input></body>"></iframe>
-<iframe src="data:text/html;charset=Shift_JIS_X0213-2000,<body>\-in-body<input value=\-in-input></body>"></iframe>
+<iframe src="data:text/html;charset=Shift_JIS,<body>\-in-body<input value=\-in-input></body>"></iframe>
+<iframe src="data:text/html;charset=ISO-2022-JP,<body>\-in-body<input value=\-in-input></body>"></iframe>
<p>Results</p>
<p id="results">
</p>
-
</body>
-
</html>
diff --git a/LayoutTests/platform/chromium/test_expectations.txt b/LayoutTests/platform/chromium/test_expectations.txt
index c3c3b11..71a2dee 100644
--- a/LayoutTests/platform/chromium/test_expectations.txt
+++ b/LayoutTests/platform/chromium/test_expectations.txt
@@ -676,7 +676,6 @@ BUG28916 MAC : editing/pasteboard/paste-xml.xhtml = TEXT
// Flaky
BUG31803 MAC LINUX : editing/inserting/12882.html = IMAGE PASS
-BUG38653 MAC : editing/selection/find-yensign-and-backslash.html = TEXT
BUGWK45438 : editing/spelling/spelling-backspace-between-lines.html = TEXT
// Tests added in r69269.
diff --git a/WebCore/ChangeLog b/WebCore/ChangeLog
index de7f771..b47ccdf 100644
--- a/WebCore/ChangeLog
+++ b/WebCore/ChangeLog
@@ -1,3 +1,31 @@
+2010-12-08 Kent Tamura <tkent at chromium.org>
+
+ Reviewed by Alexey Proskuryakov.
+
+ Yensign hack should work with Shift_JIS and ISO-2022-JP encodings.
+ https://bugs.webkit.org/show_bug.cgi?id=49714
+
+ IE chooses a font which shows a yensign for 0x5c code point for a page
+ encoded in x-mac-japanese, ISO-2022-JP, EUC-JP, Shift_JIS, Shift_JIS_X0213-2000,
+ x-sjis, and Windows-31J.
+ We have emulated this behavior by replacing 0x5c with 0xa5 for EUC-JP and
+ Shift_JIS_X0213-2000. This change adds other encodings above.
+
+ Also, we move the HashSet initialization for isJapanese() and
+ backslashAsCurrencySymbol() to TextEncodingRegistry.cpp because of
+ ease of making them multi-thread safe.
+
+ * platform/text/TextEncoding.cpp:
+ (WebCore::TextEncoding::isJapanese): Just calls isJapaneseEncoding().
+ (WebCore::TextEncoding::backslashAsCurrencySymbol): Uses shouldShowBackslashAsCurrencySymbolIn().
+ * platform/text/TextEncodingRegistry.cpp:
+ (WebCore::addEncodingName): Moved from TextEncoding.cpp, and stop using atomicCanonicalTextEncodingName().
+ (WebCore::buildQuirksSets): Added. Initializes HashSets for isJapaneseEncoding() and shouldShowBackslashAsCurrencySymbolIn().
+ (WebCore::isJapaneseEncoding):
+ (WebCore::shouldShowBackslashAsCurrencySymbolIn):
+ (WebCore::extendTextCodecMaps): Add a call to buildQuirksSets().
+ * platform/text/TextEncodingRegistry.h:
+
2010-12-08 Andy Estes <aestes at apple.com>
Reviewed by Darin Adler.
diff --git a/WebCore/platform/text/TextEncoding.cpp b/WebCore/platform/text/TextEncoding.cpp
index 58e691f..33313a0 100644
--- a/WebCore/platform/text/TextEncoding.cpp
+++ b/WebCore/platform/text/TextEncoding.cpp
@@ -40,19 +40,11 @@
#include "GOwnPtr.h"
#endif
#include <wtf/text/CString.h>
-#include <wtf/HashSet.h>
#include <wtf/OwnPtr.h>
#include <wtf/StdLibExtras.h>
namespace WebCore {
-static void addEncodingName(HashSet<const char*>& set, const char* name)
-{
- const char* atomicName = atomicCanonicalTextEncodingName(name);
- if (atomicName)
- set.add(atomicName);
-}
-
static const TextEncoding& UTF7Encoding()
{
static TextEncoding globalUTF7Encoding("UTF-7");
@@ -173,39 +165,12 @@ bool TextEncoding::usesVisualOrdering() const
bool TextEncoding::isJapanese() const
{
- if (noExtendedTextEncodingNameUsed())
- return false;
-
- DEFINE_STATIC_LOCAL(HashSet<const char*>, set, ());
- if (set.isEmpty()) {
- addEncodingName(set, "x-mac-japanese");
- addEncodingName(set, "cp932");
- addEncodingName(set, "JIS_X0201");
- addEncodingName(set, "JIS_X0208-1983");
- addEncodingName(set, "JIS_X0208-1990");
- addEncodingName(set, "JIS_X0212-1990");
- addEncodingName(set, "JIS_C6226-1978");
- addEncodingName(set, "Shift_JIS_X0213-2000");
- addEncodingName(set, "ISO-2022-JP");
- addEncodingName(set, "ISO-2022-JP-2");
- addEncodingName(set, "ISO-2022-JP-1");
- addEncodingName(set, "ISO-2022-JP-3");
- addEncodingName(set, "EUC-JP");
- addEncodingName(set, "Shift_JIS");
- }
- return m_name && set.contains(m_name);
+ return isJapaneseEncoding(m_name);
}
UChar TextEncoding::backslashAsCurrencySymbol() const
{
- if (noExtendedTextEncodingNameUsed())
- return '\\';
-
- // The text encodings below treat backslash as a currency symbol.
- // See http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx for more information.
- static const char* const a = atomicCanonicalTextEncodingName("Shift_JIS_X0213-2000");
- static const char* const b = atomicCanonicalTextEncodingName("EUC-JP");
- return (m_name == a || m_name == b) ? 0x00A5 : '\\';
+ return shouldShowBackslashAsCurrencySymbolIn(m_name) ? 0x00A5 : '\\';
}
bool TextEncoding::isNonByteBasedEncoding() const
diff --git a/WebCore/platform/text/TextEncodingRegistry.cpp b/WebCore/platform/text/TextEncodingRegistry.cpp
index 6bf5552..c0c0255 100644
--- a/WebCore/platform/text/TextEncodingRegistry.cpp
+++ b/WebCore/platform/text/TextEncodingRegistry.cpp
@@ -36,6 +36,7 @@
#include <wtf/Assertions.h>
#include <wtf/HashFunctions.h>
#include <wtf/HashMap.h>
+#include <wtf/HashSet.h>
#include <wtf/StdLibExtras.h>
#include <wtf/StringExtras.h>
#include <wtf/Threading.h>
@@ -125,6 +126,8 @@ static Mutex& encodingRegistryMutex()
static TextEncodingNameMap* textEncodingNameMap;
static TextCodecMap* textCodecMap;
static bool didExtendTextCodecMaps;
+static HashSet<const char*>* japaneseEncodings;
+static HashSet<const char*>* nonBackslashEncodings;
static const char* const textEncodingNameBlacklist[] = {
"UTF-7"
@@ -249,6 +252,59 @@ static void buildBaseTextCodecMaps()
#endif
}
+static void addEncodingName(HashSet<const char*>* set, const char* name)
+{
+ // We must not use atomicCanonicalTextEncodingName() because this function is called in it.
+ const char* atomicName = textEncodingNameMap->get(name);
+ if (atomicName)
+ set->add(atomicName);
+}
+
+static void buildQuirksSets()
+{
+ // FIXME: Having isJapaneseEncoding() and shouldShowBackslashAsCurrencySymbolIn()
+ // and initializing the sets for them in TextEncodingRegistry.cpp look strange.
+
+ ASSERT(!japaneseEncodings);
+ ASSERT(!nonBackslashEncodings);
+
+ japaneseEncodings = new HashSet<const char*>();
+ addEncodingName(japaneseEncodings, "EUC-JP");
+ addEncodingName(japaneseEncodings, "ISO-2022-JP");
+ addEncodingName(japaneseEncodings, "ISO-2022-JP-1");
+ addEncodingName(japaneseEncodings, "ISO-2022-JP-2");
+ addEncodingName(japaneseEncodings, "ISO-2022-JP-3");
+ addEncodingName(japaneseEncodings, "JIS_C6226-1978");
+ addEncodingName(japaneseEncodings, "JIS_X0201");
+ addEncodingName(japaneseEncodings, "JIS_X0208-1983");
+ addEncodingName(japaneseEncodings, "JIS_X0208-1990");
+ addEncodingName(japaneseEncodings, "JIS_X0212-1990");
+ addEncodingName(japaneseEncodings, "Shift_JIS");
+ addEncodingName(japaneseEncodings, "Shift_JIS_X0213-2000");
+ addEncodingName(japaneseEncodings, "cp932");
+ addEncodingName(japaneseEncodings, "x-mac-japanese");
+
+ nonBackslashEncodings = new HashSet<const char*>();
+ // The text encodings below treat backslash as a currency symbol for IE compatibility.
+ // See http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx for more information.
+ addEncodingName(nonBackslashEncodings, "x-mac-japanese");
+ addEncodingName(nonBackslashEncodings, "ISO-2022-JP");
+ addEncodingName(nonBackslashEncodings, "EUC-JP");
+ // Shift_JIS_X0213-2000 is not the same encoding as Shift_JIS on Mac. We need to register both of them.
+ addEncodingName(nonBackslashEncodings, "Shift_JIS");
+ addEncodingName(nonBackslashEncodings, "Shift_JIS_X0213-2000");
+}
+
+bool isJapaneseEncoding(const char* canonicalEncodingName)
+{
+ return canonicalEncodingName && japaneseEncodings && japaneseEncodings->contains(canonicalEncodingName);
+}
+
+bool shouldShowBackslashAsCurrencySymbolIn(const char* canonicalEncodingName)
+{
+ return canonicalEncodingName && nonBackslashEncodings && nonBackslashEncodings->contains(canonicalEncodingName);
+}
+
static void extendTextCodecMaps()
{
#if USE(ICU_UNICODE)
@@ -277,6 +333,7 @@ static void extendTextCodecMaps()
#endif
pruneBlacklistedCodecs();
+ buildQuirksSets();
}
PassOwnPtr<TextCodec> newTextCodec(const TextEncoding& encoding)
diff --git a/WebCore/platform/text/TextEncodingRegistry.h b/WebCore/platform/text/TextEncodingRegistry.h
index 81b7c4c..16844c6 100644
--- a/WebCore/platform/text/TextEncodingRegistry.h
+++ b/WebCore/platform/text/TextEncodingRegistry.h
@@ -39,12 +39,12 @@ namespace WebCore {
// Use TextEncoding::encode to encode, since it takes care of normalization.
PassOwnPtr<TextCodec> newTextCodec(const TextEncoding&);
- // Only TextEncoding should use this function directly.
+ // Only TextEncoding should use the following functions directly.
const char* atomicCanonicalTextEncodingName(const char* alias);
const char* atomicCanonicalTextEncodingName(const UChar* aliasCharacters, size_t aliasLength);
-
- // Only TextEncoding should use this function directly.
bool noExtendedTextEncodingNameUsed();
+ bool isJapaneseEncoding(const char* canonicalEncodingName);
+ bool shouldShowBackslashAsCurrencySymbolIn(const char* canonicalEncodingName);
#ifndef NDEBUG
void dumpTextEncodingNameMap();
--
WebKit Debian packaging
More information about the Pkg-webkit-commits
mailing list