|
From: <mi...@us...> - 2017-03-19 23:04:52
|
Revision: 8050
http://sourceforge.net/p/docutils/code/8050
Author: milde
Date: 2017-03-19 23:04:50 +0000 (Sun, 19 Mar 2017)
Log Message:
-----------
Fix [ 313 ] differentiate apostrophe from single quote (if possible).
Mind, that this is not possible for apostrophe at end of words.
Modified Paths:
--------------
trunk/docutils/HISTORY.txt
trunk/docutils/docutils/utils/smartquotes.py
trunk/docutils/test/test_transforms/test_smartquotes.py
Modified: trunk/docutils/HISTORY.txt
===================================================================
--- trunk/docutils/HISTORY.txt 2017-03-13 21:49:51 UTC (rev 8049)
+++ trunk/docutils/HISTORY.txt 2017-03-19 23:04:50 UTC (rev 8050)
@@ -58,6 +58,8 @@
- Update quote definitions for languages et, fi, ro, sv, tr, uk.
- New quote definitions for hr, hsb, hu, lv, sl.
+ - Fix [ 313 ] Differentiate apostrophe from closing single quote
+ (if possible).
* docutils/writers/_html_base.py
Modified: trunk/docutils/docutils/utils/smartquotes.py
===================================================================
--- trunk/docutils/docutils/utils/smartquotes.py 2017-03-13 21:49:51 UTC (rev 8049)
+++ trunk/docutils/docutils/utils/smartquotes.py 2017-03-19 23:04:50 UTC (rev 8050)
@@ -160,21 +160,21 @@
Backslash Escapes
=================
-If you need to use literal straight quotes (or plain hyphens and
-periods), SmartyPants accepts the following backslash escape sequences
-to force non-smart punctuation. It does so by transforming the escape
-sequence into a character:
+If you need to use literal straight quotes (or plain hyphens and periods),
+`smartquotes` accepts the following backslash escape sequences to force
+ASCII-punctuation. Mind, that you need two backslashes as Docutils expands it,
+too.
-======== ===== =========
-Escape Value Character
-======== ===== =========
-``\\\\`` \ \\
-\\" " "
-\\' ' '
-\\. . .
-\\- - \-
-\\` ` \`
-======== ===== =========
+======== =========
+Escape Character
+======== =========
+``\\`` \\
+``\\"`` \\"
+``\\'`` \\'
+``\\.`` \\.
+``\\-`` \\-
+``\\``` \\`
+======== =========
This is useful, for example, when you want to use straight quotes as
foot and inch marks: 6\\'2\\" tall; a 17\\" iMac.
@@ -274,7 +274,7 @@
continue not caring. Using straight quotes -- and sticking to the 7-bit
ASCII character set in general -- is certainly a simpler way to live.
-Even if you I *do* care about accurate typography, you still might want to
+Even if you *do* care about accurate typography, you still might want to
think twice before educating the quote characters in your weblog. One side
effect of publishing curly quote characters is that it makes your
weblog a bit harder for others to quote from using copy-and-paste. What
@@ -305,16 +305,45 @@
``'Twas the night before Christmas.``
In the case above, SmartyPants will turn the apostrophe into an opening
-single-quote, when in fact it should be a closing one. I don't think
-this problem can be solved in the general case -- every word processor
-I've tried gets this wrong as well. In such cases, it's best to use the
-proper character for closing single-quotes (``’``) by hand.
+single-quote, when in fact it should be the `right single quotation mark`
+character which is also "the preferred character to use for apostrophe"
+(Unicode). I don't think this problem can be solved in the general case --
+every word processor I've tried gets this wrong as well. In such cases, it's
+best to use the proper character for closing single-quotes (’) by hand.
+In English, the same character is used for apostrophe and closing single
+quote (both plain and "smart" ones). For other locales (French, Italean,
+Swiss, ...) "smart" single closing quotes differ from the curly apostrophe.
+ .. class:: language-fr
+
+ Il dit : "C'est 'super' !"
+
+If the apostrophe is used at the end of a word, it cannot be distinguished
+from a single quote by the algorithm. Therefore, a text like::
+
+ .. class:: language-de-CH
+
+ "Er sagt: 'Ich fass' es nicht.'"
+
+will get a single closing guillemet instead of an apostrophe.
+
+This can be prevented by use use of the curly apostrophe character (’) in
+the source:
+
+ .. class:: language-de-CH
+
+ "Er sagt: 'Ich fass' es nicht.'" → "Er sagt: 'Ich fass’ es nicht.'"
+
+
Version History
===============
-1.7 2012-11-19
+1.7.1: 2017-03-19
+ - Update and extend language-dependent quotes.
+ - Differentiate apostrophe from single quote.
+
+1.7: 2012-11-19
- Internationalization: language-dependent quotes.
1.6.1: 2012-11-06
@@ -370,6 +399,7 @@
endash = u'–' # "–" EN DASH
emdash = u'—' # "—" EM DASH
ellipsis = u'…' # "…" HORIZONTAL ELLIPSIS
+ apostrophe = u'’'
# quote characters (language-specific, set in __init__())
#
@@ -395,10 +425,10 @@
'da-x-altquot': u'„“‚‘',
'de': u'„“‚‘',
'de-x-altquot': u'»«›‹',
- 'de-CH': u'«»‹›',
+ 'de-ch': u'«»‹›',
'el': u'«»“”',
'en': u'“”‘’',
- 'en-UK': u'‘’“”',
+ 'en-uk': u'‘’“”',
'eo': u'“”‘’',
'es': u'«»“”',
'es-x-altquot': u'“”‘’',
@@ -410,7 +440,7 @@
'fr': (u'« ', u' »', u'‹ ', u' ›'), # with narrow no-break space
'fr-x-altquot': u'«»‹›', # for use with manually set spaces
# 'fr-x-altquot2': (u'“ ', u' ”', u'‘ ', u' ’'), # rarely used
- 'fr-CH': u'«»‹›',
+ 'fr-ch': u'«»‹›',
'gl': u'«»“”',
'he': u'”“»«',
'he-x-altquot': u'„”‚’',
@@ -420,7 +450,7 @@
'hsb-x-altquot':u'»«›‹',
'hu': u'„”«»',
'it': u'«»“”',
- 'it-CH': u'«»‹›',
+ 'it-ch': u'«»‹›',
'it-x-altquot': u'“”‘’',
# 'it-x-altquot2': u'“„‘‚', # antiquated?
'ja': u'「」『』',
@@ -432,7 +462,7 @@
'pl': u'„”«»',
'pl-x-altquot': u'«»“”',
'pt': u'«»“”',
- 'pt-BR': u'“”‘’',
+ 'pt-br': u'“”‘’',
'ro': u'„”«»',
'ru': u'«»„“',
'sk': u'„“‚‘',
@@ -446,8 +476,8 @@
# 'tr-x-altquot2': u'“„‘‚', # antiquated?
'uk': u'«»„“',
'uk-x-altquot': u'„“‚‘',
- 'zh-CN': u'“”‘’',
- 'zh-TW': u'「」『』',
+ 'zh-cn': u'“”‘’',
+ 'zh-tw': u'「」『』',
}
def __init__(self, language='en'):
@@ -454,7 +484,7 @@
self.language = language
try:
(self.opquote, self.cpquote,
- self.osquote, self.csquote) = self.quotes[language]
+ self.osquote, self.csquote) = self.quotes[language.lower()]
except KeyError:
self.opquote, self.cpquote, self.osquote, self.csquote = u'""\'\''
@@ -624,14 +654,22 @@
)
' # the quote
(?=\w) # followed by a word character
- """ % (dec_dashes,), re.VERBOSE)
+ """ % (dec_dashes,), re.VERBOSE | re.UNICODE)
text = opening_single_quotes_regex.sub(r'\1'+smart.osquote, text)
+ # In many locales, single closing quotes are different from apostrophe:
+ if smart.csquote != smart.apostrophe:
+ apostrophe_regex = re.compile(r"(?<=(\w|\d))'(?=\w)", re.UNICODE)
+ text = apostrophe_regex.sub(smart.apostrophe, text)
+
closing_single_quotes_regex = re.compile(r"""
(%s)
'
- (?!\s | s\b | \d)
- """ % (close_class,), re.VERBOSE)
+ (?!\s | # whitespace
+ s\b |
+ \d # digits ('80s)
+ )
+ """ % (close_class,), re.VERBOSE | re.UNICODE)
text = closing_single_quotes_regex.sub(r'\1'+smart.csquote, text)
closing_single_quotes_regex = re.compile(r"""
@@ -638,7 +676,7 @@
(%s)
'
(\s | s\b)
- """ % (close_class,), re.VERBOSE)
+ """ % (close_class,), re.VERBOSE | re.UNICODE)
text = closing_single_quotes_regex.sub(r'\1%s\2' % smart.csquote, text)
# Any remaining single quotes should be opening ones:
@@ -879,7 +917,7 @@
pass
from docutils.core import publish_string
- docstring_html = publish_string(__doc__, writer_name='html')
+ docstring_html = publish_string(__doc__, writer_name='html5')
print docstring_html
@@ -912,11 +950,3 @@
self.assertEqual(sp(text), text)
unittest.main()
-
-
-
-
-__author__ = "Chad Miller <sma...@ch...>"
-__version__ = "1.5_1.6: Fri, 27 Jul 2007 07:06:40 -0400"
-__url__ = "http://wiki.chad.org/SmartyPantsPy"
-__description__ = "Smart-quotes, smart-ellipses, and smart-dashes for weblog entries in pyblosxom"
Modified: trunk/docutils/test/test_transforms/test_smartquotes.py
===================================================================
--- trunk/docutils/test/test_transforms/test_smartquotes.py 2017-03-13 21:49:51 UTC (rev 8049)
+++ trunk/docutils/test/test_transforms/test_smartquotes.py 2017-03-19 23:04:50 UTC (rev 8050)
@@ -41,7 +41,7 @@
totest['transitions'] = ((SmartQuotes,), [
["""\
-Test "smart quotes", 'single smart quotes',
+Test "smart quotes", 'secondary smart quotes',
"'nested' smart" quotes
-- and ---also long--- dashes.
""",
@@ -48,11 +48,11 @@
u"""\
<document source="test data">
<paragraph>
- Test “smart quotes”, ‘single smart quotes’,
+ Test “smart quotes”, ‘secondary smart quotes’,
“‘nested’ smart” quotes
– and —also long— dashes.
"""],
-[r"""Escaped \\"smart quotes\\", \\'single smart quotes\\',
+[r"""Escaped \\"smart quotes\\", \\'secondary smart quotes\\',
\\"\\'nested\\' smart\\" quotes
\\-- and -\\--also long-\\-- dashes.
""",
@@ -59,7 +59,7 @@
u"""\
<document source="test data">
<paragraph>
- Escaped "smart quotes", 'single smart quotes',
+ Escaped "smart quotes", 'secondary smart quotes',
"'nested' smart" quotes
-- and ---also long--- dashes.
"""],
@@ -155,8 +155,12 @@
["""\
.. class:: language-de
-German "smart quotes" and 'single smart quotes'.
+German "smart quotes" and 'secondary smart quotes'.
+.. class:: language-en-UK
+
+British "quotes" use single and 'secondary quotes' double quote signs.
+
.. class:: language-foo
"Quoting style" for unknown languages is 'ASCII'.
@@ -163,17 +167,19 @@
.. class:: language-de-x-altquot
-Alternative German "smart quotes" and 'single smart quotes'.
+Alternative German "smart quotes" and 'secondary smart quotes'.
""",
u"""\
<document source="test data">
<paragraph classes="language-de">
- German „smart quotes“ and ‚single smart quotes‘.
+ German „smart quotes“ and ‚secondary smart quotes‘.
+ <paragraph classes="language-en-uk">
+ British ‘quotes’ use single and “secondary quotes” double quote signs.
<paragraph classes="language-foo">
"Quoting style" for unknown languages is 'ASCII'.
<paragraph classes="language-de-x-altquot">
- Alternative German »smart quotes« and ›single smart quotes‹.
- <system_message level="2" line="7" source="test data" type="WARNING">
+ Alternative German »smart quotes« and ›secondary smart quotes‹.
+ <system_message level="2" line="11" source="test data" type="WARNING">
<paragraph>
No smart quotes defined for language "foo".
"""],
@@ -181,28 +187,31 @@
totest_de['transitions'] = ((SmartQuotes,), [
["""\
-German "smart quotes" and 'single smart quotes'.
+German "smart quotes" and 'secondary smart quotes'.
-.. class:: language-en-UK
+.. class:: language-en
-English "smart quotes" and 'single smart quotes'.
+English "smart quotes" and 'secondary smart quotes'.
""",
u"""\
<document source="test data">
<paragraph>
- German „smart quotes“ and ‚single smart quotes‘.
- <paragraph classes="language-en-uk">
- English “smart quotes” and ‘single smart quotes’.
+ German „smart quotes“ and ‚secondary smart quotes‘.
+ <paragraph classes="language-en">
+ English “smart quotes” and ‘secondary smart quotes’.
"""],
])
totest_de_alt['transitions'] = ((SmartQuotes,), [
["""\
-Alternative German "smart quotes" and 'single smart quotes'.
+Alternative German "smart quotes" and 'secondary smart quotes'.
+In this case, the apostrophe isn't a closing secondary quote!
+
.. class:: language-en-UK
-English "smart quotes" and 'single smart quotes' have no alternative.
+British "quotes" use single and 'secondary quotes' double quote signs
+(there are no alternative quotes defined).
.. class:: language-ro
@@ -211,9 +220,12 @@
u"""\
<document source="test data">
<paragraph>
- Alternative German »smart quotes« and ›single smart quotes‹.
+ Alternative German »smart quotes« and ›secondary smart quotes‹.
+ <paragraph>
+ In this case, the apostrophe isn’t a closing secondary quote!
<paragraph classes="language-en-uk">
- English “smart quotes” and ‘single smart quotes’ have no alternative.
+ British ‘quotes’ use single and “secondary quotes” double quote signs
+ (there are no alternative quotes defined).
<paragraph classes="language-ro">
Romanian „smart quotes” and «secondary» smart quotes.
"""],
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|