Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-text-4] text-spacing needs to handle non-fullwidth punctuation also #6091

Open
fantasai opened this issue Mar 10, 2021 · 10 comments
Open
Labels
css-text-4 i18n-clreq Chinese language enablement i18n-jlreq Japanese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Needs Feedback/Review

Comments

@fantasai
Copy link
Collaborator

“京都(日本)” This doesn't collapse between the closing parens and the closing quote, but it really ought to. (Note that curly quotes are used in Chinese, and are rendered fullwidth even though they use the same codepoint.)

@xfq xfq added i18n-clreq Chinese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. labels Mar 11, 2021
@xfq
Copy link
Member

xfq commented Mar 11, 2021

See clreq § 3.1.6.1 and § 3.1.6.2 for requirements for Chinese layout.

@macnmm
Copy link

macnmm commented Mar 12, 2021

JLReq TF is currently discussing this class of issues. Font information is needed to augment the spacing rules of Unicode code points when the code point could be full-width or not depending on the font, and the collapse of extra space to achieve a zero-point is a necessary first step of correct glyph spacing adjustment (the second being to selectively restore some spacing).

@fantasai
Copy link
Collaborator Author

fantasai commented Sep 12, 2022

The single and double curly quotation marks are explicitly called out in the spec already, because of their usage in Chinese. So this issue is focused on other punctuation such as non-fullwidth brackets and guillmots etc. E.g. {京都(日本)}

@kidayasuo
Copy link

We've discussed this at the JLReq TF meeting and developed a note (in Japanese). In sum, when a proportional opening bracket is placed before a fullwidth opening bracket (cl-01 in JLReq), and when a proportional closing bracket is placed after a fullwidth closing bracket or a fullwidth fullstop & comma (cl-02/06/07), the extra space within the fullwidth character should be removed.

e.g. these cases
[「
」]
。]

@himorin
Copy link
Contributor

himorin commented Nov 16, 2022

We've discussed this at the JLReq TF meeting and developed a note (in Japanese). In sum, when a proportional opening bracket is placed before a fullwidth opening bracket (cl-01 in JLReq), and when a proportional closing bracket is placed after a fullwidth closing bracket or a fullwidth fullstop & comma (cl-02/06/07), the extra space within the fullwidth character should be removed.

JL-TF finally agreed to update jlreq(-d) spacing property document (one defined using Unicode property is under development), based on above direction. This update will cover all of pairs between non-fullwidth punctuation (like in Latin script, and expanded to whole Unicode) and fullwidth punctuation (defined by EAW is F or W, and which were included in definition of JLreq), as:

  • adding corresponding character classes for non-wide width ones over defined classes, A/B/BA - adding Ap or something
  • expand existing table for omitting spacing, using newly added character classes, with copying conditions where appopriate

for more, see e.g.: w3c/jlreq#340 (comment)

@frivoal
Copy link
Collaborator

frivoal commented Jan 27, 2023

Added a commit that should handle non-CJK "proportional" bracketing punctuation: cc2ee4c

It does not handle non-CJK stops and commas like ). (full width closing parenthesis followed by ASCII period) or 」, (full width right corner bracket followed by ASCII comma), or equivalent things in other writing systems.

Should it? If so, how do we identify the relevant set(s) of characters?

It also doesn't handle ambiguous punctuation from the Pf / Pi categories (like ‘ ’ or « ») other than double curly quotes which are already special-cased due to Chinese. That's probably unavoidable though.

@xfq
Copy link
Member

xfq commented Mar 2, 2023

The CLReq TF discussed this issue and concluded that the spacing around non-fullwidth punctuations should not be changed in Chinese, because these are not part of Chinese punctuation and should be handled according to the rules of Western punctuations.

We'd be happy to continue discussing a possible solution with the JLReq TF and the CSSWG, for example, different behaviour depending on the language?

@MurakamiShinyu
Copy link
Collaborator

I understand that some Chinese fonts such as "Microsoft JhengHei" have fullwidth brackets center-aligned in the full width, and that cause problem if trimmed with adjacent non-fullwidth punctuation.

How about Korean text layout? In Korean texts, non-fullwidth punctuations are used primarily, and also fullwidth brackets are used occasionally. So I thought that the text-spacing handling of adjacent fullwidth and non-fullwidth punctuations might be important for Korean text.

I checked Korean Wikipedia articles that contain adjacent fullwidth and non-fullwidth punctuations.

sample1

… 잉글랜드 유복한 집안에서 태어나 런던으로 이주하고서 본격 작품 활동을 시작하여 일약 명성을 얻었고, 생전에 '영국 최고의 극작가' 지위에 올랐다. 《로미오와 줄리엣》, 《햄릿》처럼 인간 내면을 통찰한 걸작을 남겼으며, …

sample2

… 로버트 듀발, 리처드 해리스 주연의 1993년 영화 《월터와 프랭크》(원제 《헤밍웨이와 레슬링하기》)는 플로리다의 해안 마을에서 은퇴한 두 친구의 우정을 다루었다. …
  • Hemingway, Ernest (2013). 《헤밍웨이 단편선(1~2 합본)》. 번역 김욱동. …
  • Hemingway, Ernest (2013) [1932]. 《오후의 죽음》. 번역 장왕록. …

These examples use (U+300A LEFT DOUBLE ANGLE BRACKET) and (U+300B RIGHT DOUBLE ANGLE BRACKET) which are East_Asian_Width=Wide, but interestingly these fullwidth brackets appear to be half width in my browser's default Korean font setting (Korean font: "Apple SD Gothic Neo").

Screenshots, with Korean font "Apple SD Gothic Neo":

ko-sample1-AppleSDGothicNeo

ko-sample2a-AppleSDGothicNeo

ko-sample2b-AppleSDGothicNeo

If such Korean fonts are very common, text-spacing-trim would not be necessary for Korean text. However, I found that other Korean fonts, such as AppleGothic, AppleMyungjo, Noto Sans KR, or Source Han Sans K are not the case.

Screenshots, with Korean font "AppleGothic":

ko-sample1-AppleGothic

ko-sample2a-AppleGothic

ko-sample2b-AppleGothic

In these results, the spacing of adjacent fullwidth and non-fullwidth punctuations looks not very nice.

We will need to hear from Korean text layout experts.

@MurakamiShinyu
Copy link
Collaborator

For the example case above, text-spacing: trim-all may be a better option:

@xfq
Copy link
Member

xfq commented Mar 29, 2023

The CLReq TF discussed this issue and concluded that the spacing around non-fullwidth punctuations should not be changed in Chinese, because these are not part of Chinese punctuation and should be handled according to the rules of Western punctuations.

We discussed this issue again during the I18N ⇔ CSS Call and another CLReq Editors' Call, and we're OK with trimming the spacing around non-fullwidth punctuations in Chinese (i.e., the current JLReq/CSSWG consensus).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
css-text-4 i18n-clreq Chinese language enablement i18n-jlreq Japanese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Needs Feedback/Review
Projects
None yet
Development

No branches or pull requests

7 participants