emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* How to force markup without spaces
@ 2012-11-19  5:32 cinsky
  2012-11-19  7:11 ` Vladimir Lomov
  0 siblings, 1 reply; 13+ messages in thread
From: cinsky @ 2012-11-19  5:32 UTC (permalink / raw)
  To: emacs-orgmode


Hi,

AFAIK, if the markup syntax (=code=, *bold*, ..) is directly followed
by non-whitespace characters, then it will not be marked-up:

   =hello=there
   /not/italic

This may be right decision on English text, but in some languages, the
postposition (grammar) will be postfixed without spaces into the
previous noun, so it will be the trouble.  (Following text contains
Korean characters in UTF-8, you may need additional korean font to
read properly)

   =printf=는
   =bold=로
   =철수=는

I'm sure that some other languages will have same problem
(e.g. Japanese or Chinese).

Is there any way to force mark-up on this situation?

If this pattern cannot be implemented easily, how about to introduce
new escaping character to prevent to insert whitespace between
marked-up text and the following postfix text?  For example:

  =printf=\is      => rendered in HTML: <code>printf</code>is
  *bold*\asdf      => rendered in HTML: <b>bold</b>asdf
  /철수/\는        => rendered in HTML: <i>철수</i>는

I can't say the above solution is well-designed, but I'm sure that
you'll get the point.

Thanks.

-- 
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
Korean Ver: http://www.cinsk.org/cfaqs/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2012-11-19  5:32 cinsky
@ 2012-11-19  7:11 ` Vladimir Lomov
  2012-11-19 10:06   ` Seong-Kook Shin
  0 siblings, 1 reply; 13+ messages in thread
From: Vladimir Lomov @ 2012-11-19  7:11 UTC (permalink / raw)
  To: cinsky; +Cc: emacs-orgmode

Hello,
** cinsky@gmail.com [2012-11-19 14:32:21 +0900]:

> Hi,

> AFAIK, if the markup syntax (=code=, *bold*, ..) is directly followed
> by non-whitespace characters, then it will not be marked-up:

>    =hello=there
>    /not/italic

> This may be right decision on English text, but in some languages, the
> postposition (grammar) will be postfixed without spaces into the
> previous noun, so it will be the trouble.  (Following text contains
> Korean characters in UTF-8, you may need additional korean font to
> read properly)

>    =printf=는
>    =bold=로
>    =철수=는

> I'm sure that some other languages will have same problem
> (e.g. Japanese or Chinese).

> Is there any way to force mark-up on this situation?

> If this pattern cannot be implemented easily, how about to introduce
> new escaping character to prevent to insert whitespace between
> marked-up text and the following postfix text?  For example:

>   =printf=\is      => rendered in HTML: <code>printf</code>is
>   *bold*\asdf      => rendered in HTML: <b>bold</b>asdf
>   /철수/\는        => rendered in HTML: <i>철수</i>는

> I can't say the above solution is well-designed, but I'm sure that
> you'll get the point.

May be this will help you:
http://article.gmane.org/gmane.emacs.orgmode/46263/match=zero+width+space

-- 
"Had he and I but met
By some old ancient inn,		But ranged as infantry,
We should have sat us down to wet	And staring face to face,
Right many a nipperkin!			I shot at him as he at me,
					And killed him in his place.
I shot him dead because --
Because he was my foe,			He thought he'd 'list, perhaps,
Just so: my foe of course he was;	Off-hand-like -- just as I --
That's clear enough; although		Was out of work -- had sold his traps
					No other reason why.
Yes; quaint and curious war is!
You shoot a fellow down
You'd treat, if met where any bar is
Or help to half-a-crown."
		-- Thomas Hardy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2012-11-19  7:11 ` Vladimir Lomov
@ 2012-11-19 10:06   ` Seong-Kook Shin
  2012-11-19 14:40     ` Suvayu Ali
  0 siblings, 1 reply; 13+ messages in thread
From: Seong-Kook Shin @ 2012-11-19 10:06 UTC (permalink / raw)
  To: Vladimir Lomov; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 2796 bytes --]

Yes, thank for the solution.

By the way, I'll prefer "word joiner" character (U+2060) to "zero width
space" character (U+200B),
because postpositions (grammar) should not be separated on line-break
policy.

Anyway, is there any plan to implement this feature in other way?
Using the solution that you provides makes the org document stick to the
unicode,
so it can't be used in other character encodings.

Thanks.

On Mon, Nov 19, 2012 at 4:11 PM, Vladimir Lomov <lomov.vl@gmail.com> wrote:

> Hello,
> ** cinsky@gmail.com [2012-11-19 14:32:21 +0900]:
>
> > Hi,
>
> > AFAIK, if the markup syntax (=code=, *bold*, ..) is directly followed
> > by non-whitespace characters, then it will not be marked-up:
>
> >    =hello=there
> >    /not/italic
>
> > This may be right decision on English text, but in some languages, the
> > postposition (grammar) will be postfixed without spaces into the
> > previous noun, so it will be the trouble.  (Following text contains
> > Korean characters in UTF-8, you may need additional korean font to
> > read properly)
>
> >    =printf=는
> >    =bold=로
> >    =철수=는
>
> > I'm sure that some other languages will have same problem
> > (e.g. Japanese or Chinese).
>
> > Is there any way to force mark-up on this situation?
>
> > If this pattern cannot be implemented easily, how about to introduce
> > new escaping character to prevent to insert whitespace between
> > marked-up text and the following postfix text?  For example:
>
> >   =printf=\is      => rendered in HTML: <code>printf</code>is
> >   *bold*\asdf      => rendered in HTML: <b>bold</b>asdf
> >   /철수/\는        => rendered in HTML: <i>철수</i>는
>
> > I can't say the above solution is well-designed, but I'm sure that
> > you'll get the point.
>
> May be this will help you:
> http://article.gmane.org/gmane.emacs.orgmode/46263/match=zero+width+space
>
> --
> "Had he and I but met
> By some old ancient inn,                But ranged as infantry,
> We should have sat us down to wet       And staring face to face,
> Right many a nipperkin!                 I shot at him as he at me,
>                                         And killed him in his place.
> I shot him dead because --
> Because he was my foe,                  He thought he'd 'list, perhaps,
> Just so: my foe of course he was;       Off-hand-like -- just as I --
> That's clear enough; although           Was out of work -- had sold his
> traps
>                                         No other reason why.
> Yes; quaint and curious war is!
> You shoot a fellow down
> You'd treat, if met where any bar is
> Or help to half-a-crown."
>                 -- Thomas Hardy
>



-- 
C FAQs: http://c-faq.com/
Korean: http://www.cinsk.org/cfaqs/

[-- Attachment #2: Type: text/html, Size: 4311 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2012-11-19 10:06   ` Seong-Kook Shin
@ 2012-11-19 14:40     ` Suvayu Ali
  2012-12-13 21:26       ` Bastien
  0 siblings, 1 reply; 13+ messages in thread
From: Suvayu Ali @ 2012-11-19 14:40 UTC (permalink / raw)
  To: emacs-orgmode

On Mon, Nov 19, 2012 at 07:06:10PM +0900, Seong-Kook Shin wrote:
> 
> Anyway, is there any plan to implement this feature in other way?
> Using the solution that you provides makes the org document stick to the
> unicode,
> so it can't be used in other character encodings.
> 

AFAIK, this will not be included;

  <http://thread.gmane.org/gmane.emacs.orgmode/59881/focus=59971>


-- 
Suvayu

Open source is the future. It sets us free.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2012-11-19 14:40     ` Suvayu Ali
@ 2012-12-13 21:26       ` Bastien
  2022-07-25 17:50         ` K
  2022-07-25 18:27         ` K
  0 siblings, 2 replies; 13+ messages in thread
From: Bastien @ 2012-12-13 21:26 UTC (permalink / raw)
  To: Suvayu Ali; +Cc: emacs-orgmode

Hi,

Suvayu Ali <fatkasuvayu+linux@gmail.com> writes:

>> Anyway, is there any plan to implement this feature in other way?
>> Using the solution that you provides makes the org document stick to the
>> unicode,
>> so it can't be used in other character encodings.
>> 
>
> AFAIK, this will not be included;
>
>   <http://thread.gmane.org/gmane.emacs.orgmode/59881/focus=59971>

More precisely this can be included when we decide to drop support 
of Emacs 22.

Does anyone know what is the current backward compatibility state
of major native Emacs packages (Gnus/ERC/etc) wrt Emacs 22?

Thanks,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2012-12-13 21:26       ` Bastien
@ 2022-07-25 17:50         ` K
  2022-07-25 18:27         ` K
  1 sibling, 0 replies; 13+ messages in thread
From: K @ 2022-07-25 17:50 UTC (permalink / raw)
  To: Bastien, Suvayu Ali; +Cc: emacs-orgmode


Hello everyone, I am a chinese user and also came across this problem.

Bastin once wrote this almost a decade ago:

> More precisely this can be included when we decide to drop support 
> of Emacs 22.
> 
> Does anyone know what is the current backward compatibility state
> of major native Emacs packages (Gnus/ERC/etc) wrt Emacs 22?
> 
> Thanks,
> 

Since emacs has released 28.1, Could this problem be solved?

Although we have the zero-width space workaround, for some fonts the
character will not be zero-space. So it would be nice to solve this
problem.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2012-12-13 21:26       ` Bastien
  2022-07-25 17:50         ` K
@ 2022-07-25 18:27         ` K
  2022-07-25 19:02           ` K
  1 sibling, 1 reply; 13+ messages in thread
From: K @ 2022-07-25 18:27 UTC (permalink / raw)
  To: Bastien, Suvayu Ali; +Cc: emacs-orgmode


Hello everyone, I am a chinese user and also came across this problem.

Bastin once wrote this almost a decade ago:

> More precisely this can be included when we decide to drop support 
> of Emacs 22.
> 
> Does anyone know what is the current backward compatibility state
> of major native Emacs packages (Gnus/ERC/etc) wrt Emacs 22?
> 
> Thanks,
> 

Since emacs has released 28.1, Could this problem be solved?

Although we have the zero-width space workaround, for some fonts the
character will not be zero-space. So it would be nice to solve this
problem.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2022-07-25 18:27         ` K
@ 2022-07-25 19:02           ` K
  2022-07-26  1:26             ` Ihor Radchenko
  0 siblings, 1 reply; 13+ messages in thread
From: K @ 2022-07-25 19:02 UTC (permalink / raw)
  To: k_foreign; +Cc: bzg, emacs-orgmode

> Bastin once wrote this almost a decade ago:

Sorry for the misspelling, the name is Bastien, not Bastin.

The thread and post I am mentioning is at
https://list.orgmode.org/orgmode/87bodxy77m.fsf@bzg.ath.cx/




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2022-07-25 19:02           ` K
@ 2022-07-26  1:26             ` Ihor Radchenko
  2022-07-26  2:23               ` Max Nikulin
  0 siblings, 1 reply; 13+ messages in thread
From: Ihor Radchenko @ 2022-07-26  1:26 UTC (permalink / raw)
  To: K; +Cc: bzg, emacs-orgmode

K <k_foreign@outlook.com> writes:

> The thread and post I am mentioning is at
> https://list.orgmode.org/orgmode/87bodxy77m.fsf@bzg.ath.cx/

That thread references yet another thread at
http://thread.gmane.org/gmane.emacs.orgmode/59881/focus=59971
However, gname links are no longer working.
Do you happen to know which thread id or subject the link is referring
to in the mailing list archive?

To add regarding the markup without spaces, we have discussed something
called "inline special blocks" in
https://orgmode.org/list/87a6b8pbhg.fsf@posteo.net
Such blocks can be used as an alternative markup.

Another idea we have discussed is using something similar to Markdown
format: **bold**, //italics//, __underline__, etc. It is less verbose
compared to the special blocks, which should be valuable for
Japanese/Chinese/other languages with no spaces between words.

Best,
Ihor


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2022-07-26  1:26             ` Ihor Radchenko
@ 2022-07-26  2:23               ` Max Nikulin
  2022-07-26  4:26                 ` K K
  0 siblings, 1 reply; 13+ messages in thread
From: Max Nikulin @ 2022-07-26  2:23 UTC (permalink / raw)
  To: emacs-orgmode

On 26/07/2022 08:26, Ihor Radchenko wrote:
> K writes:
> 
>> The thread and post I am mentioning is at
>> https://list.orgmode.org/orgmode/87bodxy77m.fsf@bzg.ath.cx/
> 
> That thread references yet another thread at
> http://thread.gmane.org/gmane.emacs.orgmode/59881/focus=59971
> However, gname links are no longer working.
> Do you happen to know which thread id or subject the link is referring
> to in the mailing list archive?

https://list.orgmode.org/orgmode/9C09CF9B-5B8F-4435-98D0-7E0B32BA5ACA@nf.mpg.de/T/
Stefan Vollmar. suggestion for org-emphasis-regexp-components: *U*nited 
*N*ations. 2012-09-05  8:05 UTC

However the suggestion was namely to use U+200B ZERO WIDTH SPACE and it 
is actually implemented since `org-emphasis-regexp-components' currently 
contains [:space:].

The U+2060 word joiner character (from this thread) is not a space, so 
currently it can not be used in such role. Recent mention of this character:
Tom Gillespie. On zero width spaces and Org syntax. Fri, 3 Dec 2021 
20:04:28 -0800. 
https://CA+G3_PM4cxHa8bU+3QG541UiOauLNAQFZQu-+UKczx3itOeTHg@mail.gmail.com

K, could you, please, clarify what is your particular use case? Some 
other workarounds, e.g. custom links, was discussed during last couple 
of years.

P.S. list.orgmode.org supports search by gmane article number:
     https://list.orgmode.org/orgmode/?q=gmane%3A59881
see
Kyle Meyer. yhetil.org/orgmode now supports searching by Gmane ID. Thu, 
23 Apr 2020 04:43:20 +0000 
https://list.orgmode.org/87k126revr.fsf@kyleam.com

Another recipe to fetch the article (from the same message) is
     w3m -m nntp://news.gmane.io/gmane.emacs.orgmode/59971




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2022-07-26  2:23               ` Max Nikulin
@ 2022-07-26  4:26                 ` K K
  2022-07-26  6:30                   ` Max Nikulin
  0 siblings, 1 reply; 13+ messages in thread
From: K K @ 2022-07-26  4:26 UTC (permalink / raw)
  To: Max Nikulin; +Cc: emacs-orgmode@gnu.org, Ihor Radchenko

[-- Attachment #1: Type: text/plain, Size: 1281 bytes --]

On 2022-07-26 Tue. 09:23 +0700,Max Nikulin wrote:

> However the suggestion was namely to use U+200B ZERO WIDTH SPACE and
> it
> is actually implemented since `org-emphasis-regexp-components'
> currently
> contains [:space:].
> ...
> K, could you, please, clarify what is your particular use case?

My bad, I misunderstood the "feature" mentioned in the old post.

My use case is to emphasize chinese characters without spaces being inserted, even those zero-width spaces. For example "中文*测*试" should be enough to emphasize "测".

I am using zero-width spaces right now, and it works fine in org-mode buffers, but if exported to latex-pdf files, the U+200B ZERO WIDTH SPACE character will not be zero-width for certain fonts. So I hope not to use that character.

On Tue, 26 Jul 2022 09:26:42 +0800, Ihor Radchenko wrote:
> Another idea we have discussed is using something similar to Markdown
> format: **bold**, //italics//, __underline__, etc. It is less verbose
> compared to the special blocks, which should be valuable for
> Japanese/Chinese/other languages with no spaces between words.

By the way, it seems that my use case has already been implemented by markdown-mode. In a markdown-mode buffer "中文**测**试" will certainly make "测" bold.

[-- Attachment #2: Type: text/html, Size: 3321 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
  2022-07-26  4:26                 ` K K
@ 2022-07-26  6:30                   ` Max Nikulin
  0 siblings, 0 replies; 13+ messages in thread
From: Max Nikulin @ 2022-07-26  6:30 UTC (permalink / raw)
  To: emacs-orgmode

On 26/07/2022 11:26, K K wrote:
> On 2022-07-26 Tue. 09:23 +0700,Max Nikulin wrote:
> 
>> > However the suggestion was namely to use U+200B ZERO WIDTH SPACE and
>> > it
>> > is actually implemented since `org-emphasis-regexp-components'
>> > currently
>> > contains [:space:].
>> > ...
>> > K, could you, please, clarify what is your particular use case?
> 
> My bad, I misunderstood the "feature" mentioned in the old post.
> 
> My use case is to emphasize chinese characters without spaces being 
> inserted, even those zero-width spaces. For example "中文*测*试" should 
> be enough to emphasize "测".
> 
> I am using zero-width spaces right now, and it works fine in org-mode 
> buffers, but if exported to latex-pdf files, the U+200B ZERO WIDTH SPACE 
> character will not be zero-width for certain fonts. So I hope not to use 
> that character.

I have not tested it, but I expect you can use
- export filter that removes zero-width spaces at the last export stage. 
I assume that your documents do not contain them besides markup workaround
- #+latex_header: \DeclareUnicodeCharacter{200B}{}
- custom link

    #+begin_src elisp :results none :exports both
      (org-link-set-parameters
       "sep"
       :export (lambda (path desc backend)
	       (if (org-export-derived-backend-p backend 'org)
		   (org-link-make-string (concat "sep:" path) desc)
		 (or desc ""))))
    #+end_src
    "中文[[sep:][*测*]]试"

   https://list.orgmode.org/ssp8e7$ah2$1@ciao.gmane.io/
   Max Nikulin Re: [RFC] Creole-style / Support for 
**emphasis**__within__**a word** Tue, 25 Jan 2022 23:27:50 +0700

In other thread we are discussing advantages and problems of switching 
from PdfLaTeX to LuaLaTeX for non-latin scripts. The latter is a Unicode 
engine. I am curious what is your opinion from standpoint of Chinese 
language, namely amount of required customization in both cases. I 
think, it is better to either start a dedicated thread, or find the part 
of discussion related to fonts and babel (LaTeX package) setup.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: How to force markup without spaces
@ 2022-07-26 10:24 K
  0 siblings, 0 replies; 13+ messages in thread
From: K @ 2022-07-26 10:24 UTC (permalink / raw)
  To: Max Nikulin; +Cc: emacs-orgmode

On Tue, 2022-07-26 at 13:30 +0700, Max Nikulin wrote:

> I have not tested it, but I expect you can use
> - export filter that removes zero-width spaces at the last export
> stage.
> I assume that your documents do not contain them besides markup
> workaround
> - #+latex_header: \DeclareUnicodeCharacter{200B}{}
> - custom link
> 
>     #+begin_src elisp :results none :exports both
>       (org-link-set-parameters
>        "sep"
>        :export (lambda (path desc backend)
>                (if (org-export-derived-backend-p backend 'org)
>                    (org-link-make-string (concat "sep:" path) desc)
>                  (or desc ""))))
>     #+end_src
>     "中文[[sep:][*测*]]试"

I tested the second workaround, and replaced the \DeclareUnicodeCharacter{200B}{} sequence with \newunicodechar{​}{} sequence since I am using xelatex, which does not support the former.
It works fine so far.

> In other thread we are discussing advantages and problems of
> switching
> from PdfLaTeX to LuaLaTeX for non-latin scripts. The latter is a
> Unicode
> engine. I am curious what is your opinion from standpoint of Chinese
> language, namely amount of required customization in both cases. I
> think, it is better to either start a dedicated thread, or find the
> part
> of discussion related to fonts and babel (LaTeX package) setup.

As far as I know, Chinese users commonly use ctex package https://ctan.org/pkg/ctex to handle Chinese typesetting problem, and they prefer xelatex and lualatex over pdflatex. They don't support more fonts when using pdflatex, compared with using xelatex etc. (you can see that on page 7 of their pdf document). So I just use xelatex and don't have much experience using pdflatex.

When using ctex, you just need to declare \documentclass{ctexart} (ctexart is a ctex version article) to use Chinese characters. Then if your system has the required default fonts, the pdf documents should be OK.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-07-26 10:29 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-26 10:24 How to force markup without spaces K
  -- strict thread matches above, loose matches on Subject: below --
2012-11-19  5:32 cinsky
2012-11-19  7:11 ` Vladimir Lomov
2012-11-19 10:06   ` Seong-Kook Shin
2012-11-19 14:40     ` Suvayu Ali
2012-12-13 21:26       ` Bastien
2022-07-25 17:50         ` K
2022-07-25 18:27         ` K
2022-07-25 19:02           ` K
2022-07-26  1:26             ` Ihor Radchenko
2022-07-26  2:23               ` Max Nikulin
2022-07-26  4:26                 ` K K
2022-07-26  6:30                   ` Max Nikulin

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).