* On zero width spaces and Org syntax @ 2021-12-03 12:48 Juan Manuel Macías 2021-12-03 19:03 ` Greg Minshall ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Juan Manuel Macías @ 2021-12-03 12:48 UTC (permalink / raw) To: orgmode Hi all, It is usually recommended, as you know, to insert a zero width space character (Unicode U+200B) as a sort of delimiter mark to solve the scenarios of emphasis within a word (for example, =/meta/literature=) and others contexts where emphasis marks are not recognized (for example =[/literature/]=). I believe that as a puntual workaround it is not bad; however, I find it problematic that this character is part, more or less de facto, of the Org syntax. For two main reasons: 1. It is an invisible character, and therefore it is difficult to control and manage. I think it is not good practice to introduce this type of characters implicitly in a plain text document. 2. It is more natural that this type of space characters are part of the 'output' and not of the 'input'. In the input it is better to introduce them not implicitly but through their representation. For example, in LaTeX (with LuaTeX) using the command '\char"200B{}' (or '^^^^200b'), '​' in HTML, etc. In any case, as an implicit character, I do not see it appropriate for the syntax of a markup language. The marks should be simply ascii characters, IMHO. So what if Org had a specific delimiter mark for the scenarios described above? For example, something like that: #+begin_example /meta/''literature *meta*''literature [''*literature*''] #+end_example WDYT? Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-03 12:48 On zero width spaces and Org syntax Juan Manuel Macías @ 2021-12-03 19:03 ` Greg Minshall 2021-12-03 20:30 ` Juan Manuel Macías 2021-12-03 21:48 ` Tim Cross 2021-12-04 6:43 ` Marcin Borkowski 2 siblings, 1 reply; 15+ messages in thread From: Greg Minshall @ 2021-12-03 19:03 UTC (permalink / raw) To: Juan Manuel Macías; +Cc: orgmode Juan Manuel, > however, I find it problematic that this character is part, more or > less de facto, of the Org syntax. For two main reasons: in fact, i am always queasy when i enter ZWNBSP in a .org (or any other) file. some sort of "visible" sequence would be great. backwards compatibility might be a problem. your last example : [''*literature*''] seems a bit of sleight-of-hand, though. iiuc, text inside square brackets isn't highlighted currently, and ZWNBSP doesn't (afaict) turn on highlighting. (maybe there's been recent discussion, modifications of this?) i.e., if the goal is to *expand* the realm of highlighting, might that not be a separate issue? cheers, Greg ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-03 19:03 ` Greg Minshall @ 2021-12-03 20:30 ` Juan Manuel Macías 0 siblings, 0 replies; 15+ messages in thread From: Juan Manuel Macías @ 2021-12-03 20:30 UTC (permalink / raw) To: Greg Minshall; +Cc: orgmode Hi Greg, thank you for your comment, Greg Minshall writes: > in fact, i am always queasy when i enter ZWNBSP in a .org (or any other) > file. some sort of "visible" sequence would be great. backwards > compatibility might be a problem. Yes I agree. I think that in this case, a new mark would not compromise backward compatibility, as this presumed new mark would do the same function as zero width space: i.e. delimit to preserve emphasis. Of course one could go on using a zero-width space, though I keep thinking that this is rather a puntual workaround and should not form part of the syntax. > your last example > > : [''*literature*''] > > seems a bit of sleight-of-hand, though. iiuc, text inside square > brackets isn't highlighted currently, and ZWNBSP doesn't (afaict) turn > on highlighting. (maybe there's been recent discussion, modifications > of this?) The idea would be to use a kind of 'protection mark', to allow something in a context where it is not allowed: a passport ;-). As the emphasis marks are recognized before and after a single quote, I thought that maybe a sequence of two single quotes could function here as a protection mark (screenshot: https://i.imgur.com/cPIH9qa.png). For example: #+begin_example | Some examples where emphasis marks are not allowed | Protected emphasis marks | |----------------------------------------------------+--------------------------| | /meta/literature | /meta/''literature | | [/literature/] | [''/literature/''] | | <*literature*> | <''*literature*''> | | meta/*literature* | meta/''*literature* | #+end_example With the protection marks we get (in LaTeX for example): \emph{meta}literature [\emph{literature}] <\textbf{literature}> meta/\textbf{literature} Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-03 12:48 On zero width spaces and Org syntax Juan Manuel Macías 2021-12-03 19:03 ` Greg Minshall @ 2021-12-03 21:48 ` Tim Cross 2021-12-04 1:26 ` Juan Manuel Macías ` (2 more replies) 2021-12-04 6:43 ` Marcin Borkowski 2 siblings, 3 replies; 15+ messages in thread From: Tim Cross @ 2021-12-03 21:48 UTC (permalink / raw) To: emacs-orgmode Juan Manuel Macías <maciaschain@posteo.net> writes: > Hi all, > > It is usually recommended, as you know, to insert a zero width space > character (Unicode U+200B) as a sort of delimiter mark to solve the > scenarios of emphasis within a word (for example, =/meta/literature=) > and others contexts where emphasis marks are not recognized (for example > =[/literature/]=). I believe that as a puntual workaround it is not bad; > however, I find it problematic that this character is part, more or less > de facto, of the Org syntax. For two main reasons: > > 1. It is an invisible character, and therefore it is difficult to > control and manage. I think it is not good practice to introduce this > type of characters implicitly in a plain text document. > > 2. It is more natural that this type of space characters are part of the > 'output' and not of the 'input'. In the input it is better to introduce > them not implicitly but through their representation. For example, in > LaTeX (with LuaTeX) using the command '\char"200B{}' (or '^^^^200b'), > '​' in HTML, etc. > > In any case, as an implicit character, I do not see it appropriate for > the syntax of a markup language. The marks should be simply ascii > characters, IMHO. So what if Org had a specific delimiter mark for the > scenarios described above? For example, something like that: > > #+begin_example > > /meta/''literature > > *meta*''literature > > [''*literature*''] > > #+end_example > > WDYT? > > Best regards, > > Juan Manuel I think I am in agreement regarding most of your points about the use of the zero-width character. I see it as a type of escape hatch which provides a solution in some less frequent situations. It is a somewhat clever kludge to enable markup in some situations not supported by the basic markup syntax I'm happy with its status as a kludge and would not want to see it become an official part of the syntax. Where we may differ is in whether we actually want to add inner word markup support at all. I'm somewhat surprised and more than a little concerned at how much interest and focus on modifying the markup syntax of org the question of inner word markup has generated. This seems to be a symptom of a more general trend towards adding and extending org mode to meet the needs of everyone and I'm concerned this is overlooking the key strength of org mode - simplicity. Consider how many times we have had requests for inner word markup in the last 18 years. I've seen such requests only a very few times. Certainly not frequently enough to consider modification of the markup syntax to accommodate such a requirement. A key philosophy of org mode is simplicity - it makes the easy stuff simple and the hard stuff possible. The thing about simple solutions is that they will inevitably have limitations. If you don't want those limitations, then you use a more complex feature rich markup, such as Latex, HTML, XML etc. Ideally, your system will provide some escape hatches to allow you to do things not supported by the base markup syntax. Those escape hatches will usually be less convenient and often look quite ugly, but that is fine because they are an escape hatch which is used infrequently. Better still is if the system provides some way to make a specific escape hatch easier to use in a document (such as via a macro). The basic org markup syntax has worked remarkably well for 18 years. Nearly all the proposed additions or alterations to support inner word markup with complicate the syntax or introduce potential new ambiguities and/or complexity in processing to support a feature which has been rarely asked for and which has other, less convenient and often ugly, solutions which work. One of org's strengths has been the ability to export documents to multiple formats. One way this has been made possible is by keeping the markup syntax simple - a basic markup which is well supported by all export back ends. Once you start adding more complex markup support, you see a blow out of complexity in the export back ends. Worse yet, you get results which are surprising to the end user or which simply don't work correctly with some formats. to avoid this, it is critical to keep the markup syntax as simple and straight-forward as possible, even if that means some limitations on what can be done with the markup. My vote is to simply maintain the status quo. Don't modify the syntax, don't make the zero space character somewhat special or processed in any special way during export. In short, accept that inner word markup has only limited support and if that is a requirement which is critical to your use case, accept that org mode may not be the right solution for your requirements. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-03 21:48 ` Tim Cross @ 2021-12-04 1:26 ` Juan Manuel Macías 2021-12-04 4:04 ` Tom Gillespie 2021-12-04 15:26 ` Max Nikulin 2021-12-06 11:40 ` Eric S Fraga 2 siblings, 1 reply; 15+ messages in thread From: Juan Manuel Macías @ 2021-12-04 1:26 UTC (permalink / raw) To: Tim Cross; +Cc: orgmode Tim Cross writes: > I think I am in agreement regarding most of your points about the use of > the zero-width character. I see it as a type of escape hatch which > provides a solution in some less frequent situations. It is a somewhat > clever kludge to enable markup in some situations not supported by the > basic markup syntax I'm happy with its status as a kludge and would not > want to see it become an official part of the syntax. Where we may > differ is in whether we actually want to add inner word markup support > at all. > > I'm somewhat surprised and more than a little concerned at how much > interest and focus on modifying the markup syntax of org the question of > inner word markup has generated. This seems to be a symptom of a more > general trend towards adding and extending org mode to meet the needs of > everyone and I'm concerned this is overlooking the key strength of org > mode - simplicity. > > Consider how many times we have had requests for inner word markup in > the last 18 years. I've seen such requests only a very few times. > Certainly not frequently enough to consider modification of the markup > syntax to accommodate such a requirement. > > A key philosophy of org mode is simplicity - it makes the easy stuff > simple and the hard stuff possible. The thing about simple solutions is > that they will inevitably have limitations. If you don't want those > limitations, then you use a more complex feature rich markup, such as > Latex, HTML, XML etc. Ideally, your system will provide some escape > hatches to allow you to do things not supported by the base markup > syntax. Those escape hatches will usually be less convenient and often > look quite ugly, but that is fine because they are an escape hatch > which is used infrequently. Better still is if the system provides some > way to make a specific escape hatch easier to use in a document (such as > via a macro). The basic org markup syntax has worked remarkably well for > 18 years. Nearly all the proposed additions or alterations to support > inner word markup with complicate the syntax or introduce potential new > ambiguities and/or complexity in processing to support a feature which > has been rarely asked for and which has other, less convenient and often > ugly, solutions which work. > > One of org's strengths has been the ability to export documents to > multiple formats. One way this has been made possible is by keeping the > markup syntax simple - a basic markup which is well supported by all > export back ends. Once you start adding more complex markup support, you > see a blow out of complexity in the export back ends. Worse yet, you get > results which are surprising to the end user or which simply don't work > correctly with some formats. to avoid this, it is critical to keep the > markup syntax as simple and straight-forward as possible, even if that > means some limitations on what can be done with the markup. > > My vote is to simply maintain the status quo. Don't modify the syntax, > don't make the zero space character somewhat special or processed in any > special way during export. In short, accept that inner word markup has > only limited support and if that is a requirement which is critical to > your use case, accept that org mode may not be the right solution for > your requirements. Thank you very much for the detailed and precise exposition of your point of view. I appreciate it. First of all, a point that I consider important and essential in this and other debates that are generated here, is that there is no single conception of Org that should prevail as (say) "the canon". Org is so polyhedral and so multifaceted that there are as many conceptions of Org as there are users of Org. Well, what I have said is in itself one more conception of Org. But I assume that other users may think that Org is not all the things that I say it is. At the end of the day, what matters is only one thing, for on top of theories and doctrines: if Org is useful to you and helps you to do your work, so great. A few months ago (and I think I already shared it here) I finished the typesetting and layout of a dictionary of almost 1000 pages, and I did it using a workflow that I have developed which is a merge between Org/Org-Publish and LuaTeX. And now, using the same method, I am working on an ancient-Greek/Spanish bilingual critical edition. So I believe I'm not suspicious of thinking that Org doesn't cover the needs of my workflow. As for the matter of emphasis marks between words. I believe that this is not the underlying problem, but rather the (little) inconsistency of the markup on certain contexts. Think, for example, of a text where you have to put many words in italics, enclosed between brackets. I don't care if that type of text is 'typical' or 'non-typical', 'majority' or 'non-majority'. It is simply a kind of scenario absolutely legitimate and feasible, and right now I could quote you more than a type of text in that direction. Since I have been using Org I have been running into these little inconsistencies. Any insurmountable, of course, nor I had to abandon the use of Org for that minor issues. Fortunately, Org is more than just a markup language, and it offers lots of alternative resources and extensibility. Org is GNU Emacs. Org is not Markdown. My proposal here also does not arise from an irrepressible desire to add more complexity to the syntax. If it's recommended that the user, in certain contexts, enter implicitly a zero-width space (which, I insist, is a practice that should be avoided as much as possible in a plain text document), why not at least offer a graphical alternative, a *real* mark whose role is *exactly* the same as that of the zero-with space? Is that adding more complexity??? Honestly I think that's exactly the opposite. In any case, I have suggested that new mark as a possibility, in case it is interesting to implement it, since a thread has emerged these days about the topic of the intra-words syntax. Discussions and threads arised about these questions and any other are perfectly legitimate and natural and welcome. Please: there are no issues more 'important' than others; no two users are the same in Org. What you do not find useful, another user may perhaps finds it indispensable. And vice versa. And I think no one is in willingness to state what the average Org user does or does not want, given that we do not know even 1% of Org users. Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-04 1:26 ` Juan Manuel Macías @ 2021-12-04 4:04 ` Tom Gillespie 2021-12-04 5:29 ` Juan Manuel Macías 0 siblings, 1 reply; 15+ messages in thread From: Tom Gillespie @ 2021-12-04 4:04 UTC (permalink / raw) To: Juan Manuel Macías; +Cc: Tim Cross, orgmode An important note: for intra-word markup you probably want to use word joiner U+2060 and not zero width space, because a zero width space allows layout to break the word, whereas a word joiner does not. We may need to check to make sure that U+2060 counts as whitespace for the purposes of markup. > 2. It is more natural that this type of space characters are part of the > 'output' and not of the 'input'. That is not relevant in this case. However, Org export should not be emitting byte-literal zero width spaces either, that causes as NASTY surprise for the user. All that Org does in this pass is pass something along for the user. The kludge is a kluge because it just happens to be compatible with Org syntax, that is all. I agree that significant whitespace is decidedly undesirable, unfortunately Org already has some, though it is nowhere near as bad as markdown with the trailing whitespace. There also happen to be ways to mitigate issues with non-printing chars via font-locking etc. to make them print/visible when authoring. This is another good reason to use macros as well --- they can be documented. > As for the matter of emphasis marks between words. I believe that this > is not the underlying problem, but rather the (little) inconsistency of > the markup on certain contexts. Think, for example, of a text where you > have to put many words in italics, enclosed between brackets. I don't > care if that type of text is 'typical' or 'non-typical', 'majority' or > 'non-majority'. It is simply a kind of scenario absolutely legitimate > and feasible, and right now I could quote you more than a type of text > in that direction. The problem here is that there is an unbalanced design tradeoff. Supporting intra-word markup using Org's simple markup syntax actually introduces more inconsistencies elsewhere (see my note at the end about where the burden of proof lies with regard to statements like this). Further, we also have to consider the impact of such a change across the whole population of Emacs users and use cases. Adding complexity to support a very narrow use case, and one that will produce inconsistencies elsewhere means that the whole community is forced to bear the burden of that complexity. This is the principle that I think Tim touches on in terms of keeping simple things simple. Complexity in pursuit of niche use cases is never worth the cost when it has to be borne by 99% of users that will never need such things. Further, Org provides not only a single solution to these cases, but multiple solutions. Worst case it is also possible to fail over to text macros, which are an absurdly powerful escape hatch for users that have advanced (read niche) needs. > My proposal here also does not arise from an irrepressible desire to add > more complexity to the syntax. If it's recommended that the user, in > certain contexts, enter implicitly a zero-width space (which, I insist, > is a practice that should be avoided as much as possible in a plain text > document), why not at least offer a graphical alternative, a *real* mark > whose role is *exactly* the same as that of the zero-with space? Is that > adding more complexity??? Honestly I think that's exactly the opposite. This has the same problems as other proposals about this, whether they are escape chars, or other syntactic additions. It complicates the syntax for the community as a whole. It may simplify it for your particular use case, but not when averaged out with everyone else. I think one approach is to encourage the use of \emph{a}b and friends. They are printable and hide nothing. I would also suggest that we work to update other export backends to support \emph where possible. > In any case, I have suggested that new mark as a possibility, in case it > is interesting to implement it, since a thread has emerged these days > about the topic of the intra-words syntax. Discussions and threads > arised about these questions and any other are perfectly legitimate and > natural and welcome. Please: there are no issues more 'important' than > others; no two users are the same in Org. What you do not find useful, > another user may perhaps finds it indispensable. And vice versa. And I > think no one is in willingness to state what the average Org user does > or does not want, given that we do not know even 1% of Org users. I think we have a fairly good idea in this particular case. If someone wanted to do a more thorough study of existing org files in the wild to see whether they are using a workaround it would certainly be interesting, if unlikely to reject the null hypothesis. Take a survey of all the html in the world and see how many documents make use of intra-word markup that use any markup at all. I'm guessing it is a vanishingly small percentage. If we could figure out how to implement intra-word markup in a way that didn't induce complexity it would be done, and probably would already have been done, and I suspect people might use it. There are very few syntax changes that reduce the complexity for Org (though there are some). The rest have major costs, both in implementation time, and in disruption of workflows, and hunting down of edge cases, and total complexity. The burden of proof for syntax changes lies squarely with the individual(s) suggesting the change to show that it can be done without disrupting the existing implementation and without inducing complexity and changing the interpretation of existing documents. I say this as someone who has at least one major syntax change suggestion in the pipeline. Requesting a syntax change is among the most deeply invasive and complex things that can be done. I know that syntax is also the most obvious to users, it is their interface to the format afterall! However, each individual shares that interface with thousands of other people. The maintainers have to speak for those thousands who never read, much less respond on this mailing list, and that almost always means that the response will be one that is decidedly conservative. I don't mean to be dismissive of the suggestion, but a lot of time is spent on this list walking back ideas that have not had sufficient time put into understanding what the unintended consequences would be, so I wouldn't say that it is irresponsible, I would say instead that it lacks sufficient rigor and depth to be seriously considered. If you can add those to this proposal (e.g. in the form of a patch) then I suspect it would get a much warmer reception. Best, Tom ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-04 4:04 ` Tom Gillespie @ 2021-12-04 5:29 ` Juan Manuel Macías 0 siblings, 0 replies; 15+ messages in thread From: Juan Manuel Macías @ 2021-12-04 5:29 UTC (permalink / raw) To: Tom Gillespie; +Cc: orgmode Tom Gillespie writes: > I don't mean to be dismissive of the suggestion, but a lot of > time is spent on this list walking back ideas that have not > had sufficient time put into understanding what the > unintended consequences would be, so I wouldn't say > that it is irresponsible, I would say instead that it lacks > sufficient rigor and depth to be seriously considered. If you > can add those to this proposal (e.g. in the form of a patch) > then I suspect it would get a much warmer reception. I am afraid that I am explaining myself wrong, and it is not my intention that this matter becomes entangled to infinity. I have no intention of proposing any patch on this. I'm not strongly requesting this feature be included, and I am not interested in starting a crusade to defend this (and as for lack of rigor and depth, well, it's your subjective opinion). But it's more simple. Since a thread on these questions came up recently, it occurred to me to suggest this idea as a *possibility*, in case anyone could find it interesting and would like to explore it. Nothing more. In fact, I don't think I was going to use this probable feature much, if it was implemented, because for these scenarios I prefer to use Org macros or other resources that I have implemented for my workflow. But maybe users would prefer this to insert a zero-whith space character (which is a tricky and quite ugly workaround and should not be recommended). Or maybe not. I really don't know. I don't know all Org users in the world, do you know them? Anyway, I want to point out one thing, again. The scenarios and contexts that are being described here are far from "very narrow use case". And I don't think it's very appropriate to hide the lack of something with the excuse that no one is going to need it. Intra-word emphasis is used (for example) a lot in linguistics books and texts, grammars, etc. That you *ignore* this fact does not mean that does not exist. regards, jm ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-03 21:48 ` Tim Cross 2021-12-04 1:26 ` Juan Manuel Macías @ 2021-12-04 15:26 ` Max Nikulin 2021-12-04 20:29 ` Tim Cross 2021-12-06 11:40 ` Eric S Fraga 2 siblings, 1 reply; 15+ messages in thread From: Max Nikulin @ 2021-12-04 15:26 UTC (permalink / raw) To: emacs-orgmode On 04/12/2021 04:48, Tim Cross wrote: > > My vote is to simply maintain the status quo. Don't modify the syntax, > don't make the zero space character somewhat special or processed in any > special way during export. In short, accept that inner word markup has > only limited support and if that is a requirement which is critical to > your use case, accept that org mode may not be the right solution for > your requirements. Tim, you are skeptical concerning usage of Org markup outside of Emacs. Though some subscribers of this list support such idea with hope for collaboration with colleagues and for other reasons. Status quo in respect to similar questions increases risk that other tools will adapt different workarounds and incompatible dialects will appear. From the point of view of popularizing Org it is better to make some decision: either zero-width space should become a part of syntax or some other printable marker should be chosen to suppress effect of Org markup or vice versa to activate some construct. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-04 15:26 ` Max Nikulin @ 2021-12-04 20:29 ` Tim Cross 0 siblings, 0 replies; 15+ messages in thread From: Tim Cross @ 2021-12-04 20:29 UTC (permalink / raw) To: emacs-orgmode Max Nikulin <manikulin@gmail.com> writes: > On 04/12/2021 04:48, Tim Cross wrote: >> My vote is to simply maintain the status quo. Don't modify the syntax, >> don't make the zero space character somewhat special or processed in any >> special way during export. In short, accept that inner word markup has >> only limited support and if that is a requirement which is critical to >> your use case, accept that org mode may not be the right solution for >> your requirements. > > Tim, you are skeptical concerning usage of Org markup outside of Emacs. Though > some subscribers of this list support such idea with hope for collaboration with > colleagues and for other reasons. Status quo in respect to similar questions > increases risk that other tools will adapt different workarounds and > incompatible dialects will appear. This is a misrepresentation of my position. I've never stated I'm sceptical or org markup outside of Emacs. I'm sceptical of org mode outside of Emacs, but have never expressed an opinion of org markup outside of Emacs. However, I will now.... Org markup outside Emacs is very much a secondary concern that would be a nice to have for some workflows, but should be achieved with zero impact on Emacs users. Org mode and the markup it uses is primarily an Emacs mode. In fact, making it easier for non-Emacs users to use org mode is almost certainly working against the FSF philosophy. I'm pretty certain RMS would be very unhappy of any efforts to allow users to use org mode in products like MS Visual Code. While it is fine for 3rd party systems to try and mimic org mode, it is totally contrary to GNU philosophy for a GNU project to actively support or enable such functionality in non-free solutions. Any decisions to make changes to org mode must be primarily for the benefit of Emacs users. When such decisions also have benefit for non-Emacs users, that is great, but it should not be a driving factor in making decisions regarding change or extensions to org mode. > > From the point of view of popularizing Org it is better to make some decision: > either zero-width space should become a part of syntax or some other printable > marker should be chosen to suppress effect of Org markup or vice versa to > activate some construct. Chasing popularity is always a mistake and should never be used as an argument for change. We are also talking about something where there is little evidence of demand. We have a single post from someone asking how to support inner word emphasis and suddenly, threads about modifying syntax, modifying back ends and a dozen proposals on how to support this 'feature'. A question I would ask is that if extending and adding broader support for emphasis is so straight-forward, why do we already have so many issues reported about incorrect application of markup? We have not been successful in eliminating existing ambiguities with the markup and yet some would have us charge off and add even more complexity. Rather than extending markup syntax, lets focus on fixing the real issues we already have. There have been far more posts to this list about that than about inner word emphasis. For example, the many posts about markup and links. With respect to the status of zero width space, I'm not convinced we need to do anything. Would it be classified as a kludge, probably. Does it provide an escape hatch for some situations, yes. Does that mean it needs to be formally recognised and added to the syntax, no. Does the existence of this kludge make implementation of org mode markup for other tools more difficult or less clear, probably. Should that be a primary concern for Emacs org-mode, no. Should it be something we consider when making decisions, sure, but only as a secondary consideration. What the need for the zero width space kludge really means is that in some situations, we have some ambiguity in the existing syntax. Can we fix those ambiguities? I don't know - so far, I've not seen a proposal which doesn't introduce as many problems as it solves, (though Tomp's @@ proposal looks interesting, but lots more analysis is required). The zero width kludge is certainly a symptom of limitations in the existing syntax definition. However, I don't think it is the cure and I don't agree it needs to be formally recognised as part of the syntax - it is not the cure. If we can find the correct cure, the zero width kludge will not be necessary (or will only be necessary in extreme and rare edge cases). ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-03 21:48 ` Tim Cross 2021-12-04 1:26 ` Juan Manuel Macías 2021-12-04 15:26 ` Max Nikulin @ 2021-12-06 11:40 ` Eric S Fraga 2 siblings, 0 replies; 15+ messages in thread From: Eric S Fraga @ 2021-12-06 11:40 UTC (permalink / raw) To: Tim Cross; +Cc: emacs-orgmode On Saturday, 4 Dec 2021 at 08:48, Tim Cross wrote: > My vote is to simply maintain the status quo. A very strong +1 on this. Org has enough /escape mechanisms/, as you call them, to cater for special cases, and these include @@...@@, babel, and filters, amongst others. The simplicity of org is a major advantage. -- : Eric S Fraga, with org release_9.5.1-243-gad53c5 in Emacs 29.0.50 : Latest paper written in org: https://arxiv.org/abs/2106.05096 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-03 12:48 On zero width spaces and Org syntax Juan Manuel Macías 2021-12-03 19:03 ` Greg Minshall 2021-12-03 21:48 ` Tim Cross @ 2021-12-04 6:43 ` Marcin Borkowski 2021-12-04 7:22 ` Ihor Radchenko 2021-12-06 16:01 ` Robert Pluim 2 siblings, 2 replies; 15+ messages in thread From: Marcin Borkowski @ 2021-12-04 6:43 UTC (permalink / raw) To: Juan Manuel Macías; +Cc: orgmode On 2021-12-03, at 13:48, Juan Manuel Macías <maciaschain@posteo.net> wrote: > Hi all, > > It is usually recommended, as you know, to insert a zero width space > character (Unicode U+200B) as a sort of delimiter mark to solve the > scenarios of emphasis within a word (for example, =/meta/literature=) > and others contexts where emphasis marks are not recognized (for example > =[/literature/]=). I believe that as a puntual workaround it is not bad; > however, I find it problematic that this character is part, more or less > de facto, of the Org syntax. For two main reasons: > > 1. It is an invisible character, and therefore it is difficult to > control and manage. I think it is not good practice to introduce this > type of characters implicitly in a plain text document. > > 2. It is more natural that this type of space characters are part of the > 'output' and not of the 'input'. In the input it is better to introduce > them not implicitly but through their representation. For example, in > LaTeX (with LuaTeX) using the command '\char"200B{}' (or '^^^^200b'), > '​' in HTML, etc. > > In any case, as an implicit character, I do not see it appropriate for > the syntax of a markup language. The marks should be simply ascii > characters, IMHO. So what if Org had a specific delimiter mark for the > scenarios described above? For example, something like that: Hi all, I've skimmed through this discussion. FWIW, I also use zero-width spaces in my Org files for this precise reason. However, I agree that extending syntax is dangerous. How about a solution (or maybe it's only a "solution"...) where: 1. We take care to modify the "official" exporters to throw out the ZWSs. Or even better, convert them to something reasonable, e.g. with LaTeX they can be discarded or converted to some command – possibly even one defined in the preamble – so that nothing is lost. I'd even say that an option deciding what to do with those could be nice. 2. We modify Emacs itself to somehow highlight the ZWS. There is (kind of) a precedent – a no-breaking space is already fontified with =nobreak-space= face. At the very least, make whitespace-mode somehow show ZWSs (which it doesn't now, and I'd probably say it's a bug). I know that my point 2. is a bit controversial, since it could lead to alignment issues where a ZWS is displayed as something with a positive width. OTOH, even now changing the face of a ZWS leads to a narrow (1-pixel wide) line of a different color. Is there a way to make it a bit stronger? Just some random ideas, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-04 6:43 ` Marcin Borkowski @ 2021-12-04 7:22 ` Ihor Radchenko 2021-12-04 17:37 ` Marcin Borkowski 2021-12-06 16:01 ` Robert Pluim 1 sibling, 1 reply; 15+ messages in thread From: Ihor Radchenko @ 2021-12-04 7:22 UTC (permalink / raw) To: Marcin Borkowski; +Cc: Juan Manuel Macías, orgmode [-- Attachment #1: Type: text/plain, Size: 898 bytes --] Marcin Borkowski <mbork@mbork.pl> writes: > 2. We modify Emacs itself to somehow highlight the ZWS. There is (kind > of) a precedent – a no-breaking space is already fontified with > =nobreak-space= face. At the very least, make whitespace-mode somehow > show ZWSs (which it doesn't now, and I'd probably say it's a bug). > > I know that my point 2. is a bit controversial, since it could lead to > alignment issues where a ZWS is displayed as something with a positive > width. OTOH, even now changing the face of a ZWS leads to a narrow > (1-pixel wide) line of a different color. Is there a way to make it > a bit stronger? We can try to create an accent. Try the following: 1. Open new empty org buffer 2. Disable font-lock-mode 3. M-: (insert (compose-string "a" nil nil (list ?a '(bl . tl) ?␣))) The result will look like on the attached image. Best, Ihor [-- Attachment #2: example.png --] [-- Type: image/png, Size: 2020 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-04 7:22 ` Ihor Radchenko @ 2021-12-04 17:37 ` Marcin Borkowski 0 siblings, 0 replies; 15+ messages in thread From: Marcin Borkowski @ 2021-12-04 17:37 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Juan Manuel Macías, orgmode On 2021-12-04, at 08:22, Ihor Radchenko <yantar92@gmail.com> wrote: > Marcin Borkowski <mbork@mbork.pl> writes: >> 2. We modify Emacs itself to somehow highlight the ZWS. There is (kind >> of) a precedent – a no-breaking space is already fontified with >> =nobreak-space= face. At the very least, make whitespace-mode somehow >> show ZWSs (which it doesn't now, and I'd probably say it's a bug). >> >> I know that my point 2. is a bit controversial, since it could lead to >> alignment issues where a ZWS is displayed as something with a positive >> width. OTOH, even now changing the face of a ZWS leads to a narrow >> (1-pixel wide) line of a different color. Is there a way to make it >> a bit stronger? > > We can try to create an accent. Try the following: > 1. Open new empty org buffer > 2. Disable font-lock-mode > 3. M-: (insert (compose-string "a" nil nil (list ?a '(bl . tl) ?␣))) > > The result will look like on the attached image. I'm not sure if I like that idea - looks great, but I'd be a bit afraid of unintended consequences. Either way, personally I can live with ZWSs in my Org files, so whatever is decided, it's fine with me. Best, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-04 6:43 ` Marcin Borkowski 2021-12-04 7:22 ` Ihor Radchenko @ 2021-12-06 16:01 ` Robert Pluim 2021-12-06 16:42 ` Greg Minshall 1 sibling, 1 reply; 15+ messages in thread From: Robert Pluim @ 2021-12-06 16:01 UTC (permalink / raw) To: Marcin Borkowski; +Cc: Juan Manuel Macías, orgmode >>>>> On Sat, 04 Dec 2021 07:43:35 +0100, Marcin Borkowski <mbork@mbork.pl> said: Marcin> 2. We modify Emacs itself to somehow highlight the ZWS. There is (kind Marcin> of) a precedent – a no-breaking space is already fontified with Marcin> =nobreak-space= face. At the very least, make whitespace-mode somehow Marcin> show ZWSs (which it doesn't now, and I'd probably say it's a bug). Thereʼs no need to modify Emacs: see `glyphless-char-display-control'. ZWS falls under 'format-control'. Robert -- ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: On zero width spaces and Org syntax 2021-12-06 16:01 ` Robert Pluim @ 2021-12-06 16:42 ` Greg Minshall 0 siblings, 0 replies; 15+ messages in thread From: Greg Minshall @ 2021-12-06 16:42 UTC (permalink / raw) To: Robert Pluim; +Cc: Juan Manuel =?utf-8?Q?Mac=C3=ADas?=, orgmode Robert, > Thereʼs no need to modify Emacs: see > `glyphless-char-display-control'. ZWS falls under 'format-control'. very nice. thanks! cheers, Greg ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2021-12-06 16:49 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-12-03 12:48 On zero width spaces and Org syntax Juan Manuel Macías 2021-12-03 19:03 ` Greg Minshall 2021-12-03 20:30 ` Juan Manuel Macías 2021-12-03 21:48 ` Tim Cross 2021-12-04 1:26 ` Juan Manuel Macías 2021-12-04 4:04 ` Tom Gillespie 2021-12-04 5:29 ` Juan Manuel Macías 2021-12-04 15:26 ` Max Nikulin 2021-12-04 20:29 ` Tim Cross 2021-12-06 11:40 ` Eric S Fraga 2021-12-04 6:43 ` Marcin Borkowski 2021-12-04 7:22 ` Ihor Radchenko 2021-12-04 17:37 ` Marcin Borkowski 2021-12-06 16:01 ` Robert Pluim 2021-12-06 16:42 ` Greg Minshall
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).