emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Ox-html: Replace <b> with <strong> and <i> with <em>
@ 2018-10-24  0:38 Kaushal Modi
  2018-10-24  6:04 ` Nicolas Goaziou
  0 siblings, 1 reply; 10+ messages in thread
From: Kaushal Modi @ 2018-10-24  0:38 UTC (permalink / raw)
  To: emacs-org list

[-- Attachment #1: Type: text/plain, Size: 463 bytes --]

Hello,

I am not an HTML expert. But recently off-list, I learnt that <b> and <i>
tags aren't recommended to be used for styling any more (for a while now).

Instead <strong> and <em> should be used respectively.

If there are no objections, I can commit this little change to the master
branch.

References:

- https://developer.mozilla.org/en-US/docs/Web/HTML/Element/b
- https://developer.mozilla.org/en-US/docs/Web/HTML/Element/i#Usage_Notes

--
Kaushal Modi

[-- Attachment #2: Type: text/html, Size: 997 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Ox-html: Replace <b> with <strong> and <i> with <em>
  2018-10-24  0:38 Ox-html: Replace <b> with <strong> and <i> with <em> Kaushal Modi
@ 2018-10-24  6:04 ` Nicolas Goaziou
  2018-10-24 15:14   ` Kaushal Modi
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Goaziou @ 2018-10-24  6:04 UTC (permalink / raw)
  To: Kaushal Modi; +Cc: emacs-org list

Hello,

Kaushal Modi <kaushal.modi@gmail.com> writes:

> I am not an HTML expert. But recently off-list, I learnt that <b> and <i>
> tags aren't recommended to be used for styling any more (for a while now).
>
> Instead <strong> and <em> should be used respectively.
>
> If there are no objections, I can commit this little change to the master
> branch.
>
> References:
>
> - https://developer.mozilla.org/en-US/docs/Web/HTML/Element/b
> -
> https://developer.mozilla.org/en-US/docs/Web/HTML/Element/i#Usage_Notes

No objection from me. Thank you!

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Ox-html: Replace <b> with <strong> and <i> with <em>
  2018-10-24  6:04 ` Nicolas Goaziou
@ 2018-10-24 15:14   ` Kaushal Modi
  2018-10-24 21:00     ` Tim Cross
  0 siblings, 1 reply; 10+ messages in thread
From: Kaushal Modi @ 2018-10-24 15:14 UTC (permalink / raw)
  To: emacs-org list

On Wed, Oct 24, 2018 at 2:04 AM Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote:
>
>
> No objection from me. Thank you!

Actually, before making this change, I started reading up on the HTML5
spec on the b, strong, i, em tags, and now I am confused as ever.

Facts:

- b and i are not deprecated
- b and strong are both valid but their use depends on the writer's
context (but Org mode has just one mark for either "*")
- i and em are both valid but their use depends on the writer's
context (but Org mode has just one mark for either "/").

From "em" docs[em], in the NOTE section there:

> The em element isn’t a generic "italics" element. Sometimes, text is intended to stand out from the rest of the paragraph, as if it was in a different mood or voice. For this, the i element is more appropriate.

See the b tag docs[b] and i tag docs[i], and this W3C FAQ on using b
and i tags[faq] for more.


*Summary* (/see what I did there?/):

I guess there's no need to change what "*" and "/" do right now in
ox-html, as there doesn't seem "one right way" to do things here.

And folks strongly wanting to use <strong> and <em> for bold and
italic can customize org-html-text-markup-alist.

HTML experts, please chime in.



[em]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-em-element
[b]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-b-element
[i]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-i-element
[faq]: https://www.w3.org/International/questions/qa-b-and-i-tags

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Ox-html: Replace <b> with <strong> and <i> with <em>
  2018-10-24 15:14   ` Kaushal Modi
@ 2018-10-24 21:00     ` Tim Cross
  2018-10-26  5:24       ` *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace <b> with <strong> and <i> with <em>] Garreau, Alexandre
  0 siblings, 1 reply; 10+ messages in thread
From: Tim Cross @ 2018-10-24 21:00 UTC (permalink / raw)
  To: Kaushal Modi; +Cc: emacs-org list


Kaushal Modi <kaushal.modi@gmail.com> writes:

> On Wed, Oct 24, 2018 at 2:04 AM Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote:
>>
>>
>> No objection from me. Thank you!
>
> Actually, before making this change, I started reading up on the HTML5
> spec on the b, strong, i, em tags, and now I am confused as ever.
>
> Facts:
>
> - b and i are not deprecated
> - b and strong are both valid but their use depends on the writer's
> context (but Org mode has just one mark for either "*")
> - i and em are both valid but their use depends on the writer's
> context (but Org mode has just one mark for either "/").
>
> From "em" docs[em], in the NOTE section there:
>
>> The em element isn’t a generic "italics" element. Sometimes, text is intended to stand out from the rest of the paragraph, as if it was in a different mood or voice. For this, the i element is more appropriate.
>
> See the b tag docs[b] and i tag docs[i], and this W3C FAQ on using b
> and i tags[faq] for more.
>
>
> *Summary* (/see what I did there?/):
>
> I guess there's no need to change what "*" and "/" do right now in
> ox-html, as there doesn't seem "one right way" to do things here.
>
> And folks strongly wanting to use <strong> and <em> for bold and
> italic can customize org-html-text-markup-alist.
>
> HTML experts, please chime in.
>
>
>
> [em]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-em-element
> [b]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-b-element
> [i]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-i-element
> [faq]: https://www.w3.org/International/questions/qa-b-and-i-tags

I'll start by stating I'm definitely not an HTML expert.

I do believe we should move away from b/i to strong/em as I think these
are the correct semantic tags to use and are generally what is
preferred. This means they are also likely to already have appropriate
'styling' in many 'canned' styles and valid consistent interpretations
for different media types. 

The problem with b and i is that they specify how rather than what and
don't always make sense for all possible media types. For example, what
does 'bold' or 'italic' mean for a screen reader?

I don't think this is something that is urgent, but it is the direction
we should go. The only real reason for sooner rather than later is that
we can probably simplify some of the exporters and ensure any new
exporters are correct and won't need to be change retrospectively.

Tim

-- 
Tim Cross

^ permalink raw reply	[flat|nested] 10+ messages in thread

* *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace <b> with <strong> and <i> with <em>]
  2018-10-24 21:00     ` Tim Cross
@ 2018-10-26  5:24       ` Garreau, Alexandre
  2018-10-26 20:15         ` Tim Cross
  0 siblings, 1 reply; 10+ messages in thread
From: Garreau, Alexandre @ 2018-10-26  5:24 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-org list, Kaushal Modi

Sorry, just found out that interesting (to me) thread I shouldn’t have
let go:

On 2018-10-25 at 08:00, Tim Cross wrote:
> Kaushal Modi <kaushal.modi@gmail.com> writes:
>> […]
>> - b and i are not deprecated
>> - b and strong are both valid but their use depends on the writer's
>> context (but Org mode has just one mark for either "*")
>> - i and em are both valid but their use depends on the writer's
>> context (but Org mode has just one mark for either "/").
>>
>> […]
>> 
>> From "em" docs[em], in the NOTE section there:
>>> The em element isn’t a generic "italics" element. Sometimes, text
>>> is intended to stand out from the rest of the paragraph, as if it
>>> was in a different mood or voice. For this, the i element is more
>>> appropriate.
>>
>> […]
>>
>> I guess there's no need to change what "*" and "/" do right now in
>> ox-html, as there doesn't seem "one right way" to do things here.
>>
>> And folks strongly wanting to use <strong> and <em> for bold and
>> italic can customize org-html-text-markup-alist.
>>
>> HTML experts, please chime in.
>
> I'll start by stating I'm definitely not an HTML expert.

I don’t exactely know what an expert is, at least I’m not a
professional, but I have passed some time figuring out various HTML
specs semantic meaning.

More especially, I’ve a big interest in semantics and typography, and
past many time on my now deleted-crecreated-then-lost github account,
and mail, to convince people to switch to more semantical markuping (oh,
and to use complex CSS selectors rather than classes, and stop using
<div> and <span> at all) and better typography (such as curly quotes,
simple quotes inside quotes, and many things specific to french).

> The problem with b and i is that they specify how rather than what and
> don't always make sense for all possible media types. For example, what
> does 'bold' or 'italic' mean for a screen reader?

Italic means often pronounced with a different pitch afair.  Bold
probably means prounced differently too but I don’t know how this is
pronounced iirc.  I need to recheck with orca and firefox addons (I’ll
do for a next mail).  That might be change accross screenreaders so I
might have to find some friend having a windows computer with NVDA, JAWS
or some other non-free program to either ask or check.

I believe the most correct handling for screen readers would be to use
the appropriate language from the attribute lang or xml:lang of <i> tag,
otherwise slower and slightly higher pitch, and for <b> the exact same
higher pitch as caps, without changing speed, plus adding it to an
easily reachable “keyword-list”, just as <dfn>.

Fyi : both italic, bold, and underline, have been invented in typography
as special ways of *purposely* making text harder to read.  Both the
intent and result is that the reader taking more time to read something
in italic, for instance, will memorize it better, and have more free
time to think about it, hence increasing the importance of this
something.

In the following “from far” means when you look at the global document
and are not focusing reading a particular part of it.  It doesn’t mean
you are at a far distance and you can still read it, like it is for
uppercase.

Italic is the best way, the most readable, as it’s only seen when
reading, near the text, but not “from far” and doesn’t break structure,
flowing, or “typographic grey” (“gris typographique”, I’m not aware of
the english term).  It is hence commonly used for emphasis (best usage:
if ever it gets long, it gets hard to read, but that reflects the fact
original meaning was hard to grasp or hear or say originally), citation
of artistical work names (such as books: conventional usage, but still
okay, as these are mostly short anyway), and quotations (discouraged
usage as they can get long (and thus unreadable) and quote marks cover
this, *not* to be used *along* with them, never, as it is terribly
redundant and almost no serious professional printer do that).

Bold is sometimes harder to read, and sometimes, if not too bold,
easier, however it’s really easy to “notice” its text from looking afar:
therefore it’s normally *exclusively* recommanded for text structures,
whose *role* is to purposely cut in parts the text, that is: *outlines*.
However, in an attempt of pseudo-backward compatibility and “but look
everybody was okay since the beginning”, by the W3C, another usage for
bold than in outlines has been found: keywords.  These are *meant* to be
seen from far, are usually small (one word), and yet wouldn’t alter text
structure, and might not be candidate for <dfn> (however most time they
should).

Underline is to be banned from everywhere, theorically.  It is an
especially simple and awful way of making text unreadable: it cuts the
legs of non-zero-ascent letters (making as hard to read as italic) *and*
is easy to lookup from far, yet you can notice the underline without
having the word easily and quickly grasped when seen from far, like
bold.  Iirc it has been invented for typewriters because italic wasn’t
available, for which it is the poorest candidate ever.  It is also used
in manuscript text, as people actually trying to manually write in
italics or bold are nowadays few and others are often unable to do so.
Most time I saw it used manuscriptly to anotate and highlight text.
Conventions has been developed around this: in typewriter as well as
manuscript text, you normally *only* use it for artistic works names
(instead of italic), and blue hyperlinks.  It is sad it has developed as
a such important convention but it is done, clear, and well established.

The W3C meaning of “added text” seems quite somewhat artificial to me,
as it is not more conventional to use it for “added changes” than any
other typographic convention.  However it is necessarily *one of these*,
as it is commonly used to highlight and anotate text (however the <mark>
tag is here for that, in HTML).

> I do believe we should move away from b/i to strong/em as I think these
> are the correct semantic tags to use and are generally what is
> preferred. This means they are also likely to already have appropriate
> 'styling' in many 'canned' styles and valid consistent interpretations
> for different media types. 

This is unsemantic (and is giving org markup a presentational rather
than semantic role, so I strongly oppose this) and could break true
accessibility.  I’d say ideally what we should have is more markup to be
compatible with HTML, as recently, with XHTML1, 2 and HTML5, it has
become one of the richer and most clearly defined markup language
available.  However as org, comparably to markdown and rst, is trying to
achieve some compatibility with classical clear-text markuping, such as
in email, and from what semantics I detected, I’d say the following :
– tag “*” with <em>, maybe find cases where “<b>” might be appropriate
  (for keywords, typically): I’d say an interesting experiment would,
  for some given languages (such as english, to begin) detect if an
  article (“the”, “a”, “an”…) is part of the markup: then it’s not a
  keyword (hence <em>), if it’s *preceding* the markup, then more
  probably it is a markup (but not necessarily) ;
— tag “/” with <cite>, as this match the most accurate and commonly
  meaning of “/”, “_” might be appropriate as well, but may be redundant
  (so a safe (potentially usable as buffer-local) custom var would do
  better).  However there are some cases where “/” would be more
  appropriate as <i> (I’d say the vast majority of occurences are words
  from foreign languages, other are most often incorrect and abusive
  usage of “/”);
— tag “_” as either <cite>, if correct var is of the correct value, or
  <ins>, *only* if near “+” markup.  Otherwise, as org only use “[]” for
  hyperlinks, I don’t know.

Note that, indeed, “<strong>” has no usage.  If it was up to me it
should be banned.  Maybe its most accurate usage would be for upcase
urgent emphasis-text: *URGENT: READ THIS NOW OR YOU WILL DIE* (you might
use <strong> if absolutely wanting to, for upcase emphasis text, or
emphasis text containing “urgent:” or “important:”, and differently
localized versions (format-level linguistic imperialism, bla bla: note
for the same very reason this would work as is for french, but me and
many people would funnily feel more reassured, respected or whatever if
they were blessed by being in a list whose car is "fr")).

> I don't think this is something that is urgent, but it is the
> direction we should go. The only real reason for sooner rather than
> later is that we can probably simplify some of the exporters and
> ensure any new exporters are correct and won't need to be change
> retrospectively.

This has to be a semantics work to be reported on *all* semantic
backends.  As there are “accessibility” workaround for almost all
formats (even PDF, which is understandable as it got important and
widely used, while normally meant only for printing, hence display, not
semantics (but you know, these days, you can put javascript in these…)),
this may mean “every backend”.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace <b> with <strong> and <i> with <em>]
  2018-10-26  5:24       ` *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace <b> with <strong> and <i> with <em>] Garreau, Alexandre
@ 2018-10-26 20:15         ` Tim Cross
  2018-10-27 12:52           ` Garreau, Alexandre
  0 siblings, 1 reply; 10+ messages in thread
From: Tim Cross @ 2018-10-26 20:15 UTC (permalink / raw)
  To: Garreau, Alexandre; +Cc: emacs-org list, Kaushal Modi


Garreau, Alexandre <galex-713@galex-713.eu> writes:

> Sorry, just found out that interesting (to me) thread I shouldn’t have
> let go:
>
> On 2018-10-25 at 08:00, Tim Cross wrote:
>> Kaushal Modi <kaushal.modi@gmail.com> writes:
>>> […]
>>> - b and i are not deprecated
>>> - b and strong are both valid but their use depends on the writer's
>>> context (but Org mode has just one mark for either "*")
>>> - i and em are both valid but their use depends on the writer's
>>> context (but Org mode has just one mark for either "/").
>>>
>>> […]
>>>
>>> From "em" docs[em], in the NOTE section there:
>>>> The em element isn’t a generic "italics" element. Sometimes, text
>>>> is intended to stand out from the rest of the paragraph, as if it
>>>> was in a different mood or voice. For this, the i element is more
>>>> appropriate.
>>>
>>> […]
>>>
>>> I guess there's no need to change what "*" and "/" do right now in
>>> ox-html, as there doesn't seem "one right way" to do things here.
>>>
>>> And folks strongly wanting to use <strong> and <em> for bold and
>>> italic can customize org-html-text-markup-alist.
>>>
>>> HTML experts, please chime in.
>>
>> I'll start by stating I'm definitely not an HTML expert.
>
> I don’t exactely know what an expert is, at least I’m not a
> professional, but I have passed some time figuring out various HTML
> specs semantic meaning.
>
> More especially, I’ve a big interest in semantics and typography, and
> past many time on my now deleted-crecreated-then-lost github account,
> and mail, to convince people to switch to more semantical markuping (oh,
> and to use complex CSS selectors rather than classes, and stop using
> <div> and <span> at all) and better typography (such as curly quotes,
> simple quotes inside quotes, and many things specific to french).
>
>> The problem with b and i is that they specify how rather than what and
>> don't always make sense for all possible media types. For example, what
>> does 'bold' or 'italic' mean for a screen reader?
>
> Italic means often pronounced with a different pitch afair.  Bold
> probably means prounced differently too but I don’t know how this is
> pronounced iirc.  I need to recheck with orca and firefox addons (I’ll
> do for a next mail).  That might be change accross screenreaders so I
> might have to find some friend having a windows computer with NVDA, JAWS
> or some other non-free program to either ask or check.
>
> I believe the most correct handling for screen readers would be to use
> the appropriate language from the attribute lang or xml:lang of <i> tag,
> otherwise slower and slightly higher pitch, and for <b> the exact same
> higher pitch as caps, without changing speed, plus adding it to an
> easily reachable “keyword-list”, just as <dfn>.
>
> Fyi: both italic, bold, and underline, have been invented in typography
> as special ways of *purposely* making text harder to read.  Both the
> intent and result is that the reader taking more time to read something
> in italic, for instance, will memorize it better, and have more free
> time to think about it, hence increasing the importance of this
> something.
>
> In the following “from far” means when you look at the global document
> and are not focusing reading a particular part of it.  It doesn’t mean
> you are at a far distance and you can still read it, like it is for
> uppercase.
>
> Italic is the best way, the most readable, as it’s only seen when
> reading, near the text, but not “from far” and doesn’t break structure,
> flowing, or “typographic grey” (“gris typographique”, I’m not aware of
> the english term).  It is hence commonly used for emphasis (best usage:
> if ever it gets long, it gets hard to read, but that reflects the fact
> original meaning was hard to grasp or hear or say originally), citation
> of artistical work names (such as books: conventional usage, but still
> okay, as these are mostly short anyway), and quotations (discouraged
> usage as they can get long (and thus unreadable) and quote marks cover
> this, *not* to be used *along* with them, never, as it is terribly
> redundant and almost no serious professional printer do that).
>
> Bold is sometimes harder to read, and sometimes, if not too bold,
> easier, however it’s really easy to “notice” its text from looking afar:
> therefore it’s normally *exclusively* recommanded for text structures,
> whose *role* is to purposely cut in parts the text, that is: *outlines*.
> However, in an attempt of pseudo-backward compatibility and “but look
> everybody was okay since the beginning”, by the W3C, another usage for
> bold than in outlines has been found: keywords.  These are *meant* to be
> seen from far, are usually small (one word), and yet wouldn’t alter text
> structure, and might not be candidate for <dfn> (however most time they
> should).
>
> Underline is to be banned from everywhere, theorically.  It is an
> especially simple and awful way of making text unreadable: it cuts the
> legs of non-zero-ascent letters (making as hard to read as italic) *and*
> is easy to lookup from far, yet you can notice the underline without
> having the word easily and quickly grasped when seen from far, like
> bold.  Iirc it has been invented for typewriters because italic wasn’t
> available, for which it is the poorest candidate ever.  It is also used
> in manuscript text, as people actually trying to manually write in
> italics or bold are nowadays few and others are often unable to do so.
> Most time I saw it used manuscriptly to anotate and highlight text.
> Conventions has been developed around this: in typewriter as well as
> manuscript text, you normally *only* use it for artistic works names
> (instead of italic), and blue hyperlinks.  It is sad it has developed as
> a such important convention but it is done, clear, and well established.
>
> The W3C meaning of “added text” seems quite somewhat artificial to me,
> as it is not more conventional to use it for “added changes” than any
> other typographic convention.  However it is necessarily *one of these*,
> as it is commonly used to highlight and anotate text (however the <mark>
> tag is here for that, in HTML).
>
>> I do believe we should move away from b/i to strong/em as I think these
>> are the correct semantic tags to use and are generally what is
>> preferred. This means they are also likely to already have appropriate
>> 'styling' in many 'canned' styles and valid consistent interpretations
>> for different media types.
>
> This is unsemantic (and is giving org markup a presentational rather
> than semantic role, so I strongly oppose this) and could break true
> accessibility.  I’d say ideally what we should have is more markup to be
> compatible with HTML, as recently, with XHTML1, 2 and HTML5, it has
> become one of the richer and most clearly defined markup language
> available.  However as org, comparably to markdown and rst, is trying to
> achieve some compatibility with classical clear-text markuping, such as
> in email, and from what semantics I detected, I’d say the following:
> –tag “*” with <em>, maybe find cases where “<b>” might be appropriate
>   (for keywords, typically): I’d say an interesting experiment would,
>   for some given languages (such as english, to begin) detect if an
>   article (“the”, “a”, “an”…) is part of the markup: then it’s not a
>   keyword (hence <em>), if it’s *preceding* the markup, then more
>   probably it is a markup (but not necessarily);
> —tag “/” with <cite>, as this match the most accurate and commonly
>   meaning of “/”, “_” might be appropriate as well, but may be redundant
>   (so a safe (potentially usable as buffer-local) custom var would do
>   better).  However there are some cases where “/” would be more
>   appropriate as <i> (I’d say the vast majority of occurences are words
>   from foreign languages, other are most often incorrect and abusive
>   usage of “/”);
> —tag “_” as either <cite>, if correct var is of the correct value, or
>   <ins>, *only* if near “+” markup.  Otherwise, as org only use “[]” for
>   hyperlinks, I don’t know.
>
> Note that, indeed, “<strong>” has no usage.  If it was up to me it
> should be banned.  Maybe its most accurate usage would be for upcase
> urgent emphasis-text: *URGENT: READ THIS NOW OR YOU WILL DIE* (you might
> use <strong> if absolutely wanting to, for upcase emphasis text, or
> emphasis text containing “urgent:” or “important:”, and differently
> localized versions (format-level linguistic imperialism, bla bla: note
> for the same very reason this would work as is for french, but me and
> many people would funnily feel more reassured, respected or whatever if
> they were blessed by being in a list whose car is "fr")).
>
>> I don't think this is something that is urgent, but it is the
>> direction we should go. The only real reason for sooner rather than
>> later is that we can probably simplify some of the exporters and
>> ensure any new exporters are correct and won't need to be change
>> retrospectively.
>
> This has to be a semantics work to be reported on *all* semantic
> backends.  As there are “accessibility” workaround for almost all
> formats (even PDF, which is understandable as it got important and
> widely used, while normally meant only for printing, hence display, not
> semantics (but you know, these days, you can put javascript in these…)),
> this may mean “every backend”.

I have either misunderstood most of your position or I simply disagree
with it - I'm not sure which.

- Much of what you argue seems to be based around ideas associated with
  typography. IMO this is where things fall down. Typography is really
  only relevant to 'printing' (either on paper or screen). Markup is not
  just about printing - it is about conveying what the author wanted and
  how that is best interpreted will depend on the media being used
  (i.e. how the content is 'rendered') and should largely be up to the
  consumer. 

- I am a screen reader user. While you are correct that pitch, tone,
  speed and different voices are often used to convey things like 'bold'
  or 'italic', there is no universally accepted rule for this
  interpretation, at least not in the same sense as there is with
  typography. We all know what bold or italic looks like, but there is
  no agreement as to what these should sound like. When you use Jaws,
  you will get a different result from when you use Orca or Emacspeak or
  Window Eyes or .... However, this shouldn't really matter - how these
  are 'rendered' should ideally be under the control of the individual
  consuming the content. When I consume a document, it should be my
  decision as to how the content is presented and for me, interpreting
  'strong' or 'emphasis' seems to be far clearer than 'bold' or
  'italic'.

- I don't believe there is any strong reason that the markup used by org
  should have any strong reference to HTML in appearance. Org supports
  many different backends, many of which don't have anything to do with
  HTML at all. It is perhaps unfortunate that Org syntax and markdown
  are quite different (though I feel the unfortunate part is that
  markdown didn't follow org more closely as I much prefer Org's syntax
  to most markdown semantics).  

- Probably the number 1 issue I come across when dealing with markup is
  the expectation too many authors have that things will be rendered in
  the browser in a specific way (a particular font, colour, position,
  size, etc). This is a mistake. The big advantage of electronic
  presentation is that for the first time, the consumer can have control
  over the presentation - they can customise it to meet their
  requirements or preferences. The problem with <b> and <i> is that it
  gives authors an expectation their content will be rendered in a
  specific way. Some may argue that the author should be able to control
  how their content is rendered. I think this is misleading because
  unlike printed material, the author has no control over the
  presentation media - they don't know how large the screen is, what the
  capabilities of the screen is, what fonts are installed
  etc. Therefore, tags which focus on meaning i.e. I want this to stand
  out or I want this to be emphasised are clearer than tags which say to
  make this bold or make this italic.  

The debate over <i>, <b>, <strong> and <em> is likely to continue for
some years yet. I do think things are moving towards <strong>/<em> and
nearly everything I read these days recommends these over <i> and
<b>. It is pretty well accepted that XHTML was a mistake and HTML5 goes
a long way to address the issues introduced with XHTML - I think XHTML
as a standard is pretty much relegated to an evolutionary dead end. I do
agree <div> is over used. In particular, HTML5 has a number of new tags
which should be used to convey document structure which would be a
better choice than <div> with different 'class' attributes. However, we
will continue to see a lot of div tags, even when authors begin to use
newer tags - at least it is a lot better than the early days when
everything was stuck inside tables! Backends which generate HTML should
be generating HTML5 compliant output if for no other reason than it is
clearer and easier than XHTML. 

As to the OP's original question regarding changing <b> and <i> in HTML
backends - while I would vote for strong/em over b/i, I don't think
there is any real need to do this, certainly not in the short term. As
was pointed out b/i has not been deprecated, so it is still valid. There
is no suggestion to change Org's own internal markup (ironically
referred to as bold and italic!), so overall, the status quo seems fine.

Tim
. 
--
Tim Cross

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace <b> with <strong> and <i> with <em>]
  2018-10-26 20:15         ` Tim Cross
@ 2018-10-27 12:52           ` Garreau, Alexandre
  2018-10-28 21:19             ` Tim Cross
  0 siblings, 1 reply; 10+ messages in thread
From: Garreau, Alexandre @ 2018-10-27 12:52 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-org list, Kaushal Modi

On 2018/10/27 at 07:15, Tim Cross wrote:
> I have either misunderstood most of your position or I simply disagree
> with it - I'm not sure which.

maybe a mix of both? I hope it’s a misunderstandnment but if it’s not I
want to understand too so to get to a constructive agreement.

> - Much of what you argue seems to be based around ideas associated with
>   typography. IMO this is where things fall down. Typography is really
>   only relevant to 'printing' (either on paper or screen). Markup is not
>   just about printing - it is about conveying what the author wanted and

Indeed.  But many people do not abstract what they mean to write and
still (often, poorly) think in terms of “italic” and “bold” (the org
manual, as you later said, even do so).  What I wanted to underline is
that both “italic” and “bold” (and underline too somewhat) are not just
arbitrary display-level caracteristic that had the particularity to
later get a meaning: *first* a *meaning* was wanted, and *then* they
were invented as an imperfect, more or less good, way to translate these
meanings or their intents to display (it’s as imperfect as a bitmap or
handwriting of a circle, or a sampled and compressed audio, is to the
bezier curve or equation of a circle which resulted in it, or the
function that produced the audio (such as a LilyPond musical partition
or a resulting MIDI file)).

I’m willing to extract as much of the original meaning (be it about
attention, memorization, structuration, etc. (very abstract cognitive
human features are still more common than visual-recognition features))
so it can be then better applied everywhere, without the burden and
constraints of the original media (display), with a little of history
because I like to rehistoricize things into their material and social
background, so not to see them as a static, ahistoric, uncreated,
uncriticizable, concept.  Concepts and tools are made for people to
serve them, not the opposite.

>   how that is best interpreted will depend on the media being used
>   (i.e. how the content is 'rendered') and should largely be up to the
>   consumer. 

Yes totally, this is why I believe we, at best, should try to give clear
and defined meaning to why do we use *, / and _-tags, rather than just
translating them to the traditional <em>, <strong>, and <ins> tags, that
were actually just a poor 1-to-1 wrapping to the old <i>, <b> and <u>
tags, which had no meaning, and still have confused, complex and not
backward-compatible meaning.

And why sometimes it might be better to set up user options, so if
authors disagree with what is meant by their tags, they can change it,
so in the end that gives the correct semantic markup and everybody will
get the same, intended, meaning.

Also why, ideally, for the web, I wished server-side CSS never existed
and we only used it as a user-customization language (but still most
websites have poor semantic tagging, and complex tags composition have
still no clear defined meaning so it the end it becomes either guessing,
either a request to add yet-another tag to the already complex HTML
spec).

> - I am a screen reader user. While you are correct that pitch, tone,
>   speed and different voices are often used to convey things like 'bold'
>   or 'italic', there is no universally accepted rule for this
>   interpretation, at least not in the same sense as there is with
>   typography.

I know, that’s why I wanted to check with Orca, NVDA, and maybe Jaws too
if I could.

>   We all know what bold or italic looks like, but there is no
>   agreement as to what these should sound like. When you use Jaws, you
>   will get a different result from when you use Orca or Emacspeak or
>   Window Eyes or .... However, this shouldn't really matter - how
>   these are 'rendered' should ideally be under the control of the
>   individual consuming the content. When I consume a document, it
>   should be my decision as to how the content is presented and for me,
>   interpreting 'strong' or 'emphasis' seems to be far clearer than
>   'bold' or 'italic'.

That’s why I’d like * and / to get better meaning than bold and italic.
For me it is already widely accepted that * is, sometimes, considered as
bold, but more widely used for emphasis.  So it should be considered as
such (and, personally, I’ve meant this so that it could begin rendering
with italic on display for instance, or whatever is the favorite
emphasis method of the user, it should be configurable).

/ is a way harder problem as it has been used because of its slanted
appearance, to mean italic, so sometimes it’s used for emphasis,
sometimes for other uses of emphasis.  Ideally I’d like to be acted it’s
not for emphasis (it’s way less used and supported than * for it, and *
already serves this purpose very well informally), so implementations
derive some other meaning for it, to get richer semantics.

> - I don't believe there is any strong reason that the markup used by org
>   should have any strong reference to HTML in appearance. Org supports
>   many different backends, many of which don't have anything to do with
>   HTML at all. It is perhaps unfortunate that Org syntax and markdown
>   are quite different (though I feel the unfortunate part is that
>   markdown didn't follow org more closely as I much prefer Org's syntax
>   to most markdown semantics).  

I don’t like markdown either, nor ReStructuredText.  Why I talked a lot
about HTML is for two reasons: the discussion was initially about it,
and it is, afaik, the richest and most known semantical markup
language.  It is *way* richer than LaTeX, org, md, rst, etc. maybe even
odt and texinfo, but I’m unsure.

However the * and / exports to texinfo with the same tags as html, that
is respectively strong and emphasis, which I find sad as * is what is
mostly used for emphasis (and too levels are pretty much not needed, why
richer semantics could).  ODT seems to use “<span>” with “style-name="Emphasis"”: I
heard ODT could be somewhat semantic, but I don’t know if that the best
they can do (maybe this “style-name” has standard semantics? because to
me styling is for presentation, and tagging for semantics).

Also a problem of many backends is they’re made for printing or less
semantic: pdf is not made for semantics, although I heard somewhere that
they were trial to make it so (which sounds silly as it is tailored for
printing and supports almost no dynamic modifications, it would be
better to stop using PDFs at all, in, eg, administration).

> - Probably the number 1 issue I come across when dealing with markup is
>   the expectation too many authors have that things will be rendered in
>   the browser in a specific way (a particular font, colour, position,
>   size, etc). This is a mistake. The big advantage of electronic
>   presentation is that for the first time, the consumer can have control
>   over the presentation - they can customise it to meet their
>   requirements or preferences.

*Exactely*.  Except that then, web become commercial, and businesses
have found it especially good way to control what users saw almost as
fully as in advertisements (so it can bring control, power to them, and
also money, secondarily (if they use non-semantic tags and only <div>
and <span> in awfully complex sgml soup, then no user is able to control
anything)), just as French minitel would, and they begun first to abuse
display-level tagging, then to abuse CSS and html-style-soup (full of
80% of <div> and <span>, and enormous CSSes, yay! what a progress!  …><
yet now we have worse: less CSS, less “style”, and more “data-*” and
non-free surveillance javascript to replace them).

>   The problem with <b> and <i> is that it gives authors an expectation
>   their content will be rendered in a specific way.

Not anymore, since W3C, somewhat breaking backward-compatibility,
decided <b> is for “keywords” without special emphasis, and not being a
definition (there’s already <dfn> afaik for that), and <i> is for
”differently-pronounced phrasing content”, without emphasis, such as
text prounced with a tone of disgust, or foreign-language text (so if
you want to embed french words not used enough to be in english
dictionary, and if it’s nor a real quotation (<q lang="fr>), you should
use <i lang="fr">du texte en française</i>).

So I can theorically decide that any word markuped <b> may compose a
local list of easily reachable (for instance with keystrokes)
“keywords”, like lynx, that b should be displayed normally, but in blue,
and that would be a standard-complying www user-agent.

>   Some may argue that the author should be able to control how their
>   content is rendered. I think this is misleading because unlike
>   printed material, the author has no control over the presentation
>   media - they don't know how large the screen is, what the
>   capabilities of the screen is, what fonts are installed
>   etc. Therefore, tags which focus on meaning i.e. I want this to
>   stand out or I want this to be emphasised are clearer than tags
>   which say to make this bold or make this italic.

Yes they can: they can require you server-side connecting from a local
network on computers furnished by the organization of the place (already
saw that), while checking what do you do and how to make you doing it
client-side with proprietary javascript, or even to have a tablet with
retina screen with a such range of screen sizes, on iOS… and… btw… this
already exists, there’s an app for it: AppStore (GooglePlay too): they
furnish HTML/CSS UI, controled by proprietary software, only distributed
for their devices, theorically only working on those (at least the apps
are developed, configured, and tested so).  And afaik developers don’t
mind making their software more usable with TalkBack (I don’t even know
if there’s a such thing for iOS).

The excuse of “the device is not always the same” is to me a weak one:
this can, with special political and commercial restrictions, be
lowered, and then it could be considered a “reasonable workaround”
(while this is not).  What should be advertised is it breaks
accessibility, don’t comply standard, will certainly break
forward-compatibility, legally-mandatory interoperability, and, as for
proprietary software, gives power to authors (or, more often but not
always, companies) and deprive users of what they could and should
have.  This power is comparable to what power is gained through
advertisements, propaganda.


> The debate over <i>, <b>, <strong> and <em> is likely to continue for
> some years yet.  I do think things are moving towards <strong>/<em>
> and nearly everything I read these days recommends these over <i> and
> <b>.

There are companies (and some individual, or countries) who gain power
by doing so, just as they can do by pushing proprietary software (yet on
a different level), so I don’t believe they will ever stop doing
anything equivalent.  Nor advertise they would do so.

So this is not a debate.  Like there is no “free vs proprietary” debate,
or “climate change vs this-is-god/a-myth” debate: the advocate of the
first have arguments and facts, the actors of the second are either
stating their ennemies are idealist, stating their goal are
unrealizable, or they “do so because they have no choice”, or “do their
best not harm”, and then push more and more pervasive and unadvertised
way of harming, such as DRM and proprietary javascript, or, in our case,
use sometimes <em> and <strong>, but allow users to publish content
using <b> and <i> (and colors!!!), and making their whole website a soup
of <div> and <span>, heavily relying on a gigantic style soup, based on
a site-specific CSS stylesheet, that will be partially generated
server-side, partially heavily “improve” (that is: depend upon)
proprietary javascript.

Btw, this is what Google does.  And Google is quite evidently the
biggest feudal lord on the Web.

> It is pretty well accepted that XHTML was a mistake and HTML5 goes a
> long way to address the issues introduced with XHTML - I think XHTML
> as a standard is pretty much relegated to an evolutionary dead end.

XHTML was a beautiful standard and was dismissed because all the money
and resources were placed on HTML5, whose main selling point was new
media resources (namely <audio> and <video>) and new sensors/multimedia
API for javascript.  So that hopefully killed flash, java, and almost
silverlight (which had its niche anyway), and replaced it with
proprietary obfusced javascript and DRM.  The only evolution is we have
free softwares to execute it, hence we may have better time
reverse-engineering and hacking it.

XHTML was a dead end not because evolution, but because, from what I
heard, it was pretty much, as XML compared to SGML, something promoted
by academics, IA/semiology (study of semantics) researchers and
universities (the same who made XPath, RDF, XML Schemas, DTD, SPARQL,
SQL predecessors, logical languages, etc.), so to make the whole thing
more extensible, modular, and factorized.  For instance, the fact of
embedding MathML (semantic markuping for maths language), instead of
using images (TeX-generated, of course) to display formulas, come from
XHTML, and doing that in HTML5 is called “XHTML5”.  Same for SVG.  Same
for XLinks (more powerful than HTML hyperlinking facilities, more near
to what was originally meant for first-browser HTML links (dual-end
links, multi-links, etc.), hence more semantic, and currently, as a
small subset, used in SVG and ODT to mark link.

XHTML was a great step toward semantics actually, and its death is
extremely sad, even for semantics.

But, nowadays, though they’re member as well, just as Mozilla, we know
now W3C is ran by Netflix, BBC, and other displayfull DRM companies.  So
HTML5 won over XHTML, because XML, pure logics, and semiology (what made
“semantic” stuff possible at all) do not interest these companies.
Codec-supports and DRM do.

> I do agree <div> is over used. In particular, HTML5 has a number of
> new tags which should be used to convey document structure which would
> be a better choice than <div> with different 'class' attributes.

XHTML2 had those, too (not all yet), before to be killed by HTML5.
Also, not only new elements, but combinations of those, can be used to
replace “style”.  For instance, “article article” CSS selector refers to
what HTML specs call a “comment” (and “article > footer:first-child” is
what we call in mail a “header”: it contains secondary metadata).  But
currently, everywhere in the web, comments do not always use the
“article” tag, or always are outside the commented article tag, and use
a special non-standard css style to mark them.

Note this is a non-standard, opaque and almost proprietary format: it’s
not patented, it’s not copyrighted, but its implementation (either CSS
or proprietary javascript) most of time is.  It’s not binary, it is
based on a cleartext format, but obfuscated, and unlike what cleartext
advocated have said, way more impossible to reverse-engineer than a
single binary format: easily guessable, of course, but just like a
binary format is difficult and take time to make, a text-format is easy
and quick to make, so there are *millions* of them, at least one per
website, and nobody will ever be able to reverse engineer that (unlike
binary formats which are fewer).

> However, we will continue to see a lot of div tags, even when authors
> begin to use newer tags - at least it is a lot better than the early
> days when everything was stuck inside tables!

Ah the old days… now they’ve got css-level tables.  But they still need
a div tag soup for it, instead of advanced selectors, so they keep
complexifying the (X)HTML, and making it bloated and unreadable.

Instead, in XML, we had DSSSL (inspired from scheme) and XSLT
(turing-complete, purely functional, XML syntax) to transform it in
display level things: you *never* need to change the XHTML, only your
stylesheet will get more complex as your display becomes more complex,
and you can do *anything* with your initial, semantics tags.

But now XHTML is dead, this kind of powerfulness is to reserve for
LaTeX3, lisp, and such.

> Backends which generate HTML should be generating HTML5 compliant
> output if for no other reason than it is clearer and easier than
> XHTML.

XHTML allows embedding content and using more semantic markup.  For
instance it could allow using MathML for math equations (a lot used in
org, though through AMSLaTeX syntax), instead of, in org-mode export,
currently used MathJAX, relying on javascript and display-level CSS.

> As to the OP's original question regarding changing <b> and <i> in HTML
> backends - while I would vote for strong/em over b/i, I don't think
> there is any real need to do this, certainly not in the short term. As
> was pointed out b/i has not been deprecated, so it is still valid. There
> is no suggestion to change Org's own internal markup (ironically
> referred to as bold and italic!), so overall, the status quo seems fine.

I believe we should begin making it semantic, as for
forward-compatibility, and it is always better to break backward-compat,
or introduce new specs the sooner as possible, because as the
implementation gets more widely used, the burden to change it
incompatibly increases as well.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace <b> with <strong> and <i> with <em>]
  2018-10-27 12:52           ` Garreau, Alexandre
@ 2018-10-28 21:19             ` Tim Cross
  2018-10-28 21:46               ` Neil Jerram
  2018-10-28 22:43               ` *markup*, /markup/ and _markup_ true semantics Garreau, Alexandre
  0 siblings, 2 replies; 10+ messages in thread
From: Tim Cross @ 2018-10-28 21:19 UTC (permalink / raw)
  To: Garreau, Alexandre; +Cc: emacs-org list, Kaushal Modi


On reading your response, we are probably not as far apart as I first
thought. However, we have now wondered into discussion which probably
isn't appropriate for this list. It is now in the realms of something
that would probably be better discussed with a good bottle of red or a
nice cold beer!

There is lots that is 'broken' with the web and I suspect much of it we
will just have to live with and hope whatever the next evolution brings
us learns from our mistakes.

Tim

Garreau, Alexandre <galex-713@galex-713.eu> writes:

> On 2018/10/27 at 07:15, Tim Cross wrote:
>> I have either misunderstood most of your position or I simply disagree
>> with it - I'm not sure which.
>
> maybe a mix of both? I hope it’s a misunderstandnment but if it’s not I
> want to understand too so to get to a constructive agreement.
>
>> - Much of what you argue seems to be based around ideas associated with
>>   typography. IMO this is where things fall down. Typography is really
>>   only relevant to 'printing' (either on paper or screen). Markup is not
>>   just about printing - it is about conveying what the author wanted and
>
> Indeed.  But many people do not abstract what they mean to write and
> still (often, poorly) think in terms of “italic” and “bold” (the org
> manual, as you later said, even do so).  What I wanted to underline is
> that both “italic” and “bold” (and underline too somewhat) are not just
> arbitrary display-level caracteristic that had the particularity to
> later get a meaning: *first* a *meaning* was wanted, and *then* they
> were invented as an imperfect, more or less good, way to translate these
> meanings or their intents to display (it’s as imperfect as a bitmap or
> handwriting of a circle, or a sampled and compressed audio, is to the
> bezier curve or equation of a circle which resulted in it, or the
> function that produced the audio (such as a LilyPond musical partition
> or a resulting MIDI file)).
>
> I’m willing to extract as much of the original meaning (be it about
> attention, memorization, structuration, etc. (very abstract cognitive
> human features are still more common than visual-recognition features))
> so it can be then better applied everywhere, without the burden and
> constraints of the original media (display), with a little of history
> because I like to rehistoricize things into their material and social
> background, so not to see them as a static, ahistoric, uncreated,
> uncriticizable, concept.  Concepts and tools are made for people to
> serve them, not the opposite.
>
>>   how that is best interpreted will depend on the media being used
>>   (i.e. how the content is 'rendered') and should largely be up to the
>>   consumer. 
>
> Yes totally, this is why I believe we, at best, should try to give clear
> and defined meaning to why do we use *, / and _-tags, rather than just
> translating them to the traditional <em>, <strong>, and <ins> tags, that
> were actually just a poor 1-to-1 wrapping to the old <i>, <b> and <u>
> tags, which had no meaning, and still have confused, complex and not
> backward-compatible meaning.
>
> And why sometimes it might be better to set up user options, so if
> authors disagree with what is meant by their tags, they can change it,
> so in the end that gives the correct semantic markup and everybody will
> get the same, intended, meaning.
>
> Also why, ideally, for the web, I wished server-side CSS never existed
> and we only used it as a user-customization language (but still most
> websites have poor semantic tagging, and complex tags composition have
> still no clear defined meaning so it the end it becomes either guessing,
> either a request to add yet-another tag to the already complex HTML
> spec).
>
>> - I am a screen reader user. While you are correct that pitch, tone,
>>   speed and different voices are often used to convey things like 'bold'
>>   or 'italic', there is no universally accepted rule for this
>>   interpretation, at least not in the same sense as there is with
>>   typography.
>
> I know, that’s why I wanted to check with Orca, NVDA, and maybe Jaws too
> if I could.
>
>>   We all know what bold or italic looks like, but there is no
>>   agreement as to what these should sound like. When you use Jaws, you
>>   will get a different result from when you use Orca or Emacspeak or
>>   Window Eyes or .... However, this shouldn't really matter - how
>>   these are 'rendered' should ideally be under the control of the
>>   individual consuming the content. When I consume a document, it
>>   should be my decision as to how the content is presented and for me,
>>   interpreting 'strong' or 'emphasis' seems to be far clearer than
>>   'bold' or 'italic'.
>
> That’s why I’d like * and / to get better meaning than bold and italic.
> For me it is already widely accepted that * is, sometimes, considered as
> bold, but more widely used for emphasis.  So it should be considered as
> such (and, personally, I’ve meant this so that it could begin rendering
> with italic on display for instance, or whatever is the favorite
> emphasis method of the user, it should be configurable).
>
> / is a way harder problem as it has been used because of its slanted
> appearance, to mean italic, so sometimes it’s used for emphasis,
> sometimes for other uses of emphasis.  Ideally I’d like to be acted it’s
> not for emphasis (it’s way less used and supported than * for it, and *
> already serves this purpose very well informally), so implementations
> derive some other meaning for it, to get richer semantics.
>
>> - I don't believe there is any strong reason that the markup used by org
>>   should have any strong reference to HTML in appearance. Org supports
>>   many different backends, many of which don't have anything to do with
>>   HTML at all. It is perhaps unfortunate that Org syntax and markdown
>>   are quite different (though I feel the unfortunate part is that
>>   markdown didn't follow org more closely as I much prefer Org's syntax
>>   to most markdown semantics).  
>
> I don’t like markdown either, nor ReStructuredText.  Why I talked a lot
> about HTML is for two reasons: the discussion was initially about it,
> and it is, afaik, the richest and most known semantical markup
> language.  It is *way* richer than LaTeX, org, md, rst, etc. maybe even
> odt and texinfo, but I’m unsure.
>
> However the * and / exports to texinfo with the same tags as html, that
> is respectively strong and emphasis, which I find sad as * is what is
> mostly used for emphasis (and too levels are pretty much not needed, why
> richer semantics could).  ODT seems to use “<span>” with “style-name="Emphasis"”: I
> heard ODT could be somewhat semantic, but I don’t know if that the best
> they can do (maybe this “style-name” has standard semantics? because to
> me styling is for presentation, and tagging for semantics).
>
> Also a problem of many backends is they’re made for printing or less
> semantic: pdf is not made for semantics, although I heard somewhere that
> they were trial to make it so (which sounds silly as it is tailored for
> printing and supports almost no dynamic modifications, it would be
> better to stop using PDFs at all, in, eg, administration).
>
>> - Probably the number 1 issue I come across when dealing with markup is
>>   the expectation too many authors have that things will be rendered in
>>   the browser in a specific way (a particular font, colour, position,
>>   size, etc). This is a mistake. The big advantage of electronic
>>   presentation is that for the first time, the consumer can have control
>>   over the presentation - they can customise it to meet their
>>   requirements or preferences.
>
> *Exactely*.  Except that then, web become commercial, and businesses
> have found it especially good way to control what users saw almost as
> fully as in advertisements (so it can bring control, power to them, and
> also money, secondarily (if they use non-semantic tags and only <div>
> and <span> in awfully complex sgml soup, then no user is able to control
> anything)), just as French minitel would, and they begun first to abuse
> display-level tagging, then to abuse CSS and html-style-soup (full of
> 80% of <div> and <span>, and enormous CSSes, yay! what a progress!  …><
> yet now we have worse: less CSS, less “style”, and more “data-*” and
> non-free surveillance javascript to replace them).
>
>>   The problem with <b> and <i> is that it gives authors an expectation
>>   their content will be rendered in a specific way.
>
> Not anymore, since W3C, somewhat breaking backward-compatibility,
> decided <b> is for “keywords” without special emphasis, and not being a
> definition (there’s already <dfn> afaik for that), and <i> is for
> ”differently-pronounced phrasing content”, without emphasis, such as
> text prounced with a tone of disgust, or foreign-language text (so if
> you want to embed french words not used enough to be in english
> dictionary, and if it’s nor a real quotation (<q lang="fr>), you should
> use <i lang="fr">du texte en française</i>).
>
> So I can theorically decide that any word markuped <b> may compose a
> local list of easily reachable (for instance with keystrokes)
> “keywords”, like lynx, that b should be displayed normally, but in blue,
> and that would be a standard-complying www user-agent.
>
>>   Some may argue that the author should be able to control how their
>>   content is rendered. I think this is misleading because unlike
>>   printed material, the author has no control over the presentation
>>   media - they don't know how large the screen is, what the
>>   capabilities of the screen is, what fonts are installed
>>   etc. Therefore, tags which focus on meaning i.e. I want this to
>>   stand out or I want this to be emphasised are clearer than tags
>>   which say to make this bold or make this italic.
>
> Yes they can: they can require you server-side connecting from a local
> network on computers furnished by the organization of the place (already
> saw that), while checking what do you do and how to make you doing it
> client-side with proprietary javascript, or even to have a tablet with
> retina screen with a such range of screen sizes, on iOS… and… btw… this
> already exists, there’s an app for it: AppStore (GooglePlay too): they
> furnish HTML/CSS UI, controled by proprietary software, only distributed
> for their devices, theorically only working on those (at least the apps
> are developed, configured, and tested so).  And afaik developers don’t
> mind making their software more usable with TalkBack (I don’t even know
> if there’s a such thing for iOS).
>
> The excuse of “the device is not always the same” is to me a weak one:
> this can, with special political and commercial restrictions, be
> lowered, and then it could be considered a “reasonable workaround”
> (while this is not).  What should be advertised is it breaks
> accessibility, don’t comply standard, will certainly break
> forward-compatibility, legally-mandatory interoperability, and, as for
> proprietary software, gives power to authors (or, more often but not
> always, companies) and deprive users of what they could and should
> have.  This power is comparable to what power is gained through
> advertisements, propaganda.
>
>
>> The debate over <i>, <b>, <strong> and <em> is likely to continue for
>> some years yet.  I do think things are moving towards <strong>/<em>
>> and nearly everything I read these days recommends these over <i> and
>> <b>.
>
> There are companies (and some individual, or countries) who gain power
> by doing so, just as they can do by pushing proprietary software (yet on
> a different level), so I don’t believe they will ever stop doing
> anything equivalent.  Nor advertise they would do so.
>
> So this is not a debate.  Like there is no “free vs proprietary” debate,
> or “climate change vs this-is-god/a-myth” debate: the advocate of the
> first have arguments and facts, the actors of the second are either
> stating their ennemies are idealist, stating their goal are
> unrealizable, or they “do so because they have no choice”, or “do their
> best not harm”, and then push more and more pervasive and unadvertised
> way of harming, such as DRM and proprietary javascript, or, in our case,
> use sometimes <em> and <strong>, but allow users to publish content
> using <b> and <i> (and colors!!!), and making their whole website a soup
> of <div> and <span>, heavily relying on a gigantic style soup, based on
> a site-specific CSS stylesheet, that will be partially generated
> server-side, partially heavily “improve” (that is: depend upon)
> proprietary javascript.
>
> Btw, this is what Google does.  And Google is quite evidently the
> biggest feudal lord on the Web.
>
>> It is pretty well accepted that XHTML was a mistake and HTML5 goes a
>> long way to address the issues introduced with XHTML - I think XHTML
>> as a standard is pretty much relegated to an evolutionary dead end.
>
> XHTML was a beautiful standard and was dismissed because all the money
> and resources were placed on HTML5, whose main selling point was new
> media resources (namely <audio> and <video>) and new sensors/multimedia
> API for javascript.  So that hopefully killed flash, java, and almost
> silverlight (which had its niche anyway), and replaced it with
> proprietary obfusced javascript and DRM.  The only evolution is we have
> free softwares to execute it, hence we may have better time
> reverse-engineering and hacking it.
>
> XHTML was a dead end not because evolution, but because, from what I
> heard, it was pretty much, as XML compared to SGML, something promoted
> by academics, IA/semiology (study of semantics) researchers and
> universities (the same who made XPath, RDF, XML Schemas, DTD, SPARQL,
> SQL predecessors, logical languages, etc.), so to make the whole thing
> more extensible, modular, and factorized.  For instance, the fact of
> embedding MathML (semantic markuping for maths language), instead of
> using images (TeX-generated, of course) to display formulas, come from
> XHTML, and doing that in HTML5 is called “XHTML5”.  Same for SVG.  Same
> for XLinks (more powerful than HTML hyperlinking facilities, more near
> to what was originally meant for first-browser HTML links (dual-end
> links, multi-links, etc.), hence more semantic, and currently, as a
> small subset, used in SVG and ODT to mark link.
>
> XHTML was a great step toward semantics actually, and its death is
> extremely sad, even for semantics.
>
> But, nowadays, though they’re member as well, just as Mozilla, we know
> now W3C is ran by Netflix, BBC, and other displayfull DRM companies.  So
> HTML5 won over XHTML, because XML, pure logics, and semiology (what made
> “semantic” stuff possible at all) do not interest these companies.
> Codec-supports and DRM do.
>
>> I do agree <div> is over used. In particular, HTML5 has a number of
>> new tags which should be used to convey document structure which would
>> be a better choice than <div> with different 'class' attributes.
>
> XHTML2 had those, too (not all yet), before to be killed by HTML5.
> Also, not only new elements, but combinations of those, can be used to
> replace “style”.  For instance, “article article” CSS selector refers to
> what HTML specs call a “comment” (and “article > footer:first-child” is
> what we call in mail a “header”: it contains secondary metadata).  But
> currently, everywhere in the web, comments do not always use the
> “article” tag, or always are outside the commented article tag, and use
> a special non-standard css style to mark them.
>
> Note this is a non-standard, opaque and almost proprietary format: it’s
> not patented, it’s not copyrighted, but its implementation (either CSS
> or proprietary javascript) most of time is.  It’s not binary, it is
> based on a cleartext format, but obfuscated, and unlike what cleartext
> advocated have said, way more impossible to reverse-engineer than a
> single binary format: easily guessable, of course, but just like a
> binary format is difficult and take time to make, a text-format is easy
> and quick to make, so there are *millions* of them, at least one per
> website, and nobody will ever be able to reverse engineer that (unlike
> binary formats which are fewer).
>
>> However, we will continue to see a lot of div tags, even when authors
>> begin to use newer tags - at least it is a lot better than the early
>> days when everything was stuck inside tables!
>
> Ah the old days… now they’ve got css-level tables.  But they still need
> a div tag soup for it, instead of advanced selectors, so they keep
> complexifying the (X)HTML, and making it bloated and unreadable.
>
> Instead, in XML, we had DSSSL (inspired from scheme) and XSLT
> (turing-complete, purely functional, XML syntax) to transform it in
> display level things: you *never* need to change the XHTML, only your
> stylesheet will get more complex as your display becomes more complex,
> and you can do *anything* with your initial, semantics tags.
>
> But now XHTML is dead, this kind of powerfulness is to reserve for
> LaTeX3, lisp, and such.
>
>> Backends which generate HTML should be generating HTML5 compliant
>> output if for no other reason than it is clearer and easier than
>> XHTML.
>
> XHTML allows embedding content and using more semantic markup.  For
> instance it could allow using MathML for math equations (a lot used in
> org, though through AMSLaTeX syntax), instead of, in org-mode export,
> currently used MathJAX, relying on javascript and display-level CSS.
>
>> As to the OP's original question regarding changing <b> and <i> in HTML
>> backends - while I would vote for strong/em over b/i, I don't think
>> there is any real need to do this, certainly not in the short term. As
>> was pointed out b/i has not been deprecated, so it is still valid. There
>> is no suggestion to change Org's own internal markup (ironically
>> referred to as bold and italic!), so overall, the status quo seems fine.
>
> I believe we should begin making it semantic, as for
> forward-compatibility, and it is always better to break backward-compat,
> or introduce new specs the sooner as possible, because as the
> implementation gets more widely used, the burden to change it
> incompatibly increases as well.


-- 
Tim Cross

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace <b> with <strong> and <i> with <em>]
  2018-10-28 21:19             ` Tim Cross
@ 2018-10-28 21:46               ` Neil Jerram
  2018-10-28 22:43               ` *markup*, /markup/ and _markup_ true semantics Garreau, Alexandre
  1 sibling, 0 replies; 10+ messages in thread
From: Neil Jerram @ 2018-10-28 21:46 UTC (permalink / raw)
  To: Tim Cross, Garreau, Alexandre; +Cc: emacs-org list, Kaushal Modi

Tim Cross <theophilusx@gmail.com> writes:

> On reading your response, we are probably not as far apart as I first
> thought. However, we have now wondered into discussion which probably
> isn't appropriate for this list. It is now in the realms of something
> that would probably be better discussed with a good bottle of red or a
> nice cold beer!
>
> There is lots that is 'broken' with the web and I suspect much of it we
> will just have to live with and hope whatever the next evolution brings
> us learns from our mistakes.
>
> Tim
>
> Garreau, Alexandre <galex-713@galex-713.eu> writes:
>
>> On 2018/10/27 at 07:15, Tim Cross wrote:
>>> I have either misunderstood most of your position or I simply disagree
>>> with it - I'm not sure which.
>>
>> maybe a mix of both? I hope it’s a misunderstandnment but if it’s not I
>> want to understand too so to get to a constructive agreement.
[...]

Well for what it's worth I have enjoyed reading all this; especially
Alexandre's exposition of the difference between italic and bold when
being read 'from near' and 'from far'.  Thanks both.

    Neil

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: *markup*, /markup/ and _markup_ true semantics
  2018-10-28 21:19             ` Tim Cross
  2018-10-28 21:46               ` Neil Jerram
@ 2018-10-28 22:43               ` Garreau, Alexandre
  1 sibling, 0 replies; 10+ messages in thread
From: Garreau, Alexandre @ 2018-10-28 22:43 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-org list

On 2018/10/29 at 08:19, Tim Cross wrote:
> On reading your response, we are probably not as far apart as I first
> thought. However, we have now wondered into discussion which probably
> isn't appropriate for this list. It is now in the realms of something
> that would probably be better discussed with a good bottle of red or a
> nice cold beer!

I don’t drink alcohol x) but I pretty much got the idea yeah ^^

> There is lots that is 'broken' with the web and I suspect much of it we
> will just have to live with and hope whatever the next evolution brings
> us learns from our mistakes.

Indeed.  But just waiting “whatever next evolution”, is not the good
attitude I believe.  Lisp and Emacs are great grounds for
experimentation and improvement I believe too.  What interest me is
exiting as much as possible from the web, while trying to stay somewhat
compatible with the good things, but even more with older and better
things (RFC822, nntp, dsssl, semiology studies, true logical
programming, etc.).

I recall the time I was full of energy and constantly changing whatever
was producing or containing html to make it more semantic, one day I
should find again the energy to do that for multiple org backends, and,
related to non-semantical org-backends, try to unify that to if not some
stylesheet system, customizations used by elisp web browsers such as w3
or eww.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-10-28 22:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-24  0:38 Ox-html: Replace <b> with <strong> and <i> with <em> Kaushal Modi
2018-10-24  6:04 ` Nicolas Goaziou
2018-10-24 15:14   ` Kaushal Modi
2018-10-24 21:00     ` Tim Cross
2018-10-26  5:24       ` *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace <b> with <strong> and <i> with <em>] Garreau, Alexandre
2018-10-26 20:15         ` Tim Cross
2018-10-27 12:52           ` Garreau, Alexandre
2018-10-28 21:19             ` Tim Cross
2018-10-28 21:46               ` Neil Jerram
2018-10-28 22:43               ` *markup*, /markup/ and _markup_ true semantics Garreau, Alexandre

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).