Ihor Radchenko <yantar92@posteo.net> writes:

> Nathaniel Nicandro <nathanielnicandro@gmail.com> writes:

Hello,

I've finally implemented a solution to what I've discussed previously,
inserting zero width spaces as boundary characters after an ANSI
sequence to act as a separator from the text after the sequence.  This
would handle the scenario where deleting into the end byte of a
sequence causes ansi-color to recognize the partially deleted sequence
plus the character directly after the end byte to be a new sequence.
This looked like the invisible region containing a sequence eating up
other characters not intended to be part of the region.

So for example, suppose you had a control sequence, ^[[42m, where m is
the end byte that says the sequence is a color sequence.  Let point be
signified by *.  If we have

^[[42m*text

then deletion into the end byte would result in 

^[[42*text

t is still a valid end byte so the fontification process will
recognized the whole thing as a valid sequence still and the t would
then become part of the invisible region containing the sequence.

To avoid this from happening I have introduced the rule that any valid
sequence shall have a zero width space immediately after it and this
space remains in the buffer even on deleting into it with, for
example, backward-delete-char.  Let the zero width space be signified
by |.  If we have 

^[[42m|*text

then deletion into the space would now result in

^[[42*|text

i.e., the effect is that the deletion went past the space, leaving it
alone, and deleted the end byte of the control sequence.  Since the
control sequence is no longer valid, due to the space being at the
position of the end byte, it becomes visible.

If you then insert a valid end byte, e.g. m, then the effect is

^[[42m|*text

i.e., point moved past the space character.

So the implementation of that rule of maintaining a zero width space
after valid sequences and the rules around deleting into the space or
insertion in front of a space are the main changes in this patch
compared to previous versions.

>
> I tried to test your newest patch with the example file you provided and
> I notice two things that would be nice:
>
> 1. It is a bit confusing to understand why one or other text is colored
>    without seeing the escape characters. Some customization like
>    `org-link-descriptive' and a command like `org-toggle-link-display'
>    would be nice. I can see some users prefer seeing the escape codes.

I've gone ahead and implemented the toggling of the visibility of the
escapes sequences.  The variable is `org-ansi-hide-sequences` and the
function is `org-toggle-ansi-display`.

I just used buffer-invisibility-spec for this.

>
> 2. Using overlays for fontification is problematic. In your example
>    file, table alignment becomes broken when escape sequences are hidden
>    inside overlays:
>
>    | [31mcell 1 | cell 2 |
>    | cell 3       | cell 4 |
>
>    looks like
>
>    |       cell 1 | cell 2 |
>    | cell 3       | cell 4 |
>
>    Using text properties would make table alignment work without
>    adjustments in the org-table.el code.
>

I've gone ahead and used text properties instead of overlays.

>> One thing I would like to start doing is writing some tests for this
>> feature.  It would be great if someone could point me to some tests
>> that I can peruse so that I can get an idea of how I can go about
>> writing some of my own.  Also, are there any procedures or things I
>> should be aware of when trying to write my own tests?
>
> Check out testing/README file in the Org repository.
>
> Unfortunately, we do not yet have any existing tests for font-locking in
> Org tests. You may still refer to the files in testing/lisp/ to see some
> example tests.
>
> Also, Emacs has built-in library to help writing font-lock tests -
> faceup.el. You may consider using it. Its top comment also contains a
> number of references to various tools that could be useful to diagnose
> font-locking code.

I have not looked into testing this feature yet.

Feedback appreciated!