Here is some quote of the previous discussion, with my reply below.



* Remarks #1, by Ihor Radchenko at Sat, 07 Sep 2024 11:53

Ihor Radchenko writes:
> > Sébastien Gendre writes:
> > I can search for another project.
> > 
> > I don't know how many work it would be needed to develop a search motor
> > specifically for Org-mode. But doing the indexing on Org-mode files could
> > let the user control the indexation of each page and section directly
> > from them. With buffer settings and heading properties.
> 
> Let me clarify.
> Unless an indexer is very simple, we do not really want it in Org - it
> will put a lot of extra load on maintainers.
> 
> On the other side, we do not want to tie things to some project that
> may fade out in 10 years into the future.
> 
> The way I see search engine support in Org mode is either:
> 
> 1. Using some really established project that we can expect to last for
>    many years.
> 
> 2. Implementing pluggable search support where users can choose which
>    indexer/searcher to use
> 
> (2) will be the best.
> 
> So, may you please search across available search engines and see what
> they have in common. Then, we can work out some infrastructure that is
> generic enough to plug a custom engine.
> 
> Then, pagefind can be the default (it is MIT license - GPL compatible),
> unless someone proposes a better alternative.

Yes, you are right, a pluggable search support is the best choice. I
include it in the summary of this discussion, on the first message of
this thread.

I will continue my search of other search engine and do a summary of
what they have in common. I will publish it on this thread, attached to
the first message.



* Remarks #2, from Max Nikulin at Sun, 8 Sep 2024 21:46 and 22:55

Max Nikulin writes:

> On 07/09/2024 18:53, Ihor Radchenko wrote:
> > Then, pagefind can be the default (it is MIT license - GPL compatible),
> 
> It might be more tricky:
> 
> <https://yhetil.org/emacs-devel/861q671idt.fsf@gnu.org>
> emacs-devel. Re: [Nicolas Graves] [PATCH v6 01/10] rde: emacs: Start 
> emacs in --daemon mode, with shepherd and pid-file. Sun, 12 May 2024 
> 12:36:46 +0300
> > Nicolas Graves:
> >> The code is given as MIT-0, hence also the two different licenses for
> >> the two functions sd_notify and sd_is_socket. Not an expert on licenses
> >> either, but with a proper flag about what this function's license is, I
> >> guess it should be fine, since other projects also do that.
> 
> Eli Zaretskii:
> > The license is only half of the problem.  Every non-trivial
> > contribution to Emacs must have its copyright assigned to the FSF,
> > because the FSF is in charge of protecting the Emacs sources,
> > something that only the copyright holder can do, at least in some
> > countries.  You will need to assign the copyright as well (a
> > relatively simple procedure of filling a form and emailing it), but if
> > the code is not yours, you cannot assign its copyright.

Max Nikulin writes:

> On 08/09/2024 21:46, Max Nikulin wrote:
> > On 07/09/2024 18:53, Ihor Radchenko wrote:
> >> Then, pagefind can be the default (it is MIT license - GPL compatible),
> > 
> > It might be more tricky:
> 
> Sorry for the noise. Of course, if you are not going to include any 
> pagefind code into Org then it is not an issue.

If we use PageFind, isn't possible to not include it with org-mode but
have an Elisp function that download it ? Like what Elpy do with its Python
dependencies ?



* Remarks #3, by Orm Finnendahl, at Sun, 8 Sep 2024 20:36

Orm Finnendahl writes:

> that makes no sense to me whatsoever: Post processing is already
> possible and built into org-export. pagefind is an external product
> with its own binaries, not written in elisp nor being by any means
> connected to emacs and compiling index files on generated HTML files
> is exactly that: A post process.
> 
> The javascript needed and all processing scripts can easily be
> included in the header, so I don't see any point in this, except
> writing a tutorial, how to integrate pagefind into someone's HTML
> output with the means already available with the existing backend.

It is already possible to add a search field and do PageFind indexation
and JS/CSS installation by set custom HTML in preamble and a custom
post-processing function. That what I have started to do.

But: It require the user to write custom HTML, custom Elisp function,
understand how PageFind work, etc.

The feature I suggest is to let the user having this search engine on its
website by simply set an org-publish option to "t".

A local search engine don't seems to be a niche feature. You have it
with Jupyter Book [1] or Read The Docs documentations [2]. As Org-mode
is a fantastic tool to write online documentation, having local search
engine easy to setup is a good feature. It's also very useful for blogs
to let visitor search information in old posts.

If this feature is simple to use, it let the user concentrate on writing
the content.

And regarding the inclusion, or not, of another software: We can have an
Elisp function that download PageFind ? If not, we can display a message
asking user to install it on its own if Emacs doesn't found PageFind on
$PATH.


Orm Finnendahl writes:

> And that's not even contemplating, why someone would want to throw a
> multipage site search indexer onto single page HTML output which
> doesn't work on static files opened from local disks ;-)

For the single HTML export, indexing is not necessary. But the inclusion
of a search field could be if the page have been generated separately
from the org-publish ?

For the last part about files opened from local disks, what do you mean ?



* Remarks #4, by Ihor Radchenko at 08 Sep 2024 18:42

Ihor Radchenko writes:
> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> 
> > The javascript needed and all processing scripts can easily be
> > included in the header, so I don't see any point in this, except
> > writing a tutorial, how to integrate pagefind into someone's HTML
> > output with the means already available with the existing backend.
> >
> > And that's not even contemplating, why someone would want to throw a
> > multipage site search indexer onto single page HTML output which
> > doesn't work on static files opened from local disks ;-)
> 
> I agree that including indexer is just a question of adding specific
> javascript.
> 
> But I think that it would be useful to provide some default toggle to
> include such a tool without having to know the details of what js to
> include and where. Just as a simple user option.
> 
> It should probably work for single page as well. I do not see why not.

I agree with Ihor, the goal is to have a simple user option to enable
this feature.



* Remarks #5, by Orm Finnendahl at 8 Sep 2024 21:25

Orm Finnendahl writes:

> Am Sonntag, den 08. September 2024 um 18:42:51 Uhr (+0000) schrieb
> Ihor Radchenko:
> > I agree that including indexer is just a question of adding specific
> > javascript.
> > 
> > But I think that it would be useful to provide some default toggle to
> > include such a tool without having to know the details of what js to
> > include and where. Just as a simple user option.
> >
> 
>  whatever. The js is already included in the pagefind distribution, so
> it is a simple
> 
> #+HTML_HEAD: <script src="./pagefind/pagefind-ui.js"></script>
> 
> in the org file and the searchbar html in the preamble (or wherever).

This would require the user to write custom HTML code and also write a
custom Elisp function to launch the indexation and installation of
JS/CSS. And the user would also need to understand how PageFind work.

It's a lot of steps and require different skills.

But if we provide a simple option to enable in org-publish, this could
be very simple for the user to have a search engine.

Of course, we are not gonna support every use case of search engine,
only the most simple and useful one. And the pluggable system let to
user the possibility to hack it the way she or he want.


Orm Finnendahl writes:

> > It should probably work for single page as well. I do not see why
> > not.
> 
>  sure it works. I just question the raison d'etre, when single page
> search is already integrated into webbrowsers. But as always there
> will be people arguing that it is necessary to have a search bar with
> pop up results integrated into the page and of course there is nothing
> wrong with that. I use pagefind myself, but the site I'm working on
> (built with the multipage exporter BTW) currently contains more than
> 400 files where the browser search can't help.

A multi-pages search engine is of course useful the most with
multi-pages publication.

Having it for a singe page, I will see it as useful only if you export
one HTML file that you plan to include in an already existing
multi-files website. But it's maybe to specific for us to support it on
a single page export.



* Remarks #6, by Ihor Radchenko at Mon, 09 Sep 2024 16:40

Ihor Radchenko writes:

> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> 
> >> It should probably work for single page as well. I do not see why
> >> not.
> >
> >  sure it works. I just question the raison d'etre, when single page
> > search is already integrated into webbrowsers. But as always there
> > will be people arguing that it is necessary to have a search bar with
> > pop up results integrated into the page and of course there is nothing
> > wrong with that. I use pagefind myself, but the site I'm working on
> > (built with the multipage exporter BTW) currently contains more than
> > 400 files where the browser search can't help.
> 
> Agree. So, having search functionality probably makes more sense within
> ox-publish, where we may also run post-processing to generate search
> terms or whatever extra work is needed to make the indexer work.

I 100% agree with you.


* EOF

That it, I hope to have forgotten no one.


Best regards

-------
Gendre Sébastien




[1] https://jupyterbook.org/

[2] Example here: https://slixmpp.readthedocs.io