emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Keeping an advanced dictionary in Org-mode?
@ 2011-06-06  9:38 Christian Moe
  2011-06-06 14:50 ` Alan E. Davis
  2011-06-07 10:55 ` Julian Bean
  0 siblings, 2 replies; 5+ messages in thread
From: Christian Moe @ 2011-06-06  9:38 UTC (permalink / raw)
  To: Org Mode

Hi,

Is anybody using Org-mode to build an advanced dictionary with 
sub-entries, tags etc.? Would you be willing to share a setup?

For example, the obvious way to build a dictionary would be to use a 
dictionary list (I borrow a few English-French lines from the 
wonderful WordReference.com site):

- pine ::
   (/paɪn/)
   1. /m noun/ [bot.] pin; *stripped ~* pin décapé.
   2. /intr verb/ languir (*for* après; *to do* de faire)

This looks nice, but unfortunately, you cannot set tags or properties 
on dictionary terms, so it's not particularly amenable to fancy 
searching, mapping etc.

On the other hand, you could do something like this:

* pine
   :PROPERTIES:
   :Pronunciation: /paɪn/
   :END:
** pin 					:bot:
    :PROPERTIES:
    :Word_class: noun
    :Gender:   m
    :END:							
    *stripped ~* pin décapé.
** languir
    :PROPERTIES:
    :Word_class: verb
    :Transitivity: intr
    :END:
    (*for* après; *to do* de faire)

It's a pain to do, and because of outline folding, it could be a pain 
to look up meanings, and you might need to do some serious 
post-processing on the export to make it look anything like a 
dictionary. But when you're done, you could extract a list of all 
botanical terms (:bot:), or of words and pronunciations only... etc.

So for my growing pile of translation notes, I might like to keep that 
kind of thing. But there are so many ways it could be organized - what 
do you put in subheadings? what in entry text below subheadings? what 
in tags, what in properties? etc. So if someone has an example that 
works for them, I'd like to see one.

(Org may not be the best tool for this job, of course, but it's the 
right tool for me...)

Yours,
Christian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Keeping an advanced dictionary in Org-mode?
  2011-06-06  9:38 Keeping an advanced dictionary in Org-mode? Christian Moe
@ 2011-06-06 14:50 ` Alan E. Davis
  2011-06-06 18:08   ` Christian Moe
  2011-06-07 10:55 ` Julian Bean
  1 sibling, 1 reply; 5+ messages in thread
From: Alan E. Davis @ 2011-06-06 14:50 UTC (permalink / raw)
  To: mail; +Cc: Org Mode

[-- Attachment #1: Type: text/plain, Size: 5401 bytes --]

FWIW:

I won't get into it much for now, but I have used a "band format" for
lexical data.  There are other names for this type of free form database.  I
wrote a crude elisp routine to recover entries into LaTeX formatted files.

A "band" is a record, so to speak.  I am not very well qualified in this,
but was able to use it to record lexical data.  You may find some linguists'
websites where this or similar formats are elucidated.  A record starts with
a double dotted key, and information categories may be made up on the fly,
as marked by  single-dotted keys, preceded by at least two spaces.   I think
it's convenient for a record to be delineated by a line feed, as well.

..HW <headword>  .D <local dialectZ>   .GE <English gloss>   .NS <scientific
name>  .NCE  <Common Name>   .NCs <Spanish Common Name>  .R  <remark>  .RC
<Remark on Cultural Signficance>

This is just a made up case, but perhaps you can catch the drift.

Here are a couple of simple cases from my files:

..hw tutubi   ,lang vis  .nce dragonfly    .source FSD
..HW sigai    .lang vis .ge (mollusc) shell, when empty
..hw soksok  .ec gecko  .cg  .la ilo  .src hanna .n
..hw locus  .ec octopus  .cg  .la ilongo  .src hannah .n see nucus [vis];
kuus [chuukese]
..hw tikling  .ec heron  .cg  .la vis  .from fsd
..hw nucus  .ec octopus  .cg  .la vis  .src fsd, hannah  .n related to
chuukese kuus

Fairly straightforward elisp would scan a record and wrapping each item in a
particular typeface.

To get an idea of the output.  Each line was output as an \item in a list.
This got to be a LITTLE cumbersome, perhaps, and someone good at coding
would do it differently.  The idea is that a lisp routine scans the records
and spits out list items.  This could be any kind of output, and perhaps org
mode would be a good way to rig a routine to scan list items and output
different band types as slanted (\sl), roman, or italicized components.

\item [{\sl k\'{u}\'{u}s\/}$_{3}$]   \index{k\'{u}\'{u}s} \quad     Small,
night-time octopus.   HADJ  E\'{e}t.

\item [{\sl k\'{u}\'{u}s\/}$_{4}$]   \index{k\'{u}\'{u}s} \quad     Daytime
octopus.   {\sc syn\/}:\ {\sl  nippach}.    {\sc alt\/}:\ {\sl
k\'{u}\'{u}h}.    \HADJ  F\'{o}n\'{o}.

\item [{\sl k\'{u}\'{u}sen neepwin\/}]   \index{k\'{u}\'{u}sen neepwin}
\quad    {\sc see\/}:\ {\sl nippachin neepwin}.    Even though this is not
said, it would be the correct way to say it. \HADJ  Wonip.


This may not be an appealing approach.  I am still pleased with the ability
to flexibly add band keys on the fly, during data entry, and the potential
to use LaTeX as a frontend.  HTML would also be useful, depending on how you
wish to read your dictionary.

Not a perfect system.  Linguists have done better.  Robert Hsu of the
University of Hawaii built a system around SPITBOL and maybe SNOBOL4.  I was
hopelessly lost trying to use those, but elisp did what little I needed.  I
think that it may be possible to organize a database using org-mode.

For now, I have a capture template for data entry, such as it is:

("=" "lex" entry (file+headline "lexicon.org" "Unsorted") "* ..hw
%^{Headword}  .gs %^{Scientific Name}  .ge %^{English Gloss}  .ec %^{English
Common Name}  .cg %^{Category}  .la %^{Language}  .src %^{Informant} .n
%^{Note} %?  .dt %u " :prepend t :immediate-finish t)

Again, FWIW.  To me, a great deal.  Maybe to others, not so great of a
deal.

Alan


On Mon, Jun 6, 2011 at 7:38 PM, Christian Moe <mail@christianmoe.com> wrote:

> Hi,
>
> Is anybody using Org-mode to build an advanced dictionary with sub-entries,
> tags etc.? Would you be willing to share a setup?
>
> For example, the obvious way to build a dictionary would be to use a
> dictionary list (I borrow a few English-French lines from the wonderful
> WordReference.com site):
>
> - pine ::
>  (/paɪn/)
>  1. /m noun/ [bot.] pin; *stripped ~* pin décapé.
>  2. /intr verb/ languir (*for* après; *to do* de faire)
>
> This looks nice, but unfortunately, you cannot set tags or properties on
> dictionary terms, so it's not particularly amenable to fancy searching,
> mapping etc.
>
> On the other hand, you could do something like this:
>
> * pine
>  :PROPERTIES:
>  :Pronunciation: /paɪn/
>  :END:
> ** pin                                  :bot:
>   :PROPERTIES:
>   :Word_class: noun
>   :Gender:   m
>   :END:
>   *stripped ~* pin décapé.
> ** languir
>   :PROPERTIES:
>   :Word_class: verb
>   :Transitivity: intr
>   :END:
>   (*for* après; *to do* de faire)
>
> It's a pain to do, and because of outline folding, it could be a pain to
> look up meanings, and you might need to do some serious post-processing on
> the export to make it look anything like a dictionary. But when you're done,
> you could extract a list of all botanical terms (:bot:), or of words and
> pronunciations only... etc.
>
> So for my growing pile of translation notes, I might like to keep that kind
> of thing. But there are so many ways it could be organized - what do you put
> in subheadings? what in entry text below subheadings? what in tags, what in
> properties? etc. So if someone has an example that works for them, I'd like
> to see one.
>
> (Org may not be the best tool for this job, of course, but it's the right
> tool for me...)
>
> Yours,
> Christian
>
>
>

[-- Attachment #2: Type: text/html, Size: 6357 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Keeping an advanced dictionary in Org-mode?
  2011-06-06 14:50 ` Alan E. Davis
@ 2011-06-06 18:08   ` Christian Moe
  2011-06-07 10:21     ` Alan E. Davis
  0 siblings, 1 reply; 5+ messages in thread
From: Christian Moe @ 2011-06-06 18:08 UTC (permalink / raw)
  To: Alan E. Davis; +Cc: Org Mode

Hi,

Thanks, these pointers were really helpful -- whether I end up doing 
something similar, or using them to work out how I want to do this in 
Org, or using other tools I was able to discover in five minutes after 
you'd pointed me to the right search keywords!

( Like SIL's Toolbox:
http://www.sil.org/computing/catalog/show_software.asp?id=79 )

Yours,
Christian

On 6/6/11 4:50 PM, Alan E. Davis wrote:
> FWIW:
>
> I won't get into it much for now, but I have used a "band format" for
> lexical data.  There are other names for this type of free form
> database.  I wrote a crude elisp routine to recover entries into LaTeX
> formatted files.
>
> A "band" is a record, so to speak.  I am not very well qualified in
> this, but was able to use it to record lexical data.  You may find
> some linguists' websites where this or similar formats are
> elucidated.  A record starts with a double dotted key, and information
> categories may be made up on the fly, as marked by  single-dotted
> keys, preceded by at least two spaces.   I think it's convenient for a
> record to be delineated by a line feed, as well.
>
> ..HW <headword>  .D <local dialectZ>   .GE <English gloss>   .NS
> <scientific name>  .NCE <Common Name>   .NCs <Spanish Common Name>  .R
> <remark>  .RC <Remark on Cultural Signficance>
>
> This is just a made up case, but perhaps you can catch the drift.
>
> Here are a couple of simple cases from my files:
>
> ..hw tutubi   ,lang vis  .nce dragonfly    .source FSD
> ..HW sigai    .lang vis .ge (mollusc) shell, when empty
> ..hw soksok  .ec gecko  .cg  .la ilo  .src hanna .n
> ..hw locus  .ec octopus  .cg  .la ilongo  .src hannah .n see nucus
> [vis]; kuus [chuukese]
> ..hw tikling  .ec heron  .cg  .la vis  .from fsd
> ..hw nucus  .ec octopus  .cg  .la vis  .src fsd, hannah  .n related to
> chuukese kuus
>
> Fairly straightforward elisp would scan a record and wrapping each
> item in a particular typeface.
>
> To get an idea of the output.  Each line was output as an \item in a
> list.  This got to be a LITTLE cumbersome, perhaps, and someone good
> at coding would do it differently.  The idea is that a lisp routine
> scans the records and spits out list items.  This could be any kind of
> output, and perhaps org mode would be a good way to rig a routine to
> scan list items and output different band types as slanted (\sl),
> roman, or italicized components.
>
> \item [{\sl k\'{u}\'{u}s\/}$_{3}$]   \index{k\'{u}\'{u}s} \quad
> Small, night-time octopus.   HADJ  E\'{e}t.
>
> \item [{\sl k\'{u}\'{u}s\/}$_{4}$]   \index{k\'{u}\'{u}s} \quad
> Daytime octopus.   {\sc syn\/}:\ {\sl  nippach}.    {\sc alt\/}:\ {\sl
> k\'{u}\'{u}h}.    \HADJ  F\'{o}n\'{o}.
>
> \item [{\sl k\'{u}\'{u}sen neepwin\/}]   \index{k\'{u}\'{u}sen
> neepwin} \quad    {\sc see\/}:\ {\sl nippachin neepwin}.    Even
> though this is not said, it would be the correct way to say it. \HADJ
> Wonip.
>
>
> This may not be an appealing approach.  I am still pleased with the
> ability to flexibly add band keys on the fly, during data entry, and
> the potential to use LaTeX as a frontend.  HTML would also be useful,
> depending on how you wish to read your dictionary.
>
> Not a perfect system.  Linguists have done better.  Robert Hsu of the
> University of Hawaii built a system around SPITBOL and maybe SNOBOL4.
> I was hopelessly lost trying to use those, but elisp did what little I
> needed.  I think that it may be possible to organize a database using
> org-mode.
>
> For now, I have a capture template for data entry, such as it is:
>
> ("=" "lex" entry (file+headline "lexicon.org <http://lexicon.org>"
> "Unsorted") "* ..hw %^{Headword}  .gs %^{Scientific Name}  .ge
> %^{English Gloss}  .ec %^{English Common Name}  .cg %^{Category}  .la
> %^{Language}  .src %^{Informant} .n %^{Note} %?  .dt %u " :prepend t
> :immediate-finish t)
>
> Again, FWIW.  To me, a great deal.  Maybe to others, not so great of a
> deal.
>
> Alan
>
>
> On Mon, Jun 6, 2011 at 7:38 PM, Christian Moe <mail@christianmoe.com
> <mailto:mail@christianmoe.com>> wrote:
>
>     Hi,
>
>     Is anybody using Org-mode to build an advanced dictionary with
>     sub-entries, tags etc.? Would you be willing to share a setup?
>
>     For example, the obvious way to build a dictionary would be to use
>     a dictionary list (I borrow a few English-French lines from the
>     wonderful WordReference.com site):
>
>     - pine ::
>       (/paɪn/)
>       1. /m noun/ [bot.] pin; *stripped ~* pin décapé.
>       2. /intr verb/ languir (*for* après; *to do* de faire)
>
>     This looks nice, but unfortunately, you cannot set tags or
>     properties on dictionary terms, so it's not particularly amenable
>     to fancy searching, mapping etc.
>
>     On the other hand, you could do something like this:
>
>     * pine
>       :PROPERTIES:
>       :Pronunciation: /paɪn/
>       :END:
>     ** pin                                  :bot:
>        :PROPERTIES:
>        :Word_class: noun
>        :Gender:   m
>        :END:
>        *stripped ~* pin décapé.
>     ** languir
>        :PROPERTIES:
>        :Word_class: verb
>        :Transitivity: intr
>        :END:
>        (*for* après; *to do* de faire)
>
>     It's a pain to do, and because of outline folding, it could be a
>     pain to look up meanings, and you might need to do some serious
>     post-processing on the export to make it look anything like a
>     dictionary. But when you're done, you could extract a list of all
>     botanical terms (:bot:), or of words and pronunciations only... etc.
>
>     So for my growing pile of translation notes, I might like to keep
>     that kind of thing. But there are so many ways it could be
>     organized - what do you put in subheadings? what in entry text
>     below subheadings? what in tags, what in properties? etc. So if
>     someone has an example that works for them, I'd like to see one.
>
>     (Org may not be the best tool for this job, of course, but it's
>     the right tool for me...)
>
>     Yours,
>     Christian
>
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Keeping an advanced dictionary in Org-mode?
  2011-06-06 18:08   ` Christian Moe
@ 2011-06-07 10:21     ` Alan E. Davis
  0 siblings, 0 replies; 5+ messages in thread
From: Alan E. Davis @ 2011-06-07 10:21 UTC (permalink / raw)
  To: mail; +Cc: Org Mode

[-- Attachment #1: Type: text/plain, Size: 466 bytes --]

Thank you for the link.

On Tue, Jun 7, 2011 at 4:08 AM, Christian Moe <mail@christianmoe.com> wrote:

> Hi,
>
> Thanks, these pointers were really helpful -- whether I end up doing
> something similar, or using them to work out how I want to do this in Org,
> or using other tools I was able to discover in five minutes after you'd
> pointed me to the right search keywords!
>
>
> Thank you for that link.

I would be interested in anything you come up with.

Alan

[-- Attachment #2: Type: text/html, Size: 799 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Keeping an advanced dictionary in Org-mode?
  2011-06-06  9:38 Keeping an advanced dictionary in Org-mode? Christian Moe
  2011-06-06 14:50 ` Alan E. Davis
@ 2011-06-07 10:55 ` Julian Bean
  1 sibling, 0 replies; 5+ messages in thread
From: Julian Bean @ 2011-06-07 10:55 UTC (permalink / raw)
  To: mail; +Cc: Org Mode


On 6 Jun 2011, at 10:38, Christian Moe wrote:

> ** languir
>   :PROPERTIES:
>   :Word_class: verb
>   :Transitivity: intr
>   :END:
>   (*for* après; *to do* de faire)
> 
> It's a pain to do, and because of outline folding, it could be a pain to look up meanings, and you might need to do some serious post-processing on the export to make it look anything like a dictionary. But when you're done, you could extract a list of all botanical terms (:bot:), or of words and pronunciations only... etc.


Column View is your friend. Both for lookup and for data entry.

Jules

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-06-07 10:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-06  9:38 Keeping an advanced dictionary in Org-mode? Christian Moe
2011-06-06 14:50 ` Alan E. Davis
2011-06-06 18:08   ` Christian Moe
2011-06-07 10:21     ` Alan E. Davis
2011-06-07 10:55 ` Julian Bean

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).