From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:bcc0::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id MBGLOKVYaWB3bwAAgWs5BA (envelope-from ) for ; Sun, 04 Apr 2021 08:11:49 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id KNOBMqVYaWAWUQAAB5/wlQ (envelope-from ) for ; Sun, 04 Apr 2021 06:11:49 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 4EF151452F for ; Sun, 4 Apr 2021 08:11:49 +0200 (CEST) Received: from localhost ([::1]:40228 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lSvzA-00033L-2T for larch@yhetil.org; Sun, 04 Apr 2021 02:11:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57808) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lSvyg-000339-Ip for emacs-orgmode@gnu.org; Sun, 04 Apr 2021 02:11:19 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:46918) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lSvye-0007OB-93 for emacs-orgmode@gnu.org; Sun, 04 Apr 2021 02:11:18 -0400 Received: by mail-wr1-x42f.google.com with SMTP id v4so8133792wrp.13 for ; Sat, 03 Apr 2021 23:11:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=JazSe8yJzZv8GMne2bfnDBfvc5tCUorsBB31gFqzMjo=; b=oForEpeTCWSbSf61kmAmRZM+4AFFQC6BcFPacFE84owww8nI/gKS5SnXZ7ppvvJZUZ nlGX67m0JEoM5MnxN8QBk6GaLuFXltpuTsL5KxRD9eYQrekacbl8LpuhD8ZWyBL/Lsmq Hui0gFKHrhnzozcBY9N/otBX+fueTIUlpEs8q3BBVo2xy+zKYUf3pyxV3X3DwT8oRvys AI6UZtjJx8cP5oj2PQrGX2L4NMTFGLoflwd6o7gAQ9RRznoYGJesU2QFaNA0dbNfq3ZL uxAhdDSV0edGBxsHLUquXt7ZzhADsswUgxqVA7lfaH91KTYmNa4znHEn/G90QlHxTPTa 7lnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=JazSe8yJzZv8GMne2bfnDBfvc5tCUorsBB31gFqzMjo=; b=qvcg7oBqKR3c2QeHcsg1v4zHWBom+kcSJaX5a9n0U9x34xYW1zDVourf/7/cx/qRsn mxhITsDiLPX87sf0qemimyWuMxjzuP4Z3tBymPmdhSQKIyZuWJe4w6KgFKBqzG8oPJFw D9SDfjC23RijDz/pIEtZW2xXWJB7raVCaCZ1kWKsQwe6UeNLca+q7gJo4X536JgFWKsq EfGHcjAL2c9B6l+N5KbECRzWuRuGVP80lJttoodYH3P9sqWiyh/p91WYjE0CWyJlym0I 4XJL3i9udYJt1BUyB4NJytDlV2FfsBBtmnLpXh37wEG7+TCe4IOqZy+TdhuPAZ9HQHHw IfUg== X-Gm-Message-State: AOAM533YemIvhjceg3w/A/ziN77ZK3N2GRNk/fsKrR6UwUY+/e8fk92d q5UEJb6mnzHiWVbtNvZy1kHiT5DdI+CFZbHVPSQEK2G9EfE= X-Google-Smtp-Source: ABdhPJyJ0yedd7dF5lrL8t98DVCgCFOxRN+0Odz4NwZcJIY2ozijPA30l9nMKda9dwK9SbkhrNLNDr1GXgjLCPFYKaE= X-Received: by 2002:a05:6000:10c3:: with SMTP id b3mr23025553wrx.96.1617516674581; Sat, 03 Apr 2021 23:11:14 -0700 (PDT) MIME-Version: 1.0 From: Tom Gillespie Date: Sat, 3 Apr 2021 23:11:03 -0700 Message-ID: Subject: A formal grammar for Org To: emacs-orgmode Content-Type: text/plain; charset="UTF-8" Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=tgbugs@gmail.com; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1617516709; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=JazSe8yJzZv8GMne2bfnDBfvc5tCUorsBB31gFqzMjo=; b=dWJELRHwNKcJ7iLbgOuMRfbPUCcc+snXNwuUVe1Czy/z9sjB6bfm6toyu+BdqRTtjsGNLs /1g6zBa6PddUcgQO/9lYrlNK1g+CwE/fx1b4usEVCbTL7moOt4zLjKKmZgbdnnC/vn1N2K qZpynypqlAncfg9LKVZpW7vXXPEwpeE5XFpmKnnoZh0L7nwWQJaOpK5yaiHGKYTbwaJ3oZ /lVKRy2N7f0p9l8I6BmZByiHq0XiFcmV1xH608L8DUH4loZgjvh1wne0gTIoHz9WxCVGvO uI7/z5leSgNDqN2k+Bjuj28BMTm851gAL9cgqZObG+2yQBxaYZJ0nHiLLp+Zkg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1617516709; a=rsa-sha256; cv=none; b=iNIVE66d+zzfV7IO4NDYzw7ie1ltDzGfNcF/C2yPEe2gxbDxNPR1enDhXkW/UBaDiSs1vJ EDIDpLvPHH9rSr0eImI5bprUPcakCO/+sTuwa/P5JnI+S00ELQKAwtbK2W4Z59RhQcaNAu 7vksz89/WSssy8d2HH6sypV9LZm57QRs8nKoBAoLFLbFgbt/d+jOF4X2636qitiWShlKZC XvvgMLp9cvuQPvlpWLM+1tkjgnh+klyKniVuBjwNVvah051sHWzCas7Xoj2f9nla1At+tJ dXUdGkIaYLj/5xVU0FNP6P3N/hVwl7pTTEw5klBnuXwK5BkAV64x9eBmNuQmsg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=oForEpeT; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -3.14 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=oForEpeT; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: 4EF151452F X-Spam-Score: -3.14 X-Migadu-Scanner: scn0.migadu.com X-TUID: ulnCGKEaF2dL Dear all, Here is a draft of a formal grammar for Org mode [1]. It is still in a rough state, despite quite a bit of work. However, following some changes to improve performance for parsing real (big) Org files, I think it is time to share it with the community so that we can start to gather feedback. There are a number of opportunities that I have found for simplifying the org grammar (sometimes by extending it to make it more regular, and in the process adding useful features) that are much easier to understand with this grammar in hand as a reference. The grammar itself is implemented using Racket's #lang brag (see [2] for an overview of brag's syntax). I had considered trying to break it up into literate sections in an Org file, but for now decided to leave it as a single file to simplify the development workflow. As a result the full implementation is fairly long [3]. Comments and feedback would be greatly appreciated. Best! Tom 1. https://github.com/tgbugs/laundry 2. https://docs.racket-lang.org/brag/#%28part._.The_language%29 3. https://github.com/tgbugs/laundry/blob/master/org-mode/parser.rkt