From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id oCZdHZG11mPLaAAAbAwnHQ (envelope-from ) for ; Sun, 29 Jan 2023 19:06:09 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id yBNmHZG11mMgNAEA9RJhRA (envelope-from ) for ; Sun, 29 Jan 2023 19:06:09 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id E10CE3A6AF for ; Sun, 29 Jan 2023 19:06:08 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pMC3T-0006hq-OD; Sun, 29 Jan 2023 13:05:27 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pMC3S-0006gl-1B for emacs-orgmode@gnu.org; Sun, 29 Jan 2023 13:05:26 -0500 Received: from mout01.posteo.de ([185.67.36.65]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pMC3P-00032K-Hu for emacs-orgmode@gnu.org; Sun, 29 Jan 2023 13:05:25 -0500 Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id 1A449240050 for ; Sun, 29 Jan 2023 19:05:20 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.jp; s=2017; t=1675015520; bh=VEcdaBieE/1FF0VnVTrcSq6ib4FoUjkPi7S8bXRikVo=; h=Date:From:To:Subject:From; b=bGPuDbGFYCWfxx3i3MPRj9SI3ulEXac7Vga1SAt42hcn2oMY7A1U7Kvxa/0VHnlZl gFvEvw1YNkYNONY+T8tLRYOtgWbhFGI9f/P0M5C9rG147YnURft0VlJZ+V1RZAawl3 Xc/mMiyqAHSLvNx5JVOBwcisxPRvE55xX3cX55aUw+nefgP+nrBZYEb4/dNEXar5vu 72L1bNdnW8gTreqJltZiRxDvIueE3dVIorzVcHTU2OQasExexJkl30c8z+bi2PAFfC pEx+a2GBCZXQpz9NtGIiTS0bbLkl/l8Fy+Pm4X0qoa2YwiswD2tfY7ILsdhXZp7fyu 8cdGOAxIJsnLw== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4P4fNv21JHz6tmP for ; Sun, 29 Jan 2023 19:05:18 +0100 (CET) Date: Sun, 29 Jan 2023 18:05:14 +0000 From: To: emacs-orgmode@gnu.org Subject: Inline markup: How does org identify nested code/verbatim? Followup-To: emacs-orgmode@gnu.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: <4P4fNv4F48z6tmQ@submission01.posteo.de> Received-SPF: pass client-ip=185.67.36.65; envelope-from=c.buhtz@posteo.jp; helo=mout01.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1675015569; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=Ysuym2/DQccesf4J8eOxtuzo9q9ouP6Euv/TUbGDMII=; b=PceyVtgZyIqpnHDavOW/DRnNx61BHPF1ES53eTtCFGipg6ZjXmWkWVBAdb/UsUXW3EDw/Q FToS359yAYFFlExlM2ayhkd55hH63OFtn1SfuaqY1JlH/oyt+2dTXndXPmP4ZS3Y7vcJgQ KyCFEqbvlomTRp3CGUae64LJfNr/j/zN8v7zPu9MwTH3+I+MJtJl9K/nQ8j8zgc4+PcBX/ zKq6vCXAUrwNANDkd4RvpuX9ozFcUqk83GMpJD/i+kaZMhhPb8VdemgExUJmEbEJNrQUZ/ AKBwua3cR6F6uE6hdU48oUnRDa10fv5y6XojMZmZGbETmoUkQ2zTPAnndFbX4A== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=posteo.jp header.s=2017 header.b=bGPuDbGF; dmarc=pass (policy=none) header.from=posteo.jp; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" ARC-Seal: i=1; s=key1; d=yhetil.org; t=1675015569; a=rsa-sha256; cv=none; b=s9T2xfzar+7Ebk4qK5Tkf71/lbD4CmOKiyo3/+BC38EKxphTodY0lUGF5UvTvFB3NkFa0u fGhy42DHokTKUmBHOFf1MOHiGPLYxhh+amgfRwcEU/uUcp3Ty8+Kiw1naWouTSlUoSof8Y Xx27vxJJHZe+EQMMEN748cQK7nqzppUycU3ApE45ypSiN7Do0TcDWhgm1askc+OGSAJWEJ ZUo9dHSxv1uCir/M2DJJfhsCvD2GeOqhR+YEMf8Q1j0brd/5HFFjTdnmAgYwRPdhQi+4PZ RRIvo6+f9EWoi+Dxh4ihj4PRwLx761kzA0iPcOQJLE0x/QTa4nB3yBKsq2AHRA== X-Spam-Score: -5.34 X-Migadu-Spam-Score: -5.34 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=posteo.jp header.s=2017 header.b=bGPuDbGF; dmarc=pass (policy=none) header.from=posteo.jp; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: E10CE3A6AF X-Migadu-Scanner: scn1.migadu.com X-TUID: T/ztQe2Tq1aJ Hi folks, this is a question about org(mode) development itself. It is magic to me how you do this. ;) And I would like to learn it because I do write kind of an org parser in Python. Here is a nested code-in-verbatim text. This =is ~code~ in verbatim= text. Exporting this to html (via org-html-export-as-html) This is ~code~ in verbatim text. Awsome! :D The point is myself I'm able to identify code or verbatim with regex including three catch groups for the content before, between and after the inline markers. for verbatim: "(^|[ .,;:\-?!({\"'])=(.*?)=([ .,;:\-?!)}\"']|$)" for code: "(^|[ .,;:\-?!({\"'])~(.*?)~([ .,;:\-?!)}\"']|$)" But they don't work together. In the example above I need to use the verbatim regex first to make it right. If I would use the code regex first it wouldn't work because it would find the ~code~ but without knowing that it is surrounded by ~verbatim~. I don't know what my users inputs to my software: verbatim in code or code in verbatim. So I have to figure out which regex to use first. How does org solve this problem? I don't need a full working solution but just an idea. One approach in my mind is to run both regex separate and then compare the results "somehow": Verbatim: ['This', ' ', 'is ~code~ in verbatim', ' ', 'text.'] Code : ['This =is', ' ', 'code', ' ', 'in verbatim= text.'] "Somehow"! Another approach in my mind is to do something I would call nested regex. Constructing a regex pattern looking for verbatim with code in it. And the other way around of course.