From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id 6OP2MpJplWDjOwAAgWs5BA (envelope-from ) for ; Fri, 07 May 2021 18:23:46 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id 8EGQLpJplWCiKgAAB5/wlQ (envelope-from ) for ; Fri, 07 May 2021 16:23:46 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 4A90A1FAC8 for ; Fri, 7 May 2021 18:23:46 +0200 (CEST) Received: from localhost ([::1]:60270 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lf3GT-0003K0-2i for larch@yhetil.org; Fri, 07 May 2021 12:23:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34262) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lf3FX-0003IL-GT for emacs-orgmode@gnu.org; Fri, 07 May 2021 12:22:47 -0400 Received: from mout02.posteo.de ([185.67.36.66]:57265) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lf3FU-0003xX-Uo for emacs-orgmode@gnu.org; Fri, 07 May 2021 12:22:47 -0400 Received: from submission (posteo.de [89.146.220.130]) by mout02.posteo.de (Postfix) with ESMTPS id 795062400FD for ; Fri, 7 May 2021 18:22:42 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1620404562; bh=0XT4BTBoshBYRVXwRLqatkDschuMJgbiFj8K46ZEa+M=; h=From:To:Cc:Subject:Date:From; b=GBCYE6KovEElRm9hKHIDyX0ZZ6v0lEcyug7Tv6AL8lhMuVyXyQQ8v7WZuhv0tfw6P BGL8Z4HkkE26xeslr96HRDYy0mi462GCw1BikNZW4AKoAUguq8dFgm96hjUBLQB/+K S/mjt1lgE1EzN2nDOpTJDKKM0/H1X/uX6sof6z0OfGYTPqf1jrUy4Mn/HDaHdCRnED OUU8tzViR8dm1/0oITTg1yAAr4QEpWH3ffV2NIe9f3DukT6vx6jMTnxWM4X0LHMjs1 vLT4VkZFSUJFvyH0T26gvFgv2nSNApuBWfKTKI1L9cQbad5qre3ptJgnvCyPmqB0u+ dnVLq3Asbl1cw== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4FcG294gpLz6tmJ; Fri, 7 May 2021 18:22:41 +0200 (CEST) References: <87fsyy6at8.fsf@fastmail.fm> <875yzuk9qd.fsf@posteo.de> <875yzuwv4z.fsf@fastmail.fm> <87tuneis8a.fsf@posteo.de> <8735uywosb.fsf@fastmail.fm> From: Titus von der Malsburg To: Joost Kremers Subject: Re: CSL-JSON support for =parsebib= In-reply-to: <8735uywosb.fsf@fastmail.fm> Date: Fri, 07 May 2021 16:22:41 +0000 Message-ID: <8735uyikdq.fsf@posteo.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=185.67.36.66; envelope-from=malsburg@posteo.de; helo=mout02.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: emacs-orgmode@gnu.org Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1620404626; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=zclZPm8M4PfZchCrsNpJ97GP6kGa32MhwdC+3HkPuhk=; b=gyC8ezx1gPUcFqDAcBo3jaalTzK8qJnTsJczj+GIPeMzh6F/XD8CyhG7hwkuWdWq2HaaI4 pkawu7U+HUvhSOVq6WY3eep/bHTC2fTPTMK4rmFqB0yHZR9dzdc3j/+0EnZRLCJ/LkFDpR LuQhz7nfqECfZhxM2qrJabELHETCvWdkLkqHIfxexOAZhTNEP8VeBHLVRWKitXB55B0htG i2YhYWdwKtejPd1GwbN4yxhgMOK6iEM4yJtv5MxUU7i8+0HDgbpkMwKqH6Y9/+VVE8l5VL x3mD5+0wyVZoRyZR+USuAzbLBbB00f19Nd7Fhp4Aad6KyN3YRo9JY3By9SZxXA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1620404626; a=rsa-sha256; cv=none; b=a0XWVaZxT3/DxWuLTPX+BeRyXmHXAd9VJ/VhnViTyxCwLi2u1miPApkK5bm7jLy0Wlj2EL vqWhguQeQK9ddmgX//CLoOfsbm1g6YKMIouG4kzyAE46rC9txjpmJRRAh2pQn/Wcwt4Qce TBz1Hpne4InX8GbWLz+IG/ZO82sIpvKto8aPYF/S3PoA2QRM0+6k3N43UKDn4bs9Bc/xFT fsxggVuZ16hc2kJ24/C7/HaJofIkud/lbWXaMhgvwMWQU3kh4W3Dz5QMjHFsiFWWrUpM8M oh69oJESQBmelVQdZRSr6v0LWe+g4aGVDYZXkwQ/+6k1qTQzkKwPTj9gl3gsiw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=posteo.de header.s=2017 header.b=GBCYE6Ko; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -1.65 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=posteo.de header.s=2017 header.b=GBCYE6Ko; dmarc=pass (policy=none) header.from=posteo.de; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: 4A90A1FAC8 X-Spam-Score: -1.65 X-Migadu-Scanner: scn0.migadu.com X-TUID: 47NXecDaLD1p On 2021-05-07 Fri 16:47, Joost Kremers wrote: > On Fri, May 07 2021, Titus von der Malsburg wrote: >>> Apparently, =3Djson-parse-{buffer|string}=3D then gives you a symbol wi= th a space >>> in it... >> >> I now see that symbol names =E2=80=9Ccan contain any characters whatever= =E2=80=9D [1]. But many >> characters need to be escaped (like spaces) which isn=E2=80=99t pretty. > > Agreed. But if you pass such a symbol to =3Dsymbol-name=3D or to =3D(form= at "%s")=3D, > the escape character is removed, so when it comes to displaying those sym= bols to > users, it shouldn't matter much. > > Note, though, that the keys in CSL-JSON don't seem to contain any spaces = or > other weird characters. There are just lower case a-z and dash, that's al= l. I agree that weird characters are unlikely going to be an issue. Nonethele= ss, strings seem slightly more future-proof. Funky unicode stuff is now ap= pearing everywhere (I=E2=80=99ve seen emoji being used for variable names) = and the situation could be different a couple of years down the line. >>> This works for the Elisp library =3Djson.el=3D, but Emacs 27 can be com= piled with >>> native JSON support, which, however, doesn't provide this option, >>> unfortunately. >> >> I see. In this case it might make sense to propose string keys as a feat= ure for >> json.c. The key is a string anyway at some point during parsing, so avoi= ding the >> conversion to symbol may actually be the best way to speed things up. > > True. I'll ask on emacs-devel. Personally, I'd prefer strings, too, but I= 'm a > bit hesitant about doing the conversion myself, esp. given that in Ebib, = all the > keys would need to be converted back before I can save a file. Sure, converting all keys in parsebib is not attractive. >>> That would be easy to support, but IMHO is better handled in >>> bibtex-completion: >>> just parse the buffer and then call =3Dgethash=3D on the resulting hash= table. Or >>> what use-case do you have in mind? >> >> One use case: bibtex-completion drops fields that aren=E2=80=99t needed = early on to save >> memory and CPU cycles. (Some people work with truly enormous bibliograph= ies, >> like crypto.bib with ~60K entries.) But this means that we sometimes hav= e to >> read an individual entry again if we need more fields that were dropped = earlier. >> In this case I=E2=80=99d like to be able to read just one entry without = having to >> reparse the complete bibliography. > > Makes sense. For .bib sources, this should be fairly easy to do. For .jso= n, I > can't really say how easy it would be. It's not difficult to find the ent= ry key > in the buffer, but from there you'd have to be able to find the start of = the > entry in order to parse it. Currently, I don't know how to do that. Not a big deal. Since it=E2=80=99s just about individual entries and the c= ode isn=E2=80=99t super central, we can easily hack something. >>>> - Functions for resolving strings and cross-references. > [...] >>> parsebib has a lower-level API and a higher-level API, and the latter d= oes >>> essentially what you suggest here. I thought bibtex-completion was alre= ady >>> using it... >> >> Nope. I think the high-level API didn=E2=80=99t exist when I wrote my co= de in 2014. > > No, it didn't. I seem to remember, though, that you gave me the idea for = the > higher-level API, which is probably why I assumed you were using it. > > So that part of =3Dparsebib=3D hasn't been tested much... (Ebib doesn't u= se it, > either). If you do decide to start using it, please test it and report any > issues you find. And let me know if I can help with testing. The organically grown parsing code in the Bibtex completion has been buggin= g me for a while. So I'm keen on rewriting this. But I may not get to it = until the summer. I'll keep you posted when I start working on it. Titus