From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id mOvHDty1+2KScwEAbAwnHQ (envelope-from ) for ; Tue, 16 Aug 2022 17:21:00 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id WITTDty1+2JS0QAA9RJhRA (envelope-from ) for ; Tue, 16 Aug 2022 17:21:00 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id DEC2E2997F for ; Tue, 16 Aug 2022 17:20:59 +0200 (CEST) Received: from localhost ([::1]:41030 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oNyNG-0006yo-27 for larch@yhetil.org; Tue, 16 Aug 2022 11:20:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34610) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNyMQ-0006yU-4F for emacs-orgmode@gnu.org; Tue, 16 Aug 2022 11:20:06 -0400 Received: from mout02.posteo.de ([185.67.36.66]:41963) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNyMN-0005sl-Kw for emacs-orgmode@gnu.org; Tue, 16 Aug 2022 11:20:05 -0400 Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id F36A3240101 for ; Tue, 16 Aug 2022 17:20:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1660663201; bh=NHSHNRcasnA/g0YVtAmotXtwXi9gFYPMNqpYKW0s4l4=; h=From:To:Subject:Date:From; b=ozauX1eyvrheox8sCeLsULA3gP4PoUAv7wMuRtd45NXCXuQUaRGVVyCJDCWEPJn6r ts0zGiPeYnpveLG3C1LdzYIrxB3PUiWlFN6ZOOpBgeTnBCqtz02dA8/IR5sp237hJ6 ki9i7ixZ+HW+QtAh/tqohXCz5DzCluhDcteeFmeA09TgBO7FiZz6LSJ8epE+bjjJwI 3429m8tXRJ+QdgHmdUrm2ovc9SjxcjTEDwCiopFEK5sCAl4JKEsoC8iQ2GSwcIRXsT RyPOJQmbXs/D0+d+1fIGoDNlYWCNZsageloRMyVKAAx3gVXwHmaj6ujQbhbxMD8VeG xkptCLVDKCuqg== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4M6ZZm3DcGz9rxG for ; Tue, 16 Aug 2022 17:19:59 +0200 (CEST) From: =?utf-8?Q?Juan_Manuel_Mac=C3=ADas?= To: orgmode Subject: [off topic] List all non-latin characters in a buffer Date: Tue, 16 Aug 2022 15:19:57 +0000 Message-ID: <87leroqkpu.fsf@posteo.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=185.67.36.66; envelope-from=maciaschain@posteo.net; helo=mout02.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1660663259; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=YvOcl0t/w/QRbNF1SKnxKRhUkIUM2SHifTpzbOcVGs0=; b=ZrJZRK/2dVBVpi/0wlYk0vrh4UXwqcrL83ItNfXxiJUnHNHWporhaTRx8t0hCNLoUZ6AeC CIQOCloXo5E5ov9qEGOeuW3N7INxw5kduEJvgKVr438+kG1UlptqwuK1KQ20y0Cu5c4lA/ 3vH8yRROWAfGo9Oy+ksPEJjQvq2bQ7vTKcXmaU0wPJb06W3rKZ5SE3vuXqL9gvMpxABYQw 1TcS5k9JGW6GZDGmb2IUwcdlHq8ryPr0hbU1QnKO9ExvHqN4lDmjWiyU2FYywMPL4ZoBwl NF/rhHOI13+GylOZD7EtloJjBflsfN1mV8TUFVJR4PbtDiIK5JffW5nto9OhRA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1660663259; a=rsa-sha256; cv=none; b=iMYno/+3iH58omT2nYSxZKNBH//lvmnWoCT0q9a/876zzFUxs1nMhNl4gD7u+JZSqKwafG qu9VCrUM15RCBawxBWxjvMSA3Qi7tBpim7NL3wh0iCjteDF1w0YhUCAlVXh+2sa4oW6KhP Qo6vyC1zmsgM5Slnw0EzHYJ2T6UcO73+SLbYzQovXVipnoag+TclzEv9NaKBjgNWoJFy+X 9rqmaUnZvkEfyzEdFH3b2X/6k0KRwxktRFSSPb9NRpPVGfjHzpj/Bqk6CuvpgBFtxMvU6K m3SY3tQtPlx3JmKZCaj6+6ETnRyiS9SjAjwu67XCmEgEjSd0b7M7WphmVK0Y8g== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=ozauX1ey; dmarc=pass (policy=none) header.from=posteo.net; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -5.86 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=ozauX1ey; dmarc=pass (policy=none) header.from=posteo.net; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: DEC2E2997F X-Spam-Score: -5.86 X-Migadu-Scanner: scn0.migadu.com X-TUID: nJyWVcFRUny8 Sorry for the offtopic, but I thought this homemade function I wrote some time ago for my work might perhaps be useful to some Orgers. When executed in a buffer, the `list-non-latin-chars' function opens a window displaying a list of all the non (basic) Latin characters present in that document. Each item in the list contains the character, its Unicode canonical name, and its hexadecimal code. For example: =E6=AE=BF CJK IDEOGRAPH-6BBF #6bbf Also, each item is a button (created with button.el). If the button is activated, there are currently two options: a: execute occur on that character in the document; b : execute describe-char on that character. By default, the characters displayed in the list correspond to any Unicode block other than basic-latin. Which means that the zero width space character is included, a very famous character in this mailing list :-) And here is the code (lexical binding is required). Of course, feedback welcome. Best regards, Juan Manuel #+begin_src emacs-lisp (setq ext-chars-actions-list '((?a "Occur" (lambda (buf char) (interactive) (with-current-buffer buf (occur char)))) (?b "Describe char" (lambda (buf char) (interactive) (with-current-buffer buf (save-excursion (goto-char (point-min)) (when (re-search-forward char nil t) (describe-char (- (point) 1))))))))) (defun ext-chars-choose-action (buf char) (let ((opt (read-char-choice (concat "Escoger acci=C3=B3n >>\n\n" (mapconcat (lambda (item) (format "%c: %s" (car item) (nth 1 item))) ext-chars-actions-list " --- ")) (mapcar #'car ext-chars-actions-list)))) (apply (nth 2 (assoc opt ext-chars-actions-list)) (list buf char)))) (defvar ext-chars-list nil) (defun list-non-latin-chars () (interactive) (setq ext-chars-list nil) (let ((buf (buffer-name))) (save-excursion (goto-char (point-min)) (while (re-search-forward "\\([^\u0000-\u007F]\\)" nil t) (add-to-list 'ext-chars-list (format "%s" (match-string 1)))) (setq ext-chars-list-final (mapcar (lambda (char) (let ((char-name (get-char-code-property (string-to-char char) 'name)) ;; convert to hexadecimal (char-code (format "#%x" (string-to-char char)))) (setq char (format "%s\s\s%s\s\s%s" char char-name char-code)))) ext-chars-list)) (let ((temp-buf (format "*non latin chars in %s*" buf))) (when (get-buffer temp-buf) (kill-buffer temp-buf)) (get-buffer-create temp-buf) (set-buffer temp-buf) ;; necessary for Arabic, Hebrew, etc. (setq bidi-display-reordering nil) ;; insert buttons list (mapc (lambda (el) (let ((char (when (string-match "^\\(.\\)\s" el) (match-string 1 el)))) (insert-button (format "%s" el) 'action (lambda (x)=20 (interactive)=20 (ext-chars-choose-action buf char))) (insert "\n\n"))) ext-chars-list-final) (pop-to-buffer temp-buf) (goto-char (point-min)) (view-mode))))) #+end_src --=20 -- ------------------------------------------------------ Juan Manuel Mac=C3=ADas=20 https://juanmanuelmacias.com https://lunotipia.juanmanuelmacias.com https://gnutas.juanmanuelmacias.com