emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Preserve formatting when copy/pasting from HTML
@ 2014-05-21 10:47 Tory S. Anderson
  2014-05-21 12:04 ` Bastien
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Tory S. Anderson @ 2014-05-21 10:47 UTC (permalink / raw)
  To: emacs-orgmode

We often read online articles with headings and sometimes subheadings. They may also include bold, italic, and hyperlinks, all of which are supported by Org. Is there any way to preserve this formatting if I copy-paste into org/emacs, the same way it's preserved when I paste into Word or into a Google Document/email? Or is this fundamentally difficult in emacs? It would be a tremendous feature. 

- Tory

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Preserve formatting when copy/pasting from HTML
  2014-05-21 10:47 Preserve formatting when copy/pasting from HTML Tory S. Anderson
@ 2014-05-21 12:04 ` Bastien
  2014-05-21 12:06 ` Albert Krewinkel
  2014-05-21 18:52 ` Ilya Shlyakhter
  2 siblings, 0 replies; 5+ messages in thread
From: Bastien @ 2014-05-21 12:04 UTC (permalink / raw)
  To: Tory S. Anderson; +Cc: emacs-orgmode

Hi Tory,

torys.anderson@gmail.com (Tory S. Anderson) writes:

> We often read online articles with headings and sometimes
> subheadings. They may also include bold, italic, and hyperlinks, all
> of which are supported by Org. Is there any way to preserve this
> formatting if I copy-paste into org/emacs, the same way it's preserved
> when I paste into Word or into a Google Document/email? Or is this
> fundamentally difficult in emacs? It would be a tremendous feature.

It would be easy to code something that sometimes works, with a dumb
parsing of the most common HTML tags and rendering into Org syntax.
But this would be really fragile and doing it right is fundamentally
difficult.

I'd love to see someone proves me I'm too pessimistic, though.

Best,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Preserve formatting when copy/pasting from HTML
  2014-05-21 10:47 Preserve formatting when copy/pasting from HTML Tory S. Anderson
  2014-05-21 12:04 ` Bastien
@ 2014-05-21 12:06 ` Albert Krewinkel
  2014-05-23 17:31   ` Tory S. Anderson
  2014-05-21 18:52 ` Ilya Shlyakhter
  2 siblings, 1 reply; 5+ messages in thread
From: Albert Krewinkel @ 2014-05-21 12:06 UTC (permalink / raw)
  To: Tory S. Anderson; +Cc: emacs-orgmode

torys.anderson@gmail.com (Tory S. Anderson) writes:

> We often read online articles with headings and sometimes subheadings. They
> may also include bold, italic, and hyperlinks, all of which are supported by
> Org. Is there any way to preserve this formatting if I copy-paste into
> org/emacs, the same way it's preserved when I paste into Word or into a Google
> Document/email? Or is this fundamentally difficult in emacs? It would be a
> tremendous feature.

A suggestion for a workaround: You might get decent results using
Pandoc[1] and pandoc-mode[2].  Pandoc can parse HTML and convert it to
org-mode markup.  There is also a helpful answer on stackexchange[3]
(just replace "markdown" with "org).  You might be able to use the above
tools to integrate the mentioned techniques into org's capture mechanism.

HTH,
Albert

[1] http://johnmacfarlane.net/pandoc
[2] https://github.com/joostkremers/pandoc-mode
[3] http://unix.stackexchange.com/questions/78395/

-- 
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Preserve formatting when copy/pasting from HTML
  2014-05-21 10:47 Preserve formatting when copy/pasting from HTML Tory S. Anderson
  2014-05-21 12:04 ` Bastien
  2014-05-21 12:06 ` Albert Krewinkel
@ 2014-05-21 18:52 ` Ilya Shlyakhter
  2 siblings, 0 replies; 5+ messages in thread
From: Ilya Shlyakhter @ 2014-05-21 18:52 UTC (permalink / raw)
  To: emacs-orgmode

torys.anderson@gmail.com (Tory S. Anderson) writes:

> We often read online articles with headings and sometimes
> subheadings. They may also include bold, italic, and hyperlinks, all
> of which are supported by Org. Is there any way to preserve this
> formatting if I copy-paste into org/emacs, the same way it's preserved
> when I paste into Word or into a Google Document/email? Or is this
> fundamentally difficult in emacs? It would be a tremendous feature.

There is an extension for doing this when copying from emacs-w3m into
org: http://orgmode.org/w/?p=org-mode.git;a=blob_plain;f=lisp/org-w3m.el;hb=HEAD

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Preserve formatting when copy/pasting from HTML
  2014-05-21 12:06 ` Albert Krewinkel
@ 2014-05-23 17:31   ` Tory S. Anderson
  0 siblings, 0 replies; 5+ messages in thread
From: Tory S. Anderson @ 2014-05-23 17:31 UTC (permalink / raw)
  To: Albert Krewinkel; +Cc: emacs-orgmode

I'm thoroughly impressed by pandoc. Quite the magnificent program! Using pandoc and xclip I was able to do what I wanted. As you mentioned, I am able to copy from, say, a wikipedia page, and paste as (mostly) properly formatted org code. My code is: 

while :; do
  xclip -o -selection clipboard -t text/html |
    pandoc -r html -w org |
    xclip -i -selection clipboard -quiet
done

Thanks!
- Tory

Albert Krewinkel <tarleb@moltkeplatz.de> writes:

> torys.anderson@gmail.com (Tory S. Anderson) writes:
>
>> We often read online articles with headings and sometimes subheadings. They
>> may also include bold, italic, and hyperlinks, all of which are supported by
>> Org. Is there any way to preserve this formatting if I copy-paste into
>> org/emacs, the same way it's preserved when I paste into Word or into a Google
>> Document/email? Or is this fundamentally difficult in emacs? It would be a
>> tremendous feature.
>
> A suggestion for a workaround: You might get decent results using
> Pandoc[1] and pandoc-mode[2].  Pandoc can parse HTML and convert it to
> org-mode markup.  There is also a helpful answer on stackexchange[3]
> (just replace "markdown" with "org).  You might be able to use the above
> tools to integrate the mentioned techniques into org's capture mechanism.
>
> HTH,
> Albert
>
> [1] http://johnmacfarlane.net/pandoc
> [2] https://github.com/joostkremers/pandoc-mode
> [3] http://unix.stackexchange.com/questions/78395/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-05-23 17:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-21 10:47 Preserve formatting when copy/pasting from HTML Tory S. Anderson
2014-05-21 12:04 ` Bastien
2014-05-21 12:06 ` Albert Krewinkel
2014-05-23 17:31   ` Tory S. Anderson
2014-05-21 18:52 ` Ilya Shlyakhter

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).