emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Open Peer-Review Reproducible Publication with Org and GRASS
@ 2016-06-03 14:19 Ken Mankoff
  2016-06-03 14:57 ` Brett Viren
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Ken Mankoff @ 2016-06-03 14:19 UTC (permalink / raw)
  To: Org Mode, 'grass-user grass-user'

Hi Org and GRASS lists,

I just wanted to let these two lists know that I've just posted a paper written in Org and using GRASS (text-mode) and Python for the analysis. My goal was to create not just an open access publication, but a fully reproducible publication. This is an early announcement, and the paper may not pass peer review.

The Supplemental Material is the Org file with all the code to generate the document, beginning with downloading the 3rd party data that is input to our analysis, the GRASS code to perform the analysis, and the Python code to regenerate the figures.

I don't think I did a great job on the reproducible part because I have a highly customized .emacs, etc. All the information necessary to replicate the work should be in the Supplemental Material, but it might not be easy to do so. Anyway, I think it is a step in the right direction.

To make it easier to reproduce... including my emacs.org seems overkill. Including a Virtual Machine that contains everything, including my ~/.emacs.d/ and all the software and data seems like the right thing to do, but journals don't want to host a 20 GB VM with the publication.

Thanks to people on these two lists who have developed the software and helped me use it.

   -k.
   
http://www.the-cryosphere-discuss.net/tc-2016-113/
_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Open Peer-Review Reproducible Publication with Org and GRASS
  2016-06-03 14:19 Open Peer-Review Reproducible Publication with Org and GRASS Ken Mankoff
@ 2016-06-03 14:57 ` Brett Viren
  2016-06-06 20:12   ` Daniele Pizzolli
  2016-06-06  9:08 ` Christian Moe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 6+ messages in thread
From: Brett Viren @ 2016-06-03 14:57 UTC (permalink / raw)
  To: Ken Mankoff; +Cc: 'grass-user grass-user', Org Mode

[-- Attachment #1: Type: text/plain, Size: 2995 bytes --]

Thanks for your example.

A few ideas:

- When you begin developing your paper, or sometime before submission,
  make a break from your personal ~/.emacs.d/ environment and begin
  processing the .org in an explicitly configured Emacs session.  Submit
  the needed, minimal, paper-specific Emacs setup as part of the
  supplementary material.

- Bundle the document building into a shell script which calls Emacs so
  that you can assure that personal ~/.emacs.d/ is excluded and only the
  paper-specific Emacs setup is used.  It also helps users to rebuild
  the paper, especially if they may not yet be Emacs aficionados.

- Instead of multi-GB VM image, provide a few kB Dockerfile which can be
  used to build a Linux container with base OS and all required
  applications needed to run the Babel code blocks.

- The Dockerfile could go so far as to create a user account, get the
  supplementary material from a repository or the publisher's web page,
  unpack and run the shell script which calls Emacs to build the
  document.  If you go this far then in principle just this Dockerfile
  is enough to reproduce the paper - but this will rely on some binaries
  to remain available (Docker base OS images and OS packages).

The reliance on long-term availability of the Docker base OS image and
binary packages is problematic for long term automated reproducibility.
However, even after those bits disappear from the 'net the Dockerfile
serves as a concise and explicit recipe for future humans to follow.

-Brett.


Ken Mankoff <mankoff@gmail.com> writes:

> Hi Org and GRASS lists,
>
> I just wanted to let these two lists know that I've just posted a
> paper written in Org and using GRASS (text-mode) and Python for the
> analysis. My goal was to create not just an open access publication,
> but a fully reproducible publication. This is an early announcement,
> and the paper may not pass peer review.
>
> The Supplemental Material is the Org file with all the code to
> generate the document, beginning with downloading the 3rd party data
> that is input to our analysis, the GRASS code to perform the analysis,
> and the Python code to regenerate the figures.
>
> I don't think I did a great job on the reproducible part because I
> have a highly customized .emacs, etc. All the information necessary to
> replicate the work should be in the Supplemental Material, but it
> might not be easy to do so. Anyway, I think it is a step in the right
> direction.
>
> To make it easier to reproduce... including my emacs.org seems
> overkill. Including a Virtual Machine that contains everything,
> including my ~/.emacs.d/ and all the software and data seems like the
> right thing to do, but journals don't want to host a 20 GB VM with the
> publication.
>
> Thanks to people on these two lists who have developed the software and helped me use it.
>
>    -k.
>    
> http://www.the-cryosphere-discuss.net/tc-2016-113/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Open Peer-Review Reproducible Publication with Org and GRASS
  2016-06-03 14:19 Open Peer-Review Reproducible Publication with Org and GRASS Ken Mankoff
  2016-06-03 14:57 ` Brett Viren
@ 2016-06-06  9:08 ` Christian Moe
  2016-06-06 20:11 ` Daniele Pizzolli
  2016-06-06 23:01 ` Samuel Wales
  3 siblings, 0 replies; 6+ messages in thread
From: Christian Moe @ 2016-06-06  9:08 UTC (permalink / raw)
  To: Ken Mankoff; +Cc: Org Mode


This is really interesting on several levels. Thanks for posting.

Yours,
Christian

Ken Mankoff writes:

> Hi Org and GRASS lists,
>
> I just wanted to let these two lists know that I've just posted a paper written in Org and using GRASS (text-mode) and Python for the analysis. My goal was to create not just an open access publication, but a fully reproducible publication. This is an early announcement, and the paper may not pass peer review.
>
> The Supplemental Material is the Org file with all the code to generate the document, beginning with downloading the 3rd party data that is input to our analysis, the GRASS code to perform the analysis, and the Python code to regenerate the figures.
>
> I don't think I did a great job on the reproducible part because I have a highly customized .emacs, etc. All the information necessary to replicate the work should be in the Supplemental Material, but it might not be easy to do so. Anyway, I think it is a step in the right direction.
>
> To make it easier to reproduce... including my emacs.org seems overkill. Including a Virtual Machine that contains everything, including my ~/.emacs.d/ and all the software and data seems like the right thing to do, but journals don't want to host a 20 GB VM with the publication.
>
> Thanks to people on these two lists who have developed the software and helped me use it.
>
>    -k.
>    
> http://www.the-cryosphere-discuss.net/tc-2016-113/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Open Peer-Review Reproducible Publication with Org and GRASS
  2016-06-03 14:19 Open Peer-Review Reproducible Publication with Org and GRASS Ken Mankoff
  2016-06-03 14:57 ` Brett Viren
  2016-06-06  9:08 ` Christian Moe
@ 2016-06-06 20:11 ` Daniele Pizzolli
  2016-06-06 23:01 ` Samuel Wales
  3 siblings, 0 replies; 6+ messages in thread
From: Daniele Pizzolli @ 2016-06-06 20:11 UTC (permalink / raw)
  To: Org Mode

On Fri, Jun 03 2016, Ken Mankoff wrote:

> Hi Org and GRASS lists,
>
> I just wanted to let these two lists know that I've just posted a
> paper written in Org and using GRASS (text-mode) and Python for the
> analysis. My goal was to create not just an open access publication,
> but a fully reproducible publication. This is an early announcement,
> and the paper may not pass peer review.

Hello Ken,

Thanks for sharing, I had only a brief look.  In the meantime some
early thoughts while it downloads the data.

[]

> To make it easier to reproduce... including my emacs.org seems
> overkill. Including a Virtual Machine that contains everything,
> including my ~/.emacs.d/ and all the software and data seems like the
> right thing to do, but journals don't want to host a 20 GB VM with the
> publication.

Well, you need only the recipe to build the VM.  Fortunately there are
a lot of tools that help to do this.  Unfortunately there are too
many, and not all of them focus on the reproducible side...

In the meantime I saw few main problems in your current document:

- No "One build step"
- Hardcoded path, for example
  root = "/Users/mankoff/Documents/Papers/B/Brinkerhoff/"
- I did not found the Grass installation part, I guess I will have an
  error later...
- Missing latex toolchain installation

Are you interested in fixing those issues?  If yes please share your
progress or ideas.

Best,
Daniele

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Open Peer-Review Reproducible Publication with Org and GRASS
  2016-06-03 14:57 ` Brett Viren
@ 2016-06-06 20:12   ` Daniele Pizzolli
  0 siblings, 0 replies; 6+ messages in thread
From: Daniele Pizzolli @ 2016-06-06 20:12 UTC (permalink / raw)
  To: Org Mode

On Fri, Jun 03 2016, Brett Viren wrote:

> Thanks for your example.
>
> A few ideas:
>
> - When you begin developing your paper, or sometime before submission,
>   make a break from your personal ~/.emacs.d/ environment and begin
>   processing the .org in an explicitly configured Emacs session.  Submit
>   the needed, minimal, paper-specific Emacs setup as part of the
>   supplementary material.
>
> - Bundle the document building into a shell script which calls Emacs so
>   that you can assure that personal ~/.emacs.d/ is excluded and only the
>   paper-specific Emacs setup is used.  It also helps users to rebuild
>   the paper, especially if they may not yet be Emacs aficionados.

Hello,

nice, but I suggest to start from scratch in a Virtual Machine.

I am currently trying to develop a generic extensible template for
this.

Right now I did only custom tailored installation.

When I have something interesting I will share it.  In the meantime I
will follow thread related to this topic closely!

> - Instead of multi-GB VM image, provide a few kB Dockerfile which can be
>   used to build a Linux container with base OS and all required
>   applications needed to run the Babel code blocks.
>
> - The Dockerfile could go so far as to create a user account, get the
>   supplementary material from a repository or the publisher's web page,
>   unpack and run the shell script which calls Emacs to build the
>   document.  If you go this far then in principle just this Dockerfile
>   is enough to reproduce the paper - but this will rely on some binaries
>   to remain available (Docker base OS images and OS packages).
>
> The reliance on long-term availability of the Docker base OS image and
> binary packages is problematic for long term automated reproducibility.
> However, even after those bits disappear from the 'net the Dockerfile
> serves as a concise and explicit recipe for future humans to follow.

I think that using debian only packages and the debian archive you
will be able to reproduce the environment with few effort in the
future, but I did not ever tried hard.

I will suggest to use are Vagrant rather than Docker (because it is
simpler to star, expressive enough, supports free software only
provider (like libvirt)) and some shell scripts or ansible playbooks.

I do not know the licence of the data, but I suggest to make at least
one private mirror and use a variable to switch between the two.

You may also want to verify sha256sum and/or gpg signatures on the
data and on the code (that should be hosted using a source code
management system).

Best,
Daniele

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Open Peer-Review Reproducible Publication with Org and GRASS
  2016-06-03 14:19 Open Peer-Review Reproducible Publication with Org and GRASS Ken Mankoff
                   ` (2 preceding siblings ...)
  2016-06-06 20:11 ` Daniele Pizzolli
@ 2016-06-06 23:01 ` Samuel Wales
  3 siblings, 0 replies; 6+ messages in thread
From: Samuel Wales @ 2016-06-06 23:01 UTC (permalink / raw)
  To: Ken Mankoff; +Cc: grass-user grass-user, Org Mode

On 6/3/16, Ken Mankoff <mankoff@gmail.com> wrote:
> My
> goal was to create not just an open access publication, but a fully
> reproducible publication. This is an early announcement, and the paper may
> not pass peer review.

thank you, sincerely, for acting to help fix one of science's serious crises.

this stuff matters.  in biomedicine and related academic fields, it
matters profoundly.

ok, back to org/grass talk.

-- 
The Kafka Pandemic: http://thekafkapandemic.blogspot.com

The disease DOES progress.  MANY people have died from it.  And
ANYBODY can get it.

Denmark: free Karina Hansen NOW.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-06-06 23:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-03 14:19 Open Peer-Review Reproducible Publication with Org and GRASS Ken Mankoff
2016-06-03 14:57 ` Brett Viren
2016-06-06 20:12   ` Daniele Pizzolli
2016-06-06  9:08 ` Christian Moe
2016-06-06 20:11 ` Daniele Pizzolli
2016-06-06 23:01 ` Samuel Wales

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).