Schedule: https://dataoneorg.github.io/provathon-2017/
Uses https://github.com/cloudyr/RoogleVision to send photos to Google’s Cloud Vision API and extract useful information from archaeological field photography
Secret Life Of Data (SLO data)
Requesting participants for survey
Crosswalking: Helping to translate between contexts, for example from R to python
Rocker: docker containers with R and other stuff ready to go, minimal config
Using R markdown, github, zenodo
“Our path to better science in less time using open data science tools” https://www.nature.com/articles/s41559-017-0160
http://ohi-science.org/
Organisational and cultural dimensions of reproducible research
What shifts required to promote reproducible science? What might be some unintended consequences?
Provenance Editing via the web https://dataoneorg.github.io/provathon-2017/prov-editing/prov-editing.html
Tools for uploading Provenance in R https://nceas.github.io/oss-lessons/publishing-data/upload-data.html, demo files at https://github.com/NCEAS/oss-lessons
R Recordr package: https://github.com/NCEAS/recordr, vignette: https://raw.githubusercontent.com/NCEAS/recordr/master/vignettes/intro_recordr.Rmd
YesWorkflow: https://github.com/yesworkflow-org/yw-prototypes
Production version: https://github.com/yesworkflow-org/yw
Abstraction? Maybe gets another * if you use notebooks/rmd… hopefully commenting is getting better…
YW in the browser: http://absflow.westus.cloudapp.azure.com/
…
Whole Tale: ‘give them the computer’ https://dashboard-dev.wholetale.org/
Whole Tale demo: https://dashboard-dev.wholetale.org/
currently supported front-ends: Jupyter, RStudio; custom front-ends will be easy to add; docker/machine config (later)
running RStudio in WT dashboard
after leaving the environment, the state is preserved; when reconnecting, the same state will be visible
How to deal with “idle time”? What is WT doing right now? How should it be?
Discussion about different use cases and modes of use
crowd-sourced use of shared tale
WT vs github based collab
pair programming
common use case:
publish work via dockerhub
run via WT (without knowing anything about docker)
⇒ new use case!? If you have accounts, credentials for different cloud providers (eg Amazon); then WT will take care of “patching you through”
expose lower-level info via log-file “dashboard-log.txt”
similar to std-error
allows to see what cloud resources have been used
What cloud providers are most popular with this community?
⇒ Feedback sought on the tale presentation in the UI:
category, title, abstract, ..
how to search, browse ..
⇒ compare with https://rpubs.com/
search by username; contributor
how far to go down the road? e.g. see https://en.wikipedia.org/wiki/MyExperiment
more focused: search by datasets used
Chris describes possible “breaks in provenance chain”
extended file metadata / attributes
support for workflows through cobbled together containers!?