1 minute read

Joseph M. introduced the issues around Data storage and reproducibility. Started a conversation on producing guidelines and templates for reproducible research.

Data storage and R templates for research analyses

Joseph Moxon presented his take on how to use R for data storage best practices, and started a conversation on producing R templates for research analyses.

Ira C.: data often too large to be stored on the same location as the scripts.

Cesar H.: in the process of submitting data. Trying to find good example of metadata in the Tropical Data Hub, and couldn’t find a good example. Looked at papers from his team and other JCU coworkers, and didn’t find complete data, non-available code, mislabelled files, etc.

Wytamma W.: Github would be a great place to have private repositories where to share code and comment on each others scripts

Where to from here?

Ira C.: It would be good to, as Joseph suggested, subdivide into disciplines and produce guidelines. These could include: where to store the data used; things that have to be included in the script, etc.

Joseph M.: It would be good to identify key areas that will likely share similar needs and challenges (e.g., Population genomics/Phylogenomics; Medical trials; etc.)

Expressions of interest

Peter C: keen to help produce guidelines for pop gen/genomic data

Lorenzo B: keen to help produce guidelines for pop gen/genomic data

Examples of reproducible data and useful resources

Legana F.: Just published a paper, and has script to reproduce everything that was done on github for her publication

Cesar H.: Ten simple solutions for Digital data storage

Cesar H.: Shared his paper that will have code available once peer-reviewed

Lorenzo B.: Good example of pop gen data availability and reproducibility here, see supplementary information.

Coding challenges with Rosalind

Ira suggested starting an appended session (15 min on top of the usual 11 am to midday), to go over 1-2 rosalind coding problems. Everyone seemed to agree. Peter C. will start a slack channel for it.

Find the website here