README.md

# Training on data management

Lidiski project. Tuesday 2023-12-05 9:30 CET.

After the workshop, the participants will have acquired the basic principles for an efficient management of their research data and documents, avoiding duplication, handling multiple versions, producing appropriate documentation and naming practices.

## Specific objectives:

- Identify the different __requirements__ of data storage, analysis and visualisation.

- Implement __good practices__ for naming files and variables on your own data set.

- Organise your data according to __tidy__ principles.

- Start a __data dictionary__ for your data.


LICENSE : [CC-BY-SA](http://creativecommons.org/licenses/by-sa/4.0/).


[![slides](https://forgemia.inra.fr/umr-astre/training/data-management/-/raw/main/img/title-page.png?ref_type=heads)](https://umr-astre.pages.mia.inra.fr/training/data-management/)


## Preparation

A copy of one or several data sets and project files in order to practice.

If you don't have anything at hand, you can use [this one](https://ndownloader.figshare.com/files/2252083) (from [Datacarpentry](https://datacarpentry.org/spreadsheet-ecology-lesson/))


## Practical exercise

Your mission: apply these principles to one of your projects

- Re-organise your project files in a suitable directory structure

- Rename files using descriptive and systematic name structures

- Restructure a data set in a __tidy__ format

- Write the documentation (meta-data) of the data set

__Document your own work__. Take some screen shots before and after intervention. Explain what you have changed and justify your decisions. This will be your __deliverable__ for the workshop.

Note: work on a __copy__ of your project, so you feel free to experiment and make mistakes. No need to be exhaustive, it is simply an exercise. A few examples of each type of change will suffice.


## Some related tools


1. [Stats tips](https://umr-astre.pages.mia.inra.fr/guidelines/statstips/)

2. [dataspice](https://docs.ropensci.org/dataspice/index.html): A R package to facilitate the creation of metadata