Requirement: datalad installed, some element will only run on bash, ssh and token access for datalad to talk to the GIN instance, and the corresponding user should have owner rights to the G-Node organisation.
Refer to Research folder structure quick start guide , clone the repository or download its content.
Make it a git repository and add submodules, or modify folders to become submodules.
Using datalad, you can do all that with:
datalad create -f -c text2git
datalad create -f -c text2git -d . 06_dissemination
datalad create -d . 03_data/001_rawdata
datalad create -d . 03_data/001_deriveddata
datalad create -c text2git -d . 04_data_analysis/001_analysiscode
datalad create -f -c text2git -d . 05_figures/990_shared_figures
Note that some of them use the text2git options, so that text will be added to git and not to git-annex.
Because the tonic application does not support repositories without annex, we did not use the –no-annex option for the code repository, this might be done in the repositories later on. (https://github.com/G-Node/tonic/issues/52)
You may have seen that the analysis code is a git-only repository, there will be no annexed file there. Alternatively, you may only use the text2git option with:
datalad create -c text2git -d . 04_data_analysis/001_analysiscode
(logic:
create put datalad and git-annex info, -c text2git
tells it to put text into git in that repo,
-d .
tells that the supradataset or parent repository is located in the folder where the code is run,
-f
will force the creation if the folder already exists.
)
We will run git annex config --set annex.addunlocked true
in some datasets. There, annex files will not be locked when using datalad save or the scripts (takes twice as much space in non-windows computer, but will allow people to change binary files without needing to unlock them first). We will also populate readme file and make sure they are not git-annexed by setting md files to be added on git on the repository not having the “text2git” option.
git annex config --set annex.addunlocked true
cd 06_dissemination
git annex config --set annex.addunlocked true
cd ../05_figures/990_shared_figures
git annex config --set annex.addunlocked true
cd ../../
echo "**/*.md annex.largefiles=nothing" >> 03_data/001_rawdata/.gitattributes
echo "**/*.md annex.largefiles=nothing" >> 03_data/001_deriveddata/.gitattributes
echo "
Here comes the raw data (that is data coming directly from the research hardware used, or manually entered (for example in spreadsheets))
It is good practive to never modify files here, make copies in the derived data folder if you need to clean the data." >> 03_data/001_rawdata/README_dataraw.md
echo "
Here comes derived data, usually this repository would be:
- trashed if the raw data is published
- published if the raw data is archived
" >> 03_data/001_deriveddata/README_dataderived.md
echo "
# Figures
Put here figures/graphs you want to share or use in dissemination files (poster/presentation)
" >> 05_figures/990_shared_figures/README_figures.md
echo "
# code for analysis repository
Note: This is a not pure git repository.
If you want a pure git repository, create a new one manually and use only datalad scripts.
" >> 04_data_analysis/001_analysiscode/README_analysis-code.md
You will need to add an application token for your user, then create the sibling with
datalad create-sibling-gin G-Node/template_03 -s gin -r --api https://gindata.biologie.hu-berlin.de --existing reconfigure --credential juliencolomb
PS: you need to have an application token for datalad to do this.
The .gitmodules files needs to be corrected by hand at this point. The URL
entry will not be correct. You need to modify it with ../template_03-06_dissemination
and so on. change every slash into -
.
open .gitmodules
You may want to add scripts for the udpates of repositories, see the script section of this website and directly the github page at https://github.com/tonic-team/synchronisation_scripts
DO NOT USE the scripts to push the template, as scripts usually add other elements (extra submodules for example) you do not want here.
You may want to add extra information and readme files, especially in the newly created submodules (the one with no -f
above).
Putative changes:
.Rprofile
file to give info to Rstudio usersyou will first need to save changes datalad save -r -m "template created"
Then push your changes datalad push --to gin -r
Since you may have made several save at this point, we want to clean the history, so that new repositories created with the template start with 1 or 2 commits.
To do so, we will create a new branch and rename the master branch “oberste”, and delete the master branch on gin:
Let’s first rename the master branch:
git branch -M oberste
git push --set-upstream gin oberste
git push -d gin master
Now, let’s create a new branch with empty history
git checkout --orphan a_main
git add -A
git commit -m "Template initialisation"
git push --set-upstream gin a_main
Then make the a_main branch the default branch, on the browser (Settings::Branches).
Note: If a a_main branch already exist, you will need to erase it first:
git branch -d a_main``
git push -d gin a_main``
The clean branch is called a_main because in absence of “master”" branch, GIN takes alphabetical order for the default branch.
Depending on the tool you are using to clone the repositories created from the template, you may have to checkout to to the a_main branch to see the content.