5 Create a project
5.1 Create a GitHub repository
To avoid any conflict between RStudio and GitHub let’s create your repository first at GitHub.
Go to your GitHub page to create a new repository.
Add a creative name and a description to your repository, please leave as a public option, that will allow you to create a GitHub page to share your work with all of us.
After that we are going to copy the https link
of your repository to clone it with RStudio. Just click at the red button.
Then let’s move to RStudio.
To clone (Download) your repository from GitHub, click at the Project
button on the top right of RStudio (1˚)
then at New Project...
.
A window named New Project Wizard
will appear, select Version Control
, then Git
.
This will open a window with the Clone Git Repository
. Here you will just paste that link that I asked you to copy long time ago at the Repository URL:
window. At the Create project as subdirectory of:
select a directory of easy access for you, that will make it easier for you to locate it later.
I suggest you to create a folder at your Documents
to keep your workflowr
projects organized.
5.2 Create a workflowr
project
workflowr
package helps you to organize an analysis aiming to improve the project management, reproducibility, and team work. It works with the version control software git. Git is another incredible software that works as an version control, saving all the changes at your project that you make during the way, allowing to easily get back to older versions or track your changes or bugs.
So let’s start installing the workflowr
package
install.packages("workflowr")
5.2.1 Starting your workflowr project
Let’s start reading the workflowr
package, and then running its function
library(workflowr)
wflow_git_config(user.name = "YourGitHubUserName", user.email = "YourGitHubEmail")
wflow_git_config
function will save the yours information of username and email linked to your GitHub account. That will be required to allow you to push all your changes at your project. This configuration will only be necessary once per computer.
Then we are going to create the workflowr
directory structure with the wflow_start
function. Just pay attention that you already have a main directory for your project, but if you follow my steps everything will be fine.
wflow_start(directory = ".",
name = "YourRepositoryName",
git = TRUE,
existing = TRUE)
Obs.:
The dot
.
represents your working directory. It is saying toworkflowr
to create the new folders at your working directory, not in a new folder.Use the same
name
for your project as for your GitHub repository.
git
andexisting
arguments inform forworkflowr
that you will use git as version control and that the folder already exists, respectively.
wflow_start
will provide the following template of sub directories:
myproject/
|-- .gitignore
|-- .Rprofile
|-- _workflowr.yml
|-- analysis/ # This is the most important folder,
| | it will store all the your R markdown
| | files with your analysis of this project
| |-- about.Rmd
| |-- index.Rmd # This Rmd file will generate the homepage of your
| | website. Here you could write more about the
| | project and link it to the your Rmd files with
| | your analysis
| |-- license.Rmd
| \-- _site.yml # This file is the does all the magic of your website
| layout, theme, navigation bar, ...
|-- code/ # This folder you should store all the code that you think
| | that might not be appropriate to include at your Rmd files
| | or that's functions that you created that you will just call
| | for the analysis using a source function.
| \-- README.md
|-- data/ # Here you will add all your raw data files.
| \-- README.md
|-- docs/ # This folder will save all the html pages created from your Rmd
| files, SHOULD NOT BE EDITED BY THE USER
|-- myproject.Rproj
|-- output/ # Here you will save all the output from your analysis,
| | like data, results, figures,...
| | Even pre-process data files should be saved here.
| \-- README.md
|-- README.md
workflowr
also provide an template format for your Rmd files that could be used to create yours GitHub pages websites like this one!!!
You can look more ideas of how to customize the theme and layout of your project website here.
5.3 Tidyverse functions
There are lots of great resources online for learning the basic tidyverse
functions.
Here you will find a lot of cheat sheets of the wonderful world of tidyverse and so much more.
5.3.1 Code chunks
At the R markdown files your R code must be inside a code chunks
for RStudio to understand as so.
So What is a code chunk?
Here's one ```r dim(iris) #> [1] 150 5
```
but you could also use that in the middle of any phrase as 2 + 2
is 4, all you need to do is to write your code surrounded by a pair of back-ticks and the letter r
like this.
# Two plus two equals `r 2 + 2`
R markdown allows you to create chunks for several programming languages, like python.
In your RStudio there is a +c
button at the menu below your Rmd file name, try it to see which other type of languages you could apply at a Rmd file.
5.3.2 Hotkeys
Pretty critical to learn a few of these, especially these:
OS X - MAC | Windows | Hotkey |
---|---|---|
CMD + Option + I
|
Ctrl + Alt + I
|
create chunk |
Shift + CMD + M
|
Shift + Ctrl + M
|
%>% pipe operator |
Option + -
|
Alt + -
|
<- assignment operator |
CMD + Enter
|
Windows + Enter
|
submit (run) lines of code in your Rmd or R script to the console. |
magrittr
package has several operators very useful for managing data.
5.4 Using Rmarkdown
Here is some guides to improve your Rmd writing. You can use headers, give emphasis, create tables, call a figure, add links of useful websites.
-
Markdown Basic Syntax
- R Markdown Reference Guide - From RStudio.
5.5 Using workflowr
We will open the index.Rmd
file using wflow_open
function
wflow_open("analysis/index.Rmd")
At this file you can update the title of the index page, and start writing the main objectives of this repository. Like:
This repository was created to assist my learning experience with GitHub and workflowr.
My first R code at this project will be at this [git hub page](PCA.html)
That’s great, but we still do not have the PCA.hmtl
file, so let’s create it with the wflow_open
function.
wflow_open("analysis/PCA.Rmd")
That should create the PCA.Rmd file, you should be looking for it now.
You can update the name to replacing the abbreviation for Principal Components Analysis
, and add a new intro for the analysis that we are going to do at this R markdown file.
You can follow the example of this website WorkFlowRExample.
As we already have some changes at our project, we can update our repository in GitHub running the wflow_status
and wflow_publish
.
wflow_status
will check if there is any changes at the files of theanalysis
folder that requires to create the html pages again, and verify any new/delete/modified files at your repository. Always comparing with the last version (commit).wflow_publish
will commit (save, take a snapshot) the changes at the Rmd files at theanalysis
folder. Then create or update the html files and figures, and commit these new html files and figures again.
You should see something like this.
Status of 4 Rmd files
Totals:
3 Unpublished
1 Scratch
The following Rmd files require attention:
Unp analysis/about.Rmd
Unp analysis/index.Rmd
Unp analysis/license.Rmd
Scr analysis/PCA.Rmd
Key: Unp = Unpublished, Scr = Scratch (Untracked)
The current Git status is:
status substatus file
untracked untracked .DS_Store
untracked untracked 2.1 Script Var BLUPs.R
untracked untracked Data_Crosses_Density_chart.txt
untracked untracked Parentais selecionados.xlsx
To publish your html website using wflow_publish
you will need to provide a small message that will be linked to the git commit
function.
wflow_publish(files = "analysis/*.Rmd", message = "Test")
Current working directory: /Users/lbd54/Documents/GitHub/CassavaReproductiveBarriers
Building 3 file(s):
Building analysis/about.Rmd
log directory created: /var/folders/33/g0c9br3d0rx_bvhf9jsc0t9mcdw1j5/T//RtmphiTKma/workflowr
Building analysis/index.Rmd
Building analysis/license.Rmd
Summary from wflow_publish
**Step 1: Commit analysis files**
No files to commit
**Step 2: Build HTML files**
Summary from wflow_build
Settings:
combine: "or" clean_fig_files: TRUE
The following were built externally each in their own fresh R session:
docs/about.html
docs/index.html
docs/license.html
Log files saved in /var/folders/33/g0c9br3d0rx_bvhf9jsc0t9mcdw1j5/T//RtmphiTKma/workflowr
**Step 3: Commit HTML files**
Summary from wflow_git_commit
The following was run:
$ git add docs/about.html docs/index.html docs/license.html docs/figure/about.Rmd docs/figure/index.Rmd docs/figure/license.Rmd docs/site_libs docs/.nojekyll
$ git commit -m "Build site."
The following file(s) were included in commit 96ce162:
docs/about.html
docs/index.html
docs/license.html
However, I prefer to create/update the html files using the Knit button, then commit myself. This strategy reduces the number of commits in your repository, which makes easier to find an older version of it. Also the Knit button allows you to see if your website has the configuration that you expected without requiring to commit each time you recreate your website.
You can ask for RStudio
to create your html website pressing the button knit
, as showed below.
RStudio
will create/update your html file and save it at the docs
folder. After you repeat this step for all your Rmd files and checking if all of them are as you expected, you can commit this changes to GitHub. See next section.
5.6 Using Git to save your updates at GitHub
Git has four main functions:
- clone: will copy your repository for an specific directory on your computer.
- pull: will update the cloned repository of your computer with the new updates in your GitHub repository.
- commit: will save a version of your repository with your new codes, files, outputs. But that will not send it to GitHub.
- push: will send to git hub all your new commits/updates in project. After pushing your repository to GitHub you can share or clone your updates in any computer.
For a good commiting practice, just commit your updates after you finish your work or part of the project, so you will reduce the number of commits in your project.
5.6.1 Git in RStudio
To commit your updates just click at the commit button at this menu. This will open a window called Rstudio: Review Changes
.
At this window, you will be allowed to stage
(confirm) the changes that you made in all the files. You can make this decision per chunk, you just have to decide if you stage chunk
(keep the changes) or discard chunk
(keep the file as it was at the last commit).
DO NOT FORGET TO WRITE A SHORT MEANINGFUL MESSAGE ABOUT THE NEW CHANGES FOR THIS COMMIT.
Just click commit
and then push your new commit to GitHub clicking at the green arrow
.
If it is your first time push a commit to GitHub in your computer, RStudio will ask you your GitHub user
and a password
, the password you should provide is a personal access token
. This link will provide what you need to do to generate one.
Remember to save this token in a safe place, it might be used another time.
5.7 Publishing on GitHub (Pages)
Ok, your project is already at the GitHub, but now we need to give instructions to build your website to GitHub, so let’s go to your GitHub repository. GitHub link
At your repository website, click in settings
Then select the Pages
section at the sidebar menu
You will see a section about Source
, GitHub need to know which branch and which folder inside this branch is yours html files. So click at the None
button and select Branch: main
, then at the new windows with a folder symbol select /docs
folder, and save.
Congratulations your website will be created, just wait some minutes. The link will appear in a window similar to this one.
® Your site is ready to be published at
https://YourUserName.github.io/YourRepositoryName/
Copy this link, and get back to your repository website clicking at your repository name.
UserName/RepositoryName
at the right side of the page will have a section called about with a gear
, click at the gear
and paste your website link at the window Website
, and save the changes.
Now everyone that have access to your repository could see your project website just clicking at the link provided at the about section.