T32 Training Sessions
1
Introduction
1.1
Acknowledgements
1.2
License
Session I
2
The Center for Data and Bioinformation Services
2.1
Visit Our New Web Portal
2.2
Data Management Planning
2.3
Data Collection
2.4
Finding Data
2.5
Data Sharing and the UMB Data Catalog
2.6
Workshops
2.7
Data Analysis Support
2.8
Working with Molecular Data
2.9
Research Computing
2.10
Expert Consultations
2.11
Project Contribution
3
Best Practices for Research Data Management
3.1
Why Data Management?
3.1.1
Funding Agency Requirements
3.1.2
Publisher Requirements
3.1.3
Why Data Management? – What’s in it for me?
3.1.4
Don’t end up here!
Data Management Best Practices
3.2
Data Management Planning
3.2.1
Data Lifecycle
3.2.2
Planning
3.2.3
Data Management Plans (DMP’s)
3.2.4
Data Management Workflows
3.3
Data Collection
3.3.1
Variables - Best Practices
3.3.2
Data Documentation
3.4
File Organization
3.4.1
File Naming Issues
3.4.2
File Naming Conventions
3.4.3
File Naming Recap
3.5
Storage
3.5.1
Storage Solutions at UMB
3.5.2
Cloud Storage Options
3.5.3
Backup Considerations
3.5.4
Security Considerations
3.6
Preservation
3.6.1
Preservation Issues
3.6.2
Open Software Formats
3.6.3
Data Formats
3.7
Providing Access
3.7.1
Why Share Data?
3.7.2
Data Sharing Challenges
3.7.3
Providing Access to Data
3.7.4
Data Repositories
3.7.5
UMB Data Catalog
3.7.6
Data Publishing
3.8
Conclusion
3.9
Attributions
3.10
Photo References
Session II
4
Introduction to R and RStudio
4.1
Learning Objectives
4.2
Why learn R?
4.3
Starting out in R
4.3.1
Downloading, Installing and Running R
4.3.2
RStudio
4.4
Working in the Console
4.5
Objects
4.5.1
Creating Objects
4.6
Saving code in an R script
4.6.1
Setting your Working Directory
4.7
Functions and their arguments
4.7.1
Challenge: using functions
4.7.2
Getting Help
4.8
Packages
4.9
Vectors
4.9.1
Missing Data
4.9.2
Mixing types
4.9.3
Indexing and Subsetting vectors
4.9.4
Data frames and tibbles
5
Welcome to the Tidyverse
5.1
Install
5.2
The Data
5.3
Importing data
5.4
Working with columns
5.4.1
Select()
5.4.2
Renaming columns
5.4.3
Creating new columns with
mutate()
5.5
Working with rows
5.5.1
filter()
5.5.2
Grouping and Summarizing data
5.6
Plotting with ggplot2
6
Joining Datasets
6.1
Long vs Wide formats
Session III
7
Reproducible Project Management
7.1
RStudio Projects
7.1.1
What is Real?
7.1.2
Where does your analysis live?
7.1.3
Creating an RStudio project
7.2
Version Control and RStudio
7.2.1
Why Git?
7.2.2
What’s GitHub?
7.3
Setting up a remote repository on Github
7.4
Connecting Rstudio to Github
7.4.1
Introduce yourself to Git
7.5
Get a personal access token (PAT)
7.5.1
Create the PAT
7.5.2
Put your PAT into the Git credential store
7.6
Checking out a project from a version control remote repository
7.6.1
Clone the new GitHub repository to your computer via RStudio
7.7
Making some changes, save, commit.
7.8
Push your local changes online to GitHub
7.9
Confirm the local change propagated to the GitHub remote
7.10
Clean up
8
Reproducible Reports with R Markdown
8.1
What is R Markdown
8.2
R Markdown Related Packages
8.3
How does R Markdown work
8.3.1
Creating an R Markdown file
8.3.2
R Markdown Basic Components
8.3.3
Markdown
8.3.4
R code chunks
8.4
Resources for R Markdown
9
Shiny Apps
9.1
Shiny app basics
9.2
Create an empty Shiny app
9.2.1
Alternate way to create a Shiny app: separate UI and server files
9.2.2
Let RStudio fill out a Shiny app template for you
9.3
An example Shiny App
9.4
Host your Shiny App
9.5
Resources
Appendix
A
Further Resources
A.1
Installing R and RStudio
A.2
The Tidyverse
A.3
Graphing in R
A.4
R Based Technologies
B
Selected Glossary of R Terminology
C
Packages and Functions Used
Published with bookdown
T32 Working with Data Training
3.10
Photo References