2.7 Data Analysis Support

Software evaluation
R programming language
Data wrangling
Web resources for Statistics

Now I will talk about how we support data analysis. Typically data analysis is thought of as making calculation on a dataset to reveal underlying information – that is accurate. When we think about data analysis, we are referring to a process of working with data that includes software evaluation and selection, data collection, data organization, data wrangling, exploratory data analysis, calculation of statistics, inference and modeling and data visualization.

When it comes to analyzing your data while you can think about using a pen and paper most people use software to aid the process because as a dataset grows the process of analyzing the data can become very involved. We have expertise in the area of selecting software to perform wide array of data analysis tasks. Some software is general meaning it has the functionality to do many tasks and some is very specific developed to work with a particular type of data. Keep this in mind when you’re thinking what software you will choose.

One of the software we routinely provide instruction and encouragement for researchers to learn and use is the R programming language. There are many reasons why its amazing. Some of the important to consider is, 1) its open source which means its free for you to use so there are no license fees 2) it’s a mature programming language so there is a ton of support out on the web for nearly any type of data analysis you can ever imagine doing and on top of it the libraries that expand R’s capabilities are typically written by experts in their field many of which are in the health sciences so its tuned to what youre doing. 3) reporting and reproducibility. The Rmarkdown package gives you the ability to create a reproducible workflow of your data analysis ( a record of how you analyzed your data) and create a report at the same time. this is very powerful as if you need to modify a part of your data analysis youre merely modifying a small bit of code instead of rerunning a whole pipeline. And on top of it you can easily share your report with colleagues enhances the experience of sharing information you’ve found during your data analysis.

Fundamentally, data wrangling is the process involved in transforming or preparing data for analysis. Data wrangling is known to be one of the most time-consuming parts of data analysis. We tend to think of the time required can be alleviated with some forethought. Especially if you consider collection of data in such a way that when its collected it is also being structured for analysis. How you structure your data collection will determine how swiftly you can go through the wrangling process. We would love to talk to you about data wrangling and how to structure your data for efficient downstream analysis and visualization

At the time being we do not offer direct support for choosing the statistics that will best explain what is significant in your findings. However, we have provided countless direction to researchers on where to go on the web to find information on choosing the right stats. We’d be happy to do that for you.