July 2021
July 1
If you share code plus data, it's a good idea to adopt a few defensive techniques to ensure that the data are what the code expects.
β R Function A Day (@rfunctionaday) July 1, 2021
The {assert_that} function from {assertthat} π¦ provides just the tool! π₯https://t.co/kQT74S0AQX#rstats #DataScience pic.twitter.com/ScTmUxQxnr
July 2
Either for aesthetic or for highlighting purposes, you may sometimes wish to draw borders around legend keys in {ggplot2} plots. πΌοΈ
β R Function A Day (@rfunctionaday) July 2, 2021
The {keybox} from {ggfun} π¦ does exactly this, easily and flexibly πhttps://t.co/dBAjEqKzY2#rstats #DataScience pic.twitter.com/mvFxIeInyj
July 3
If you don't use RMarkdown and copy-paste software output to report statistics, you'd want to check that no errors were made in the process.
β R Function A Day (@rfunctionaday) July 3, 2021
The {statcheck} function from eponymous π¦ does this (for single or multiple files)! πhttps://t.co/HdW6CKKPO8#rstats #DataScience pic.twitter.com/tgC5ysvz6T
July 4
Correctly specifying a distribution family for regression model can improve estimate accuracy. But what if weβre unsure?
β R Function A Day (@rfunctionaday) July 4, 2021
The {check_distribution} function from {performance} π¦ uses Random Forest to help you reconsider the choice β οΈhttps://t.co/qD7cQvNLLz#rstats #DataScience pic.twitter.com/MLhkCT2cwh
July 5
While scraping web data in R, sometimes we may wish that the extracted text layout mimics its browser/HTML behavior (e.g.Β ignore whitespace).
β R Function A Day (@rfunctionaday) July 5, 2021
The {html_text2} function from {rvest} π¦ helps with exactly this! πββοΈhttps://t.co/QJwe3nXx43#rstats #DataScience pic.twitter.com/T7NU9MYKqL
July 6
Simple slopes analyses can help understand interaction effects in linear regression.
β R Function A Day (@rfunctionaday) July 6, 2021
The {sim_slopes} function from {interactions} π¦ provides an easy way to both run and visualize this analysis for 2-way or 3-way interactions! ποΈhttps://t.co/4Xvc4FMrEr#rstats #DataScience pic.twitter.com/70AFS2QQcF
July 7
A histogram is a good visualization to represent the distribution of numeric data. π
β R Function A Day (@rfunctionaday) July 7, 2021
The {gghistostats} function from {ggstatsplot} π¦ provides ready-made histograms (with additional descriptive and inferential statistics) πhttps://t.co/zSmT4MBGjX#rstats #DataScience pic.twitter.com/xMFwXedQSO
July 8
If you want to use non-standard fonts or characters, getting graphics devices to work with them can be a pain.
β R Function A Day (@rfunctionaday) July 8, 2021
The {showtext_auto} function from {showtext} π¦ supports a large collection of font formats and graphics devices! πhttps://t.co/bt3SzZfyhI#rstats #DataScience pic.twitter.com/bYmyfZEfhL
July 9
Sometimes we may wish to provide descriptive labels for colors we are using, but may not know how to label them.
β R Function A Day (@rfunctionaday) July 9, 2021
The {name} function from {ColorNameR} π¦ produces color labels in multiple languages and colorspaces! π¨https://t.co/RVkqjq7Ksz#rstats #DataScience pic.twitter.com/MmD1MWPHF3
July 10
Sometimes we may wish to check all relevant assumptions for a regression model in one go. π
β R Function A Day (@rfunctionaday) July 10, 2021
The {check_model} function from {performance} π¦ does exactly this and also provides elegant visualizations with helpful pointers β πhttps://t.co/4SIIL0u9Jn#rstats #DataScience pic.twitter.com/EMmye4qAZk
July 11
If you are writing manuscripts in RMarkdown, you may wish to auto-generate citations for all R packages used in the document. βοΈ
β R Function A Day (@rfunctionaday) July 11, 2021
The {write_bib} function from {knitr} π¦ does exactly this! πhttps://t.co/JYux01tB7h#rstats #DataScience pic.twitter.com/YvOiaPeYfY
July 12
While plotting time series data, we may wish to plot several subseries corresponding to the periods of interest (seasons, months, etc.).
β R Function A Day (@rfunctionaday) July 12, 2021
The {ggfreqplot} function from {ggfortify} π¦ makes this task effortless! ππ https://t.co/rRmcQbKv4U#rstats #DataScience pic.twitter.com/Vic17LXkM9
July 13
If π¦s being used happen to have a function with an identical name, using that function may fail.
β R Function A Day (@rfunctionaday) July 13, 2021
Aside from::
qualifier, {conflicted_prefer} function from {conflicted} π¦ can solve this conflict by prioritizing one function π₯https://t.co/n96Mt8kHx3#rstats #DataScience pic.twitter.com/V9jZwxADoM
July 14
Exploratory data analysis often involves specifying and comparing multiple regression models.
β R Function A Day (@rfunctionaday) July 14, 2021
The {compare_parameters} function from {parameters} π¦ provides dot-and-whisker plots to display and compare regression estimates! π₯π₯π₯https://t.co/vmuGta5W9i#rstats #DataScience pic.twitter.com/Xyve8SMk1Q
July 15
Either computing environment or good practice recommendations may compel you to check for file paths, character encoding, etc. in your script. π
β R Function A Day (@rfunctionaday) July 15, 2021
The {is_} function family from {xfun} π¦ provides tools to run such checks easily β https://t.co/LdRTIKXOCA#rstats #DataSciencee pic.twitter.com/YCtgZEufae
July 16
Sometimes we may wish to assess the polarity (positive, negative, neutral) of text data.
β R Function A Day (@rfunctionaday) July 16, 2021
The {sentiment} function from {sentimentr} π¦ provides a convenient and flexible way to approximate the sentiment of the text by sentence ββhttps://t.co/mMpxUxXyVP#rstats #DataScience pic.twitter.com/9oXGUwVTnf
July 17
During exploratory phase, we may wish to visualize and model data quickly and thoroughly. βΏ
β R Function A Day (@rfunctionaday) July 17, 2021
The {ggwithinstats} function from {ggstatsplot} π¦ does this for one-way repeated measures designs via plots with statistical details πhttps://t.co/lOT7qa37z8#rstats #DataScience pic.twitter.com/UxoUQr1cvB
July 18
To keep related data together, you might sometimes create dataframes with columns that themselves contain dataframes. π§³
β R Function A Day (@rfunctionaday) July 18, 2021
Since working with them can be a pain, the {unpack} function from {tidyr} π¦ helps you "unpack" them! π¨https://t.co/rt0ekSiMUg#rstats #DataScience pic.twitter.com/m2O8OQRDgF
July 19
If you use {dplyr} π¦ for data analysis, you may sometimes wish to carry out statistical analysis on a grouped data frame.
β R Function A Day (@rfunctionaday) July 19, 2021
If the statistical function requires the whole dataframe, the {cur_data} function provides just the tool! ποΈhttps://t.co/IemOGQqqrH#rstats #DataScience pic.twitter.com/MzEnBq0tp0
July 20
Visualizing variable distribution via violin plot is easy in {ggplot2}, but we may wish to avoid redundant mirroring of the density plot.
β R Function A Day (@rfunctionaday) July 20, 2021
The {stat_density_ridges} function from {ggridges} π¦ provides just the geometric layer! πhttps://t.co/x6yAh0hHW9#rstats #DataScience pic.twitter.com/V3JhOamnFb
July 21
During analysis, model selection may involve the specification of multiple models and formally testing if they are different.
β R Function A Day (@rfunctionaday) July 21, 2021
The {test_performance} function from {performance} π¦ performs and summarizes indices from these tests π₯https://t.co/Qd25p55XPq#rstats #DataScience pic.twitter.com/EvmEOVXNRn
July 22
While visualizing data across a combination of variables, {facet_wrap} in {ggplot2} creates small multiples. But what if the variables are nested?
β R Function A Day (@rfunctionaday) July 22, 2021
The {facet_nested_wrap} function from {ggh4x} π¦ handles exactly such designs! ππhttps://t.co/WVp154GvPA#rstats #DataScience pic.twitter.com/P6ZjOUF9eV
July 23
Sometimes you may wish to do something in R and can't think of any package that might be helpful.
β R Function A Day (@rfunctionaday) July 23, 2021
In such cases, the {findPackage} function from {packagefinder} π¦ can search and return relevant CRAN packages given the keywords πhttps://t.co/0IiwJtafTq#rstats #DataScience pic.twitter.com/n946xE4cAS
July 24
Sometimes you may wish to write SQL queries (for practice?) without access to a database. π©
β R Function A Day (@rfunctionaday) July 24, 2021
In such cases, you can use the {dbwritetable} function from {DBI} π¦ to copy a dataframe to a database table, and then write queries! βοΈπhttps://t.co/Gu3HwjrR0S#rstats #DataScience pic.twitter.com/z6IZnW0o1o
July 25
Simulating parameter draws can sometimes be a (computationally faster) alternative to bootstrapping. β²οΈ
β R Function A Day (@rfunctionaday) July 25, 2021
The {simulate_parameters} function from {parameters} π¦ can run and visualize such simulations for various regression models π²https://t.co/SHdA3JANeq#rstats #DataScience pic.twitter.com/sRe2BhPKjW
July 26
In case you are looking for an alternate, "operator" way to access object attributes in R, you can use the infix attribute accessor (%@%) from {rlang} π¦!οΈhttps://t.co/C0T97lOtWV#rstats #DataScience pic.twitter.com/zAYw4JCo5i
β R Function A Day (@rfunctionaday) July 26, 2021
July 27
Sometimes you need to plot data from different geographical entities into a grid and may wish to preserve the original geographical orientation of the entities.
β R Function A Day (@rfunctionaday) July 27, 2021
The {facet_geo} function from {geofacet} π¦ produces such a grid! πΊοΈhttps://t.co/IUIQ1B7rUw#rstats #DataScience pic.twitter.com/MamwLytO4F
July 28
Operating on multiple columns in a row-wise manner is fairly straightforward in {dplyr} π¦.
β R Function A Day (@rfunctionaday) July 28, 2021
In this workflow, the {c_across} function allows you to use the tidy selection syntax to select columns to operate on πhttps://t.co/gmFEid0WDK#rstats #DataScience pic.twitter.com/8uMiwmcxtq
July 29
Either out of curiosity or to improve its performance, sometimes you may want to time your R code.
β R Function A Day (@rfunctionaday) July 29, 2021
The {tic}/{toc} function from {tictoc} π¦ provides just the tool β²οΈhttps://t.co/yCu9ArIdBd#rstats #DataScience pic.twitter.com/PyHJCSm2NG
July 30
To ensure reproducibility of R script, you may wish it to download needed package versions on a certain date.
β R Function A Day (@rfunctionaday) July 30, 2021
The {groundhog.library} function from {groundhog} π¦ creates a local library with the needed package versions πhttps://t.co/bOhOXRzfMZ#rstats #DataScience pic.twitter.com/1pLfn8uU4q
July 31
ROC curves provide a convenient way to compare responses and predictions of a binomial model.
β R Function A Day (@rfunctionaday) July 31, 2021
The {performance_roc} function from {performance} π¦ computes AUC metric and visualizes ROC curves for a collection of models π₯π₯π₯https://t.co/ZKty2kA5Br#rstats #DataScience pic.twitter.com/LYJgfvZHIF