Update on fgeo

Overview

What I’ve been doing lately can be divided in two parts:

fgeo: building the infrastructure to facilitate the expansion and maintenance of ForestGEO’s R code.
Some specific projects:
- map
- soil krig
- allodb

Although I’m happy to discuss specific projects, I am most interested in discussing the most general aspects of the development of ForestGEO’s code. That is, fgeo and working in private versus public.

fgeo: Goals

CENTRALIZE

to make all the code available via one package.
to make all the code discoverable via one website.

MODULARIZE

to make it easy to add, remove, change code with minimum impact on the entire system.

The future

INSTALLS (and loads)
- abundance
- biomass
- demography (mortality, growth, recruitment)
- map
- spatial
- topography
SUGGEST
- utils
- ctfs
- packages by partners
- packages by others

fgeo: Demo coding

Fails. Packages are missing

search()

## [1] ".GlobalEnv"        "package:stats"     "package:graphics" 
## [4] "package:grDevices" "package:utils"     "package:datasets" 
## [7] "package:methods"   "Autoloads"         "package:base"

If fgeo is loaded, al the core fgeo packages become available

library(fgeo)

## -- Attaching packages ---------------------------------------------- fgeo 0.0.0.9000 --

## v forestr         0.0.0.9000     v bciex           0.0.0.9000
## v map             0.0.0.9007     v fgeo.demography 0.0.0.9000

##

search()

##  [1] ".GlobalEnv"              "package:fgeo.demography"
##  [3] "package:bciex"           "package:map"            
##  [5] "package:forestr"         "package:fgeo"           
##  [7] "package:stats"           "package:graphics"       
##  [9] "package:grDevices"       "package:utils"          
## [11] "package:datasets"        "package:methods"        
## [13] "Autoloads"               "package:base"

fgeo notifies if there are conflicts with other packages.

# Silent if there is no conflicts
fgeo_conflicts()

# Loading a package that will cause conflicts
library(ctfs)

## 
## Attaching package: 'ctfs'

## The following objects are masked from 'package:fgeo.demography':
## 
##     growth, mortality, recruitment

## The following object is masked from 'package:forestr':
## 
##     abundance

# Vocal if there are conflicts
fgeo_conflicts()

## -- Conflicts ------------------------------------------------------ fgeo_conflicts() --
## x ctfs::abundance()   masks forestr::abundance()
## x ctfs::growth()      masks fgeo.demography::growth()
## x ctfs::mortality()   masks fgeo.demography::mortality()
## x ctfs::recruitment() masks fgeo.demography::recruitment()

detach("package:ctfs")

fgeo: Demo website

https://forestgeo.github.io/fgeo/

Packages
Tutorials
Apps

Working privatele versus publicly: How it impacts development?

I just put everything in a public place rather than trying to have this granular access control; it simplifies things greatly. Working in the open has simplified a lot of decisions, that’s nice.

– Jenny Bryan (interview).

PROBLEM

Private packages are difficult to:

test
install
get feedback on

Why do we need private packages?

BENEFITS OF WORKING PUBLICLY

Free support on Travis https://travis-ci.com/plans
Cheap on codecov https://codecov.io/pricing, maybe free – I’ve requested a discount
Easier to share with partners and thus easier to get feedback.
- I don’t neet to authorize each partner
- partners can install softare without getting an authorization token
- no problem to install all packages via fgeo

IN-DEVELOPMENT CODE WILL EVENTUALLY BE PUBLIC

If the package contains no private data, I can see no benefit that will not eventually dissapear
Once a package is advertised, it will likely be public. Then, there will also be code in-development – which is the same situation we have now.

RELEASED AND IN-DEVELOPMENT CODE CAN BE DIFFERENTIATED

in development code is generally tagged with a version number ending in .9###.
released code lives in https://github.com/forestgeo/
/releases, e.g.:
- https://github.com/forestgeo/map/releases

Q & A

Why fgeo and not forestgeo

We need a short yet indicative prefix to differentciate our pkgs when a name is unavailable; e.g.
- fgeo.demography
- fgeo.utils

Why modularity (project/topic oriented management)

Issues are focused on a specific topic.
- Don’t bother collaborators with irrelevant discussions.
- Don’t obscure active issues with inactive issues on a different topic.
Avoid running our of informative names for internal functions
Allow organize code in useful ways (e.g. https://forestgeo.github.io/fgeo/reference/index.html):
- by who developed that code
  - us
  - partners
  - others (e.g. suggest BIOMASSS)
- by how much we trust it

How small can be a module?

Example of a small package:

Package here: https://krlmlr.github.io/here/reference/index.html
Package reprex: http://reprex.tidyverse.org/reference/index.html

Examples of meta-packages

asserive: https://cran.r-project.org/web/packages/assertive/index.html
tidyverse: https://www.tidyverse.org/packages/