I programatically confirmed that the data I have stored earlier is equal to the last dataset Shameema sent for check.
Surprisingly, the count by PlotName and PlotCensusNumber is different between the computation by Shameema and mine.
#> # A tibble: 10 x 3
#>    plotname              plotcensusnumber  count
#>    <chr>                            <dbl>  <dbl>
#>  1 Bukit Timah Secondary               1.  6700.
#>  2 Bukit Timah Secondary               2.  6723.
#>  3 Bukit Timah Secondary               3.  7603.
#>  4 Bukit Timah Primary                 1. 13472.
#>  5 Bukit Timah Primary                 2. 14343.
#>  6 Bukit Timah Primary                 3. 15177.
#>  7 Bukit Timah Primary                 4. 16122.
#>  8 Bukit Timah Primary                 5. 18738.
#>  9 Bukit Timah Primary                 6. 18637.
#> 10 Bukit Timah Big Trees               1. 11019.
#> # A tibble: 60 x 3
#> # Groups:   PlotName, PlotCensusNumber [60]
#>    PlotName              PlotCensusNumber     n
#>    <chr>                            <int> <int>
#>  1 Bukit Timah Secondary                1  6700
#>  2 Bukit Timah Secondary                2  6723
#>  3 Bukit Timah Secondary                3  7603
#>  4 Bukit Timah Primary                  1 13461
#>  5 Bukit Timah Primary                  2 14328
#>  6 Bukit Timah Primary                  3 15151
#>  7 Bukit Timah Primary                  4 16091
#>  8 Bukit Timah Primary                  5 18691
#>  9 Bukit Timah Primary                  6 18589
#> 10 Bukit Timah Primary                 NA   178
#> 11 Bukit Timah Big Trees                1 11019
#> 12 41569                               NA     1
#> 13 36565                               NA     2
#> 14 35285                               NA     2
#> 15 33763                               NA     2
#> # ... with 45 more rows
Most importantly, in my data some values of PlotCensusNumber are NA.
#> Observations: 356
#> Variables: 32
#> $ PlotName         <chr> "Bukit Timah Primary", "10849", "Bukit Timah ...
#> $ PlotCensusNumber <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ Tag              <chr> "C3-22552", NA, "C3-22552", NA, "C3-22552", N...
#> $ DBHID            <int> 4849, NA, 24650, NA, 26259, NA, 27524, NA, 82...
#> $ PlotID           <int> 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, ...
#> $ StemID           <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ StemNumber       <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ StemTag          <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ PrimaryStem      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ CensusID         <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ DBH              <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ LargeStem        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ Family           <chr> "Clusiaceae", NA, "Clusiaceae", NA, "Clusiace...
#> $ Genus            <chr> "Calophyllum", "main", "Calophyllum", NA, "Ca...
#> $ SpeciesName      <chr> "ferrugineum", "2", "ferrugineum", "3", "ferr...
#> $ Mnemonic         <chr> "CALOBM", "1", "CALOBM", "2", "CALOBM", "3", ...
#> $ Subspecies       <chr> "NULL", "18", "NULL", "21", "NULL", "19", "NU...
#> $ SpeciesID        <int> 112, NA, 112, NA, 112, NA, 112, NA, 112, NA, ...
#> $ SubspeciesID     <chr> "NULL", "1993-05-04", "NULL", "1995-12-18", "...
#> $ QuadratName      <chr> "C3", "12177", "C3", "13135", "C3", "NULL", "...
#> $ QuadratID        <int> 26, NA, 26, NA, 26, NA, 26, NA, 26, NA, 26, N...
#> $ PX               <dbl> 42.6, 1.0, 42.6, 1.0, 42.6, 1.0, 42.6, 1.0, 4...
#> $ PY               <dbl> 57.5, NA, 57.5, NA, 57.5, NA, 57.5, NA, 57.5,...
#> $ QX               <dbl> 2.6, NA, 2.6, NA, 2.6, NA, 2.6, NA, 2.6, NA, ...
#> $ QY               <dbl> 17.5, NA, 17.5, NA, 17.5, NA, 17.5, NA, 17.5,...
#> $ TreeID           <int> 10849, NA, 10849, NA, 10849, NA, 10849, NA, 1...
#> $ HOM              <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ ExactDate        <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
#> $ Date             <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ ListOfTSM        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ HighHOM          <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ Status           <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
Here is the problematic data. Here, NAs are represented as empty values.
Any idea what is going on? Should this data be discarded?
–
On Tue, Mar 6, 2018 at 6:15 PM, Shameema Jafferjee Esufali shameemaesufali@gmail.com wrote:
Mauro, I ran an export with NA fill in null values and got something that works in R. Please find attached.
save(ViewNA,file="/home/asiaplots/CTFSRPackage/bukittimah/ViewNA.rdata")
table(ViewNA$Plot,ViewNA$PlotCensusNumber)
New data had different names:
On Sat, Mar 10, 2018 at 3:36 AM, Mauro Lepore maurolepore@gmail.com wrote: Hi Shameema,
Compared to other ViewFullTables, the dataset you attached (ViewNA.rdata) has different column names. Can you fix that and resend?
From: Shameema Jafferjee Esufali shameemaesufali@gmail.com Date: Fri, Mar 9, 2018 at 7:55 PM
I attach the csv file and corresponding rdata file
Great! Now the data has the expected column names.
#> character(0)
And the count of observations per PlotCensusNuber equals what Shameema computed.
Shemeema’s computation.
#> # A tibble: 10 x 3
#>    plotname              plotcensusnumber  count
#>    <chr>                            <dbl>  <dbl>
#>  1 Bukit Timah Secondary               1.  6700.
#>  2 Bukit Timah Secondary               2.  6723.
#>  3 Bukit Timah Secondary               3.  7603.
#>  4 Bukit Timah Primary                 1. 13472.
#>  5 Bukit Timah Primary                 2. 14343.
#>  6 Bukit Timah Primary                 3. 15177.
#>  7 Bukit Timah Primary                 4. 16122.
#>  8 Bukit Timah Primary                 5. 18738.
#>  9 Bukit Timah Primary                 6. 18637.
#> 10 Bukit Timah Big Trees               1. 11019.
My computation (same)
#> # A tibble: 10 x 3
#> # Groups:   PlotName, PlotCensusNumber [10]
#>    PlotName              PlotCensusNumber     n
#>    <chr>                            <int> <int>
#>  1 Bukit Timah Secondary                1  6700
#>  2 Bukit Timah Secondary                2  6723
#>  3 Bukit Timah Secondary                3  7603
#>  4 Bukit Timah Primary                  1 13472
#>  5 Bukit Timah Primary                  2 14343
#>  6 Bukit Timah Primary                  3 15177
#>  7 Bukit Timah Primary                  4 16122
#>  8 Bukit Timah Primary                  5 18738
#>  9 Bukit Timah Primary                  6 18637
#> 10 Bukit Timah Big Trees                1 11019