2 Material and methods

description of the materials and methods used to derived database. (to come)

Structure of the algorithm development : 1. In-situ data used to explore and compute model coefficients. 2. Matchup validation is made with simulated data.

Date and month of sampling for each dataset integrated to this study

Characteristic Overall, N = 3641 CHONe, N = 120 CoastJB, N = 161 PMZA-RIKI, N = 22 WISEMan, N = 61
Month, n (%)
January 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
February 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
March 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
April 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
May 27 (7.4) 26 (22) 0 (0) 1 (4.5) 0 (0)
June 62 (17) 57 (48) 0 (0) 5 (23) 0 (0)
July 85 (23) 0 (0) 81 (50) 4 (18) 0 (0)
August 141 (39) 9 (7.5) 65 (40) 6 (27) 61 (100)
September 40 (11) 22 (18) 15 (9.3) 3 (14) 0 (0)
October 8 (2.2) 6 (5.0) 0 (0) 2 (9.1) 0 (0)
November 1 (0.3) 0 (0) 0 (0) 1 (4.5) 0 (0)
December 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
Year, n (%)
2015 18 (4.9) 0 (0) 0 (0) 18 (82) 0 (0)
2016 12 (3.3) 9 (7.5) 0 (0) 3 (14) 0 (0)
2017 80 (22) 79 (66) 0 (0) 1 (4.5) 0 (0)
2018 77 (21) 0 (0) 77 (48) 0 (0) 0 (0)
2019 177 (49) 32 (27) 84 (52) 0 (0) 61 (100)

1 n (%)

Characteristic N Overall, N = 3641 CHONe, N = 120 CoastJB, N = 161 PMZA-RIKI, N = 22 WISEMan, N = 61
Region, n (%) 364
EGSL 203 (56) 120 (100) 0 (0) 22 (100) 61 (100)
JB 161 (44) 0 (0) 161 (100) 0 (0) 0 (0)
SPM [mg/L], Range 347 1 - 110 2 - 38 1 - 110 1 - 9 3 - 35
Ag(440) [m], Range 343 0.16 - 11.50 0.16 - 4.97 0.94 - 11.50 0.21 - 0.50 0.28 - 2.66
Ag(295) [m], Range 343 2 - 100 2 - 41 12 - 100 3 - 6 4 - 25
Ag(275) [m], Range 343 3 - 128 3 - 53 17 - 128 4 - 8 5 - 33
A(532) [m], Range 104 0.09 - 1.20 0.09 - 1.20 Inf - -Inf 0.12 - 0.35 0.15 - 0.52
Bbp(532) [m], Range 276 0.00 - 0.46 0.00 - 0.10 0.01 - 0.46 0.01 - 0.02 0.00 - 0.04

1 n (%); Range

Characteristic N N = 161
Region, n (%) 161
JB 161 (100)
SPM [mg/L], Range 161 1 - 110
Ag(440) [m], Range 155 0.94 - 11.50
Ag(295) [m], Range 155 12 - 100
Ag(275) [m], Range 155 17 - 128
A(532) [m], Range 0 Inf - -Inf
Bbp(532) [m], Range 144 0.01 - 0.46
Sensors bands
OLI MSI OLCI MODISA MERIS SeaWiFS
B1 (443), B2 (482), B3 (561), B4 (655), B5 (865) B1 (443), B2 (492), B3 (560), B4 (665), B5 (704), B6 (740), B7 (783), B8 (833), B8a (865) Oa1 (400), Oa2 (412.5), Oa3 (442), Oa4 (490), Oa5 (510), Oa6 (560), Oa7 (620), Oa8 (665), Oa9 (673.75), Oa10 (681.25), Oa11 (708.75), Oa12 (753.75), Oa13 (761.25), Oa14 (764.38), Oa15 (767.5), Oa16 (778.75), Oa17 (865), Oa18 (885) B1 (412), B2 (443), B3 (469), B4 (488), B5 (531), B6 (547), B7 (555), B8 (645), B9 (667), B10 (678), B11 (748), B12 (859), B13 (869) B1 (413), B2 (443), B3 (490), B4 (510), B5 (560), B6 (620), B7 (665), B8 (681), B9 (709), B10 (754), B11 (762), B12 (779), B13 (865) B1 (412), B2 (443), B3 (490), B4 (510), B5 (555), B6 (670), B7 (765), B8 (865)

2.1 Dataset for training and testing

2.2 Summary stats for train and test

As the model proposed here are purely empirical, it is of great importance to define the range for which they are applicable. The tables below present the summary statistics of each retrieved optically active constituents for the train and test datasets.

It also worth to note that as the modeled relationships depend on a variety of complex intricate cumulative effects (i.e. specifics IOPs), the time range of the measurement are also of importance as one cannot assume the OACs concentrations and distributions to remain constant.

Global summary
Characteristic N Overall, N = 3601 EGSL, N = 199 JB, N = 161
matchup, n (%) 360 116 (32) 61 (31) 55 (34)
SPM, Median (IQR) Range 343 6 (3, 10) 1 - 110 7 (5, 10) 1 - 38 4 (2, 8) 1 - 110
PIM, Median (IQR) Range 334 5 (3, 8) 0 - 101 6 (4, 8) 0 - 35 4 (2, 8) 1 - 101
POM, Median (IQR) Range 174 1.54 (1.19, 1.93) 0.55 - 3.75 1.54 (1.19, 1.93) 0.55 - 3.75 NA (NA, NA) Inf - -Inf
Ag_440, Median (IQR) Range 339 1.56 (0.94, 2.29) 0.16 - 11.50 1.04 (0.48, 1.66) 0.16 - 4.27 1.89 (1.55, 3.37) 0.94 - 11.50
Ag_295, Median (IQR) Range 339 16 (9, 22) 2 - 100 10 (5, 16) 2 - 40 20 (17, 33) 12 - 100
Ag_275, Median (IQR) Range 339 21 (12, 30) 3 - 128 13 (8, 21) 3 - 53 28 (24, 44) 17 - 128
Bbp_532, Median (IQR) Range 274 0.03 (0.01, 0.06) 0.00 - 0.46 0.01 (0.01, 0.02) 0.00 - 0.08 0.05 (0.03, 0.10) 0.01 - 0.46
PIM_frac, Median (IQR) Range 292 79 (72, 85) 31 - 97 79 (72, 84) 31 - 93 79 (72, 85) 35 - 97
set, n (%) 360
test 116 (32) 61 (31) 55 (34)
train 244 (68) 138 (69) 106 (66)

1 n (%); Median (IQR) Range

2.2.1 By Region

EGSL summary
Characteristic N test, N = 61 train, N = 138
SPM, Median (IQR) Range 182 6.7 (5.3, 8.9) 1.6 - 38.3 7.6 (5.1, 10.4) 1.2 - 34.6
PIM, Median (IQR) Range 174 5.7 (4.4, 8.0) 1.1 - 34.9 5.8 (3.8, 8.7) 0.4 - 32.3
POM, Median (IQR) Range 174 1.60 (1.27, 1.85) 0.55 - 3.47 1.54 (1.17, 1.97) 0.59 - 3.75
Ag_440, Median (IQR) Range 184 1.38 (0.85, 1.87) 0.28 - 3.84 0.92 (0.43, 1.50) 0.16 - 4.27
Ag_295, Median (IQR) Range 184 13 (8, 17) 3 - 35 9 (5, 14) 2 - 40
Ag_275, Median (IQR) Range 184 18 (11, 23) 4 - 46 12 (7, 18) 3 - 53
Bbp_532, Median (IQR) Range 130 0.016 (0.008, 0.019) 0.003 - 0.077 0.013 (0.007, 0.023) 0.004 - 0.061
PIM_frac, Median (IQR) Range 174 80 (74, 84) 62 - 92 79 (72, 84) 31 - 93
JB summary
Characteristic N test, N = 55 train, N = 106
SPM, Median (IQR) Range 161 3 (2, 6) 1 - 110 5 (3, 9) 1 - 71
PIM, Median (IQR) Range 160 3 (2, 5) 1 - 101 5 (3, 9) 1 - 67
POM, Median (IQR) Range 0 NA (NA, NA) Inf - -Inf NA (NA, NA) Inf - -Inf
Ag_440, Median (IQR) Range 155 1.83 (1.44, 3.20) 0.94 - 6.58 1.91 (1.58, 3.41) 1.14 - 11.50
Ag_295, Median (IQR) Range 155 20 (16, 32) 12 - 61 21 (18, 34) 13 - 100
Ag_275, Median (IQR) Range 155 27 (23, 43) 17 - 80 28 (24, 45) 18 - 128
Bbp_532, Median (IQR) Range 144 0.04 (0.03, 0.09) 0.01 - 0.24 0.06 (0.03, 0.10) 0.01 - 0.46
PIM_frac, Median (IQR) Range 118 78 (72, 83) 52 - 92 80 (72, 87) 35 - 97