Table 1.
Sample Namea | Random Forest Classificationb | NMR+DS Array ROC Curvec | |||||
---|---|---|---|---|---|---|---|
NMR | DS Array | NMR+DS Array | AUC | NMR Ratiod (% NMR) | DS Array Ratioe (% DS) | ||
2015 Vineyard | |||||||
SMV1 | Santa Maria Valley | 0.88 | 1.00 | 1.00 | 0.95 | 0.93 (0.07) | 0.07 (0.04) |
SMV2 | Satna Maria Valley | 1.00 | 0.88 | 1.00 | 0.98 | 0.73 (0.05) | 0.27 (0.15) |
SRH1 | Santa Rita Hills | 0.75 | 0.88 | 0.88 | 0.91 | 0.4 (0.05) | 0.6 (0.56) |
AS1 | Arroyo Seco | 0.63 | 1.00 | 0.88 | 0.81 | 0.88 (0.1) | 0.12 (0.11) |
AS2 | Arroyo Seco | 0.75 | 0.88 | 1.00 | 0.98 | 1.00 (0.07) | 0.00 (0.00) |
SNC1 | Sonoma Coast | 1.00 | 0.88 | 1.00 | 0.98 | 0.68 (0.08) | 0.32 (0.3) |
SNC2 | Sonoma Coast | 0.75 | 1.00 | 1.00 | 0.99 | 0.2 (0.01) | 0.8 (0.44) |
CRN1 | Carneros | 1.00 | 1.00 | 1.00 | 0.92 | 0.64 (0.08) | 0.36 (0.33) |
RRV1 | Russian River Valley | 1.00 | 0.75 | 1.00 | 0.99 | 0.93 (0.07) | 0.07 (0.04) |
RRV2 | Russian River Valley | 1.00 | 0.86 | 1.00 | 0.98 | 0.52 (0.06) | 0.48 (0.44) |
RRV3 | Russian River Valley | 1.00 | 1.00 | 1.00 | 0.99 | 0.67 (0.05) | 0.33 (0.19) |
AV1 | Anderson Valley | 0.86 | 0.86 | 0.86 | 0.94 | 0.92 (0.11) | 0.08 (0.07) |
AV2 | Anderson Valley | 0.88 | 1.00 | 1.00 | 0.94 | 0.64 (0.08) | 0.36 (0.33) |
OR1 | Willamette Valley | 1.00 | 1.00 | 1.00 | 0.99 | 0.80 (0.06) | 0.20 (0.11) |
OR2 | Willamette Valley | 1.00 | 0.88 | 1.00 | 0.95 | 0.32 (0.04) | 0.68 (0.63) |
Averagef | 0.90±0.12 | 0.92±0.08 | 0.97±0.05 | 0.95±0.05 | |||
p-valueg (individual vs combination) | 0.045 | 0.05 | |||||
2016 Vineyard | |||||||
SMV1 | Santa Maria Valley | 1.00 | 0.75 | 1.00 | 0.97 | 0.87 (0.06) | 0.13 (0.07) |
SMV2 | Satna Maria Valley | 0.88 | 0.88 | 1.00 | 0.96 | 0.72 (0.09) | 0.28 (0.26) |
SRH1 | Santa Rita Hills | 1.00 | 1.00 | 1.00 | 0.95 | 0.64 (0.08) | 0.36 (0.33) |
AS1 | Arroyo Seco | 1.00 | 0.88 | 1.00 | 0.95 | 0.47 (0.03) | 0.53 (0.3) |
AS2 | Arroyo Seco | 0.88 | 0.88 | 0.88 | 0.95 | 0.40 (0.03) | 0.60 (0.33) |
SNC1 | Sonoma Coast | 1.00 | 1.00 | 1.00 | 0.97 | 0.67 (0.05) | 0.33 (0.19) |
SNC2 | Sonoma Coast | 1.00 | 0.75 | 1.00 | 0.96 | 0.80 (0.10) | 0.20 (0.19) |
CRN1 | Carneros | 0.88 | 0.50 | 0.88 | 0.99 | 0.73 (0.05) | 0.27 (0.15) |
RRV1 | Russian River Valley | 1.00 | 0.75 | 1.00 | 1.00 | 1.00 (0.07) | 0.00 (0.00) |
RRV2 | Russian River Valley | 1.00 | 0.75 | 1.00 | 0.96 | 0.88 (0.10) | 0.12 (0.11) |
RRV3 | Russian River Valley | 1.00 | 1.00 | 1.00 | 0.99 | 0.40 (0.02) | 0.60 (0.22) |
AV1 | Anderson Valley | 1.00 | 0.75 | 1.00 | 1.00 | 0.87 (0.06) | 0.13 (0.07) |
AV2 | Anderson Valley | 1.00 | 1.00 | 1.00 | 0.97 | 0.72 (0.09) | 0.28 (0.26) |
OR1 | Willamette Valley | 1.00 | 1.00 | 1.00 | 0.99 | 0.80 (0.04) | 0.20 (0.07) |
OR2 | Willamette Valley | 1.00 | 0.63 | 1.00 | 0.99 | 1.00 (0.07) | 0.00 (0.00) |
Average | 0.975±0.05 | 0.83±0.15 | 0.98±0.04 | 0.97±0.02 | |||
p-value (individual vs combination) | 0.64 | 0.001 | |||||
Vineyard Totals | |||||||
Average | 0.94±0.10 | 0.88±0.13 | 0.98±0.05 | 0.96±0.04 | |||
p-value (vineyard vs combination) | 0.050 | 0.0002 | |||||
2015 Region | |||||||
Santa Maria Valley | SMV1, SMV2 | 1.00 | 0.94 | 1.00 | 0.97 | 0.93 (0.07) | 0.07 (0.04) |
Santa Rita Hills | SRH1 | 0.63 | 0.63 | 0.88 | 0.91 | 0.36 (0.04) | 0.64 (0.59) |
Arroyo Seco | AS1, AS2 | 0.94 | 1.00 | 0.94 | 0.93 | 0.72 (0.09) | 0.28 (0.26) |
Sonoma Coast | SNC1, SNC2 | 0.81 | 0.81 | 1.00 | 0.96 | 0.60 (0.03) | 0.40 (0.15) |
Carneros | CRN1 | 0.83 | 0.83 | 1.00 | 0.87 | 0.53 (0.04) | 0.40 (0.22) |
Russian River Valley | RRV1, RRV2, RRV3 | 1.00 | 0.78 | 1.00 | 0.99 | 0.87 (0.06) | 0.13 (0.07) |
Anderson Valley | AV1, AV2 | 0.93 | 0.87 | 0.93 | 0.96 | 0.73 (0.05) | 0.27 (0.15) |
Willamette Valley | OR1, OR2 | 1.00 | 0.94 | 1.00 | 0.99 | 0.80 (0.04) | 0.20 (0.07) |
Average | 0.089±0.12 | 0.85±0.11 | 0.97±0.05 | 0.948±0.039 | |||
p-value (individual vs combination) | 0.147 | 0.018 | |||||
2016 Region | |||||||
Santa Maria Valley | SMV1, SMV2 | 0.94 | 1.00 | 1.00 | 0.99 | 0.80 (0.04) | 0.20 (0.07) |
Santa Rita Hills | SRH1 | 1.00 | 0.75 | 1.00 | 0.93 | 0.80 (0.06) | 0.20 (0.11) |
Arroyo Seco | AS1, AS2 | 1.00 | 0.94 | 1.00 | 0.98 | 0.90 (0.04) | 0.10 (0.04) |
Sonoma Coast | SNC1, SNC2 | 1.00 | 0.88 | 1.00 | 0.99 | 0.67 (0.05) | 0.33 (0.19) |
Carneros | CRN1 | 0.88 | 0.50 | 0.88 | 0.99 | 0.80 (0.06) | 0.20 (0.11) |
Russian River Valley | RRV1, RRV2, RRV3 | 1.00 | 0.83 | 1.00 | 0.98 | 1.00 (0.05) | 0.00 (0.00) |
Anderson Valley | AV1, AV2 | 1.00 | 0.75 | 1.00 | 0.99 | 1.00 (0.05) | 0.00 (0.00) |
Willamette Valley | OR1, OR2 | 1.00 | 0.69 | 1.00 | 0.98 | 0.73 (0.05) | 0.27 (0.15) |
Average | 0.977±0.043 | 0.792±0.147 | 0.984±0.041 | 0.980±0.020 | |||
p-value (individual vs combination) | 0.736 | 0.005 | |||||
Region Totals | |||||||
Average | 0.935±0.101 | 0.821±0.132 | 0.976±0.044 | 0.964±0.035 | |||
p-value (region vs combination) | 0.1529 | 0.0002 |
list of the fifteen vineyard IDs and the associated regions.
RF classification accuracy ranges from 0 to 1, where 1 is perfect classification. RF classification accuracy using just the NMR or DS array data alone or using the combined dataset.
ROC - receiver operating characteristic curve, AUC-area under the ROC curve. AUC ranges from 0 to 1, where 1 indicates perfect classification. ROC and AUC were calculated using the combined NMR spectroscopy and DS array datasets.
NMR ratio identifies the percentage of the total features used in the ROC curve that are from the 1D 1H NMR spectrum. %NMR identifies the percentage of the total number of NMR features used in the ROC curve.
DS array ratio identifies the percentage of the total features used in the ROC curve that are from the DS array data. %DS identifies the percentage of the total number of DS array features used in the ROC curve
column averages are presented as average ± standard deviation
p-values are calculated from a Student’s t-test