Database schema for LAGOS including the two main modules: LAGOSGEO (green box) and LAGOSLIMNO (blue box). The component that links the two models is the ‘aggregated lakes’ table (LAGOS lakes) that has the unique identifier and spatial location for all 50,000 lakes. LAGOSGEO data are stored in horizontal tables that are all linked back to the spatial extents for which they are calculated and ultimately linked to each of the 50,000 individual lakes. The LAGOSGEO data includes information for each lake, calculated at a range of different spatial extents that the lake is located within (such as its watershed, its HUC 12, or its state). Each green box identifies a theme of data, the number of metrics that are calculated for that theme, and the number of years over which the data are sampled. LAGOSLIMNO data are stored in vertical tables that are also all linked back to the aggregated lakes table. The ‘limno values’ table and associated tables (in blue) include the values from the ecosystem-level datasets for water quality; each value also has other tables linked to it that describe features of that data value such as the water depth at which it was taken, the flags associated with it, and other metadata at the data value level. The ‘program-level’ tables (in purple) include information about the program responsible for collecting the data. Finally, the ‘source lakes’ table and associated tables include information about each lake where available. Note that a single source can have multiple programs that represent different datasets provided to LAGOS