Scientific Opinion on the state of the art of Toxicokinetic/Toxicodynamic (TKTD) effect models for regulatory risk assessment of pesticides for aquatic organisms

. 2018 Aug 23;16(8):e05377. doi: 10.2903/j.efsa.2018.5377

ASPECT OF THE MODEL TO BE EVALUATED BY THE RISK ASSESSOR – GUTS model application for lethal effects		Yes	No
1. Evaluation of the problem definition The problem definition needs to explain how the modelling fits into the risk assessment and how it can be used to address the specific protection goals. For GUTS, questions to be answered are likely to be those that are set out in Chapter 3. Nevertheless, the problem definition should make clear the following points:
(a)	Is the regulatory context for the model application documented?
(b)	Is the question that has to be answered by the model clearly formulated?
(c)	Is the model output suitable to answer the formulated questions?
(d)	Was the choice of the test species clearly described and justified, also considering all the available valid information (including literature)?
(e)	Is the species to be modelled specified? – Is it clear whether the model is being used with a Tier‐1 test species i.e. Tier‐2C₁ or with one or more relevant species (which might include the Tier‐1 species) i.e. Tier‐2C₂?
2. Evaluation of the quality of the supporting experimental data In this part of the evaluation, it is checked whether the experimental data with which the model is compared (both calibration and validation data sets) have been subjected to quality control. The focus is on the data quality, i.e. the laboratory conditions, set‐up, chemical analytics and similar. Additional specific criteria for the suitability of the data sets for model calibration and validation are evaluated later in more detail (Sections 7 and 9 of this checklist).
(a)	Has the quality of the data used been considered and documented? (see list of OECD test guidelines in Chapter 7, Table 6)
(b)	Have all available data been used (either for calibration or for validation)? If not, is there a justification why some information has not been used?
(c)	Is it checked whether the actual exposure profile in the study matches the intended profile in the test (+/− 20%); if not, are then measured concentrations used for the modelling, instead of nominal ones?
3. Evaluation of the conceptual model Providing GUTS models are being used to address mortality/immobility effects in fish or invertebrates, the conceptual model will be suitable to address the specific protection goals; so, no further evaluation is required (see Chapters 2.1, 2.2 and 4.1).
4. Evaluation of the formal model The formal model contains the equations and algorithms to be used in the model. For GUTS models, the equations are standardised, so that no further check is necessary (see Chapter 4.1.1). It has to be documented, however, which GUTS model version is used (e.g. full or reduced model).
5. Evaluation of the computer model The formal model is converted into a model that can run on a computer (the computer model). For GUTS models, the computer model can be tested by showing the model performance for the GUTS ring‐test data and performing some further checks (see Section 7.5).
(a)	Is the used implementation of GUTS tested against the ring‐test data set (see Section 4.2)?
(b)	Were GUTS parameters estimated for the ring‐test data and compared to the reference values, including confidence or credible intervals (Appendices B.6 and B.7)?
(c)	Is a set of default scenarios (e.g. standard scenarios, extreme cases, see Section 4.1.2) simulated and checked?
(d)	Are all data and parameters provided to allow an independent implementation of GUTS to be run?
6. Evaluation of the regulatory model – the environmental scenarios For GUTS models using FOCUS simulations (or other Member State‐specific exposure simulations) as exposure input, no further definition and check of the environmental conditions is needed, since pesticide concentrations will be generated using the relevant FOCUS simulations (or MS‐specific exposure simulations), which consider factors such as soil, rainfall and agronomic practice, and the (effect) model will have been calibrated based on data collected under standard laboratory conditions. Fixing the environmental scenario to the conditions of the calibration experiments is considered appropriate because the modelling will be used with the equivalent of Tier‐1 or Tier‐2 Assessment Factors, so an extrapolation from laboratory to field conditions is already covered.
7. Evaluation of the regulatory model – parameter estimation Parameter estimation requires a suitable data set, the correct application of a parameter optimisation routine, and the comprehensive documentation of methods and results. Model parameters are always estimated for a specific combination of species and compound (see Chapter 3 for background information). Supporting data for GUTS models are mortality or immobility data, have to be of sufficient quality (Section 2 in this checklist) and fulfil a set of basic criteria. Please check the following items to evaluate the calibration data, and the parameter optimisation process and the results (see Sections 4.1.3.1 and 7.6.2):
(a)	Is it clear which parameters have been taken from literature or other sources and which have been fitted to data? If used, are values from literature reasonable and justified?
(b)	Are raw observations of mortality or immobility reported for at least five time‐points15?
(c)	Does calibration data span from treatment levels with no effects up to strong effects, ideally up to full effects (e.g. 0% survival)?
(d)	Have all data available for calibration been used? If not, is there a justification?
(e)	Has attention been paid in terms of adjusting the time course of the experiment to capture the full toxicity of the pesticide?
(f)	Has the model parameter estimation been adequately documented, including settings of optimisation routines, and type and settings of the numerical solver that was used for solving the differential equations?
(g)	If Bayesian inference has been used, are priors on model parameters reported? If a frequentist approach has been used, are starting values for the optimisation reported?
(h)	Are the estimated parameter values reported including confidence/credible intervals?
(i)	Is the method to get these limits reported and documented?
(j)	Are the optimal values of the objective function for calibration (e.g. log‐likelihood function) as the result of the parameter optimisation reported?
(k)	Are plots of the calibrated GUTS models in comparison with the calibration data over time provided, and does the visual match appear of acceptable quality?
(l)	Has a posterior predictive check been performed and documented?
8. Evaluation of the sensitivity and uncertainty analysis For the reduced GUTS models, the influence of the model parameters on the model results are known well enough. Results of sensitivity analyses can, if contained, demonstrate that the model implementation is done correctly. For other GUTS models than the reduced, sensitivity analyses should be included for future applications and be checked by the following list.
(a)	Has a sensitivity analysis been performed and adequately documented16?
(b)	Are the results of the sensitivity analysis presented so that the most sensitive parameters can be identified?
(c)	Is the parameter uncertainty for the most important TKTD model parameters propagated to the model outputs and the results of the uncertainty propagation been documented?
(d)	Are the model outputs reported including confidence/credible intervals?
9. Evaluation of the model by comparison with independent measurements (model validation) Validation data are used to test the model performance for predictions of mortality/immobility under exposure profiles which have not been used for model calibration. The performance of the model is usually evaluated by comparing relevant model outputs with measurements (often referred to as model validation). For GUTS, relevant outputs are the simulated mortality/immobility probability or LP_X/EP_X values. The following checklist is mandatory only for invertebrates; for vertebrates, a case‐by‐case basis check needs to be done (see also Sections 7.7.2 and 4.1.4.5)
(a)	Are effect data available from experiments under time‐variable exposure?
(b)	Is mortality or immobility reported at least for 7 time‐points in the validation data set?
(c)	Are two exposure profiles tested with at least two pulses each, separated by no‐exposure intervals of different duration length?
(d)	Is the individual depuration and repair time (DRT₉₅) calculated, and is the duration of the no‐exposure intervals defined accordingly?17
(e)	Is each profile tested at least at 3 concentration levels, in order to obtain low, medium and high effects at the end of the respective experiment?
(f)	Has attention been paid to the duration of the experiments considering the time course of development in toxicity of the specific pesticide?
(g)	Does the visual match (‘visual fit’ in FOCUS Kinetics (2006)) of the model prediction quality indicate acceptability of the model predictions in comparison with the validation data?
(h)	Do the reported quantitative model performance criteria (e.g. PPC, NRMSE, SPPE) indicate a sufficient model performance?
(i)	Has the performance of the model been reported in an objective and reproducible way?
10. Evaluation of model use When using a TKTD model for regulatory purposes, the inputs of species‐ and compound specific model parameters and of exposure profile data are required to run the model under new conditions. In this stage, it is important that the model is well documented and that it is clear how the model works. Please check the following items:
(a)	Is the use of the model sufficiently documented?
(b)	Is an executable implementation of the model made available to the reviewer, or Is at least the source code provided?
(c)	Has a summary sheet been provided by the applicant? The summary sheet should provide quick access to the comprehensive documentation with sections corresponding to the ones of this checklist.
(d)	Does the exposure profile used with the TKTD model come from the same source as the PEC used with the Tier‐1 effects data? For example, if FOCUS Step 3 maximum values were used at Tier‐1, are the exposure profiles used Tier‐2C from the same FOCUS Step 3 modelling? If the exposure profile comes from any other source (e.g. different scenarios, different inputs, different model) has this been checked?
(e)	Further points to be checked by evaluators Use an independent implementation of GUTS to test whether the output of the evaluated model implementation can be reproduced for some parameter sets. The MOSAIC_GUTS web‐platform (http://pbil.univ-lyon1.fr/software/mosaic/guts) can be used to test the model calibration The GUTS Shiny App (http://lbbe-shiny.univ-lyon1.fr/guts-shinyapp/) to test model predictions under a specific constant or time‐variable exposure profile given the set of model parameters can be used.