To disentangle the model-derived predictions into distinct linguistic dimensions, the lexical predictions were analyzed. For the prediction about syntactic category, POS tagging was performed over all potential sentences (e.g., “It made the boy sad to think,” “It made the boy sad to see,” etc.). To compute the phonemic prediction, each predicted word was decomposed into its constituent phonemes, and the predicted probabilities were used as a contextual prior in a phoneme model (Fig. 6). For the semantic prediction, a weighted average was computed over the GLoVe embeddings of all predicted words to retrieve the expected semantic vector, based on GPT-2.