Skip to main content
. 2024 May 28;12:RP84855. doi: 10.7554/eLife.84855

Figure 1. Data and processing pipeline overview.

(a), left, depicts an example news article and the type of data extracted from the text. Green and blue highlighted text depicts all quotes, and associated speakers identified by the coreNLP pipeline. A custom script described in section Methods identifies all citations. (a), right, charts the analyses done on the extracted names and locations from news articles and papers published by Nature. (b) shows the types and amounts of articles that we have used for analyses.

Figure 1.

Figure 1—figure supplement 1. Benchmark data.

Figure 1—figure supplement 1.

The performance of gender prediction for pipeline-identified quoted speakers.