Abstract
This paper describes a coding algorithm and corresponding dataset of negotiation interventions and negotiation interactions by country parties and groupings in the multilateral negotiations under the United Nations Framework Convention on Climate Change (UNFCCC). The data is obtained by scraping and automatically coding the negotiation summaries published in the Earth Negotiations Bulletins (ENBs) between 1995 and 2023. The data is validated by comparing it with a hand-coded dataset of negotiation interactions under the UNFCCC. One limitation discovered upon validation is that our automated procedure finds significantly fewer opposition interactions than the hand-coding procedure. The main reason for this is that the algorithm identifies negotiation interactions on the basis of individual sentences, while the hand coding is able to identify them across sentences and even paragraphs. However, the distribution of opposition interactions seems to be representative of the larger dataset and therefore not substantively biased. We describe possible uses of this data in research, and provide the algorithm, which can be adapted for application to other negotiations covered by the ENBs.
Subject terms: Politics, Climate change
Background & Summary
Climate change is one of the defining global challenges of our time. Understanding when, why, and which countries – and other stakeholders – decide to cooperate in addressing this challenge is therefore a key goal of environmental governance research. This article aims to help addressing this goal by offering an up-to-date dataset of countries’ oral interventions and exchanges during the meetings of the United Nations Framework Convention on Climate Change (UNFCCC).
The UNFCCC is the main multilateral body under which its 198 parties (197 countries plus the European Union) regularly meet to discuss and agree actions for addressing climate change. This includes, for example, the 1997 Kyoto Protocol and the 2015 Paris Agreement, and all decisions implementing the provisions in those agreements.
Meetings of the UNFCCC Conference of the Parties (COP) or its permanent subsidiary bodies – the Subsidiary Body for Implementation (SBI) and the Subsidiary Body for Scientific and Technological Advice (SBSTA) – have taken place at least every six months since the Convention entered into force in 1995, while other ad-hoc subsidiary bodies might meet more frequently when new agreements are being negotiated. During these meetings, delegates from each party and from admitted observer organizations (such as environmental NGOs, civil society groups, business associations, or intergovernmental organizations) get together during one or two weeks to exchange views on the implementation of existing agreements, discuss the need to establish new bodies, and negotiate new mitigation or adaptation measures. These discussions typically start with open plenaries in which the agenda is agreed, and which are open to observers. They then progress to specialized contact groups and other informal settings in which technical discussions on specific agenda items are held, or to drafting groups in which the text of new agreements is drafted1,2. Such informal meetings may or may not be open to observers, depending on how contentious the issues under discussion are. For the most intractable and final issues, facilitators conduct bilateral meetings with individual parties or party groups, trying to identify parties’ real bottom lines and potential areas of compromise. Secrecy and confidentiality are key components of these bilateral consultations3,4. Periodically, the progress of work under contact and informal groups and in bilateral discussions is reported back to and discussed in the plenary.
Another crucial characteristic of the UNFCCC negotiations is the tendency of parties to organize into groups – coalitions or alliances – of like-minded countries, which join forces to coordinate positions and negotiate together. These groups (called “groupings” in this paper) help to structure and simplify the discussions, as reaching agreement is easier between 15 or 20 groups than between almost 200 parties5.
As in many other multilateral bodies, including the Council of the European Union6,7, the World Trade Organization8, and even the UN General Assembly9, decisions under the UNFCCC are traditionally taken by consensus10. This means that the agreement of all parties – or at least a lack of open dissent – is needed for any decision to be taken. It also means that there are no records of role-call votes that can provide information about which parties support which decisions. In such a context, the discussions during the negotiation phase become crucial for understanding the decision-making process. However, there is hardly any systematic data available on consensus decision-making processes, despite how widespread they are (but see7,9,11 for some notable exceptions).
So far, such negotiations have therefore been analysed mostly through qualitative case studies of individual negotiation rounds4,12,13. While such in-depth analyses are very valuable due to their high level of detail and their frequent use of insider information, more systematic data can help gaining a comparative understanding of negotiation processes over time, or across issue areas.
While there are no official transcripts of the discussions in the UNFCCC meetings, since 1995 the International Institute for Sustainable Development (IISD) has published daily summaries of the negotiations in its Earth Negotiations Bulletins (ENBs). IISD publishes such summaries not only for the climate change negotiations, but also for around 50 other sustainable development or environment-related intergovernmental bodies and agreements (for an overview, see https://enb.iisd.org/negotiations). The ENBs thus represent a crucial historical record of environmental negotiations, which can be used to understand not only their substance and progress, but also how active the various parties have been over time, and how they interact with each other. The ENBs are written by experienced academics who are specifically trained to write independent, concise and objective reports14. As a result, the bulletins are clearly structured, and use clear, objective, and consistent language that is ideal for automatic coding.
The ENBs’ main target audience are negotiation delegates themselves who use them to follow the daily negotiation progress. However, researchers have used them in the past to map the issues being discussed in the climate negotiations and countries’ level of activity in the discussions14, to explain how differential treatment has affected bargaining behaviour in the negotiations11, to trace the emergence of negotiation coalitions (groupings) over time15, or to describe the positions of particular groupings16,17.
Some of this previous work has used hand-coding of the ENBs to collect systematic data on the negotiations11,18. This article builds upon that work to generate a Python package to systematically code the ENB negotiation summaries. It therefore aims to present a new, comprehensive dataset of participation, cooperation and conflict in the UN climate negotiations during the period 1995–2022. The dataset compiles information on:
which country parties or groupings (more generally called entities in the following sections) speak in the negotiations and when (interventions);
in which cooperative or conflictual ways do parties and groupings interact with each other through their statements in the negotiations (cooperative and conflictual interactions).
For each intervention and interaction, the dataset records the intervening actors, the negotiation date, an identifier for the respective ENB issue, the sentence within that ENB issue from which the information was obtained, as well as the corresponding headings and subheadings within the ENB issue. These headings and subheadings are then used to classify the interventions and interactions in terms of the negotiation body as well as the agenda item or issue area within the negotiations to which they relate. The data can be used, for example, to analyse which parties and groupings are active in the negotiations overall, at specific points in time, or with respect to specific issue areas. As an example, Fig. 1 illustrates how active all countries around the world have been in participating orally in the negotiations across the whole period covered by the dataset. Given that the EU usually speaks as one entity, all of its member states have been assigned the EU’s number of interventions. Otherwise, interventions by other groupings are not included in this illustration. Clearly, while Global North countries have been the most active individual countries so far, emerging economies including China, Brazil, India and several Latin-American countries have been very active as well.
Fig. 1.
Oral interventions in UNFCCC negotiations in the period 1995–2023, as reported in the ENBs. Only interventions by individual country parties are displayed in the map.
The dataset can also be used to explore which countries raise their own voice, and which ones rely on the groupings (coalitions and ad-hoc groups) they are member to in order to make their positions known. Social network analysis methods can be applied to the data to investigate who agrees or disagrees with whom at what points in time during the negotiations. Figure 2, for example, shows the network of cooperative negotiation interactions during the year 2015, in which the Paris Agreement was adopted. Nodes are classified according to whether the entity involved is an individual country or a country grouping. While many individual countries were active in the negotiations, groupings clearly take a central role.
Fig. 2.
Network of cooperative negotiation interactions in the year 2015. Individual countries displayed in light blue, groupings in yellow. The size of the nodes indicates degree centrality.
Figure 3, finally, shows two networks of cooperative negotiation interactions in year 2015, focusing only on negotiations relating to mitigation and on adaptation. Nodes are classified, this time, according to whether the entity involved belongs to the Annex I group of developed countries, or to the non-Annex I group of developing countries. The network graphs suggest that discussions on mitigation are quite polarized, with several small groups of countries completely separated from the main network. Notably, important developed country parties such as the United States and the EU have no ties to developing countries. The network of adaptation interactions, in contrast, is notably more cohesive, with more links between developed and developing countries. This suggests that discussions on adaptation are less contentious than those on mitigation. In addition, vulnerable countries, in particular the Least Developed Countries (LDCs) but also the Alliance of Small Island States (AOSIS), play a more central role in the adaptation discussions.
Fig. 3.
Network of cooperative negotiation interactions by issue area in the year 2015. Annex I parties displayed in brown, non-Annex I parties in green.
The goal of the dataset is therefore to provide a resource to other scholars for advancing research on the climate change negotiations, but also to offer an automatic coding algorithm that can easily be applied to other environmental negotiations reported in the ENBs. Both the dataset for the climate negotiations and the Python code for obtaining the data are available open source.
Methods
We develop an algorithm to extract oral interventions, as well as on behalf, agreement, support, and opposition interactions from the reports by the Earth Negotiations Bulletin (ENB). The extraction of interventions relies on a list of parties and groupings to identify each time each of them is mentioned in the substantive sections of the ENB reports. The algorithm to extract interactions relies on the grammatical structure of sentences to identify when parties and groupings speak on behalf of each other, agree together, support each other, or oppose one another. The ENB staff uses specific language and structure to report on the negotiations, such as “Australia, the UK, and the US proposed...” and “China, opposed by Vanuatu and Lesotho, suggested to...”, which makes it possible to define rules for extracting interactions between parties.
Our algorithm works in two steps:
Tag parties, groupings, and interactions markers (e.g., list of parties, “Party A opposed by Party B”, etc.) with a tailor-made part-of-speech tagger.
Parse tagged sentences to extract interventions and interaction data.
In total, we extract 57375 interventions and 86763 interactions from 91 meetings between 1995 and 2023.
Data Collection & Pre-processing
The ENB provides three types of reports of the climate negotiations. Curtain raisers are published on the first day of a negotiation meeting, and provide an overview of the state of play and the issues to be discussed at the meeting. The daily issues cover one day of negotiations, providing an account of the main points addressed during the discussions. Summary reports provide an overview of the discussions during the full 1- or 2-week meeting, as well as a summary of the outcomes (decisions and agreements taken) and an analysis of the process. For completeness, we extracted interventions and interactions from all types of ENB reports. However, because the curtain raisers do not offer an account of the current negotiations, and the summary reports typically repeat much of the information already provided in the daily issues, researchers using the data may wish to exclude curtain raisers and summary reports from their analysis.
The ENB reports provide an overview of all negotiations, while maintaining the confidentiality of sensitive discussions. This means that they explicitly mention which actors (country parties, country groupings, and observer organizations) make a statement or support which view only when reporting on meetings that are open to observers. However, when reporting on closed meetings, the reports just summarize the positions and proposals being discussed, but without attribution. The decision about when to close meetings to observers is taken by the parties on an individual basis. While the overall policy is to maintain transparency as far as possible, when discussions turn very sensitive they may be closed. This happens, in particular, towards the end of a negotiation, when the main sticking points need to be decided upon at a political level. Usually, the progress and results of closed discussions are periodically reported and discussed upon in plenary meetings that are open to observers, and therefore covered in our dataset. For this reason, we are confident that the dataset reflects discussions across all issue areas addressed in the negotiations, even if it only covers those discussions that are reported in full.
From the ENB website (https://enb.iisd.org), we retrieve issues of the bulletin in HTML format for all UNFCCC meetings between 1995 and 2023. In total, we download 748 issues that cover 29 Conferences of the Parties, 29 Subsidiary Bodies meetings, and 33 talks, summits, dialogues, and workshops. We preprocess the HTML files in order to recover a clean text from which we will extract interventions and interactions between parties. We keep both the titles (denoted by <strong> tags or <h> tags) and the content of paragraphs (denoted by <p> tags). We remove all other HTML tags. We also remove the sections “In the corridors” (rumours about the negotiations), “brief analysis” (short summary of the day), “things to look for” (pointers to key events for the upcoming day), and other similar sections containing commentary about – rather than summary of – the negotiations.
Finally, we split the text into sentences from which we will extract the interventions and interactions data. While doing this, the headings and subheadings preceding each sentence, as well as a paragraph counter, are concatenated into a variable “heading”. Because the data is split into individual sentences, the algorithm is unable to identify relational references across multiple sentences. The implications of this limitation are analyzed in the Section Technical Validation.
Multi-Word Tokenization
To identify parties and party groupings (together referred to as “entities”) in sentences, we define a list of parties and groupings with their aliases (i.e., all variations of an entity name, including typos). For example, the grouping “G77/China” appears under several spellings: “Group of 77 and China”, “G77-China”, and “G-77/China”. We provide this list together with the code. It includes 201 parties and 43 groupings. Groupings are identified based on their appearance in the ENB themselves, on official listings in the UNFCCC website (https://unfccc.int/process-and-meetings/parties-non-party-stakeholders/parties/party-groupings), and on the literature5.
The sentences in our dataset are then split into multi-words tokens derived from the list of entities. For example, “USA”, “the US”, and “United States of America” form three different tokens derived from the canonical “US” entity.
Part-of-Speech Tagging
To identify interactions in sentences, we use part-of-speech tagging with custom tags to mark up tokens. We tag party tokens with a <PAR> tag and grouping tokens with a <GRP> tag. We tag on behalf, support, and opposition interactions, marked for example with “on behalf of”, “opposed by”, and “supported by”, with <OBH>, <SUPP>, and <OPP> tags, respectively. We also tag agreement interactions, i.e., lists of parties and groups that are assumed to state the same negotiation position, with the <AGR> tag. It might happen that parties are listed for other reasons than sharing the same position, e.g., when the ENB lists the authors of various amendment proposals. This will be captured as an agreement by our algorithm, hence making it a false positive. We note, however, that this occurs very rarely. We report in Table 1 the full list of interactions and their markers.
Table 1.
List of interactions, their markers, and the recorded data points.
| Interaction | Tag | Markers [A “marker” B] | Example | Data point |
|---|---|---|---|---|
| Agreement | <AGR> | List of parties separated by commas and “and/or/with” | ‘The US, with AUSTRALIA, SLOVENIA, NORWAY, and CANADA, supported the consideration of a multi-year work programme, and the streamlining of each sessions agenda.’ | (A, B, agr.) & (B, A, agr.) |
| On behalf of | <OBH> | “speaking on behalf of”, “on behalf of”, “speaking for”, “for” | ‘VENEZUELA, for Bolivia, Cuba and Paraguay, expressed concerns with offsetting.’ | (A, B, on behalf) |
| Support | <SUP> | “supported by” | ‘JAMAICA, supported by SUDAN, the PHILIPPINES, VANUATU and ZAMBIA, called on the CMP to facilitate mobilization of additional funds during CMP 8.’ | (B, A, support) |
| Opposition | <OPP> | “opposed by”, “while”, “whereas” | ‘CANADA and the US, opposed by PAKISTAN and INDIA, suggested considering preambular text at a later stage of negotiations.’ | (B, A, opposition) |
Extracting Interactions With Regular Expressions
We use regular expressions to match specific structures in sentences and extract interactions. For each type of interaction, we define one or more patterns to match. For example, the regular expression <PAR><,>?<OBH><GRP><,>? is able to match both the patterns “Party A, on behalf of Grouping B,” and “Party A for Grouping B” to extract on behalf interactions. Because the order of interaction gives information about their nature, we also record the data points as ordered pairs. We show in Table 1 this ordering for each type of interaction.
All in all, this approach enables us to automatically extract interactions in sentences as complex as
“NORWAY, supported by AUSTRALIA and the EU, but opposed by BRAZIL, CHINA, INDIA and KENYA, suggested [...].”
From this sentence, our algorithm correctly extracts (i) an agreement between Brazil, China, India, and Kenya, (ii) their opposition to Norway, Australia, and the EU, and (iii) the support of Australia and the EU to Norway.
Interventions
As a by-product of tagging parties and groupings to extract interactions, it is straightforward to extract interventions, i.e., a party or party grouping speaking during the negotiations. We use the following rules to count interventions:
If party A speaks during a negotiation, then we count one intervention for A.
If party A speaks on behalf of a grouping B, then we count an intervention for the grouping B but not for the party A.
If party A speaks on behalf of another party B, then we count an intervention for both parties.
This enables us to extract 57375 interventions by parties.
Headings, Negotiation Bodies, and Issue Areas
We leverage the structural organization of the ENBs to extract information on negotiation bodies and issue areas. The ENBs use a hierarchical structure with headings and subheadings to organize content; usually, main headings identify negotiation bodies, while subheadings indicate specific agenda items or topics being discussed. However, the formatting conventions used in the ENBs have evolved over the years. To account for these variations, we designed our algorithm to recognize multiple structural markers in the HTML source code: primary headings (tagged as <h2>), secondary headings (<h3>), tertiary headings (<h4>), bolded text (<strong>), italicized text (<em>), as well as formatted text at the beginning of paragraphs that serves as implicit subheadings. By capturing all these different formatting patterns, our algorithm successfully identifies the relevant organizational structure for 56979 out of 57375 interventions (99.3%), so that only 396 interventions without a corresponding heading or subheading remain. Once extracted, we apply keyword-based methods to classify the headings and subheadings into specific negotiation bodies and issue areas.
Negotiation bodies are the various groups into which the negotiations are structured. They include, for example, the plenaries of the Conference of the Parties (COP) and of its permanent subsidiary bodies (SBSTA and SBI), but also groups that are created on an ad-hoc basis to negotiate new agreements (such as the Ad-Hoc Group on the Berlin Mandate – AGBM, tasked to negotiate the Kyoto Protocol, or the Ad Hoc Working Group on the Durban Platform for Enhanced Action – ADP, which was in charge of negotiating what became the Paris Agreement). There are also smaller bodies, such as contact groups or informal groups, usually convened to discuss specific issues in the agenda. In addition, the COPs include High-Level Segments where ministers or heads of state can issue statements. Information on the corresponding negotiation body is not available in the headings and subheadings of 16082 out of 57375 interventions (28.1%) and 23031 out of 86763 interactions (26.5%). There are several reasons for this. A central one is that the full dataset includes the curtain raiser issues of the ENBs. These offer an overview of previous developments in the negotiations, rather than a summary of the current ones. For this reason, in this case, headings and subheadings do not necessarily specify a negotiation body. In other cases, headings and subheadings include information on issue areas, but not on the negotiation bodies in which these are being discussed.
Issue areas reflect the substantive topics under negotiation. These include central aspects of addressing climate change, such as mitigation, adaptation, or the provision of financial support to developing countries. But also more specific and technical issues such as forestry and land use change, agriculture, reporting, compliance, or the science-policy interface. Discussions may also deal with the organization of the negotiations themselves, or with institutional arrangements. We classify issue areas using keywords selected with the purpose of identifying the various agenda items discussed under the UNFCCC. For selecting appropriate keywords, we rely on the classification of agenda items by Allan and Bhandary19, but expand their classification to include some more detailed sub-issues, in particular within the overall area of mitigation (such as, e.g., market mechanisms, forestry, or emissions from aviation and maritime transport). Issue areas may partly overlap (for example, when dealing with reporting about mitigation actions). We allow for such overlaps by creating individual dummy variables for each issue area. Overall, the classification of issue areas covers 65.3% of the interventions and 63.4% of the interactions. Missing data on issue areas is due to the fact that the ENB headings and subheadings may at times not include information on issue areas, but just on negotiation bodies. This is particularly the case when discussions take place in plenaries that may cover many issue areas at the same time, or in ad-hoc bodies negotiating various aspects of a new agreement.
For more complete coding of issue areas, researchers using this dataset can apply natural language processing methods to the sentences themselves to find out further details on the topics or issue areas addressed in negotiation interventions and interactions. We refrain from offering that type of classification within our algorithm because our aim is to offer a dataset that mirrors the structure of the negotiations, as reflected in the ENBs, as closely as possible. Applying NLP methods to the sentences adds a further layer of interpretation and subjectivity that each researcher should decide upon for themselves.
Data Records
The dataset comprises three main data files. All of them are available in csv format for ease of access and use, and can be obtained from the SWISSUBase data repository20.
The data file “2719_UN_ClimateNegotiations_Data_Interventions_v1.0.csv” records every time each party or grouping (i.e., each identified entity) is mentioned in the text of each ENB issue (following the coding rules above, and importantly, after excluding irrelevant sections such as the brief analysis of the meeting, news from the corridors, or the outlook to future meetings). It includes six variables, as shown in Table 2. A second version of the interventions data, including the classification of negotiation bodies and of issue areas, can be found in the file “2719_UN_ClimateNegotiations_Data_Interventions_with_issue_areas_v1.0.csv”.
Table 2.
Variables covered in the three data files.
| Data file | Variable | Variable type | Description |
|---|---|---|---|
| Interventions | id | Numeric | Numeric ID for each recorded negotiation intervention made by an entity (country or country grouping). |
| Interventions | issue_id | Numeric | Numeric ID for each ENB report recorded. Notice that these IDs do not correspond to the ENB issue numbers found in the pdf version of the bulletins. |
| Interventions | entity | String | Entity name or acronym. |
| Interventions | date | String | Date in which the negotiation intervention took place. |
| Interventions | heading | String | Headings and subheadings found in the ENB report that correspond to the current sentence. To better visualize the structure of the ENB report, a numbering of the paragraphs within the ENB report is prepended. |
| Interventions | sentence | String | Sentence from which the negotiation intervention was recorded. Notice that the same sentence may appear several times in the dataset if it mentions various entities. |
| Interactions | id | Numeric | Numeric ID for each recorded negotiation interaction between two entities. |
| Interactions | issue_id | Numeric | Numeric ID for each ENB report recorded. Notice that these IDs do not correspond to the ENB issue numbers found in the pdf version of the bulletins. |
| Interactions | entity_a | String | Name or acronym of first entity (country or grouping) participating in the negotiation interaction. |
| Interactions | entity_b | String | Name or acronym of first entity (country or grouping) participating in the negotiation interaction. |
| Interactions | type | String | Type of negotiation interaction recorded. There are four types: “agreement”, “on-behalf”, “support”, and “opposition”. Agreement takes place when Country A expresses the same position or statement as Country B. It is coded as a bidirectional interaction: if Country A agrees with Country B, then Country B also agrees with Country A. On behalf takes place when an ad-hoc group of countries (including Country B, but possibly also Country C, D, E, which would be coded as additional observations) delivers a common position statement on an issue, so that Country A is speaking for all of the others. This coding does not include instances when Country A speaks on behalf of a grouping to which it is a recognized member. It is coded as a unidirectional interaction from Country A towards Country B. Support takes place when Country B’s statement or position is supported by Country A. It is coded as a unidirectional interaction from Country A towards Country B. Opposition takes place when Country A opposes Country B’s statement or position. It is coded as a unidirectional interaction. |
| Interactions | date | String | Date in which the negotiation interaction took place. |
| Interactions | heading | String | Headings and subheadings found in the ENB report that correspond to the current sentence. To better visualize the structure of the ENB report, a numbering of the paragraphs within the ENB report is prepended. |
| Interactions | sentence | String | Sentence from which the current negotiation interaction was coded. |
| Issues | id | Numeric | Numeric ID for each ENB report recorded (same as issue_id above). Notice that these IDs do not correspond to the ENB issue numbers found in the pdf version of the bulletins. |
| Issues | issue_date | Date | Date covered by the ENB issue. |
| Issues | location | String | Location of the meeting (city, country). |
| Issues | meeting | String | Name of the negotiation meeting. |
| Issues | meeting_date | String | Whole time period covered by the negotiation meeting. |
| Issues | type | String | Type of ENB issue. For each negotiation meeting, the first issue is usually a “curtain raiser”, summarizing the developments before the meeting rather than during the meeting. At the end of each meeting, the last issue is a “summary”, which covers both discussions during the latest day of negotiations, and also a summary of the whole meeting and the decisions taken. In between there are daily “issues”, which cover the actual negotiations. Some meetings do not have daily issues and are only covered by a summary. For analysis of negotiation interactions, it may be advisable to remove all curtain raisers and also the summaries of those meetings that have daily issues. |
| Issues | url | String | Link to the html version of the ENB. |
The data file “2719_UN_ClimateNegotiations_Data_Interactions_v1.0.csv” records, for each ENB issue, which parties or groupings interacted with each other during the negotiations, and in what way. There are three types of cooperative interactions (agreeing with, supporting, or speaking on behalf of another entity), as well as one type of conflictual interaction (opposing another entity). A second version of the interactions data, including the classification of negotiation bodies and of issue areas, can be found in the file “2719_UN_ClimateNegotiations_Data_Interactions_with_issue_areas_v1.0.csv”.
The data file “2719_UN_ClimateNegotiations_Data_Issues_v1.0.csv” records the information about each ENB issue and the respective negotiation meeting covered, including, for example, its date, location, meeting type and number, as well as the link to the original ENB report (see Table 2 for further details). The ENB issue ID allows for combining the information from the three data files if necessary.
All variables are described in Table 2.
Two additional text files are provided. They list the parties and the party groupings that have been identified as being active in the negotiations, as well as their aliases, and were created manually. These lists were used as input for the algorithm to identify interventions and interactions.
The file “2719_UN_ClimateNegotiations_Data_Parties_v1.0.txt” lists all identified parties with a canonical name and, optionally, with further aliases (other ways in which the same party is mentioned in the ENBs). In addition, it lists, for each party, the groupings to which it is a member.
The file “2719_UN_ClimateNegotiations_Data_Groupings_v1.0.txt” lists all identified groupings (i.e., alliances or coalitions) with a canonical name and, optionally, with further aliases.
Finally, a codebook is provided in the file “2719_UN_ClimateNegotiations_Doc_Codebook.pdf”. This includes the data file descriptions, variable descriptions, as well as the lists of negotiation bodies and issue areas.
Technical Validation
To validate the above described dataset, we compared it to a previously hand-coded dataset11,18, which was set up in a very similar way. The hand-coded dataset uses the daily ENB issues about the UNFCCC negotiations to gather systematic information on the participation of parties (i.e., countries that are parties to the UNFCCC) and groupings (i.e., country alliances and coalitions) and their cooperative and conflictual negotiation behavior. The hand-coded dataset was collected between 2012 and 2015 by a group of researchers at the University of Zurich and covers the negotiations during the years 1995-2013. The coding procedure of this hand-coded dataset served as blueprint to develop the above presented algorithm.
In a nutshell, the research team read the reports and manually identified the senders and targets of cooperative and conflictual interactions between party-party or party-grouping dyads. In addition, the human coders annotated the type of interaction, the date in which it took place, the ENB issue in which it was reported, details of the meeting in which it took place, as well as a quote of the text from which this interaction was coded.
To identify the dyadic interactions, the coders identified sentences that contained specific formulations indicating the various types of cooperative and conflictual interactions. The following types of interaction indicate cooperation: “speaking on behalf of”, “supporting”, and “agreeing with one another”. The interactions “opposition” and “criticism” indicate conflict. As the interaction “criticism” only appears rarely and is rather difficult to distinguish from “opposition”, these two categories were aggregated for the analysis of the data11. Hence, both the hand-coded and the machine-coded dataset presented in this data descriptor rely on the same types of negotiation interactions.
The similarity of the coding procedure – particularly in terms of the substantial interaction categories identified – for both datasets suggests that a comparison between them is an appropriate tool to assess the validity of the machine-coded data. Hence, in this technical validation, we present a number of simple statistics that illustrate that the two data collection strategies yield similar results. Specifically, we compare the distribution of interaction categories and their development over time, the similarity of parties and party groupings identified, i.e. the overlap of identified parties and party groupings, as well as the distribution of their frequencies in both datasets.
However, there are also important differences between both coding procedures. Crucially, while the machine-coded dataset relies on individual sentences to identify negotiation interactions on the basis of a limited set of grammatical constructions, the human coders were allowed to read across sentences and to identify more varied and subtle formulations of support or opposition that may stretch across sentences or even full paragraphs. They were also able to interpret implicit references to particular actors (e.g., through the use of pronouns, such as in “The US said [...]. They emphasized that [...]”) to identify negotiation interactions. In the natural language processing community, this problem is known as coreference resolution (See http://nlpprogress.com/english/coreference_resolution.html). Several approaches have been proposed, but they all require large amounts of data to train powerful language models. The size of the present dataset is insufficient to train such models. By visual inspection, however, we determined that the number of occurrences of such implicit references in the ENB reports is relatively limited.
Furthermore, the human coders were also able to code instances of agreement or opposition when the content of the text (rather than its grammatical form) suggested this (e.g., “Party A supported option 1. Parties B and C preferred option 2. Party D also supported option 1.”). Overall, this resulted in a larger number of interactions identified in the hand-coded dataset, which is particularly noticeable for opposition interactions, as we will see below. These differences in the unit of analysis used in both coding procedures prevent us from applying conventional performance metrics such as precision or recall statistics.
There is also a slight difference in the way in which the “on behalf” interaction was coded. This is because there can be two different interpretations of instances in which one party speaks on behalf of another: The interpretation in the machine-code dataset emphasizes the fact that one actor (entity A) is more active in speaking on behalf of another actor (entity B). Here, the relationship is unidirectional. If entity A speaks on behalf of several other entities, this is recorded as several “on behalf” observations. In addition, bi-directional “agreement” between those other entities is recorded. The interpretation in the hand-coded dataset emphasizes the fact that a group of entities is speaking together. The assumption is that they have prepared a joint statement or joint intervention in advance. So, the relationship is bi-directional, and all combinations between entity A and the (possibly various) entity B are coded as a “joint statement”. To allow the comparison between both datasets, thus, the on behalf interactions in the machine-coded dataset were recoded to match the coding in the hand-coded one.
Finally, the hand-coded dataset only covers the period from 1995 to 2013. For this reason, we can only test the validity of the machine-coding algorithm up to this point in time.
Data Cleaning
Before comparison, some key data cleaning steps had to be done. As explained above, the machine-coded dataset includes all types of ENB reports, including not only the daily issues, but also the curtain raisers and the summaries. As the curtain raisers only provide context but do not report on the actual negotiations, and the summary reports repeat to a large extent the content of the daily issues for the same meeting, these were not included in the collection of the hand-coded dataset to avoid doublets and save time. To ensure comparability, we exclude them also from the machine-coded dataset. Only for the few meetings in which – due to budgetary limitations – no daily ENB issues were produced, both the hand-coded and the machine-coded datasets include the corresponding summaries.
Further, the machine-coded dataset contains an “others” party category. The algorithm identified this category in sentences like the following: “CANADA, supported by AUSTRALIA, and others ...”. The hand-coded dataset does not contain such an actor category, hence we remove it from the machine-coded dataset.
Finally, we ensure that all parties and groupings are named in the same way. To do so, we transform the names of country parties to ISO codes, and create consistent abbreviations for the country groupings.
Comparison across Interaction Categories over Time
In the following, we compare the coverage of both datasets across the identified interactions and parties and groupings over time. For this comparison, we focus on the period 1995-2013, since the hand-coded dataset covers only those years. We begin with a summary of the types of interactions recorded in both datasets, as presented in Table 3 and Fig. 4. Table 3 displays the number of interactions by type in (i) the full version of the machine-coded dataset, covering the period 1995–2023, (ii) the cleaned version of the machine-coded dataset, shortened to cover the same time period and ENB issues as the hand-coded dataset and with the “on behalf” interactions recoded, and (iii) the hand-coded dataset. It can be clearly noticed that the number of “on behalf” interactions has increased substantially as a result of the recoding. Figure 4 simply compares the two cleaned datasets in the period 1995–2013.
Table 3.
Summary statistics of machine-coded (full and clean versions) and hand-coded datasets.
| Interaction | Machine-coded full | Machine-coded 1995-2013 | Hand-coded 1995-2013 |
|---|---|---|---|
| (% of total) | (% of total) | (% of total) | |
| Agreement | 79472 (91.6%) | 29474 (76.9%) | 40812 (66.3%) |
| On behalf of | 797 (0.9%) | 5652 (14.7%) | 8172 (13.3%) |
| Opposition | 3934 (4.5%) | 1807 (4.7%) | 10080 (16.4%) |
| Support | 2560 (3.0%) | 1406 (3.7%) | 2482 (4.0%) |
| Total | 86763 | 38339 | 61546 |
Fig. 4.

Interaction types in hand-coded (green) and machine-coded (blue) datasets.
The first remarkable observation is that, when comparing the cleaned datasets, the hand-coded dataset covers many more interactions than the machine-coded dataset. Hence, the algorithm does indeed identify fewer interactions than human coders. The main reason for this deviation has been explained above: In the machine-coded dataset, the unit of analysis is the sentence, and the interactions are identified on the basis of a few predefined grammatical constructions and of explicit mentions of entity pairs. In contrast, the hand-coded dataset covers a larger set of expressions for each interaction, including some that are described across several sentences within the same paragraph.
However, as shown in Table 3 and also in Fig. 4, the proportion of the various types of interventions is quite similar in both datasets. The one exception are the opposition interactions. For this type of interaction, our algorithm identifies a much smaller set of observations than the human coders. For this reason, we manually investigate why there is such a large discrepancy for opposition interactions.
The results are shown in Table 4. Clearly, the vast majority of opposition interactions not identified by our algorithm were identified by the human coders across separate sentences. While some of them (around 17% of all opposition interactions) could potentially be found on the basis of words denoting opposition (markers such as “cautioned against”, “expressed concern”, “objected”, “opposed”, “preferred” or similar), most of them (around 61%) were coded by humans as opposition due to their substantive content. Modifying the algorithm to identify interactions across sentences would, however, imply a wholly different approach. Adding further markers to the current algorithm (such as “but”, “objected”, “disagreed”, “rejected”, “questioned”, “opposing”) would allow us to identify only up to 1.7% additional opposition interactions, while adding more complexity and risking false positives. We therefore refrain from modifying the algorithm despite these findings, but bear in mind that opposition may be underrepresented in the machine-coded dataset.
Table 4.
Opposition interactions: reasons for large discrepancy.
| Type of discrepancy | % of total |
|---|---|
| Interaction identified in both datasets (i.e., no discrepancy) | 15.9% |
| Interaction identified in machine-coded but not in hand-coded dataset | 2.2% |
| Interaction not identified in machine-coded dataset due to: | |
| Miscoding in hand-coded dataset | 0.4% |
| Whole ENB issue not being included | 0.6% |
| Complex sentence structure, despite existing markers | 3.0% |
| Markets not included in algorithm | 1.7% |
| Lack of markers (opposition implicit in content of sentence) | 0.6% |
| Being found in separate sentence (with clear markers) | 17.4% |
| Being found in separate sentence (based on their substantive content) | 60.5% |
Figure 5 visualizes the distribution of the four different interaction types over time. Clearly, both datasets reveal similar patterns of interaction. Even though the hand-coded dataset contains overall a greater number of interactions, the machine-coded dataset reflects the valleys and peaks of the time series very well. Again, the exception is the opposition category, where the machine-coded and hand-coded dataset deviate more strongly.
Fig. 5.
Interaction types over time, hand-coded (green) and machine-coded (blue) datasets.
Similarity of Parties and Groupings
Even if the machine-coding procedure identifies fewer negotiation interactions than human coding, the results are still valid if they cover a representative sample of those interactions. Above, we have seen that this is the case for most interaction types. Here, we focus on the extent to which the machine-coded data contains the same or at least a very similar set of parties and groupings, both in their role as senders and as targets of an interaction. Here, a sender is the actor that is active in the interaction, and the target is the one that receives it. For example, in the sentence “Entity A supports entity B”, entity A is the sender and entity B the target. In contrast, in the expression “Entity C, opposed by entity D, stated...”, entity D is the sender and entity C is the target.
We evaluate the coverage of senders and targets in both datasets over time and assess whether the respective sets also contain the same parties and groupings. Our results suggest two general observations. First, the number of senders and targets in both datasets is very similar (see Table 5). Also over time, both datasets cover a very similar number of senders and targets (see Fig. 6).
Table 5.
Total and unique number of senders and targets.
| Characteristic | Hand-coded | Machine-coded | ||||
|---|---|---|---|---|---|---|
| Total | Unique | % unique | Total | Unique | % unique | |
| Number of senders | 212 | 8 | 3.8% | 209 | 5 | 2.4% |
| Number of targets | 214 | 11 | 5.1% | 208 | 5 | 2.4% |
Fig. 6.

Sender and target countries over time, hand-coded (green) and machine-coded (blue) datasets.
Second, the two datasets contain also overwhelmingly the same parties and groupings within their respective sets of senders and targets. Put differently, only a small share of senders and targets are found only in one of the datasets (i.e., is “unique” to it). So, only 5 parties or groupings (2.4%) appear as senders in the machine-coded dataset only, and only 8 entities (3.8%) are unique to the hand-coded set of senders. The picture is similar for the targets of an interaction. We observe 5 (2.4%) unique targets in the machine-coded dataset and 11 (5.1%) in the hand-coded dataset. Overall, it can be stated that the observed differences are rather small. A closer look into both datasets reveals that the deviations are partly due to a handful of countries that have very rarely made statements in the negotiations (Serbia-Montenegro, North Korea, Liechtenstein, and Republic of the Congo), but mostly due to a wider coverage of groupings in the hand-coded dataset. This is related, on the one hand, to some differences in the classification of groupings in both datasets. For example, the “Umbrella Group” of industrialized countries emerged from the older group “JUSCANZ”, after Switzerland decided to leave it. While both groups are named separately in the machine-coded dataset, they are given the same name in the hand-coded dataset. Similarly, the group of small island developing states (recognized as SIDS under the United Nations) is usually represented by the “Alliance of Small Island States” (AOSIS). While both names are identified in the machine-coded dataset, only the spelling AOSIS is used in the hand-coded dataset. On the other hand, other groupings (Central America, OPEC, COMIFAC) do not appear in the machine-coded dataset because they are only involved in interactions recorded over several sentences, at least during the period compared with the hand-coded dataset.
Investigating Activity and Popularity of Parties and Groupings
Beyond the number and similarity of senders and targets across the two datasets, it is also important to understand if the datasets identify the same parties or groupings to be important, or put differently to be “central” to the UNFCCC negotiations. Further, we assess whether the two data collection techniques find the same most common interaction dyads, i.e. pairs of parties or groupings that interact with each other. We investigate this in two ways.
First, we identify parties and groupings with the highest total number of mentions as senders or targets in both datasets (full time coverage), respectively, and compare if the respective subsets are comprised of the same parties and groupings. Specifically, we investigate the top 10, 20, and 30 senders and targets. For the top 10 senders and targets, we also look at potential differences between cooperative and conflictual interactions. The analysis shows that the two datasets are similar with regards to the most important senders and targets, also for the conflictual interactions (see Table 6 and Figs. 7, 8 and 9). The maximum deviation is to be found among the top 30 senders, with three parties or groupings being unique to the respective dataset. The share of deviation does not surpass 10% across all subsets, at least when looking together at cooperative and conflictual interactions.
Table 6.
Total and unique most frequent senders and targets.
| Hand-coded (% unique) | Machine-coded (% unique) | |
|---|---|---|
| Unique top 10 senders | 1 (10.0%) | 1 (10.0%) |
| Unique top 10 targets | 1 (10.0%) | 1 (10.0%) |
| Unique top 20 senders | 1 (5.0%) | 1 (5.0%) |
| Unique top 20 targets | 1 (5.0%) | 1 (5.0%) |
| Unique top 30 senders | 3 (10.0%) | 3 (10.0%) |
| Unique top 30 targets | 3 (10.0%) | 3 (10.0%) |
| Unique top 10 senders, cooperation | 1 (10.0%) | 1 (10.0%) |
| Unique top 10 targets, cooperation | 2 (20.0%) | 2 (20.0%) |
| Unique top 10 senders, conflict | 0 (0.0%) | 0 (0.0%) |
| Unique top 10 targets, conflict | 1 (10.0%) | 1 (10.0%) |
Fig. 7.
Top 10 sender and target countries, full datasets.
Fig. 8.
Top 10 sender and target countries, only cooperative interactions.
Fig. 9.
Top 10 sender and target countries, only conflictual interactions.
More concretely, Figs. 7 to 9 show that across both the cooperative and the conflictual interactions, the EU, USA, Japan, Australia, Canada, China and Saudi Arabia are always among the top 10 most active senders and most popular targets in both datasets; in addition, the Group of 77 and China (G77) appears prominently among the conflictual interactions in both datasets. In short, these are the most actively engaged actors in the oral climate negotiations. A similar conclusion can be reached for the top 20 and the top 30 senders and targets. Nonetheless, while this shows that the machine-coded and the hand-coded datasets are consistent in identifying overwhelmingly the same parties and groupings as being the most active and the most popular, the order of the parties within those top 10 deviates between the two datasets, as the bar colours in Figures 7 to 9 illustrate.
Second, we study whether the two datasets yield a similar pattern of interactions, i.e. whether the same pairs of negotiation partners are dominant in the negotiations. Our results show that also in terms of negotiation partners the two datasets yield similar results. Specifically, we investigate this for the top 20 pairs and separately for cooperative and conflictual interactions, see Figs. 10 and 11. Substantively, these figures show that both datasets consistently identify pairs of developed countries, such as USA – EU, USA – Australia, or EU – Japan as being among the most cooperative ones, while the conflictual interactions are dominated by pairs of developed and developing countries or groupings, including the EU – G77, USA – G77, or EU – Saudi Arabia.
Fig. 10.
Top 20 country pairs, only cooperative interactions.
Fig. 11.
Top 20 country pairs, only conflictual interactions.
Importantly for this technical validation, the composition of the top negotiation pairs for both the cooperative and the conflictual interactions does not deviate significantly between both datasets, even though their order does (again, as illustrated by the bar colours). This helps to reduce our concerns regarding the conflictual interactions: Even though the machine-coding procedure identifies significantly fewer conflictual interactions, those that are identified follow a similar distribution than in the hand-coded dataset. This supports our belief that there are no systematic deviations between both datasets, and that the machine-coding procedure does a good job of identifying the true patterns of cooperation and conflict described in the ENBs. For this reason, while the machine-coded dataset may not be appropriate for making inferences about the relative number of conflictual interactions with respect to the cooperative ones, it seems to be representative enough for making inferences -within- the set of opposition interactions.
Summary of Findings of Technical Validation
Although we can observe differences between the hand-coded and the machine-coded datasets, particularly in terms of the number of negotiation interactions identified, the similarities in terms of distribution of those interactions across interaction types, single actors and actor pairs dominate. The largest differences are related to the number of opposition interactions. These differences, however, disappear mostly when looking at the top 10 to top 30 parties and groupings and negotiation pairs. We conclude, therefore, that the overwhelming similarities should allow researchers to reach the same general conclusions when using the one or the other dataset.
Usage Notes
To produce the dataset, a Python package called “enb-mining” has been created. It is available in the following GitHub repository: https://github.com/victorkristof/enb-mining). Setup instructions for running the package are detailed in the readme file included in the repository. Importantly, two lists of entities need to be created in order for the package to work: A list of parties with their aliases and, optionally, the groupings that they belong to, and a list of groupings with their aliases. These lists are provided for the case of the UNFCCC negotiations. In addition, the classification of headings into negotiation bodies and issue areas relies on keywords that are specific to the climate change negotiations.
To adapt the algorithm to collect information from other environmental negotiations covered by the ENBs, several small amendments to the scripts need to be made. First, the specific negotiation to be coded and the path to it within the ENB website need to be specified in the script client.py. Second, in the script 1-list-issues.py, the list of missing meetings needs to be adapted (in case some relevant ENB issues – for example from the old archives – exist but are not listed in the current ENB website structure) or turned to empty. Also in this script, the negotiation to be coded needs to be specified in accordance to how it is defined in client.py, and the number of website pages updated. Third, the lists of parties and especially of groupings need to be adapted to include those entities that are relevant to the negotiation being recorded. Fourth, the lists of negotiation bodies and issue areas and their respective keywords, defined in the script “5-classify-headings.py”, need to be adapted.
Acknowledgements
We acknowledge the financial support from the Swiss Network for International Studies (SNIS) under project CC20035: “What International Negotiators Promise and Domestic Policymakers Adopt: Policy and Politics in the Two-Level Climate Change Regime”, as well as from the Swiss National Science Foundation (SNSF) and the French National Research Agency (ANR) under the project “Beyond Coalitions: Small States in Climate Negotiations (BeCoSS Climate)” (grant number 219067).
Author contributions
Data and paper conceptualization and writing: P.C., V.K. and M.K. ENB coding algorithm: T.C. (early version), V.K. (supervision of T.C. and improved algorithm), P.C. (recording of headings, negotiation bodies and issue areas). Early data quality checks: P.C. Technical validation: M.K. with support from P.C. We also acknowledge research support by Mathis Do Cao.
Data availability
The dataset has been deposited to the SWISSUBase data repository and is available under 10.48573/8vqm-7z9820.
Code availability
As described above under usage notes, the code for creating the dataset is available in the following GitHub repository: https://github.com/victorkristof/enb-mining. The code and data for the technical validation are available as a subdirectory in the same repository.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Paula Castro, Victor Kristof.
References
- 1.Bodansky, D., Brunnée, J. & Rajamani, L. International Climate Change Law (Oxford University Press, Oxford, New York, 2017).
- 2.Yamin, F. & Depledge, J. The international climate change regime: a guide to rules, institutions and procedures (Cambridge University Press, Cambridge; New York, 2004).
- 3.Chan, N. The temporal emergence of developing country coalitions. In Klöck, C., Castro, P., Weiler, F. & Blaxekjær, L. Ø. (eds.) Coalitions in the Climate Change Negotiations, 53–69, 10.4324/9780429316258 (Routledge, Abingdon, 2021).
- 4.Dimitrov, R. S. The Paris agreement on climate change: Behind closed doors. Global Environmental Politics16, 1–11, 10.1162/GLEP_a_00361 (2016). [Google Scholar]
- 5.Klöck, C., Castro, P., Weiler, F. & Blaxekjær, L. Ø. Coalitions in the Climate Change Negotiations (Routledge, Abingdon, 2020).
- 6.Naurin, D. Generosity in intergovernmental negotiations: The impact of state power, pooling and socialisation in the Council of the European Union. European Journal of Political Research54, 726–744, 10.1111/1475-6765.12104 (2015). [Google Scholar]
- 7.McKibben, H. E. & Western, S. D. Levels of Linkage: Across-Agreement versus Within-Agreement Explanations of Consensus Formation among States. International Studies Quarterly58, 44–54, 10.1111/isqu.12071 (2014). [Google Scholar]
- 8.Ehlermann, C.-D. & Ehring, L. Decision-Making in the World Trade Organization: Is the Consensus Practice of the World Trade Organization Adequate for Making, Revising and Implementing Rules on International Trade? Journal of International Economic Law8, 51–75, 10.1093/jielaw/jgi004 (2005). [Google Scholar]
- 9.Häge, F. & Hug, S. Consensus Decisions and Similarity Measures in International Organizations. International Interactions42, 503–529, 10.1080/03050629.2016.1138107 (2016). [Google Scholar]
- 10.Vihma, A. Climate of Consensus: Managing Decision Making in the UN Climate Change Negotiations. Review of European, Comparative & International Environmental Law24, 58–68, 10.1111/reel.12093 (2015). [Google Scholar]
- 11.Castro, P. & Kammerer, M. The Institutionalization of a Cleavage: How Differential Treatment Affects State Behavior in the Climate Negotiations. International Studies Quarterly65, 683–698, 10.1093/isq/sqab045 (2021). [Google Scholar]
- 12.Obergassel, W. et al. Turning Point Glasgow? An Assessment of the Climate Conference COP26. Carbon & Climate Law Review15, 271–281, 10.21552/cclr/2021/4/4 (2021). [Google Scholar]
- 13.Odell, J. S. Breaking Deadlocks in International Institutional Negotiations: The WTO, Seattle, and Doha. International Studies Quarterly53, 273–299, 10.1111/j.1468-2478.2009.00534.x (2009). [Google Scholar]
- 14.Venturini, T. et al. Three maps and three misunderstandings: A digital mapping of climate diplomacy. Big Data & Society1, 2053951714543804, 10.1177/2053951714543804 (2014). [Google Scholar]
- 15.Castro, P. & Klöck, C. Fragmentation in the Climate Change Negotiations: Taking Stock of the Evolving Coalition Dynamics. In Klöck, C., Castro, P., Weiler, F. & Blaxekjær, L. Ø. (eds.) Coalitions in the Climate Change Negotiations, 17–34, 10.4324/9780429316258 (Routledge, Abingdon, 2021).
- 16.Watts, J. AILAC and ALBA: Differing visions of Latin America in climate change negotiations. In Klöck, C., Castro, P., Weiler, F. & Blaxekjær, L. Ø. (eds.) Coalitions in the Climate Change Negotiations, 156–174, 10.4324/9780429316258 (Routledge, Abingdon, 2021), 1 edn.
- 17.Blaxekjær, L. Ø., Lahn, B., Nielsen, T. D., Green-Weiskel, L. & Fang, F. The narrative position of the Like-Minded Developing Countries in global climate negotiations. In Klöck, C., Castro, P., Weiler, F. & Blaxekjær, L. Ø. (eds.) Coalitions in the Climate Change Negotiations, 113–135, 10.4324/9780429316258 (Routledge, Abingdon, 2021), 1 edn.
- 18.Castro, P. Relational Data between Parties to the UN Framework Convention on Climate Change, 10.7910/DVN/IVLEHB (2017).
- 19.Allan, J. I. & Bhandary, R. R. What's on the agenda? UN climate change negotiation agendas since 1995. Climate Policy24, 153–163, 10.1080/14693062.2022.2120453 (2024). [Google Scholar]
- 20.Castro, P., Kristof, V., Kammerer, M. & Cogne, T. Participation, Cooperation and Conflict in UN Climate Negotiations, 10.48573/8VQM-7Z98 (2025). [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The dataset has been deposited to the SWISSUBase data repository and is available under 10.48573/8vqm-7z9820.
As described above under usage notes, the code for creating the dataset is available in the following GitHub repository: https://github.com/victorkristof/enb-mining. The code and data for the technical validation are available as a subdirectory in the same repository.









