Abstract
Recent works have introduced prompt learning for Event Argument Extraction (EAE) since prompt-based approaches transform downstream tasks into a more consistent format with the training task of Pre-trained Language Model (PLM). This helps bridge the gap between downstream tasks and model training. However, these previous works overlooked the complex number of events and their relationships within sentences. In order to address this issue, we propose Event Co-occurrences Prefix Event Argument Extraction (ECPEAE). ECPEAE utilizes the co-occurrences events prefixes module to incorporate template information corresponding to all events present in the current input as prefixes. These co-occurring event knowledge assist the model in handling complex event relationships. Additionally, to emphasize the template corresponding to the current event being extracted and enhance its constraint on the output format, we employ the present event bias module to integrate the template information into the calculation of attention at each layer of the model. Furthermore, we introduce an adjustable copy mechanism to overcome potential noise introduced by the additional information in the attention calculation at each layer. We validate our model using two widely used EAE datasets, ACE2005-EN and ERE-EN. Experimental results demonstrate that our ECPEAE model achieves state-of-the-art performance on both the ACE2005-EN dataset and the ERE dataset. Additionally, according to the results, our model also can be adapted to the low resource environment of different training sizes effectively.
Subject terms: Computer science, Software
Introduction
Event Argument Extraction (EAE) is a critical subtask within the task of Event Extraction (EE)1–4. EAE aims to extract the arguments of event mentions and associate them with predefined roles within event types, forming structured event knowledge5–8. This structured event knowledge can be further utilized to improve downstream tasks such as dialogue systems9, recommendation system10, question answering11–13, information retrieval, knowledge graph construction, and public affairs management14–16. However, previous prompt-based approaches in EAE typically only consider the current event type being extracted, overlooking the possibility of multiple events and their complex relationships within the input. This issue has not yet been effectively resolved.
As shown in figure 1, given the event type Movement: Transport present in the event mention and its corresponding trigger take, three relevant arguments are extracted and associated with the corresponding roles: someone (Agent), rifle (Artifact), mall (Destination). Given the presence of another event Life: Injure in the event mention and the corresponding trigger injure, four relevant arguments are extracted and associated with the corresponding roles: someone (agent), people (victim), rifle (instrument), mall (place). Moreover, a single input contains two different events triggered by different trigger words, yet they share multiple token spans as their arguments, such as someone, rifle, and mall. This indicates a strong causal relationship between these two events17. However, previous works18–20 neglect the influence of other events in the input during the extraction task and only focus on the extraction of a single event being considered. How to leverage the information from different events in the input to help the model handle complex event relationships among multiple events remains a challenging task.
Fig. 1.
Two examples of event extraction from event mention, which two event records (Movement: Transport and Life: Injure) were obtained.
To address the aforementioned issue, as shown in the figure 2, we propose Event Co-occurrences Prefix Event Argument Extraction (ECPEAE). In order to fully leverage the causal relationships among multiple events present in the input, ECPEAE introduces the co-occurrences events prefixes module. This module concatenates and encodes templates corresponding to all event types in the input, generating a dense information matrix. This matrix is then incorporated into the attention calculations of each layer in Pre-trained Language Model (PLM) in the form of prefixes20,21. Through this approach, ECPEAE can utilize the information from all events in the input, thus assisting the model in handling complex event relationships more effectively. Furthermore, to enhance the constraints of the current event’s template on the output and expand the potential role relationship information within the current template, we design the present event bias module. This module extracts potential role relationship information from the template corresponding to the current event and incorporates this information as biases into the attention calculations of each layer in PLM. Lastly, we introduce an adjustable copy mechanism to overcome potential noise introduced by the additional information in each layer’s attention calculation.
Fig. 2.
An illustration of ECPEAE for predicting a Contact:contact event. The backbone of ECPEAE is BART-large, which consists of 12 layers of encoder and 12 layers of decoder. The left part of the figure shows the overall structure of the present event bias module, which
,
,
respectively represent input-1, input-2 and input-3 in the present event bias module.
In addition, we find that the model achieves excellent performance in low resource environments as well, being able to perform the event argument extraction at different percentages (3%, 5%, 10%, 20%, 30% and 50%) of low resource environments and achieving better results than the base model DEGREE(EAE).
We summarize our contributions as follow:
We propose the co-occurrences events prefixes module, which utilizes information from the templates corresponding to all event types present in the events to help the model handle complex event relationships among multiple events.
We introduce the present event bias module to expand the potential role relationship information within the current template. We also incorporate an adjustable copy mechanism to eliminate potential noise introduced by the additional information.
We experimentally validate the outstanding performance of ECPEAE in a full-resource setting on the ACE2005-EN and ERE datasets. ECPEAE achieves state-of-the-art results on both datasets. Moreover, in extremely low-resource environments, it outperforms the baseline model DEGREE by over 10% and surpasses the current state-of-the-art models across various data proportions.
Related work
Previously, the Event Argument Extraction (EAE) task22always appeared as a sub-task of the Event Extraction (EE) task4,23, and the previous work usually performed event argument extraction after completing the Event Detection (ED) task, where the ED task aims to obtain the trigger words and corresponding event types present in event mentions. However, as the field of ED has matured24,25, the EAE task has been viewed as an important challenge to the EE task and thus studied separately.
Classification-based learning
Prior work has typically viewed EAE as a classification task, leading to the emergence of several classification-based models26–32. Some of these models will incorporate global features to do joint extraction29, some will annotate event types as important information for sequence labeling33, and some will use event types as known information and utilize the rich semantic information of event type and corresponding role set for trigger word and argument extraction34. However, these methods share common drawbacks of classification models, such as the need to perform manual design of the optimal combination of different subtasks, poor portability, and the need for a large amount of manually and carefully labeled data for training.
Unlike traditional classification models, our work treats the EAE task as a generative task, which is more flexible than classification-based models and shows better performance in low-resource environments.
Generation-based learning
Recently, more and more work has been done using generation-based models for EAE tasks18,35–37, because generation-based models are more flexible and can not only act on other datasets easily, but also can handle similar tasks of other types38,39. In addition, the generation-based model is more concise and easy to understand compared to the classification-based model because of its end-to-end feature40,41, which accomplishes two subtasks of EAE at once. Recent advances in prompt learning have brought superior performance to generation-based models compared to other methods17,20,42,43, and have yielded impressive results in areas such as low-resource18,20and transfer learning36,37. The prompt-based generation task designs a different prompt for each event type, and although most models use different prompts18,19,37, they all contain templates, which connect the roles corresponding to the event types using natural language, dominate the format of the model generation, and are the main factor affecting the effectiveness of the model18.
However, previous models typically concatenate event mention with the template that corresponding to the type of event to be extracted and input them into the model. The potential association information between the roles in the template from the prompt guides the PLM to extract the correct argument. These methods usually only consider the single event currently being extracted and ignore the effect of strong causality between multiple events present in the input. As a prompt-based generative approach, ECPEAE differs from previous work by not only considering the current single event type being extracted but also designing a co-occurrences events prefixes module to incorporate information from all event types present in the input, thereby assisting the model in handling complex event relationships among multiple events. Furthermore, the present event bias module is employed to enhance the potential relationship information among roles within the template corresponding to the current event type. An adjustable copy mechanism is introduced to overcome potential noise introduced by the additional information in each layer’s attention calculation.
Methodology
As shown in figure 2, we display an example from the ACE dataset, where the Input contains event mention and prompt. The prompt consists of the current event description, event trigger, and template corresponding to the event type to be extracted. The template for the current event type,, is encoded by the present event bias module and participates in attention computation in a biased manner (as indicated by “Bias” in the figure 2). All templates corresponding to event types in the event mention are encoded by the co-occurrences events prefixes module, and participate in attention computation as prefixes (denoted by
in the figure 2). Finally, we obtain the output text by replacing placeholders in the template with arguments. With simple regular expressions, we can extract the final arguments. As shown in figure 2, we can see that although the sentence triggers numerous events, these events often share the same token spans as arguments, demonstrating strong causal relationships between events17. When the model receives the template information corresponding to these events, it can leverage the strong causal relationships between the events to assist in argument extraction.
In this section, we first provide a formal definition of prompt-based generative EAE in section 3.1, then introduce the ECPEAE in section 3.2, and detail the proposed co-occurrences events prefixes module in section 3.2.1, the present event bias module in section 3.2.2, we will describe in section 3.2.3 how the information from the above two modules affects the generation of PLM. We detail the adjustable copy mechanism in section 3.3.
Task definition
A simple example of event argument extraction in a generative task based on prompt learning can be formulated as:
| 1 |
| 2 |
where the backbone of model is generally a PLM using the encoder-decoder transformer structure, such as BART44and T545, denotes the sequence concatenation operation, is the event mention for EAE, is an event type contained in, is the prompt corresponding to, and is the output of the model after replacing the corresponding role placeholders in the template with the extracted arguments. Whether the model can perform EAE tasks correctly depends on whether the generated arguments in are correct and placed in the correct role placeholders. After getting , we can use regular expressions to extract the arguments generated in . This process we call decode. is the set of roles associated with event type , is the set of arguments of the and each is a textual span within that represents the role . It should be noted that not all roles in the template have corresponding arguments in the event mention, so does not generally have a one-to-one correspondence with .
Different from the traditional EAE method, ECPEAE will not only consider the template contained in , but input the templates corresponding to all event types that occur in to the model, providing a more diversified and broad view of the model.
ECPEAE
ECPEAE inherits the simple and easily extensible DEGREE(EAE)18as the basic model. The backbone model of DEGREE(EAE) uses BART-large44 model with encoder-decoder Transformer structure. The BART-large model has 12 encoder and 12 decoder. In its core components, as shown in figure 2, we propose co-occurrences-self-attention and co-occurrences-cross-attention to replace self-attention in traditional encoder and cross-attention in traditional decoder to introduce co-occurrences events information and present event information into the PLM.
In prompt-based learning for EAE methods, the model’s input typically consists of event mention and prompt. To enhance the performance of EAE models in low-resource settings, we drew inspiration from the prompt design in DEGREE18. In ECPEAE, the input prompt includes not only the template and event trigger but also the event type description, which provides a simple natural language description of the event (We present the event description of ACE2005-EN in table 12 and the event template of ACE2005-EN in table 13. For more information about ERE dataset, please refer to https://github.com/PlusLabNLP/DEGREE.).
Table 12.
Event Type Description of ACE2005-EN from DEGREE.
| Event Type | Event Type Description |
|---|---|
| Life_Be_Born | the event is related to life, and someone is given birth to |
| Life_Marry | the event is related to life and someone is married |
| Life_Divorce | the event is related to life and someone was divorced |
| Life_Injure | the event is related to life and someone is injured |
| Life_Die | the event is related to life and someone died |
| Movement_Transport | the event is related to movement. The event occurs when a weapon or vehicle is moved from one place to another |
| Transaction_Transfer_Ownership | the event is related to the transaction the event occurs when an item or an organization is sold or given to some other |
| Transaction_Transfer_Money | the event is related to the transaction the event occurs when someone is giving |
| Business_Start_Org | the event is related to a new organization being created |
| Business_Merge_Org | the event is related to two or more organizations coming together to form a new organization |
| Business_Declare_Bankruptcy | the event is related to some organization declaring bankruptcy |
| Business_End_Org | the event is related to some organization ceasing to exist |
| Conflict_Attack | the event is related to conflict and some violent physical act |
| Conflict_Demonstrate | the event is related to a large number of people coming together to protest |
| Contact_Meet | the event is related to a group of people meeting and interacting with one another face-to-face |
| Contact_Phone_Write | the event is related to people phone calling or messaging one another |
| Personnel_Start_Position | the event is related to a person begins working for an organization or a hiring manager |
| Personnel_End_Position | the event is related to a person stops working for an organization or a hiring manager |
| Personnel_Nominate | the event is related to a person being nominated for a position |
| Personnel_Elect | the event is related to a candidate wins an election |
| Justice_Arrest_Jail | the event is related to a person getting arrested or a person being sent to jail |
| Justice_Release_Parole | the event is related to an end to someone’s custody in prison |
| Justice_Trial_Hearing | the event is related to a trial or hearing for someone |
| Justice_Charge_Indict | the event is related to someone or some organization being accused of a crime |
| Justice_Sue | the event is related to a court proceeding that has been initiated, and someone sues the other |
| Justice_Convict | the event is related to someone being found guilty of a crime |
| Justice_Sentence | the event is related to someone being sentenced to punishment because of a crime |
| Justice_Fine | the event is related to someone being issued a financial punishment |
| Justice_Execute | the event is related to someone being executed to death |
| Justice_Extradite | the event is related to justice; the event occurs when a person was extradited from one place to another place |
| Justice_Acquit | the event is related to someone being acquitted |
| Justice_Pardon | the event is related to someone being pardoned |
| Justice_Appeal | the event is related to someone appealing the decision of a court |
Table 13.
Event Template of ACE2005-EN from DEGREE.
| Event Type | Event Template |
|---|---|
| Life_Be_Born | somebody was born in somewhere |
| Life_Marry | somebody got married in somewhere |
| Life_Divorce | somebody divorced in somewhere |
| Life_Injure | somebody or some organization led to some victim injured by some way in somewhere |
| Life_Die | somebody or some organization led to some victim died by some way in somewhere |
| Movement_Transport | something was sent to somewhere from some place by some vehicle. somebody or some organization was responsible for the transport |
| Transaction_Transfer_Ownership | someone got something from some seller in somewhere |
| Transaction_Transfer_Money | someone paid some other in somewhere |
| Business_Start_Org | somebody or some organization launched some organization in somewhere |
| Business_Merge_Org | some organization was merged |
| Business_Declare_Bankruptcy | some organization declared bankruptcy |
| Business_End_Org | some organization dissolved |
| Conflict_Attack | some attacker attacked some facility, someone, or some organization by some way in somewhere |
| Conflict_Demonstrate | some people or some organization protest at somewhere |
| Contact_Meet | some people or some organization met at somewhere |
| Contact_Phone_Write | some people or some organization called or texted messages at somewhere |
| Personnel_Start_Position | somebody got new job and was hired by some people or some organization in somewhere |
| Personnel_End_Position | somebody stopped working for some people or some organization at somewhere |
| Personnel_Nominate | somebody was nominated by somebody or some organization to do a job |
| Personnel_Elect | somebody was elected a position, and the election was voted by some people or some organization in somewhere |
| Justice_Arrest_Jail | somebody was sent to jailed or arrested by somebody or some organization in somewhere |
| Justice_Release_Parole | somebody was released by some people or some organization from somewhere |
| Justice_Trial_Hearing | somebody, prosecuted by some other, faced a trial in somewhere. The hearing was judged by some adjudicator |
| Justice_Charge_Indict | somebody was charged by some other in somewhere. The adjudication was judged by some adjudicator |
| Justice_Sue | somebody was sued by some other in somewhere. The adjudication was judged by some adjudicator |
| Justice_Convict | somebody was convicted of a crime in somewhere. The adjudication was judged by some adjudicator |
| Justice_Sentence | somebody was sentenced to punishment in somewhere. The adjudication was judged by some adjudicator |
| Justice_Fine | some people or some organization in somewhere was ordered by some adjudicator to pay a fine |
| Justice_Execute | somebody was executed by somebody or some organization at somewhere |
| Justice_Extradite | somebody was extradicted to somewhere from some place. somebody or some organization was responsible for the extradition |
| Justice_Acquit | somebody was acquitted of the charges by some adjudicator |
| Justice_Pardon | somebody received a pardon from some adjudicator |
| Justice_Appeal | some other in somewhere appealed the adjudication from some adjudicator |
Co-occurrences event prefix
Unlike previous prompt-based generative methods that only consider the current event type being extracted, we aim to incorporate all event information from event mentions. In prompt-based generative approaches, templates provide semantic guidance for the model’s generation. By leveraging the label semantics in the template, the model can capture event arguments more effectively18. For this reason, we attempt to start with the templates and integrate the information from to achieve the goal of incorporating all event information from the current input. Specifically, as shown in figure 2, we design the co-occurrences events prefixes module. First, this module connects with all templates corresponding to the events present in the current input , where represents the template corresponding to event in , and k denotes the number of events in . The templates are separated by a special character “<point>”, and We refer to the representation obtained by concatenating all templates separated by “<point>” as .
| 3 |
Then, an encoder is used to encode these concatenated templates and extract the semantic information from :
| 4 |
We do not impose any restrictions on the choice of encoder, taking into consideration both parameter count and performance. In our implementation, we utilize BART-encoder as the encoder. It is important to note that the BART-encoder used for encoding is different from the backbone, Bart-Large, as they have different parameters.
After obtaining the semantic information from , inspired by Xiang Lisa Li21and I-Hung Hsu18, we integrate this information into the model by concatenating it as a prefixes to the backbone. Firstly, we introduce a vector l, which serves as the Q input for the multi-head attention and determines the length of the prefixes to be generated and concatenated with the backbone. Then, we duplicate the semantic information and use it as the K and V inputs for the multi-head attention module:
| 5 |
Where d represents the hidden dimension of the model, and this results in :
| 6 |
Finally, the output of the multi-head attention is passed through a series of feed-forward networks to generate the prefixes. This prefixes is then concatenated with the backbone. The specific concatenation method will be discussed in section 3.2.3.
Present event bias
In order to expand the potential role relationship information within the current template and strengthen the template’s constraint on the output format, as shown on the left in figure 2, we have designed the present event bias module, which requires three inputs, namely Input-1, Input-2, Input-3. Input-1 corresponds to , Input-2 and Input-3 respectively correspond to the K and V of the encoder in BART-large that participate in multi-head attention, and the K and V have been concatenated by prefixes.
Specifically, in the first step, is encoded by the encoder to obtain its vector representation . It’s important to note that the encoder used here is the same as the one mentioned in section 3.2.1 for encoding and shares the same parameters:
| 7 |
In the second step, after obtaining the vector representation of , we introduce . Unlike the l in the co-occurrences events prefixes module, the in the present event bias module is not a hyperparameter but a fixed value. In ECPEAE, it is set to 64 to ensure that the shape of the value obtained after passing through a series of feed-forward networks can be multiplied with Input-2 and Input-3. We use as the Q input for multi-head attention and the as the K and V inputs for multi-head attention, and this results in :
| 8 |
In the third step, the result of the multi-head attention is passed through a series of feed-forward neural networks and multiplied with Input-2 and Input-3. Finally, the multiplied result is averaged to obtain a vector of the same length as . This vector is then integrated into the computation of the backbone as bias. The specific integration method will be discussed in section 3.2.3.
Co-occurrences-cross/self-attention
In this section, we will provide a detailed explanation of how to integrate the prefixes obtained from the co-occurrences events prefixes module and the bias obtained from the present event bias module into the backbone. Specifically, we will integrate them into the attention modules of the backbone. Due to the encoder-decoder Transformer structure of BART-Large, it consists of two self-attention modules and one cross-attention module. Through our experiments, we have found that adding the prefixes to the self-attention module in the encoder layer and the cross-attention module in the decoder layer yields the best results for the model.
In ECPEAE, the prefixes is introduced in the cross-attention of each decoder layer in BART-Large. The prefixes is concatenated with the Key and Value inputs of the multi-head attention in the cross-attention, thereby incorporating the information from the templates corresponding to all event types mentioned in the event mention. We refer to the cross-attention in the decoder layer that incorporates the prefixes as “co-occurrences-cross-attention”.
In addition to the co-occurrences-cross-attention, ECPEAE also introduces the prefixes in each layer of the encoder. However, this introduces a new challenge as the BART-Large model has 12 layers in the encoder. While ECPEAE incorporates the prefixes information in each encoder layer, it only introduces the information at the input stage. In order to enhance the potential relationship information between roles in the templates corresponding to the current target event type and to strengthen the constraints on the output of the model, we not only include the information in each encoder layer but also aim to introduce the information additionally.
In ECPEAE, the bias that integrates the information is added to the attn-weights, which is the output of the MatMul operation shown in figure 2. This addition process incorporates the information into each layer of the encoder, thereby strengthening the constraint of on the model’s output. We refer to the self-attention that incorporates the prefixes and bias as “co-occurrences-self-attention”.
Training
The model is trained so that it can correctly extract arguments from event mentions and replace them with placeholders for the corresponding roles in the template. During training, although we will input all the corresponding templates of the co-occurrences event types in the event mention into the model, the model will only extract the argument of the current event to be extracted, and the format of the output text generated is the same as that of the .
Due to the modification of part of the structure of BART-large, which is different from the way of training BART-large44, in order to strengthen the restriction on the Output text format, we introduce a common copy mechanism46 to restrict its generated results:
| 9 |
where is the prediction of the current result when the traditional generative model generates the result. controls the probability of result generation by the generative model, calculated by the last decoder hidden state in BART-large. is each token of Input, is the probability of copying directly from the Input, calculated by cross-attention weights in the last decoder layer. Moreover, we also use the regularization method proposed by20, introducing regularization on to the traditional copy mechanism to encourage the model to copy the input more:
| 10 |
where is a hyperparameter.
Experiments
Datasets
We tested ECPEAE on two widely used datasets for EAE, ACE2005 and ERE. ACE2005 data set contains Arabic, English, and Chinese, and the English part of ACE2005-EN was selected for this experiment. The ACE2005-EN dataset contains 599 English annotated documents, 33 event types, and 22 parameter roles. The ERE dataset contains 458 English annotated documents, 38 event types, and 21 parameter roles. The statistics of the dataset are shown in the table. 1.
Table 1.
Dataset statistics.
| Dataset | Split | #Sents | #Events | #Roles |
|---|---|---|---|---|
| ADC2005-EN | Train | 19,216 | 4,419 | 6,607 |
| Dev | 901 | 468 | 759 | |
| Test | 832 | 403 | 576 | |
| ERE-EN | Train | 14,736 | 6,208 | 8,924 |
| Dev | 901 | 468 | 759 | |
| Test | 676 | 424 | 689 |
We followed the data processing method of previous work18,29,30preprocessed the data set. We also used the segmentation method for data in18, which splits the data into different percentages (3%, 5%, 10%, 20%, 30%, and 50%), and used the original validation set and test set to evaluate the performance of the model in a low-resource environment.
Evaluation metrics
We consider the same evaluation criteria in prior works18,29, since it is a generative task, we pay more attention to F1 metric. This experiment will report the F1 score of argument identification (Arg-I) and argument classification (Arg-C):
Arg-I: an argument is correctly identified from event mention.
Arg-C: an argument is correctly classified if its offset and the role’s label both match the ground truth.
Because the Arg-C metric can better reflect whether the model extracts the correct arguments, and is associated with its matching roles, so as to generate the correct event records. Therefore, the EAE task pays more attention to the F1 metric of Arg-C.
Baseline method
We divided the models used for comparison with ECPEAE into two groups, The first group is about classification-based methods.
EEQA47: sets dynamic thresholds to extract arguments in an end-to-end manner based on question answering (QA) task.
OneIE29: the model incorporates global features to capture the cross-subtask and cross-instance interaction, which is the SOTA in the classification-based model. The second group is about generation-based methods.
PAIE19: prompt and span selector are used to extract arguments together, and binary matching loss is designed to find the optimal role label, which is the SOTA model under full-resource environment for the 2022 EAE task.
DEGREE(EAE)18: an end-to-end model for conditional generation under low resource conditions using prompt, which is the SOTA model under low-resource environment for the 2022 EE task.
TabEAE17: the model extends the PAIE19 into a non-autoregressive generation framework to extract the arguments of multiple events concurrently, which is the SOTA model under high-resource environment for the 2023 EAE task.
AMPERE20: The model introduces the abstract meaning representation (AMR) referred to by the event into the EAE, which is the SOTA model under low-resource environment for the 2023 EAE task.
Implementation details
We used BART-large provided by Facebook on the Huggingface website as the backbone model introduced in section 3.2. The number of parameters is around 406 millions. We used a single NVIDIA A40 Tensor Core GPU 45GB to train the ECPEAE and replicate other paper experiments. We used AdamW48 as the optimizer for the model, with the length of l set to 40 in section 3.2.1 and the hyperparameter set to 1 in section 3.3. We set the batch size to 4.
High-resource event argument extraction
We evaluate the proposed model ECPEAE and baseline methods under the high-resource of supervised learning setting. As shown in table 2, our model outperforms the current state-of-the-art models on the important indicator Arg-C on both datasets. Compared with the basic model DEGREE(EAE), ECPEAE has achieved comprehensive improvement in all metrics. On ACE2005-EN dataset, the Arg-I metric gained 2.36%, and the Arg-C metric gained 2.74%. On the ERE dataset, the Arg-I metric gained 0.15%, and the Arg-C metric gained 4.04%. In addition, ECPEAE exceeded the SOTA model TabEAE in the high-resource environment of the EAE task, and the Arg-C metric gained 0.26% improvement on the ACE2005 dataset. The excellent performance of ECPEAE demonstrates the effectiveness of incorporating knowledge of co-occurrences events into EAE models.
Table 2.
Results on ACE2005-EN and ERE for high-resource event extraction. For each column, we bold the highest score. *We report the numbers from the18. ECPEAE achieves a new state-of-the-art performance on most important metric Arg-C of event argument extraction.
| Model | ACE2005-EN | ERE | ||
|---|---|---|---|---|
| Arg-I | Arg-C | Arg-I | Arg-C | |
| Classification-based | ||||
| EEQA | 70.5 | 68.9 | 72.4 | 67.6 |
| OneIE* | 73.2 | 69.3 | 75.3 | 70.0 |
| Generation-based | ||||
| PAIE | 75.85 | 72.42 | - | - |
| DEGREE(EAE) | 73.15 | 70.70 | 76.09 | 68.57 |
| TabEAE | 75.92 | 73.18 | - | - |
| AMPERE | 74.63 | 71.22 | 77.74 | 71.21 |
| ECPEAE | 75.52 | 73.45 | 76.25 | 72.62 |
Low-resource event argument extraction
Due to the fact that the condition generation method based on prompt learning can fully utilize the language understanding ability of pre-trained language model, ECPEAE also performs well in low-resource environments. In order to fully study its performance under different proportions of data, we divided the training set data into 3%, 5%, 10%, 20%,30% and 50% of the original data set by referring to the method proposed in AMPERE20. The model is evaluated using the original validation set and test set.
As shown in the table 3, ECPEAE achieves better results than the baseline methods for most proportions of the ACE2005-EN dataset and the ERE dataset. Among them, there are only two cases in which Arg-C fails to obtain optimal results in ACE2005-EN dataset with a total of six data proportions. In the case of the proportion of six data quantities in ERE dataset, Arg-C failed to get the optimal result in only one case. And in the case of very low data volume (3%,5% and 10%), the improvement effect is obvious. We depict the performance of different models on ACE2005-EN and ERE-EN for low-resource more directly in figure 3.
Table 3.
Argument Identification and Argument Classification F1-Score on ACE2005-EN and ERE for low-resource event argument extraction. For each column, we bold the highest score.
| Argument Identification F1-Score (%) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model | ACE2005-EN | ERE-EN | ||||||||||
| 3% | 5% | 10% | 20% | 30% | 50% | 3% | 5% | 10% | 20% | 30% | 50% | |
| PAIE | 38.9 | 46.8 | 62.1 | 69.0 | 69.3 | 73.6 | - | - | - | - | - | - |
| TabEAE | 33.9 | 47.3 | 59.1 | 68.4 | 70.3 | 74.1 | - | - | - | - | - | - |
| DEGREE(EAE) | 50.2 | 47.9 | 61.3 | 63.7 | 69.1 | 71.6 | 47.8 | 52.6 | 50.7 | 65.3 | 67.6 | 75.0 |
| AMPERE | 53.4 | 57.6 | 65.5 | 65.3 | 71.0 | 71.2 | 56.9 | 62.5 | 56.2 | 70.2 | 74.7 | 74.0 |
| ECPEAE | 55.9 | 59.6 | 68.2 | 69.0 | 71.6 | 72.4 | 56.3 | 63.4 | 54.4 | 72.4 | 74.8 | 74.9 |
| Argument Classification F1-Score (%) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model | ACE2005-EN | ERE-EN | ||||||||||
| 3% | 5% | 10% | 20% | 30% | 50% | 3% | 5% | 10% | 20% | 30% | 50% | |
| PAIE | 34.7 | 42.3 | 57.1 | 64.3 | 64.9 | 70.0 | - | - | - | - | - | - |
| TabEAE | 27.7 | 38.4 | 52.8 | 61.2 | 65.9 | 71.1 | - | - | - | - | - | - |
| DEGREE(EAE) | 43.7 | 40.5 | 56.3 | 59.4 | 64.5 | 68.7 | 43.8 | 40.2 | 41.5 | 56.4 | 60.1 | 69.3 |
| AMPERE | 46.1 | 48.3 | 60.0 | 65.3 | 67.8 | 68.4 | 46.0 | 47.7 | 45.6 | 60.8 | 64.3 | 69.1 |
| ECPEAE | 47.4 | 51.5 | 61.5 | 64.8 | 67.9 | 69.7 | 49.1 | 51.2 | 44.9 | 65.6 | 64.9 | 69.6 |
Fig. 3.
Argument Identification and Argument Classification F1-scores on ACE2005-EN and ERE for low-resource event argument extraction. The above figure shows how ECPEAE performs with other models under low-resource conditions.
Compared to the base model DEGREE18, ECPEAE achieved an increase in the total data proportions, except for a 0.1% reduction in the Arg-I metric in the case of 50% of the ERE dataset. The largest increase on ACE2005-EN dataset occurred when only 5% of the train set data was used, where Arg-I increased by 11.7% and Arg-C increased by 11%. The largest increases on the ERE dataset were also seen in the 5% scaled train set, where Arg-I increased by 10.8% and Arg-C increased by 11%.
Compared to the SOTA model AMPERE20 in the low-resource environment, ECPEAE failed to exceed ampere in only two cases out of twelve cases with six different proportions in two data sets. The largest increases on the ACE2005-EN dataset occurred in the 5% scaled training set, where Arg-I increased by 2% and Arg-C increased by 3.2%. The largest increases on the ERE dataset were seen in the 20% scaled training set, where Arg-I increased by 7.1% and Arg-C increased by 4.8%.
Compared with the DEGREE model, the performance gain of ECPEAE model under low resource conditions fully demonstrates its effectiveness of incorporating co-occurrences events into EAE tasks. Compared with the AMPERE model, ECPEAE’s excellent performance in low resource conditions fully demonstrates its superior performance.
Comparison with large language models
To demonstrate the superiority of our model’s performance, we conducted experiments on currently widely used Large Language Models (LLMs), such as ChatGPT-3.5, GPT-4o-mini, GPT-4o, and Llama3-8B. The results are shown in the table 4. Our model outperformed the best-performing GPT-4o-mini by 37.22% in Arg-I and 40.77% in Arg-C. These results indicate that LLMs do not exhibit exceptional performance on complex extraction tasks like event argument extraction, which aligns with the conclusions of previous studies49,50. In the experiment, the prompts we used are shown in the table 5.
Table 4.
Comparison experiment results with LLMs.
| Model | Arg-I | Arg-C |
|---|---|---|
| ChatGPT-3.5 | 35.57 | 27.19 |
| GPT4o-mini | 38.30 | 32.68 |
| GPT4o | 34.98 | 30.97 |
| Llama3-8B | 25.11 | 20.15 |
| ECPEAE | 75.52 | 73.45 |
Table 5.
Prompt for LLMs.
| You will perform event argument extraction tasks in the news domain. | |
| Please follow the steps below to identify the arguments corresponding to the roles in the sentence delimited by <sent>. | |
| If a role does not have a corresponding argument, strictly output None. | |
| In step 4, I will provide you with an example marked by <eg> | |
| 1 - The trigger word ’Former’ marked with <t> triggers a Personnel.End-Position event. | |
| 2 - The event ’Personnel.End-Position’ corresponds to the list of roles: Person, Entity, Place. | |
| 3 - Please output the role names and their corresponding arguments in JSON format. | |
| 4 - I will give you an example as follows: | |
| <eg> Given a sentence: the <t> former <t> governor of basra province also surrendered to coalition forces today . | |
| You need to output: “Person”: “governor”, “Entity”: “basra province”, “Place”: “basra province” <eg>. | |
| Sentence: <sent> <t> Former <t> senior banker Callum McCarthy begins what is one of the most important jobs | |
| in London ’s financial world in September , when incumbent Howard Davies steps down . <sent> |
Analysis
Argument extraction in complex event relations
To analyze the effectiveness of incorporating information from all events present in the input for handling complex event inputs, we will investigate the model’s capability in two aspects: multiple event inputs and overlapping events.
Multiple Event Inputs. According to the table 6, we have listed the quantity of event mentions with multiple events in the test sets of the ACE2005-EN and ERE datasets. In the table, “1” represents event mentions with only one event, “2” represents event mentions with two events, and so on. We categorize event mentions in the test set with only one event as ACE-1 and ERE-1, and event mentions with two or more events as ACE-2+ and ERE-2+. We then evaluate the trained DEGREE18 and ECPEAE models on these categorized datasets.
Table 6.
Number of events present in the test set event mentions in ACE2005-EN and ERE.
| Dataset | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| ACE2005-EN | 185 | 82 | 15 | 1 | 1 |
| ERE | 115 | 43 | 12 | 0 | 1 |
Based on the table 7, the test results show that when ECPEAE incorporates the semantic information of all events included in the event mentions, it achieves a 1.24% improvement in the Arg-C metric compared to DEGREE when handling event mentions with co-occurrences events (containing more than two events) in the ACE2005-EN dataset. Similarly, when handling event mentions with co-occurrences events in the ERE dataset, the Arg-C metric improves by 3.89% compared to DEGREE. Additionally, even in event mentions without co-occurrences events (containing only one event), ECPEAE performs exceptionally well compared to DEGREE. This demonstrates that incorporating co-occurrences events information into ECPEAE contributes to the improvement of the EAE task.
Table 7.
The above table shows the different performance of DEGREE(EAE) and ECPEAE in processing event mentions that contain only one event and event mentions that contain multiple events. In the table above, we present the F1 scores of the model on Arg-I and Arg-C for the ACE2005-EN and ERE datasets. For each column, we highlight the highest score in bold text. Based on the table, it can be observed that our model consistently outperforms DEGREE (EAE) in both single-event event mention processing and multi-event event mention processing, as indicated by the higher F1 scores.
| Model | ACE-1 | ACE-2+ | ERE-1 | ERE-2+ | ||||
|---|---|---|---|---|---|---|---|---|
| Arg-I | Arg-C | Arg-I | Arg-C | Arg-I | Arg-C | Arg-I | Arg-C | |
| DEGREE(EAE) | 72.36 | 69.51 | 73.77 | 71.60 | 78.57 | 70.78 | 73.95 | 66.67 |
| ECPEAE | 76.86 | 74.12 | 74.69 | 72.84 | 77.08 | 73.75 | 75.00 | 70.56 |
Overlapping Events In order to study the model’s ability to handle complex event relationships, we also categorized the presence of overlapping events in the test sets of both datasets. Overlapping events refer to multiple events that share the same token span as arguments. For example, in figure 1, the Movement: Transport event triggered by “take” and the Life: Injure event triggered by “injury” both share the same text span “someone,” making them overlapping events.
From the table 8, it can be observed that ECPEAE outperforms DEGREE (EAE) in handling overlapping events in both datasets. Specifically, ECPEAE achieves a 2.03% improvement in the Arg-C metric for handling overlapping events in the ACE dataset compared to DEGREE (EAE) and an 11.42% improvement in the ERE dataset. Moreover, ECPEAE also demonstrates improvement over DEGREE (EAE) in the non-overlapping (n_o) cases. This highlights the superiority of ECPEAE in handling overlapping events.
Table 8.
The table above illustrates the capabilities of different models in handling overlapping events in the two datasets. In the table, “Overlap” represents overlapping events, while “” denotes non-overlapping events. For each column, we highlight the highest score in bold text.
| Model | ACE2005-EN | ERE | ||||||
|---|---|---|---|---|---|---|---|---|
| Overlap | Overlap | |||||||
| Arg-I | Arg-C | Arg-I | Arg-C | Arg-I | Arg-C | Arg-I | Arg-C | |
| DEGREE(EAE) | 76.61 | 74.85 | 71.68 | 68.92 | 75.27 | 64.52 | 76.41 | 70.15 |
| ECPEAE | 78.03 | 76.88 | 74.54 | 71.83 | 81.28 | 75.94 | 74.00 | 70.61 |
Ablation studies
In this section, we study the effectiveness of each proposed module by adding them into base model DEGREE and finally get our final model ECPEAE. Specifically, we used the following methods:
baseline: without adding any additional information, is the basic model DEGREE(EAE)
+ copy: no information is added for prefixes and bias, but a adjustable copy mechanism is added.
+ copy and prefixes: no information for bias is added, but information for prefixes is incorporated.
+ copy and bias: no information is added for bias, but bias is incorporated.
As shown in the table 9, compared with the base model DEGREE(EAE), ECPEAE achieves 2.36% improvement in Arg-I and 2.57% improvement in Arg-C. Compared with + copy and bias, ECPEAE shows a 1.52% increase in the Arg-I metric and a 2.45% increase in the Arg-C value. After removing prefixes, the performance of + copy and bias is degraded, which we believe are mainly due to two reasons. The first reason is that no knowledge of co-occurrences events is introduced. The second reason is that although no knowledge of co-occurrences events is introduced, in order for the model to encode to generate bias, it still needs to increase the Template-encoder model, which greatly increases the number of parameters trained, resulting in degraded model performance. Therefore, we believe that the comparison between + copy and bias and ECPEAE better shows the effectiveness of introducing knowledge of co-occurrences events.
Table 9.
Ablation study for the components in the ECPEAE on event argument extraction with ACE2005-EN.
| Model | ACE2005-EN | |
|---|---|---|
| Arg-I | Arg-C | |
| baseline | 73.16 | 70.70 |
| + copy | 74.62 | 72.42 |
| + copy and prefixes | 74.47 | 72.54 |
| + copy and bias | 74.00 | 71.00 |
| ECPEAE | 75.52 | 73.45 |
Prefix position
The backbone used by ECPEAE, BART-large, has three modules that compute multi-head attention: self-attention in encoder, self-attention in decoder, and cross-attention in decoder. In order to study the reasonable position of prefixes in BART-large, we arranged and combined three modules for calculating attention, and analyzed the performance of the model under each combination in the experiment. As shown in the table 10, adding prefixes to self-attention in encoder and cross-attention in decoder leads to optimal performance of the model.
Table 10.
Argument Identification and Argument Classification F1-Score on ACE2005-EN for high-resource event argument extraction. For each column, we bold the highest score. The above table shows the effect of event argument extraction on the model when prefixes is placed at the location of different multi-head attention calculations in BART-large.
| ECPEAE (ACE2005-EN) | ||||
|---|---|---|---|---|
| Location | Arg-I | Arg-C | ||
| Encoder-self-attention | Decoder-self-attention | Decoder-cross-attention | ||
| ✗ | ✗ | ✗ | 74.00 | 71.00 |
| ✓ | ✗ | ✗ | 73.83 | 71.60 |
| ✗ | ✓ | ✗ | 74.31 | 71.55 |
| ✗ | ✗ | ✓ | 75.19 | 72.42 |
| ✓ | ✓ | ✗ | 73.56 | 71.17 |
| ✓ | ✗ | ✓ | 75.51 | 73.44 |
| ✗ | ✓ | ✓ | 71.71 | 69.47 |
| ✓ | ✓ | ✓ | 74.39 | 71.99 |
Case study
To better demonstrate the DEGREE gain of ECPEAE over the base model, we present a use case from the ACE2005-EN test set in a full-resource environment. As shown in the figure 4, there exist two event types in the event mention, Conflict: Attack, which is triggered by strikes, and Movement: Transport, which is triggered by airlifting. It can be seen from the figure 4 that DEGREE failed to predict Iraq corresponding to role Place in the Gold Argument of Conflict: Attack event. ECPEAE with more information uses the template knowledge of Movement: Transport in co-occurrences events to successfully extract argument Iraq and match it to the role Place.
Fig. 4.
One test set case from ACE2005-EN. The above figure shows how co-occurrences helps the generation of event argument extraction.
Conclusion
In this paper, we differentiate ourselves from previous prompt-based generative approaches by not only considering the current target event type but also integrating information from all events present in the input. This allows our model to generate outputs that consider the complex event structures in the input. To achieve this, we propose the co-occurrences events prefixes module, which extracts information from all event-specific templates present in the input and incorporates them as prefixes in the attention computation of each layer in the model. To enhance the semantic information of the target event-specific template, we extract the information of the current target event’s corresponding template and incorporate it into the model as a bias. Finally, to overcome the potential noise introduced by the additional information in the attention computation of each layer, we introduce an adjustable copy mechanism. Experimental results demonstrate that our model achieves state-of-the-art performance on the ACE2005-EN and ERE datasets, and it also proves its effectiveness in low-resource settings.
Limitation
Our model incorporates information about co-occurring events and has been proven to be effective. However, there is still a lot of room for improvement.
ECPEAE is a generative task based on prompt learning, so it has the inherent flaw of prompt learning tasks, namely the limitation of manual template design. Future work could revolve around automatically generating templates or prompts.
EAE task is a sub-task of EE task. Although ED task has been perfected, there will always be error propagation between ED and EAE tasks when the two tasks are studied separately.
Many of the previous classification models have been proved to be effective for EAE, but most of these methods have not been introduced into prompt-based generative tasks for the time being. It is also worth exploring how to integrate these methods into prompt-based generative tasks.
Error analysis
In this section, we will explore the reasons for errors made by different models. We randomly selected 200 samples from the ACE2005-EN test dataset and manually annotated the incorrectly predicted samples. We classified the errors into five categories:
Wrong Span: This refers to assigning a specific role to an incorrect span that does not overlap with the gold standard.
Over-extract: This refers to the model predicting an argument role that does not exist in the document.
Under-extract: This refers to the model failing to predict any role when the gold standard has that role.
Partial: This refers to cases where some extracted spans are substrings of the gold standard spans.
Overlap: This refers to cases where the extracted spans partially overlap with the gold standard spans.
The detailed results are shown in the table 11. From the table, it can be seen that ECPEAE demonstrates superior capability compared to DEGREE in avoiding over-extraction and under-extraction, reducing more errors in extraction.
Table 11.
The above figure shows the number of different error types between two models in 200 randomly selected examples. “Gold” represents the Gold Label, while “Pred” represents the model’s prediction.
| Category | Example | Errors | |
|---|---|---|---|
| DEGREE | ECPEAE | ||
| Wrong Span |
Gold: Muslims protested at Baghdad. Pred: Muslims protested at mosque. |
13 | 14 |
| Over-extract |
Gold: someone got apartment from Belgrade in Belgrade. Pred: nanny got apartment from somewhere in Belgrade. |
60 | 57 |
| Under-extract |
Gold: MCI paid Ebbers in somewhere. Pred: someone paid some other in somewhere. |
46 | 38 |
| Partial |
Gold: Hunter and star divorced in Los Angeles. Pred: Hunter divorced in Los Angeles. |
5 | 6 |
| Overlap |
Gold: EU met at somewhere. Pred: Georhge Papandreou and EU met at somewhere. |
5 | 4 |
Author contributions
Jiaren Peng: Software, Writing - original draft, Writing - review & edit, Visualization. Wenzhong Yang: Conceptualization, Funding acquisition, Supervision. Fuyuan Wei: Formal analysis, Data curation. Liang He: Formal analysis, Data curation. Long Yao: Software. Hongzhen Lv: Software. All authors reviewed the manuscript.
Funding
The article is supported by the “Tianshan Talent” Research Project of Xinjiang (No. 2022TSYCLJ0037). The National Natural Science Foundation of China (No. 62262065). The Science and Technology Program of Xinjiang (No. 2022B01008)
Data availability
The datasets used and analysed during the current study are available in the LDC repository, “https://catalog.ldc.upenn.edu/LDC2006T06”.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Peng, J., Yang, W., Wei, F. & He, L. Prompt for extraction: Multiple templates choice model for event extraction. Knowledge-Based Systems289, 111544. 10.1016/j.knosys.2024.111544 (2024). [Google Scholar]
- 2.Huang, H. et al. A multi-graph representation for event extraction. Artif. Intell.332, 104144. 10.1016/J.ARTINT.2024.104144 (2024). [Google Scholar]
- 3.Chen, R., Qin, C., Jiang, W. & Choi, D. Is a large language model a good annotator for event extraction? In Wooldridge, M. J., Dy, J. G. & Natarajan, S. (eds.) Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada, 17772–17780, 10.1609/AAAI.V38I16.29730 (AAAI Press,) (2024).
- 4.Zhu, M. et al. LC4EE: llms as good corrector for event extraction. In Ku, L., Martins, A. & Srikumar, V. (eds.) Findings of the Association for Computational Linguistics, ACL 2024, Bangkok, Thailand and virtual meeting, August 11-16, 2024, 12028–12038, 10.18653/V1/2024.FINDINGS-ACL.715 (Association for Computational Linguistics,) (2024).
- 5.Doddington, G. R. et al. The automatic content extraction (ACE) program - tasks, data, and evaluation. In Proceedings of the Fourth International Conference on Language Resources and Evaluation, LREC 2004, May 26-28, 2004, Lisbon, Portugal (European Language Resources Association,) (2004).
- 6.Ahn, D. The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events (2006).
- 7.Wei, K. et al. Implicit event argument extraction with argument-argument relational knowledge. IEEE Transactions on Knowledge and Data Engineering35, 8865–8879 (2022). [Google Scholar]
- 8.Wei, K. et al. Trigger is not sufficient: Exploiting frame-aware knowledge for implicit event argument extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 4672–4682 (2021).
- 9.Zhang, T., Chen, M. & Bui, A. A. T. Diagnostic prediction with sequence-of-sets representation learning for clinical events. In Michalowski, M. & Moskovitch, R. (eds.) Artificial Intelligence in Medicine - 18th International Conference on Artificial Intelligence in Medicine, AIME 2020, Minneapolis, MN, USA, August 25-28, 2020, Proceedings, vol. 12299 of Lecture Notes in Computer Science, 348–358, 10.1007/978-3-030-59137-3_31 (Springer, 2020). [DOI] [PMC free article] [PubMed]
- 10.Li, M. et al. GAIA: A fine-grained multimedia knowledge extraction system. In Celikyilmaz, A. & Wen, T. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, ACL 2020, Online, July 5-10, 2020, 77–86, 10.18653/V1/2020.ACL-DEMOS.11 (Association for Computational Linguistics,) (2020).
- 11.Costa, T. S., Gottschalk, S. & Demidova, E. Event-qa: A dataset for event-centric question answering over knowledge graphs. In d’Aquin, M., Dietze, S., Hauff, C., Curry, E. & Cudré-Mauroux, P. (eds.) CIKM ’20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020, 3157–3164, 10.1145/3340531.3412760 (ACM, 2020).
- 12.Wu, J., Mu, T., Thiyagalingam, J. & Goulermas, J. Y. Memory-aware attentive control for community question answering with knowledge-based dual refinement. IEEE Transactions on Systems, Man, and Cybernetics: Systems53, 3930–3943 (2023). [Google Scholar]
- 13.Wu, J. et al. Question-aware dynamic scene graph of local semantic representation learning for visual question answering. Pattern Recognition Letters170, 93–99 (2023). [Google Scholar]
- 14.Zhang, H., Liu, X., Pan, H., Song, Y. & Leung, C. W. ASER: A large-scale eventuality knowledge graph. In Huang, Y., King, I., Liu, T. & van Steen, M. (eds.) WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 201–211, 10.1145/3366423.3380107 (ACM / IW3C2,) (2020).
- 15.Rospocher, M. et al. Building event-centric knowledge graphs from news. J. Web Semant.37–38, 132–151. 10.1016/j.websem.2015.12.004 (2016). [Google Scholar]
- 16.Li, Z., Ding, X. & Liu, T. Constructing narrative event evolutionary graph for script event prediction. In Lang, J. (ed.) Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, 4201–4207, 10.24963/ijcai.2018/584 (ijcai.org, 2018).
- 17.He, Y., Hu, J. & Tang, B. Revisiting event argument extraction: Can EAE models learn better when being aware of event co-occurrences? In Rogers, A., Boyd-Graber, J. L. & Okazaki, N. (eds.) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, 12542–12556, 10.18653/v1/2023.acl-long.701 (Association for Computational Linguistics,) (2023).
- 18.Hsu, I. et al. DEGREE: A data-efficient generation-based event extraction model. In Carpuat, M., de Marneffe, M. & Ruíz, I. V. M. (eds.) Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10-15, 2022, 1890–1908, 10.18653/v1/2022.naacl-main.138 (Association for Computational Linguistics,) (2022).
- 19.Ma, Y. et al. Prompt for extraction? PAIE: prompting argument interaction for event argument extraction. In Muresan, S., Nakov, P. & Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, 6759–6774, 10.18653/v1/2022.acl-long.466 (Association for Computational Linguistics,) (2022).
- 20.Hsu, I., Xie, Z., Huang, K., Natarajan, P. & Peng, N. AMPERE: amr-aware prefix for generation-based event argument extraction model. In Rogers, A., Boyd-Graber, J. L. & Okazaki, N. (eds.) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, 10976–10993, 10.18653/v1/2023.acl-long.615 (Association for Computational Linguistics,) (2023).
- 21.Li, X. L. & Liang, P. Prefix-tuning: Optimizing continuous prompts for generation. In Zong, C., Xia, F., Li, W. & Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, 4582–4597, 10.18653/V1/2021.ACL-LONG.353 (Association for Computational Linguistics,) (2021).
- 22.Zhang, G. et al. Hyperspherical multi-prototype with optimal transport for event argument extraction. In Ku, L., Martins, A. & Srikumar, V. (eds.) Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024, 9271–9284, 10.18653/V1/2024.ACL-LONG.502 (Association for Computational Linguistics,) (2024).
- 23.Cao, H. et al. Oneee: A one-stage framework for fast overlapping and nested event extraction. In Calzolari, N. et al. (eds.) Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022, 1953–1964 (International Committee on Computational Linguistics,) (2022).
- 24.Wang, Z. et al. CLEVE: contrastive pre-training for event extraction. In Zong, C., Xia, F., Li, W. & Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, 6283–6297, 10.18653/v1/2021.acl-long.491 (Association for Computational Linguistics,) (2021).
- 25.Fei, H., Ren, Y. & Ji, D. A tree-based neural network model for biomedical event trigger detection. Inf. Sci.512, 175–185. 10.1016/J.INS.2019.09.075 (2020). [Google Scholar]
- 26.Chen, Y., Xu, L., Liu, K., Zeng, D. & Zhao, J. Event extraction via dynamic multi-pooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, 167–176, 10.3115/v1/p15-1017 (The Association for Computer Linguistics,) (2015).
- 27.Nguyen, T. M. & Nguyen, T. H. One for all: Neural joint modeling of entities and events. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, 6851–6858, 10.1609/aaai.v33i01.33016851 (AAAI Press, 2019).
- 28.Nguyen, T. H., Cho, K. & Grishman, R. Joint event extraction via recurrent neural networks. In Knight, K., Nenkova, A. & Rambow, O. (eds.) NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, 300–309, 10.18653/v1/n16-1034 (The Association for Computational Linguistics,) (2016).
- 29.Lin, Y., Ji, H., Huang, F. & Wu, L. A joint neural model for information extraction with global features. In Jurafsky, D., Chai, J., Schluter, N. & Tetreault, J. R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, 7999–8009, 10.18653/v1/2020.acl-main.713 (Association for Computational Linguistics,) (2020).
- 30.Wadden, D., Wennberg, U., Luan, Y. & Hajishirzi, H. Entity, relation, and event extraction with contextualized span representations. In Inui, K., Jiang, J., Ng, V. & Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, 5783–5788, 10.18653/v1/D19-1585 (Association for Computational Linguistics,) (2019).
- 31.Yang, S., Feng, D., Qiao, L., Kan, Z. & Li, D. Exploring pre-trained language models for event extraction and generation. In Korhonen, A., Traum, D. R. & Màrquez, L. (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, 5284–5294, 10.18653/v1/p19-1522 (Association for Computational Linguistics,) (2019).
- 32.Wei, K. et al. Guide the many-to-one assignment: Open information extraction via iou-aware optimal transport. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 4971–4984 (2023).
- 33.Sheng, J. et al. Casee: A joint learning framework with cascade decoding for overlapping event extraction. In Zong, C., Xia, F., Li, W. & Navigli, R. (eds.) Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021, vol. ACL/IJCNLP 2021 of Findings of ACL, 164–174, 10.18653/v1/2021.findings-acl.14 (Association for Computational Linguistics,) (2021).
- 34.Wang, S., Yu, M., Chang, S., Sun, L. & Huang, L. Query and extract: Refining event extraction as type-oriented binary decoding. In Muresan, S., Nakov, P. & Villavicencio, A. (eds.) Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022, 169–182, 10.18653/v1/2022.findings-acl.16 (Association for Computational Linguistics,) (2022).
- 35.Lu, Y. et al. Text2event: Controllable sequence-to-structure generation for end-to-end event extraction. In Zong, C., Xia, F., Li, W. & Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, 2795–2806, 10.18653/v1/2021.acl-long.217 (Association for Computational Linguistics,) (2021).
- 36.Liu, X., Huang, H., Shi, G. & Wang, B. Dynamic prefix-tuning for generative template-based event extraction. In Muresan, S., Nakov, P. & Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, 5216–5228, 10.18653/v1/2022.acl-long.358 (Association for Computational Linguistics,) (2022).
- 37.Li, S., Ji, H. & Han, J. Document-level event argument extraction by conditional generation. In Toutanova, K. et al. (eds.) Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021, 894–908, 10.18653/v1/2021.naacl-main.69 (Association for Computational Linguistics,) (2021).
- 38.Lu, Y. et al. Unified structure generation for universal information extraction. In Muresan, S., Nakov, P. & Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, 5755–5772, 10.18653/v1/2022.acl-long.395 (Association for Computational Linguistics,) (2022).
- 39.Lou, J. et al. Universal information extraction as unified semantic matching. In Williams, B., Chen, Y. & Neville, J. (eds.) Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2023, Thirteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2023, Washington, DC, USA, February 7-14, 2023, 13318–13326, 10.1609/aaai.v37i11.26563 (AAAI Press, 2023).
- 40.Fei, H., Wu, S., Ren, Y. & Zhang, M. Matching structure for dual learning. In Chaudhuri, K. et al. (eds.) International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, vol. 162 of Proceedings of Machine Learning Research, 6373–6391 (PMLR, 2022).
- 41.Fei, H. et al. Lasuie: Unifying information extraction with latent adaptive structure-aware generative language model. In Koyejo, S. et al. (eds.) Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022 (2022).
- 42.Zhuang, L., Fei, H. & Hu, P. Knowledge-enhanced event relation extraction via event ontology prompt. Inf. Fusion100, 101919. 10.1016/J.INFFUS.2023.101919 (2023). [Google Scholar]
- 43.Li, J. et al. Unified named entity recognition as word-word relation classification. In Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022, 10965–10973, 10.1609/AAAI.V36I10.21344 (AAAI Press, 2022).
- 44.Lewis, M. et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Jurafsky, D., Chai, J., Schluter, N. & Tetreault, J. R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, 7871–7880, 10.18653/v1/2020.acl-main.703 (Association for Computational Linguistics, 2020).
- 45.Raffel, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res.21, 140:1–140:67 (2020).
- 46.See, A., Liu, P. J. & Manning, C. D. Get to the point: Summarization with pointer-generator networks. In Barzilay, R. & Kan, M. (eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, 1073–1083, 10.18653/V1/P17-1099 (Association for Computational Linguistics, 2017).
- 47.Du, X. & Cardie, C. Event extraction by answering (almost) natural questions. In Webber, B., Cohn, T., He, Y. & Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, 671–683, 10.18653/V1/2020.EMNLP-MAIN.49 (Association for Computational Linguistics, 2020).
- 48.Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019 (OpenReview.net), (2019).
- 49.Chen, R., Qin, C., Jiang, W. & Choi, D. Is a large language model a good annotator for event extraction? In Wooldridge, M. J., Dy, J. G. & Natarajan, S. (eds.) Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada, 17772–17780, 10.1609/AAAI.V38I16.29730 (AAAI Press, 2024).
- 50.Ma, Y., Cao, Y., Hong, Y. & Sun, A. Large language model is not a good few-shot information extractor, but a good reranker for hard samples!. In Findings of the Association for Computational Linguistics: EMNLP2023, 10572–10601 (2023). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets used and analysed during the current study are available in the LDC repository, “https://catalog.ldc.upenn.edu/LDC2006T06”.




