Abstract
In scientific writing, positive credits and negative criticisms can often be seen in the text mentioning the cited papers, providing useful information about whether a study can be reproduced or not. In this study, we focus on citation sentiment analysis, which aims to determine the sentiment polarity that the citation context carries towards the cited paper. A citation sentiment corpus was annotated first on clinical trial papers. The effectiveness of n-gram and sentiment lexicon features, and problem-specified structure features for citation sentiment analysis were then examined using the annotated corpus. The combined features from the word n-grams, the sentiment lexicons and the structure information achieved the highest Micro F-score of 0.860 and Macro-F score of 0.719, indicating that it is feasible to use machine learning methods for citation sentiment analysis in biomedical publications. A comprehensive comparison between citation sentiment analysis of clinical trial papers and other general domains were conducted, which additionally highlights the unique challenges within this domain.
INTRODUCTION
Sentiment analysis aims to determine the sentiment polarity conveyed through a segment of text with respect to a specific entity (opinion target)1,2. This entity can be individuals, objects, or topics. With the expanding use of social media, such as blogs and social networks starting from the year 2000, sentiment analysis has attracted increasing attention. Massive opinion data conveying personal sentiments are available online, which were very valuable for business intelligence applications, such as the product search engines3. These intelligent systems are supported by sentiment analysis methods that automatically identify the sentiments from product and service reviews3, 4. A quick perusal of existing work shows that most of the state-of-the-art studies approach sentiment analysis as a classification problem. Depending on the granularity of the text regions, sentiment analysis can be conducted at three levels, i.e., document-level5, sentence-level6, and aspect-level4. Typically, the sentiment corpora are composed of reviews, e.g. movie reviews5 and product reviews4, news articles6, 7 and tweets8, 9. Researchers have applied different strategies for sentiment analysis, including keyword spotting, lexical affinity, statistical methods, and concept-level techniques10. In the biomedical domain, sentiment analysis has already been used for detection of adverse drug reactions11, assessment of suicide risk12, evaluation of doctor service13, hospital quality survey14, and public health policy making15.
Citation sentiment analysis is an application of sentiment analysis in citation content analysis16, which aims to determine the sentiment polarity that the citation context carries towards the cited paper. In citation sentiment analysis, the opinion target is the cited paper, and the categories of sentiment polarity could be either positive, negative or neutral. Citation content analysis studies are useful for investigating the numerical, literal, and sociocultural17 aspects of citations, focusing on the classification of citation functions18, roles19, and the academic influence measures20. Some of the categories in these studies are also related to citation sentiments, e.g., ‘Contrast/Conflict’ and ‘Similarity/Consistency’ in the study by Agarwal et al19. Of late, citation sentiments analysis has emerged as a novel research topic in this area. Athar and Teufel 21, 22 proposed to use citation sentiment analysis to identify whether a particular act of citing was done with positive or negative intentions in computational linguistic publications.
One potential application of citation sentiment analysis is to improve bibliometric measures. In traditional bibliometrics, the impact of a research article is usually measured by citation frequency. Diverse citation counting-based measures, like the Hirsch index23 and the g-index24, offer a quantitative proxy to determine the impact. However, all citations are treated equally in these measures, without making any distinctions between the essential function of a citation, i.e., whether the cited work is credited or criticized. Citation sentiment analysis could therefore be used to improve the existing bibliometrics measures by introducing different weights, based on the specific sentiments of the citations. Moreover, it can provide evidence for scientific authoring support25 and citation bias analysis26, 27.
Another potential application of citation sentiment analysis in the biomedical domain is to detect non-reproducible studies. Over the last few years, the reproducibility of existing research has become an important issue in the biomedical domain. In a recent research paper, the researchers from Amgen Inc. found that 47 of 53 ‘landmark’ oncology publications could not be replicated 28. Moreover, increasing numbers of research articles are being retracted from prestigious journals due to the reproducibility issue. Therefore, it is important to detect non-reproducible studies as soon as possible, to avoid wasting resources, e.g., expensive drug-discovery projects that attempt to confirm non-reproducible results. Here we propose that the sentiment polarity conveyed in the citation context could be a useful clue denoting the reproducibility of the cited paper. For example,
I. Thus, inhibition of SATB1 expression does not appear to alter the aggressive phenotype of breast cancer cell lines in vitro, in contrast to the results reported by Han et al. [1].
In the above example I, a negative opinion is expressed towards the cited paper (reference id: 1), implying that the citing work could not reproduce the cited work. The reproducibility of a study could thus be evaluated by aggregating the sentiments dispersed in the publications citing it as a reference. Thus, citation sentiment analysis comprises the preliminary step towards reproducibility evaluation. However, there are limited studies about citation sentiment analysis in the biomedical domain. The most related study is from Yu25, where the author analyzed the needs of citation sentiment analysis and suggested that it should consider the sentiment of both specific claims and the citations.
To address the critical problem of evaluation of research reproducibility, this study proposed to identify citation sentiment in clinical trial papers using machine learning methods. A citation-level annotated sentiment analysis corpus composed of 285 clinical trial related publications was first constructed. Based on the corpus, we developed a machine learning based classifier and systematically compared different features and feature combinations. The experimental results show that it is feasible to use machine learning classifiers to identify citation sentiments from biomedical publications. One thing to be noted is that both the annotation and detection of citation sentiment are context-enhanced. To the best of our knowledge, this is one of the first studies that addresses automated citation sentiment analysis in the biomedical domain.
METHODS
This study attempts to classify the sentiment polarity of a citation in clinical trial papers, i.e., the sentiment polarity relation between the citing and the cited paper, based on the sentiment analysis of the content of citation context. We first annotated a citation sentiment analysis corpus, which contains discussion sections extracted from 285 clinical trial papers. The citation sentiment polarity was annotated at the citation-level by following an annotation guideline. A simple rule-based method was used to extract the citation context, which is a set of on-topic sentences. The citation sentiment polarity was then classified using machine learning methods incorporating features extracted from the citation context. The performance of citation sentiment analysis was evaluated using the 10-fold cross-validation method.
Data Set Annotation
Data Set Selection
The data we used in this study are derived from clinical trial papers. Some clinical trials are ‘self-correcting’ studies, which are founded on the replication of the earlier trials. The clinical trial paper contains the opinionated citations we wanted and satisfied the purpose of our concern on the reproducibility of biomedical research. We used the query “clinical trial[Title] AND (“research and review articles”[filter] AND “open access”[filter] AND “2004/06/01”[PubDate]: “2014/05/31”[PubDate])” to search the Pubmed Central (PMC)*. Then we downloaded all the papers in the search results from the PMC archive using OAI-PMH service†. We randomly selected 285 papers for corpus construction. The clinical trial paper in the biomedical domain has a purely descriptive style and a very standardized structure known as Introduction, Methods, Results, and Discussion (IMRAD). Most of the opinionated citations were found in the discussion section. Therefore, we only extracted the content of the discussion section. Multiple-references citation, e.g. ‘[1], [3]’ or ‘[3–10]’, was treated as one citation, since they share same citation sentiment polarity. Based on the above criteria, we obtained 4182 citations.
Annotation Schema
In previous work21, 22, each sentence which has a citation was treated as the citation context, and the sentiment was labeled at the sentence level. In our study, we annotated each citation based on its context. The citation level annotation is applied for the following reasons. First, in science papers, the context scope of a citation varies widely, from a clause to several sentences, even a paragraph. Second, there are some sentences which have more than one citation and have different sentiment polarities.
Based on our observation of the problem, we developed a simple and practical annotation scheme. We illustrate it as a decision tree, as shown in Figure 1 below. There are no instructions about the use of cue phrases. This study does not consider the correctness or incorrectness of the claims, ideas, results and conclusions presented in the citing or cited paper. The opinion target is the cited paper, not a scientific claim. For example, both citations in “Xxx et al. [3] concluded that …, which did not agree with Yyy et al. [7].” are labeled as neutral.
Figure 1.
Simple annotation scheme designed as a decision tree
Annotation Process
For the annotation task, three annotators were employed. According to the scheme discussed earlier, all citations were annotated by the two primary annotators. One of the annotators was a physician and served as the domain expert. The third annotator was involved when the two primary annotators has differing annotations for the same citation.
Citation Context Extraction
Citation context is defined as the textual statement that contains the citation. It is a set of on-topic (describing the target citation) sentences surrounding the citation. In our study, we hypothesized that citation context will provide relevant and related information which will help identify the citation sentiment. Most of the citations’ sentiment can be determined by the sentence which contains the citation itself. However, some of them need sentential context. A simple rule-based method was developed to extract the context content for each citation. First, for each citation (target citation), we extracted the whole sentence that contains it. Then, if the next sentence did not contain other citation(s), and there was a contrastive discourse relation between the two sentences, the next sentence was also added to the citation context. To determine the contrastive relation, a small set of cue-phrase-based patterns were used, including ‘however (,)…’, ‘conversely, …’, etc. The citation context extraction partially solves the problem of complex discourse structure, which may have expression of true opinion.
Citation Sentiment Polarity Classification
Features
Considering the characteristics of citation sentiment analysis, we systematically extracted the following features, including word n-grams, sentiment lexicons and problem-specified structure features.
(1) Word n-grams
All unigrams, bigrams and trigrams in the citation context were extracted as features. Since negation words always affect the polarity, they have to be taken into consideration in sentiment analysis. The total count of negation phrases found within the citation context was therefore extracted as features. Furthermore, artificial words like ‘NOT_w’ were extracted if the word ‘w’ was in the scope of negation. In this study, the negation scope was set as 2-words to the right of negations.
(2) Sentiment lexicons
Researchers use sentiment words to express opinions, which could be used as clues for citation sentiment analysis. We manually created a biomedical research paper specific sentiment lexicon. The lexicon includes 53 positive and 46 negative phrases. They were extracted from the corpus we built. Most of them are high frequent words found in biomedical literature. The presence and absence of a sentiment word were also extracted as features in addition to their sentiment polarities. We also assigned opposite polarity if the sentiment word was detected within the scope of a negation word.
(3) Structure features
Based on the observation of the examples, we found that the sentiment of citation was constrained by the intra- and inter- sentence features, including contrast discourse relation, sentiment words, negation words, comparative relation, related position and direction information of citing work, target, and other non-target cited work. We extracted and combined all these features in an order to generate structure features, which have better representation power by retaining intra- and inter- sequential relations. In scientific writing, researchers usually use a number of particular ways to express their sentiment regarding a citation, such as comparative expression and complex discourses. Thus, we designed a feature extraction pipeline to capture those complex structures. As shown in Table 1, the feature extraction pipeline composed of two steps: 1) Extract and tag all of the citing work tokens, comparative tokens, negation words, sentiment tokens, contrastive discourse cue words in the citation context. This is rule-based processing. We built a series of cue-phrase-based patterns to recognize concept entities like citing works, negation words, comparative expressions, and contrast discourse relations; 2) Generate the unigram, bigram and trigram of recognized concepts as features. If there is a contrast discourse relation in the citation context, we also generated the direction feature. It represented the position and direction information of citing, cited works and the contrast cues, as shown in the last row of Table 1.
Table 1.
An illustration of the structure features extraction.
| Example | Our data are generally consistent with that of other studies [TC], but not with studies where a single dose of paracetamol was administered [OC]. | These values are lower than those reported by French et al. [20]. | |
|---|---|---|---|
| Step 1 | [Our data]CITINGWORK are generally [consistent with]POSITIVE that of other studies [TC]TC, [but]CONTRAST [not]NEGATION with studies where a single dose of paracetamol was administered [OC]OC | [These values]CITINGWORK are [lower]COMPARATIVE [than]THAN those reported by French et al. [TC]TC | |
| Step 2 | Unigram | ‘CITINGWORK’, ‘POSITIVE’, ‘TC’, ‘CONTRAST’, ‘NEGATION’, ‘OC’ | ‘CITINGWORK’, ‘COMPARATIVE’,‘THAN’, ‘TC’ |
| Bigram | ‘CITINGWORK_POSITIVE’, ‘POSITIVE_TC’,‘TC_CONTRAST’, ‘CONTRAST_NEGATION’, ‘NEGATION_OC’ | ‘CITINGWORK_COMPARATIVE’, ‘COMPARATIVE_THAN’, ‘THAN_TC’ | |
| Trigram | ‘CITINGWORK_POSITIVE_TC’, ‘POSITIVE_TC_CONTRAST’, ‘TC_CONTRAST_NEGATION’, ‘CONTRAST_NEGATION_OC’ | ‘CITINGWORK_COMPARATIVE_TH AN’, “COMPARATIVE_THAN_TC’ | |
| Direction | ‘CITINGWORK_CONTRAST_DIR’, ‘TC_CONTRAST_DIR’, ‘CONTRAST_OC_DIR’ | ||
Machine Learning Algorithms
Before the feature extraction, the reference markers in the sentence, such as ‘[3]’, ‘[8, 9]’, and ‘[2, 6–10]’ etc., were identified. A citation context may contains several citations. Consequently, we replaced the text of target citation and other citations with the tokens ‘[TC]’ and ‘[OC]’, respectively. Then, we extracted the features from the citation context. Each citation context was represented as a feature vector. In this study, Support Vector Machines (SVMs) was employed for its known performance on previous sentiment analysis tasks. More specifically, the implementation of LIBLINEAR29 was used in our study. Parameters for SVM classifiers were optimized based on training set.
Evaluation
This study adopted the standard evaluation metrics. We reported the accuracy (Acc.), Micro- and Macro- F across all the categories. As the corpus is imbalanced, we further reported the precision (P), recall (R) and F-measure (F) for each category. In the experiments, 10-fold cross validation was performed. All the performances given below are the average of 10 folds. We report the results with different combinations of features.
RESULTS
Table 2 shows the descriptive statistics of the corpus. The corpus contains 702 positive, 308 negative and 3172 neutral samples. The KAPPA value among the annotations from the two primary annotators’ was 0.504. It is a moderate agreement level.
Table 2.
The statistic information of the corpus.
| Number of Documents | Number of Citations | # Positive | #Neutral | #Negative |
|---|---|---|---|---|
| 285 | 4182 | 702 (16.79%) | 3172 (75.85%) | 308 (7.36%) |
Table 3 shows the results of the citation sentiment analysis on the corpus we annotated. Different combinations of features were tested. Default parameters of LIBLINEAR package were used. We treat the results achieved using word n-gram features as the baseline. By adding sentiment lexicon features or structure features, the recall as well as the overall performance of both negative and positive class increased significantly. With a combination of all the features, we reached the best overall performance (accuracy of 0.870) and the highest F for each class, with 0.511 for “negative”, 0.924 for “neutral”, and “0.723” for “positive” respectively.
Table 3.
Results of 10-fold cross validation using different combination of features.
| Features | Overall | Per Category | |||||
|---|---|---|---|---|---|---|---|
| Acc. | Micro-F | Macro-F | Category | P | R | F | |
| (1) | 0.853 | 0.834 | 0.647 | Negative | 0.630 | 0.244 | 0.351 |
| Neutral | 0.867 | 0.972 | 0.916 | ||||
| Positive | 0.802 | 0.581 | 0.674 | ||||
| (1)+(2) | 0.869 | 0.858 | 0.714 | Negative | 0.700 | 0.386 | 0.498 |
| Neutral | 0.885 | 0.967 | 0.924 | ||||
| Positive | 0.822 | 0.638 | 0.719 | ||||
| (1)+(3) | 0.868 | 0.856 | 0.711 | Negative | 0.713 | 0.380 | 0.496 |
| Neutral | 0.882 | 0.969 | 0.923 | ||||
| Positive | 0.825 | 0.630 | 0.714 | ||||
| (1)+(2)+(3) | 0.870 | 0.860 | 0.719 | Negative | 0.711 | 0.399 | 0.511 |
| Neutral | 0.886 | 0.966 | 0.924 | ||||
| Positive | 0.823 | 0.644 | 0.723 | ||||
Feature sets: (1) Word n-grams; (2) Sentiment lexicons; (3) Structure features.
DISCUSSION
We conducted a study on identifying the sentiment polarity from the citation context in clinical trial papers using machine learning methods. The combined features from the word n-grams, the sentiment lexicons and the structure information achieved the highest Micro-F score of 0.860 and Macro-F score of 0.719, validating the promise of using automated machine learning methods to analyze the sentimental polarities in citation context from biomedical literature.
In this study, we constructed a citation analysis corpus from 285 randomly selected clinical trial related research articles. To the best of our knowledge, this is the first citation sentiment analysis corpus in the biomedical domain. The annotators had a moderate inter-annotator agreement with a Kappa of 0.5040, suggesting that citation sentiment identification is challenging even for domain experts. We also found that biomedical researchers expressed their sentiments in many different ways, which cannot be identified by simply using the sentiment lexicons. Sometimes, the annotators had to link distant citation context to figure out the sentiment polarity, which made it more difficult to automatically identify the sentiment using machine learning methods.
Compared to general sentiment analysis study on product reviews, news articles, and Twitter, citation sentiment analysis on biomedical literature presented several unique challenges.
-
Most of the citations were neutral with respect to sentiment. The purpose of citing was to gain information, such as background, methods and data information from the cited work. Authors typically used objective words in the citation context. As a result, the corpus we constructed for citation sentiment analysis is imbalanced, as shown in Table 2. An example sentence is shown below:
-
Sentiment polarity was often expressed as a comparative sentence. Typically, the authors used comparative claims to express opinions, either refuting or confirming a previous work, as comparative claims clearly describe similarity or difference relation between the citing and the cited work. The negativity or positivity was implied by the comparison with the authors’ work. An example is noted below:
III. Compared with previous studies using intratumoral or intravenous infusion of rAd-p53 [6], this study achieved higher overall clinical response rates with intra-arterial administration.
-
Scientists are reluctant to be critical when they cannot reproduce previous results. In some cases, the express of citation sentiment is very implicit without any explicit linguistic cues, such as negative words. Please see below for an example.
IV. It has been reported that the inclusion of micronutrients can decrease the number of acute respiratory infections [36]. In the present study, no differences were found in respiratory tract infections among treatment groups.
-
The context scope of a citation varied widely, from a single clause to multiple sentences. Meanwhile, adversative transition words were used extensively to connect these clause or sentences. The citation sentiment was constrained and implied by these intra- and inter- sentence discourse relation. A multiple sentences example is provided below.
V. Previous reports by Ulrichs et al. have shown that IFN-g producing cells against ESAT6 in tuberculosis patients increase post-tuberculosis therapy [18]. However, we were unable to demonstrate any difference in ESAT6-induced IFN- γ responses between patients prior to and post-treatment. This is in concordance with a recent report by Coussens et al. [14] that did not show any change in ESAT-6 induced IFN-g in patients’ post-antituberculous therapy.
In conclusion, imbalanced data, comparative expression, implicit sentiment and complex discourse structure made the citation sentiment analysis more challenging in the biomedical domain.
The clinical trial related biomedical literature is usually about biomedical findings. The biomedical researchers express their sentiments indicating whether his work confirms, supports, or agrees with the cited paper. The literature in other domains, such as the computer science, is more about theoretical models, algorithms and application systems. Therefore, the scientists used different ways to express sentiments in different domains. For examples, “agree with” and “confirm” were always used within a positive citation context in the biomedical domain. However, they were uncommon in the literature of computer science.
Although the overall performance on the three sentiment categories is promising, the performance for the negative category still needs to be improved. We conducted an error analysis for the negative category and found that the low performance was caused by the implicit sentiment expression and complex discourse structures. Normally, the authors tried their best to avoid giving explicit personal effect especially negative credit to the cited paper. Further, the intentions of the authors were not sometimes available to the content analyst. For example, in the sentence “We did not observe any reductions in mortality, as did Ferrer and colleagues [14].”, more subjective interpretation to assign citations to the defined sentiment categories is required. Textual entailment techniques may be required to determine whether the mentioned statement contradicts the claim of the cited paper or not. In our study, for simplicity, we only proposed a rule based citation context extraction method. The on-topic sentence could potentially be missed if the cue words are out the scope of the two continuous sentences. Furthermore, other discourse relations related to sentiment analysis30 were not taken into account.
The method we used is not proposed for learning from imbalanced data. Some state-of-the-art methods for imbalanced learning may improve the performance of minority classes. That’s one of the aims of our future work. For minority classes, it is hard to collect enough samples for training. However, massive citations can be obtained easily. Semi-supervised methods could be a potential way to handle the imbalanced data problem for citation sentiment analysis. Another limitation of this study is that data were collected from clinical trial papers only; therefore more optimizations would be required when we extend the work to all biomedical papers.
CONCLUSION
In this study, we manually annotated a citation sentiment corpus for clinical trial papers and developed a machine learning approach to determine sentiment polarity of biomedical citations. We conducted experiments to examine the effectiveness of the common features (i.e., word n-gram and sentiment lexicons) and structure features in classifying citation sentiment. Our results and analysis indicate that it is feasible to determine biomedical citation sentiment using machine learning approaches, although challenges still exist. In the future, we will enhance our approach by employing methods for imbalanced data and discourse relations analysis techniques that could extract more informative context features.
Acknowledgments
This study is supported in part by grants from NLM 2R01LM010681-05, NIGMS 1R01GM103859 and 1R01GM102282. The first author (JX) is partially supported by NSFC 61203378.
Footnotes
References
- 1.Pang B, Lee L. Opinion mining and sentiment analysis. Foundations and Trends ® in Information Retrieval. 2008;2(1–2):1–135. [Google Scholar]
- 2.Liu B. Sentiment analysis and opinion mining. Morgan & Claypool Publishers; 2012. [Google Scholar]
- 3.Lerman K, Blair-Goldensohn S, McDonald R. Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics. Athens, Greece: Association for Computational Linguistics; 2009. Sentiment summarization: evaluating and learning user preferences; pp. 514–22. [Google Scholar]
- 4.Hu M, Liu B. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. Seattle, WA, USA: ACM; 2004. Mining and summarizing customer reviews; pp. 168–77. [Google Scholar]
- 5.Pang B, Lee L, Vaithyanathan S. Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10. Association for Computational Linguistics; 2002. Thumbs up? sentiment classification using machine learning techniques; pp. 79–86. [Google Scholar]
- 6.Riloff E, Wiebe J. Proceedings of the 2003 conference on Empirical methods in natural language processing. Association for Computational Linguistics; 2003. Learning extraction patterns for subjective expressions; pp. 105–12. [Google Scholar]
- 7.Koppel M, Shtrimberg I. Good news or bad news? let the market decide. Computing Attitude and Affect in Text: Theory and Applications. 2006;20:297–301. [Google Scholar]
- 8.Agarwal A, Xie B, Vovsha I, Rambow O, Passonneau R. Proceedings of the Workshop on Languages in Social Media Portland. Oregon: Association for Computational Linguistics; 2011. Sentiment analysis of Twitter data; pp. 30–8. [Google Scholar]
- 9.Bollen J, Mao H. Twitter mood as a stock market predictor. Computer. 2011;44(10):91–4. [Google Scholar]
- 10.Cambria E, Schuller B, Xia Y, Havasi C. New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems. 2013;28(2):15–21. [Google Scholar]
- 11.Sharif H, Zaffar F, Abbasi A, Zimbra D. Proceedings of the Sixth ASE International Conference on Social Computing. 2014. Detecting adverse drug reactions using a sentiment classification framework. [Google Scholar]
- 12.Pestian J, Pestian J, Pawel M, Brett S, Ozlem U, John H. Sentiment analysis of suicide notes: A Shared Task. Biomedical Informatics Insights. 2012;3 doi: 10.4137/BII.S9042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wallace BC, Paul MJ, Sarkar U, Trikalinos TA, Dredze M. A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews. Journal of the American Medical Informatics Association : JAMIA. 2014 Nov-Dec;21(6):1098–103. doi: 10.1136/amiajnl-2014-002711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Greaves F, Laverty AA, Cano DR, et al. Tweets about hospital quality: a mixed methods study. BMJ quality & safety. 2014 Oct;23(10):838–46. doi: 10.1136/bmjqs-2014-002875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Larson HJ, Smith DMD, Paterson P, et al. Measuring vaccine confidence: analysis of data obtained by a media surveillance system used to analyse public concerns about vaccines. The Lancet Infectious Diseases. 2013;13(7):606–13. doi: 10.1016/S1473-3099(13)70108-7. [DOI] [PubMed] [Google Scholar]
- 16.Swales J. Citation analysis and discourse analysis. Applied Linguistics. 1986;7(1):39–56. [Google Scholar]
- 17.Zhang G, Ding Y, Milojević S. Citation content analysis (CCA): a framework for syntactic and semantic analysis of citation content. Journal of the American Society for Information Science and Technology. 2013;64(7):1490–503. [Google Scholar]
- 18.Teufel S, Siddharthan A, Tidhar D. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Sydney, Australia: Association for Computational Linguistics; 2006. Automatic classification of citation function; pp. 103–10. [Google Scholar]
- 19.Agarwal S, Choubey L, Yu H. Automatically classifying the role of citations in biomedical articles. AMIA Annu Symp Proc 2010. 2010:11–5. [PMC free article] [PubMed] [Google Scholar]
- 20.Zhu X, Turney P, Lemire D, Vellino A. Measuring academic influence: not all citations are equal. Journal of the Association for Information Science and Technology. 2015;66(2):408–27. [Google Scholar]
- 21.Athar A. Proceedings of the ACL 2011 Student Session. Portland, Oregon: Association for Computational Linguistics; 2011. Sentiment analysis of citations using sentence structure-based features; pp. 81–7. [Google Scholar]
- 22.Athar A, Teufel S. Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Montreal, Canada: Association for Computational Linguistics; 2012. Context-enhanced citation sentiment detection; pp. 597–601. [Google Scholar]
- 23.Hirsch JE. An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America. 2005 Nov 15;102(46):16569–72. doi: 10.1073/pnas.0507655102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Egghe L. Theory and practise of the g-index. Scientometrics. 2006;69(1):131–52. [Google Scholar]
- 25.Yu B. Proceedings of the 76th ASIS&T Annual Meeting: Beyond the Cloud: Rethinking Information Boundaries. Montreal, Quebec, Canada: American Society for Information Science; 2013. 2013. Automated citation sentiment analysis: what can we learn from biomedical researchers; pp. 1–9. [Google Scholar]
- 26.Ioannidis JP. Contradicted and initially stronger effects in highly cited clinical research. J Am Med Assoc. 2005 Jul 13;294(2):218–28. doi: 10.1001/jama.294.2.218. [DOI] [PubMed] [Google Scholar]
- 27.Greenberg SA. How citation distortions create unfounded authority: analysis of a citation network. Bmj. 2009;339:b2680. doi: 10.1136/bmj.b2680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012 Mar 29;483(7391):531–3. doi: 10.1038/483531a. [DOI] [PubMed] [Google Scholar]
- 29.Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J. LIBLINEAR: a library for large linear classification. J Mach Learn Res. 2008;9:1871–4. [Google Scholar]
- 30.Somasundaran S, Wiebe J, Ruppenhofer J. Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1. Manchester, United Kingdom: Association for Computational Linguistics; 2008. pp. 801–8. [Google Scholar]

