Automatic Classification of Structured Product Labels for Pregnancy Risk Drug Categories, a Machine Learning Approach

Laritza M Rodriguez; Dina Demner Fushman

. 2015 Nov 5;2015:1093–1102.

Automatic Classification of Structured Product Labels for Pregnancy Risk Drug Categories, a Machine Learning Approach

Laritza M Rodriguez ¹, Dina Demner Fushman ¹

PMCID: PMC4765680 PMID: 26958248

Abstract

With regular expressions and manual review, 18,342 FDA-approved drug product labels were processed to determine if the five standard pregnancy drug risk categories were mentioned in the label. After excluding 81 drugs with multiple-risk categories, 83% of the labels had a risk category within the text and 17% labels did not. We trained a Sequential Minimal Optimization algorithm on the labels containing pregnancy risk information segmented into standard document sections. For the evaluation of the classifier on the testing set, we used the Micromedex drug risk categories. The precautions section had the best performance for assigning drug risk categories, achieving Accuracy 0.79, Precision 0.66, Recall 0.64 and F1 measure 0.65. Missing pregnancy risk categories could be suggested using machine learning algorithms trained on the existing publicly available pregnancy risk information.

Keywords: machine learning, pregnancy, drug risk, data-mining, knowledge extraction, document classification

Background and Justification

The use of drugs and other substances during pregnancy and lactation can be damaging for the developing embryo and fetus, and is in general discouraged, unless there is a strong medical reason. At any given time over 10 million women are pregnant or lactating in the United States¹; these women often need to use drugs and other substances while under care of medical providers, likely other than obstetricians. Drug effects on the fetus and embryo vary during different stages of development; the hemodynamic changes in the pregnant women cause changes on the absorption rates, and pharmacologic action of medications and substances. The obstetrical and pharmacological literature information on the effects of drugs on the embryo and the fetus is extensive, however, the information is static, difficult to maintain and not readily available at the point of care. Ideally, this information should be provided in the FDA-approved standard product labels (SPL) made available to the public by the National Library of Medicine (NLM)². The SPL for each drug is created by the pharmaceutical manufacturer; it is a requirement of the FDA to provide complete information on use, dose, content, contraindications, side effects and warnings of drugs and chemical products for human and animal consumption. The drug information is stored in a structured standardized format with document sections defined by the LOINC³ document standard. The LOINC codes and descriptions for the SPL document sections are described in (Table 1).

Table 1:

SPL document sections LOINC codes

LOINC Code	SPL section name
34066-1	FDA package insert Boxed warning section
34067-9	FDA package insert Indications and usage section
34068-7	FDA package insert Dosage and administration section
34069-5	FDA package insert How supplied section
34070-3	FDA package insert Contraindications section
34071-1	FDA package insert Warnings section
34076-0	FDA package insert Information for patients section
34084-4	FDA package insert Adverse reactions section
34088-5	FDA package insert Overdosage section
34089-3	FDA package insert Description section
34090-1	FDA package insert Clinical pharmacology section
38056-8	FDA package insert Structured product labeling supplemental patient material
42230-3	FDA package insert Structured product labeling patient package insert section
42231-1	FDA package insert Structured product labeling medguide section
42232-9	FDA package insert Precautions section

Open in a new tab

The Code of Federal Regulations (CFR) Title 21 from the Federal Drug Administration (FDA) describes the specific requirements on content and format of labeling for human prescription drug and biological products⁴. It includes specific guidelines for a section on specific populations describing the effects on pregnancy and lactation. The CFR indicates: “the section may be omitted only if the drug is not absorbed systemically and the drug is not known to have a potential for indirect harm to the fetus”. However, the pregnancy section is missing from some of the labels even for those drugs where the information is provided for the same ingredients in other brand names labels. This information is available in proprietary collections, such as Micromedex® Solutions. Micromedex Solutions is an evidenced based clinical resource curated by experienced professionals in the healthcare field⁵. The resource includes the Micromedex Pharmacological Knowledge with a monograph for pharmaceutical products in brand and ingredient forms. The pregnancy risk categories in the Micromedex monographs include curated classes from the FDA classification, the Australian categorization system for prescribing medicines in pregnancy (ACPM), and a simplified classification defined by Micromedex. However, not all documents have categories from all three classification systems: some have only FDA categories and some only the simplified Micromedex categories. For the purposes of this study we considered Micromedex Fetal risk is minimal equivalent to FDA categories A and B, Micromedex Fetal risk cannot be ruled out equivalent to FDA category C, and Micromedex Fetal risk has been demonstrated equivalent to FDA categories D and X. Table 2 summarizes the FDA drug pregnancy drug categories, and the Micromedex classification

Table 2:

Pregnancy risk Categories

FDA Pregnancy Risk Drug Categories
A	No risk in controlled human studies: Adequate and well-controlled human studies have failed to demonstrate a risk to the fetus in the first trimester of pregnancy (and there is no evidence of risk in later trimesters).
B	No risk in other studies: Animal reproduction studies have failed to demonstrate a risk to the fetus and there are no adequate and well-controlled studies in pregnant women OR Animal studies have shown an adverse effect, but adequate and well-controlled studies in pregnant women have failed to demonstrate a risk to the fetus in any trimester.
C	Risk not ruled out: Animal reproduction studies have shown an adverse effect on the fetus and there are no adequate and well-controlled studies in humans, but potential benefits may warrant use of the drug in pregnant women despite potential risks.
D	Positive evidence of risk: There is positive evidence of human fetal risk based on adverse reaction data from investigational or marketing experience or studies in humans, but potential benefits may warrant use of the drug in pregnant women despite potential risks.
X	Contraindicated in Pregnancy: Studies in animals or humans have demonstrated fetal abnormalities and/or there is positive evidence of human fetal risk based on adverse reaction data from investigational or marketing experience, and the risks involved in use of the drug in pregnant women clearly outweigh potential benefits.
N	FDA has not yet classified the drug into a specified pregnancy category.
Micromedex Pregnancy Category Definitions
Fetal risk is minimal	The weight of an adequate body of evidence suggests this drug poses minimal risk when used in pregnant women or women of childbearing potential.
Fetal risk cannot be ruled out	Available evidence is inconclusive or is inadequate for determining fetal risk when used in pregnant women or women of childbearing potential. Weigh the potential benefits of drug treatment against potential risks before prescribing this drug during pregnancy.
Fetal risk has been demonstrated	Evidence has demonstrated fetal abnormalities or risks when used during pregnancy or in women of childbearing potential. An alternative to this drug should be prescribe during pregnancy or in women of childbearing potential.

Open in a new tab

The FDA pregnancy risk category is assigned according to scientific evidence of effects of drugs on the developing embryo and fetus based on animal and human studies⁶. Clinical obstetrical pharmacological studies on the effect of drugs focus on limited numbers of drugs, require expert review and are often limited to a small number of drug classes¹^,⁷. In recent years there has been an effort by the Advisory Committee of Prescription Medicines (ACPM) in Australia⁸, and the FDA to change the pregnancy risk category classification, and to base the risk classification on human studies and population registries that gather accurate information on the effect of drugs on humans. These efforts will take time and extensive data analysis. For now, the most complete and widely accepted classification remains the FDA risk category. As mentioned above, the information in the SPLs is incomplete and does not include all available drugs. We sought to augment the existing information and assign missing pregnancy risk drug categories to SPLs. To that end, we applied machine learning algorithms to extract knowledge from the free text and leverage the existing categories assigned to some of the documents. Automatic SPL classification has the potential to aid expert groups when studying the effects of drugs belonging to drug classes and subclasses, and could ultimately be applied to aid prescribers at the point of care.

Materials and Methods

We used 18,341 prescription drugs in the DailyMed collection available for download in XML format from the national Library of Medicine at http://dailymed.nlm.nih.gov/.

We used the following regular expression in Perl to identify and extract the standard pregnancy risk categories from the labels: /pregnan.{3,50}category[^a-z0-9]+([abcdnx])(?:[^a-z0-9]|$)/i

We manually checked the pregnancy risk class assigned by regular expressions. Manual checking allowed us to find cases missed by the regular expressions, for example, SPLs in which “pregnancy” is misspelled; the words “pregnancy” and “category” are separated by other text or the category is set off by punctuation.

In the process, we identified two sets of documents: those that had risk categories and those with no mention of pregnancy risk category in the text. We used the standard FDA categories (A, B, C, D, X) and N when no category was mentioned in the text and assigned a risk category class to each document. Some drugs include more than one pregnancy drug category (a drug that is safe in the second trimester of pregnancy might be contraindicated in the third trimester e.g. NSAID), drugs of this type will have mention of two categories in the label. However, the overall total of documents with two categories was only 81; these labels were removed from our data set, since these sparse classes were insufficient to train the classifier. This left us with 15,221 labels in the training set and 3,039 in the testing set with an overall total of 18,260 documents. The class distribution of the labels is summarized in Table 3.

Table 3:

Prescription Drugs: Distribution of Class Labels for Pregnancy Drug Categories.

	Training Set labels (Gold Standard)	Test Set labels
	Training Set labels (Gold Standard)	Gold Standard	Correctly assigned by classifier	Misclassified as class (number of labels)
A	198(1%)	64(2.11%)	61	C (3)
B	3,587(24%)	458(15.10%)	370	C(88)
C	9,050(59%)	1668(54.98%)	1665	B(3)
D	1,704(11%)	596(19.64%)	579	B(3), C (12), X(2)
X	682(4%)	248(8.17%)	233	B(3), C(12)
Total	15,221(83% of total)	3,034(17% of total)	2,908(95.84%)	126(4.33%)

Open in a new tab

We segmented the documents using the XML section tags and then converted the sections to plain text for the machine learning approach. Each section preserved the original document unique identifier and the pregnancy risk class label. In the testing set, the unknown class label field was replaced with a “?” to comply with WEKA format requirements for unclassified documents.

Machine Learning

Support vector machines are extensively used in biomedical text processing and are known to produce good classification results⁹. Marafino and colleagues demonstrated similar successful results in a different clinical domain¹⁰ using N-grams SVM. Other authors have compared the use of applying different machine learning algorithms Naive Bayes, and Stacking, Boosting and Feature Selection for document classification¹¹. We experimented with Naive Bayes and SMO with and without feature selection. SMO had the best cross-validation results, therefore, we used the Sequential Minimal Optimization (SMO) implementation of the method described by Platt¹², and available in WEKA¹³ for our final experiments.

To generate the support vectors we applied the unsupervised 'StringtoWordVector' filter to both training and testing sets for all the document sections with the following parameters: lower case and normalization, string delimiters, word counts, N-grams restricted to 3 words per string, no stemmers, no stop word list, preserving 400 most frequent strings per document section. We applied the SMO algorithm classifier to each of the training document sections with 10 fold cross validation to decide which document section had the highest predictive value for the classification task. We used the value of Receiver Operating Curve (ROC) weighted average obtained from the cross validation to select the best performer. The SPL precautions sections had the best performance for document classification for pregnancy risk categories. We applied the resulting classification model trained on the precautions section to the precaution sections of the testing set. We then evaluated the performance on the testing set as described in the next section.

Machine Learning Model Evaluation

To evaluate the performance of the classifier we limited the set to single ingredient drugs using the SPL document mapping to RxNorm drug terminology standard¹⁴. To do this we used the RxMix¹⁵ tool providing RxCUIs for Standard Clinical Drugs (SCD) as input and limiting the output term type to ingredients (IN). We extracted the ingredients for both the training and the testing sets separately. From the Micromedex Pharmaceutical Knowledgebase monographs we manually extracted the pregnancy risk category for these ingredients.

As previously described, the Micromedex monographs include the FDA pregnancy risk category and/or the simplified Micromedex® Fetal Risk Classification. For evaluation purposes we normalized the risk categories of all three sets to the three Micromedex® categories: Micromedex Fetal risk is minimal equivalent to FDA categories A and B, Micromedex Fetal risk cannot be ruled out equivalent to FDA category C, and Micromedex Fetal risk has been demonstrated equivalent to FDA categories D and X.

We calculated accuracy, precision, recall and F1 measure to compare how well the known risk categories in the training set of the FDA-approved SPL documents agree with Micromedex manually curated expert knowledgebase.

To evaluate the performance of the classifier on the test set, we calculated the same measures, using Micromedex as reference standard. We also established a baseline classifier by assigning the majority class label (Fetal risk cannot be ruled out) to the testing set. We tested the statistical significance of the differences in performance measures between the model and the baseline taking into account sample size and the number of classes according to the description by Combrissona¹⁶.

Error Analysis

To analyze the errors of the model we analyzed the differences in classification between the Micromedex set and the testing set for the most extreme case of disagreement, for which Micromedex categorizes the drug as demonstrated fetal risk, and the testing set as minimal fetal risk.

Results

The prescription collection of product labels included 18,341 documents. We excluded 81 labels with a multiple risk category mentions. The distribution of the risk categories for the training set was Class A 1%, Class B 24%, Class C 59%, Class D 11%, and Class X 4%; the training set accounts for 83% of the documents, and the testing set for 17% (Table 3). The distribution for the class labels assigned to the testing set by the classifier was: Class A 2.11%, Class B 15.10, Class C 54.98%, Class D 19.64%, and Class X 8.17% (Table 3)

As mentioned above, we selected the sections most likely to contain information about pregnancy risk using ROC. The highest ROC weighted average for the cross validation of the SMO on the training set was for the “Precautions section” (0.99), values for the ROC weighted average for the other document sections are in Table 4.

Table 4:

Results of Sequential Minimal Optimization ROC Weighted Average on Document Sections of the Testing Set with 10 Fold Cross Validation

Section	ROC Weighted Average
How supplied section	0.74
Indications and usage section	0.89
Information for patients section	0.90
Description section	0.90
Contraindications section	0.90
Adverse reactions section	0.91
Overdosage section	0.92
Dosage and administration section	0.93
Clinical pharmacology section	0.94
Boxed warning section	0.94
Warnings section	0.94
Structured product labelling supplemental patient material	0.95
Structured product labelling medguide section	0.96
Structured product labelling patient package insert section	0.96
Precautions section	0.99

Open in a new tab

The 15,221 labels in the training set mapped to 685 distinct single ingredients (IN). The 3,039 labels in testing set mapped to 286 single IN. In the training set, we found 37 single ingredients with more than one risk category. Unlike the 81 documents we initially identified as having two risk categories in the text and removed, these 37 ingredients had different categories assigned in different documents, indicating inconsistency in the content labeling across different manufactures. We analyzed the content of two of the 37 ingredients for which we encountered more than one pregnancy risk category in the training set (Table 5).

Table 5:

Ingredients with mention of more than one category in the training set

Drug Name		Document Counts for Different Risk Categories for the same Ingredient
RxCui	Ingredient Name	A	B	C	D	X	Total
2582	Clindamycin		101	1			102
3992	Epinephrine		51	40			91
3355	Diclofenac		5	84			89
4815	Glyburide		80	6			86
4053	Erythromycin		50	3			53
7299	Neomycin			43	9		52
142438	Gentamicin Sulfate (USP)			30	18		48
10753	Tretinoin			38	3		41
8134	Phenobarbital		4	22	14		40
11002	Urea		20	13			33
11124	Vancomycin		1	31			32
10627	Tobramycin		14		17		31
7213	Ipratropium		26	1			27
4450	Fluconazole			14	13		27
6468	Loperamide		3	23			26
6703	Megestrol				12	13	25
1223	Atropine		7	17			24
5487	Hydrochlorothiazide		2	14	7		23
33272	phendimetrazine			17		5	22
2623	Clotrimazole		5	16			21
11295	Water		1	19			20
25789	glimepiride		12	7			19
7242	Naloxone		7	12			19
3498	Diphenhydramine		17	1			18
2409	Chlorthalidone		9		9		18
1886	Caffeine			10		7	17
6585	Magnesium Sulfate	6		2	7		15
6470	Lorazepam			1	12		13
10368	Terbutaline		4	7			11
6628	Mannitol		3	6			9
6694	Mefloquine		6	1			7
6054	Isoproterenol		1	6			7
6854	Methoxsalen			5	1		6
19831	Budesonide		1	3			4
61148	Somatropin		1	2			3
6878	Methylene blue			1		2	3
4986	Chorionic Gonadotropin			1		2	3

Open in a new tab

We manually reviewed the Micromedex knowledgebase monographs for 685 ingredients in the training set and 286 ingredients in the testing set.

The SPL training set and the Micromedex reviewed ingredients set had 200 ingredient-class pairs in common, and the Micromedex and the testing set had 238 ingredient-class pairs in common.

The performance measures between the SPL training set and the Micromedex set resulted in: Accuracy 0.94, Precision 0.90, Recall 0.88 and F1 measure of 0.89. The performance of the classifier tested against the Micromedex knowledge base were Accuracy 0.79, Precision 0.66, Recall 0.64 and F measure 0.65. Measures for assigning the most frequent class were Accuracy 0.58, Precision 0.58, Recall 1, and F measure 0.74. The statistical significance for a sample size of 200 and 3 classes exceeds p<0.001.

Error Analysis

We found four drugs in extreme disagreement between the testing set and Micromedex where the drug is clearly contraindicated in Micromedex and the classifier assigned minimal fetal risk. Diazepam: The SPL warning for this drug is in the Warnings Section, and not in the Precautions section used to train the classifier. Both the SPL and Micromedex clearly classify it as contraindicated. Estradiol: The pregnancy section for this drug in SPL refers to no apparent increased risk of birth defects in women who have used the drug in low dose form as contraceptive during early pregnancy, there is no further statement for use of the drug in other stages of pregnancy or at higher doses. Levonorgestrel: The SPL for this drug has no mention of use during pregnancy. In Micromedex there is a clear statement of fetal risks, which affirms that the drug is contraindicated in women who are or may become pregnant. Meprobamate: Both the SPL and Micromedex affirm there is positive evidence of fetal risk but it may be used if the drug is needed in life-threatening situations for the pregnant woman, and there is no other drug that can be effective.

Discussion and Conclusions

We demonstrated that it is possible to automatically classify drug documents into pregnancy risk categories using standard document classification machine learning algorithms, and thus extracting valuable information from free text documents. Several authors have used the FDA SPL documents to extract valuable clinical information using different approaches. Fung and colleagues demonstrated the feasibility of extracting drug indication information from FDA SPL using publicly available natural language processing tools¹⁷. Further, Khare describes a method to extract structured and normalized indications from FDA drug labels¹⁸. Culbertson and colleagues used semantic natural language processing (SemRep) to extract Adverse Drug Event Information from Black Box Warnings¹⁹. Deleger et al. used a hybrid Natural Language processing method to extract indications, contraindications, over-dosage, and adverse reactions from FDA SPL documents²⁰. To the best of our knowledge, our work is the first to attempt automatic drug document classification based on pregnancy risk categories.

Our work also demonstrates that the automatic classification method provides better precision and accuracy than assuming the most frequent medium-level risk for all drugs with unknown pregnancy risks.

The performance of the classifier was somewhat affected by inconsistencies in the content of the labels for the same ingredients across different manufactures, and by inconsistencies of the document section in which the pregnancy warning statement is included. Inconsistencies in the information in the SPL documents were demonstrated by Duke and colleagues²¹ who reported 68% discrepancy in the labeling across manufactures in the Warnings section of bioequivalent medications (identical active ingredient). Duke et al. also report discrepancies in the reported adverse events, post-marketing reports, and even indications differences.

It is not surprising that there are disagreements in the pregnancy risk categorization among similar drugs in the same source. In (Table 5) we included all the ingredients for which more than one risk category is mentioned for the same ingredient in different documents, and analyzed in detail two of those drugs. For RxCUI: 6585 RxNormName: Magnesium Sulfate the ingredient is in 15 SPL documents, all of which indicate that the substance is safe for use during pregnancy and is a drug of choice in the treatment of Pregnancy Induced Hypertension, and prevention of seizures in these patients. Only 6 documents, however, label it as category A, 2 documents label it as category C, warning that the use of the drug for more than 7 days can cause bone anomalies in the fetus, and if used 2 hours before delivery the fetus may suffer from severe hypocalcemia. Further, 7 documents label the same substance as category D based on the animal studies showing that the prolonged use of the drug can cause bone alterations and alter reproductive capacity of the fetus. Another example of multiple drug categories for the same ingredient is RxCUI: 4450 RxNormName: Fluconazole, for this drug tablets for vaginal application have a risk category A, while tablets for oral treatment have a risk category C. The drug is not expected to be absorbed systemically when used in topical or vaginal applications, therefore it is not expected to cause the teratogenic effects demonstrated in animals with oral treatment, but there is no conclusive knowledge on the amount of drug absorbed systemically with vaginal application.

Other examples of disagreements in the reference standards are the assignments of a single risk category to the drugs with different risks to the developing fetus depending on the gestational age. We excluded from our study documents that included more than one category, but some manufacturers label the drug with only the highest risk category as we demonstrated with the analysis of the text for the 15 labels containing Magnesium Sulfate. This analysis indicated that in the future, gestational age needs to be taken into consideration, when automatically assigning pregnancy risk categories to drugs.

Federal Register Content and Format of Labeling for Human Prescription Drug and Biological Products; Requirements for Pregnancy and Lactation Labeling²² issued on December 4, 2014 removes the pregnancy risk category labeling from the Standard Product Labels and implements the requirement to include patient registry information when available. However, the FDA and its equivalent Australian organization have attempted this change for several years. Progress is slow and not likely to happen without the aid of automatic methods. Patient registries on prescription medication use are slowly growing; the information obtained from these data sources will require curation before it is applicable to clinical documentation. Patient registries will also benefit from automatic classification systems similar to the one tested by us. Moreover, patient registries are focused on drug classes and the information is not yet available to the public.

Our study demonstrates that it is possible to extract useful clinical information from free text using automatic machine learning algorithms, although more work is needed, as indicated by the 89% F1 score when comparing the known SPL category assignments and Micromedex, which could be considered an achievable agreement for machine learning methods. Additional experiments are needed to test if the accuracy, precision and recall can be improved by deeper understanding of the context, such as the gestational age, and through additional syntactic and semantic features of the SPL precautions sections. Automatic classification of drug labels lacking pregnancy risk category information can aid clinicians and patients who are attempting to identify the effects of drugs similar to those for which the information is available in text format. Our approach could also be of general interest for extraction of other information that might be incomplete in some drug labels, such as contraindications or side-effects, e.g. secondary effects of Tricyclic antidepressants (TCAs) depend on pharmacological action on histamine or serotonine receptors, an automatic system that could automatically classify these drugs could be of great benefit at the point of care Information on drugs changes every day, continued research both in the pharmaceutical industry and in clinical practice unveil previously unknown information, the FDA is continuously releasing drug safety communications²³. It is important to provide a mechanism by which identical drugs can be automatically classified utilizing the existing knowledge.

Limitations

Our work is limited to prescription drugs, however, over the counter drugs and homeopathic medications have potential to cause harm on the developing fetus as well and need to be explored in the future.

Figure 1: — Example of a SPL header and Warnings Section

Acknowledgments

Institutional: This work was funded by the Intramural Research Program of the National Library of Medicine. Individuals: Phil Wolf BS

References

1.Nahum GG, Uhl K, Kennedy DL. Antibiotic use in pregnancy and lactation: what is and is not known about teratogenic and toxic risks. Obstet Gynecol. 2006;107(5):1120–1138. doi: 10.109-7/01.AOG.0000216197.26783.b5. [DOI] [PubMed] [Google Scholar]
2.DailyMed. [Accessed November 26, 2014]. http://dailymed.nlm.nih.gov/dailymed/index.cfm.
3. LOINCManual.pdf. [Accessed March 6, 2015]. http://loinc.org/downloads/files/LOINCManual.pdf.
4.Title 21–Food and Drugs Chapter I–Food and Drug Administration. Department of Health and Human Services Subchapter C–Drugs:General; http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/cfrsearch.cfm?fr=201.57. [Google Scholar]
5.Micromedex® 20, (electronic version) Truven Health Analytics. Greenwood Village; Colorado, USA: [Accessed December 18, 2014]. http://micromedex.com/pharmaceutical. [Google Scholar]
6.Pregnancy.pdf. [Accessed November 26, 2014]. https://depts.washington.edu/druginfo/Formulary/Pregnancy.pdf.
7.Gallego Úbeda M, Delgado Téllez de Cepeda L, Campos Fernández de Sevilla M de LA, De Lorenzo Pinto A, Tutau Gómez F. An update in drug use during pregnancy: risk classification. Farm Hosp Órgano Of Expr Científica Soc Esp Farm Hosp. 2014;38(4):364–378. doi: 10.7399/fh.2014.38.4.7395. [DOI] [PubMed] [Google Scholar]
8.Advisory Committee on Prescription Medicines (ACPM) | Therapeutic Goods Administration (TGA) [Accessed November 26, 2014]. http://www.tga.gov.au/committee/advisory-committee-prescription-medicines-acpm.
9.Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB. Towards automatic recognition of scientifically rigorous clinical research evidence. J Am Med Inform Assoc JAMIA. 2009;16(1):25–31. doi: 10.1197/jamia.M2996. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Marafino BJ, Davies JM, Bardach NS, Dean ML, Dudley RA. N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit. J Am Med Inform Assoc JAMIA. 2014;21(5):871–875. doi: 10.1136/amiajnl-2014-002694. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF. Text categorization models for high-quality article retrieval in internal medicine. J Am Med Inform Assoc JAMIA. 2005;12(2):207–216. doi: 10.1197/jamia.M1641. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Fast Training of Support Vector Machines using Sequential Minimal Optimization. [Accessed November 26, 2014]. http://common-lisp.net/p/cl-machine-learning/git/cl-svm/research/platt-smo-book.pdf.
13.Hall Mark, Frank Eibe, Holmes Geoffrey, Pfahringer Bernhard, Reutemann Peter, Ian H. Witten. The WEKA data mining software: an update. SIGKDD Explor Newsl. 2009;11(1931-0145):10–18. doi: 10.1145/1656274.1656278. [DOI] [Google Scholar]
14.Nelson SJ, Zeng K, Kilbourne J, Powell T, Moore R. Normalized names for clinical drugs: RxNorm at 6 years. J Am Med Inform Assoc JAMIA. 2011;18(4):441–448. doi: 10.1136/amiajnl-2011-000116. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.RxMix [Accessed December 8, 2014]. http://mor.nlm.nih.gov/RxMix/#.
16.Combrisson E, Jerbi K. Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy. J Neurosci Methods. 2015 doi: 10.1016/j.jneumeth.2015.01.010. [DOI] [PubMed] [Google Scholar]
17.Fung KW, Jao CS, Demner-Fushman D. Extracting drug indication information from structured product labels using natural language processing. J Am Med Inform Assoc JAMIA. 2013;20(3):482–488. doi: 10.1136/amiajnl-2012-001291. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Khare R, Li J, Lu Z. LabeledIn: cataloging labeled indications for human drugs. J Biomed Inform. 2014;52:448–456. doi: 10.1016/j.jbi.2014.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Culbertson A, Fiszman M, Shin D, Rindflesch TC. Semantic processing to identify adverse drug event information from black box warnings. AMIA Annu Symp Proc AMIA Symp AMIA Symp. 2013;2013:266. [PMC free article] [PubMed] [Google Scholar]
20.Li Q, Deleger L, Lingren T, et al. Mining FDA drug labels for medical conditions. BMC Med Inform Decis Mak. 2013;13:53. doi: 10.1186/1472-6947-13-53. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Duke J, Friedlin J, Li X. Consistency in the safety labeling of bioequivalent medications. Pharmacoepidemiol Drug Saf. 2013;22(3):294–301. doi: 10.1002/pds.3351. [DOI] [PubMed] [Google Scholar]
22.Federal Register | Content and Format of Labeling for Human Prescription Drug and Biological Products Requirements for Pregnancy and Lactation Labeling. [Accessed December 18, 2014]. https://www.federalregister.gov/articles/2014/12/04/2014-28241/content-and-format-of-labeling-for-human-prescription-drug-and-biological-products-requirements-for#h-11. [PubMed]
23.Drug Safety and Availability > FDA Drug Safety Communication: FDA has reviewed possible risks of pain medicine use during pregnancy. [Accessed February 20, 2015]. http://www.fda.gov/Drugs/DrugSafety/ucm429117.htm.

[b1-2246776] 1.Nahum GG, Uhl K, Kennedy DL. Antibiotic use in pregnancy and lactation: what is and is not known about teratogenic and toxic risks. Obstet Gynecol. 2006;107(5):1120–1138. doi: 10.109-7/01.AOG.0000216197.26783.b5. [DOI] [PubMed] [Google Scholar]

[b2-2246776] 2.DailyMed. [Accessed November 26, 2014]. http://dailymed.nlm.nih.gov/dailymed/index.cfm.

[b3-2246776] 3. LOINCManual.pdf. [Accessed March 6, 2015]. http://loinc.org/downloads/files/LOINCManual.pdf.

[b4-2246776] 4.Title 21–Food and Drugs Chapter I–Food and Drug Administration. Department of Health and Human Services Subchapter C–Drugs:General; http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/cfrsearch.cfm?fr=201.57. [Google Scholar]

[b5-2246776] 5.Micromedex® 20, (electronic version) Truven Health Analytics. Greenwood Village; Colorado, USA: [Accessed December 18, 2014]. http://micromedex.com/pharmaceutical. [Google Scholar]

[b6-2246776] 6.Pregnancy.pdf. [Accessed November 26, 2014]. https://depts.washington.edu/druginfo/Formulary/Pregnancy.pdf.

[b7-2246776] 7.Gallego Úbeda M, Delgado Téllez de Cepeda L, Campos Fernández de Sevilla M de LA, De Lorenzo Pinto A, Tutau Gómez F. An update in drug use during pregnancy: risk classification. Farm Hosp Órgano Of Expr Científica Soc Esp Farm Hosp. 2014;38(4):364–378. doi: 10.7399/fh.2014.38.4.7395. [DOI] [PubMed] [Google Scholar]

[b8-2246776] 8.Advisory Committee on Prescription Medicines (ACPM) | Therapeutic Goods Administration (TGA) [Accessed November 26, 2014]. http://www.tga.gov.au/committee/advisory-committee-prescription-medicines-acpm.

[b9-2246776] 9.Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB. Towards automatic recognition of scientifically rigorous clinical research evidence. J Am Med Inform Assoc JAMIA. 2009;16(1):25–31. doi: 10.1197/jamia.M2996. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b10-2246776] 10.Marafino BJ, Davies JM, Bardach NS, Dean ML, Dudley RA. N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit. J Am Med Inform Assoc JAMIA. 2014;21(5):871–875. doi: 10.1136/amiajnl-2014-002694. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b11-2246776] 11.Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF. Text categorization models for high-quality article retrieval in internal medicine. J Am Med Inform Assoc JAMIA. 2005;12(2):207–216. doi: 10.1197/jamia.M1641. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b12-2246776] 12.Fast Training of Support Vector Machines using Sequential Minimal Optimization. [Accessed November 26, 2014]. http://common-lisp.net/p/cl-machine-learning/git/cl-svm/research/platt-smo-book.pdf.

[b13-2246776] 13.Hall Mark, Frank Eibe, Holmes Geoffrey, Pfahringer Bernhard, Reutemann Peter, Ian H. Witten. The WEKA data mining software: an update. SIGKDD Explor Newsl. 2009;11(1931-0145):10–18. doi: 10.1145/1656274.1656278. [DOI] [Google Scholar]

[b14-2246776] 14.Nelson SJ, Zeng K, Kilbourne J, Powell T, Moore R. Normalized names for clinical drugs: RxNorm at 6 years. J Am Med Inform Assoc JAMIA. 2011;18(4):441–448. doi: 10.1136/amiajnl-2011-000116. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b15-2246776] 15.RxMix [Accessed December 8, 2014]. http://mor.nlm.nih.gov/RxMix/#.

[b16-2246776] 16.Combrisson E, Jerbi K. Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy. J Neurosci Methods. 2015 doi: 10.1016/j.jneumeth.2015.01.010. [DOI] [PubMed] [Google Scholar]

[b17-2246776] 17.Fung KW, Jao CS, Demner-Fushman D. Extracting drug indication information from structured product labels using natural language processing. J Am Med Inform Assoc JAMIA. 2013;20(3):482–488. doi: 10.1136/amiajnl-2012-001291. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b18-2246776] 18.Khare R, Li J, Lu Z. LabeledIn: cataloging labeled indications for human drugs. J Biomed Inform. 2014;52:448–456. doi: 10.1016/j.jbi.2014.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b19-2246776] 19.Culbertson A, Fiszman M, Shin D, Rindflesch TC. Semantic processing to identify adverse drug event information from black box warnings. AMIA Annu Symp Proc AMIA Symp AMIA Symp. 2013;2013:266. [PMC free article] [PubMed] [Google Scholar]

[b20-2246776] 20.Li Q, Deleger L, Lingren T, et al. Mining FDA drug labels for medical conditions. BMC Med Inform Decis Mak. 2013;13:53. doi: 10.1186/1472-6947-13-53. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b21-2246776] 21.Duke J, Friedlin J, Li X. Consistency in the safety labeling of bioequivalent medications. Pharmacoepidemiol Drug Saf. 2013;22(3):294–301. doi: 10.1002/pds.3351. [DOI] [PubMed] [Google Scholar]

[b22-2246776] 22.Federal Register | Content and Format of Labeling for Human Prescription Drug and Biological Products Requirements for Pregnancy and Lactation Labeling. [Accessed December 18, 2014]. https://www.federalregister.gov/articles/2014/12/04/2014-28241/content-and-format-of-labeling-for-human-prescription-drug-and-biological-products-requirements-for#h-11. [PubMed]

[b23-2246776] 23.Drug Safety and Availability > FDA Drug Safety Communication: FDA has reviewed possible risks of pain medicine use during pregnancy. [Accessed February 20, 2015]. http://www.fda.gov/Drugs/DrugSafety/ucm429117.htm.

PERMALINK

Automatic Classification of Structured Product Labels for Pregnancy Risk Drug Categories, a Machine Learning Approach

Laritza M Rodriguez, MD, PhD

Dina Demner Fushman, MD, PhD

Abstract

Background and Justification

Table 1:

Table 2:

Materials and Methods

Table 3:

Machine Learning

Machine Learning Model Evaluation

Error Analysis

Results

Table 4:

Table 5:

Error Analysis

Discussion and Conclusions

Limitations

Figure 1:

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Automatic Classification of Structured Product Labels for Pregnancy Risk Drug Categories, a Machine Learning Approach

Laritza M Rodriguez, MD, PhD

Dina Demner Fushman, MD, PhD

Abstract

Background and Justification

Table 1:

Table 2:

Materials and Methods

Table 3:

Machine Learning

Machine Learning Model Evaluation

Error Analysis

Results

Table 4:

Table 5:

Error Analysis

Discussion and Conclusions

Limitations

Figure 1:

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases