Skip to main content
Evidence-based Complementary and Alternative Medicine : eCAM logoLink to Evidence-based Complementary and Alternative Medicine : eCAM
. 2017 Oct 11;2017:8279109. doi: 10.1155/2017/8279109

Prescription Function Prediction Using Topic Model and Multilabel Classifiers

Lidong Wang 1,, Yin Zhang 2, Yun Zhang 3, Xiaodong Xu 4, Shihua Cao 1
PMCID: PMC5662811  PMID: 29234434

Abstract

Determining a prescription's function is one of the challenging problems in Traditional Chinese Medicine (TCM). In past decades, TCM has been widely researched through various methods in computer science, but none concentrates on the prediction method for a new prescription's function. In this study, two methods are presented concerning this issue. The first method is based on a novel supervised topic model named Label-Prescription-Herb (LPH), which incorporates herb-herb compatibility rules into learning process. The second method is based on multilabel classifiers built by TFIDF features and herbal attribute features. Experiments undertaken reveal that both methods perform well, but the multilabel classifiers slightly outperform LPH-based method. The prediction results can provide valuable information for new prescription discovery before clinical test.

1. Introduction

Traditional Chinese Medicine (TCM) is a unique medical knowledge system in China and has become a popular complementary treatment in Western countries. Currently there are 100,000 formulae based on the continuous clinical records. A formula is a prescription that is validated by pharmacology and clinics. Researchers have made great efforts to study and utilize those formulae to discover new prescriptions hidden in the formulae data [1]. To discover a new prescription for disease treatment, researchers have to analyze the efficiency of related herbs and collect several herbs with proper proportion according to TCM theory. Then, the function of a new prescription has to be proved through repeated clinical tests, which would require a large amount of manpower and material resources. Actually, if a new prescription's function can be prepredicted by computer science technology, the results would provide valuable reference for the following clinical practices.

It has been found that data mining approaches play critical roles in TCM related topics, such as new drug discovery [1], syndrome differentiation [24], herbal combinational rule mining [5, 6], symptom name normalization [7], intelligent diagnosis [8], and treatment pattern mining [9]. Most of the previous research was related to relationship mining, such as herb-symptom relationships [8, 10, 11] and herb-herb relationships [6]. Wang et al. [6] created a herbal network to present the herb-herb correlation. Chen et al. [8] detected the patterns between herbs and symptoms by using tripartite information network. Recently, more and more researchers have adopted topic models to mine the correlation between TCM objects. Lin et al. [10] proposed a symptom-herb-therapies-diagnosis topic model to diagnose the disease and administer appropriate drugs and treatments given a patient's symptoms. Zhang et al. [4] proposed a Symptom-Herb-Diagnosis Topic (SHDT) model to extract multiple relationships among symptoms, herb combinations, and diagnoses from large-scale CM clinical data. The proposed model was useful in discovering the common TCM diagnosis and treatment patterns. Jiang et al. [11] applied Linked LDA to extract the herb-symptom patterns. Yao et al. [9] employed Labeled LDA (Labeled Latent Dirichlet Allocation) to mine treatment patterns in TCM clinical cases, but the mining result was not satisfactory. Unlike these studies, we concentrate on the prescription function prediction through topic detection and incorporate compatibility rule mining into the topic model.

In TCM theory, a prescription's function can be affected mainly by the following factors: the attributes of herbs, the compatibility rules of paired herbs, and the dosages. Based on this, we present two methods to predict a prescription's function. The first method is based on topic modeling. A novel topic model named LPH (Label-Prescription-Herb) is proposed to incorporate the results of compatibility rule mining into learning process. It can automatically learn the posterior distribution of each herb in a prescription conditioned on the prescription's label set (function set). The second method is based on feature extraction and multilabel classifiers. We extract N-dimensional feature vector space for each prescription concerning their herbal attributes and TFIDF (Term Frequency-Inverse Document Frequency) Features and then employ several popular and competitive classifiers to validate our method.

The rest of paper is organized as follows. Section 2 presents the detailed steps of our methods for prescription function prediction. Section 3 provides analyses and discussion of our experimental results. Finally, some conclusions and future works are provided in Section 4.

2. Methods

The framework of our methods is shown in Figure 1, with details presented in following subsections.

Figure 1.

Figure 1

The framework of our methods.

The herb dataset and formula dataset are extracted from our project CKCEST (http://zcy.ckcest.cn/tcm/) (Chinese Knowledge Center for Engineering Science and Technology). In the first method, we conduct compatibility rule mining from the formula dataset and then incorporate the results into the learning process of topic modeling. The objective of topic modeling is to learn the “topic-word” (function-herb) structure with supervision. The prescription's most likely labels can then be inferred by thresholding its posterior probability over function labels. In the second method, we treat our prediction task as a multiclass, multilabel classification problem. We extract feature space based on TFIDF weighting and herbal attributes and then train the multilabel classification model by using the features.

2.1. Prediction Based on Topic Model

In this section, we propose a supervised topic model named Label-Prescription-Herb (LPH) to mine treatment patterns in the herbs of the formula dataset. Although a prescription consists of two or more individual herbs, some of them act as pairs in the treatment. In this subsection we introduce the method to mine the compatibility rules.

2.1.1. Compatibility Rule Mining

In TCM theory, compatibility refers to the combination of two or more herbs based on the clinical settings and the properties of herbs [12]. The efficiency of a single herb is usually limited, but when two herbs are used together, their interaction should display their superiority over a single herb in the treatment of diseases; we say that these two herbs have compatibility rule. In China, many herbs have intensive compatibility rule that have been learned from ancient times to the modern period. However, the existing 917 herb pairs in Chinese Paired Herb Database are inadequate for our prediction task. Thus, computer intelligence can be employed to discover more pairs for further research. When two herbs are frequently used in combination with each other, they are more likely to be paired drugs. We propose a method based on support degree [13] and dependency relationship for compatibility rule mining between herb h i and herb h j, which is consists of the following steps:

Step 1 . —
support=phi,hj. (1)
Step 2 . —
dependency=phi,hjphiphj. (2)
Step 3 . —
Cor=a·support+b·dependency. (3)
Step 4 . —

Rank all possible herb pairs according to their associated value of Cor.

Step 5 . —

Return top-N pairs.

Here support denotes the joint probability of occurrence of two herbs h i and h j. In Step 3, we combine the support attribute (p(h 1, h 2)) and the dependency attribute (the ratio of p(h 1, h 2) to p(h 1)p(h 2)). Note that we remove Glycyrrhizae Radix from the mining results, since it is useless to analyze compatibility rule between Glycyrrhizae Radix and other herbs. The use of this herb is merely in decreasing or moderating medicinal side-effects of all herbs in a prescription.

2.1.2. Topic Model Description on TCM

LDA (Latent Dirichlet Allocation) is a completely unsupervised method that models each document as a mixture of topics [14]. The model outputs a discrete probability distribution over words for each topic and a discrete distribution over topics for each document. However, LDA is not appropriate for multilabeled corpora because it generates automatic summaries of topics that have no direct correspondence with the label set. A simple solution to this problem is to assign a document's words to its labels rather than to a latent and possibly less interpretable semantic space. At present there exists some related research, such as Labeled LDA [15] and partially Labeled LDA [16].

Analogous to the relationship among documents, topics, and words, we can treat herbs as “words.” A prescription (formula) is a bag of herbs, and we can treat it as a structured “document.” Correspondingly, a prescription's function can be considered as a “topic.” Thus, we employ topic models to mine the latent relationship between function labels and herbs. The topic model for our prediction task should incorporate supervision by constraining the model to use only those “topics” that correspond to a prescription's label set. Since the combination of herbs contributes a factor to the function prediction, we consider the role of herb pairs in the topic learning process.

We define some notations. Let each prescription p be represented by a tuple consisting of a list of herbs, H (p) = {h 1, h 2,…, h Np} and a list of binary topic presence/absence indicators Λ(p) = {l 1, l 2,…, l K}, where each h i ∈ {1,…, V} and each l k ∈ {0,1}. Here N p is the prescription length, V is the total number of herbs extracted from formula dataset and K is the total number of function labels. We set the number of functions in our model to be the number of unique labels K.

2.1.3. LPH Model

To incorporate compatibility rules into the topic model, we introduce variable x i to indicate whether herb h i has compatibility rule with herb h j. If x i = 1, then h i and h j are paired herbs; otherwise, they are generated from the distribution associated with their function label. The graphical model of LPH model is shown in Figure 2.

Figure 2.

Figure 2

Graphical model of improved Labeled LDA.

In Figure 2, β k is a vector consisting of the parameters of multinomial distribution corresponding to the kth function label. γ i is the prior parameter for variable x i. α are the parameters of the Dirichlet topic prior and η are the parameters of the herb prior, while Φk is the label prior for function k. The generative process for LPH model is given as follows:

  1. For each function k ∈ [1,…, K], generate β k from a Dirichlet distribution with prior parameter η, that is, β k ~ Dir(η).

  2. For each prescription p:

    1. For each function k ∈ [1,…, K], generate function label (topic) presence/absence indicators Λk from a Bernoulli distribution with prior parameter Φk, that is, Λk ~ Bernoulli(Φk).
    2. Generate the parameters of the Dirichlet function prior α(p) from the label projection matrix L and the predefined Dirichlet priors α, that is, α(p)=L×α.
    3. Generate function mixture θ from Dirichlet distribution Dir(α(p)), that is, θ~Dir(α(p)).
  3. For each herb h i, i ∈ {1,…, N p}:

    1. Generate x i from Bernoulli distribution Bernoulli (γ i), that is, x i ~ Bernoulli(γ i).
    2. Generate function f from multinomial distribution Mult(θ), that is, f ~ Mult(θ).
    3. If x i = 0, generate a herb h i from multinomial distribution Mult(β f), that is, h i ~ Mult(β f); if x i = 1, generate herb pair (h i, h j) from multinomial distribution Mult(β f), that is, (h i, h j) ~ Mult(β f).

During step (2)(b), label projection matrix L is used to project the Dirichlet prior vector α={α1,,αK} into a lower dimension α(p). For instance, suppose K = 6 and that a prescription p has labels given by Λ(p) = (0,0, 0,1, 1,0) which implies L would be

000100000010. (4)

The ith row of L has an entry of 1 in column j if and only if the ith label in prescription p is equal to the function j and 0 otherwise. Then, function mixture θ is drawn from a Dirichlet distribution with parameters α(p)=L×α=(α4,α5)T.

During step (3)(a), when the parameter x i for the herb h i is observed from the compatibility rule mining results, the prior parameter γ i is separated from the rest of the models. Analogous to Labeled LDA, for prescription p, we restrict θ to be defined over topics corresponding to its prior labels Λ(p). This restriction ensures that all the topic assignments are limited to the prescription's labels.

2.1.4. Learning and Inference

The exact inference for LPH is intractable, thus several approximate schemes have been proposed to infer the model. We use collapsed Gibbs sampling [17] to estimate the probability of a function label k assigned to the herb h i in a prescription. We first choose initial states for the Markov chain randomly; then we calculate the conditional distribution p(f i = kf i) and p(f (i, j) = kf i,−j) as follows, where f i denotes all herbs' function label assignments excluding h i; f i,−j denotes all herbs' function label assignments excluding h i and h j.

If  xi=0,pfi=kfini,khi+ηhini,k·+ηT1×ni,kp+αkni,·p+αT1 (5)
Ifxi=1,pfi,j=kfi,jni,khi+ηhinj,khj+ηhjni,j,k·+ηT1×ni,j,kp+αkni,j,·p+αT1. (6)

In (5), n i,k hi is the count of herb h i in function k, n i,k (·) is the total number of herbs assigned to function k, n i,k p is the number of times herbs in prescription p are assigned to function k, and n i p is the number of herbs in p. All counts exclude the current assignment. In (6), all counts do not include the current two cases h i and h j. Note that once a herb pair (h i, h j) is assigned to the function k, the two herbs h i and h j will be assigned to the topic simultaneously.

After Gibbs sampling iterations, we estimate the function-herb multinomial distribution β and the prescription function mixture θ as follows:

If x i = 0, then

θpk=ni,kp+αkni,·p+αT1,βkhi=ni,khi+ηhini,k·+ηT1. (7)

If x i = 1, then

θpk=ni,j,kp+αkni,j,·p+αT1,βkhi,hj=ni,khi+ηhinj,khj+ηhjni,j,k·+ηT1. (8)

2.1.5. Function Prediction

During multilabel prediction, inferring the best set of labels for an unlabeled prescription at test time is more complex: it involves assessing all function label assignments and returning the assignment that has the highest posterior probability. However, the issue is not so simple, since there are 2K possible function label assignments. For the purpose of this paper, we infer the conditional probability of function labels (topics) given a new prescription by using Bayes rules (see (9)). The prescription's most probable labels can then be inferred by suitably thresholding its posterior probability over function labels. Suppose a new prescription p consists of a set of herbs H (p) = {h 1, h 2,…, h Np}, then p(kH (p)) is calculated as follows:

pkHphi,hjHpphikpkxi=0·phi,hjkpkxi=1=hi,hjHpβkhipkxi=0·βkhi,hjpkxi=1. (9)

To simplify calculation, p(k) can be treated as a constant and p(kH (p)) can be calculated as follows:

pkHphi,hjHpβkhixi=0·βkhi,hjxi=1. (10)

2.2. Feature Extraction

In this section, we adopt the TFIDF method and herbal attributes to extract a prescription's features.

2.2.1. TFIDF Features

TFIDF is often used as a weighting factor in information retrieval and text mining. In TCM, some herbs appear frequently to tend to have little influence on a prescription's function, such as Glycyrrhizae Radix. In this work, we employ TFIDF to reflect the importance of a herb for a prescription in a collection. A prescription is treated as a “document,” and the corresponding herbs are treated as “terms.” So, we denote TF(h i) = F(h i), which is the frequency of h i and define IDF(h i) = log(N/F′(h i)), where N is the number of prescriptions; F′(h i) = |{j : h ip j}| is the number of prescriptions containing the herb h i. Then, the TFIDF feature for the herb h i can be denoted as follows:

TFIDFhi=FhilogNFhi. (11)

Based on this, we use the TFIDF features to represent a prescription:

p=t1,t2,,tm, (12)

where t i = TFIDF(h i) if the prescription contains herb h i, otherwise 0. m is the total number of unique herbs.

However, a prescription contains no information about the number of occurrences for each herb. Thus, we cannot calculate F(h i) this way. To solve this problem, we set the herb's dosage as its initial weight. The dosage information can reflect the importance of a herb in a prescription but should be standardized before our task, since different herbs have different usual dosages. For instance, the usual dosage for Pseudoginseng is 3 g ~ 9 g, while that of Dioscoreae Rhizoma is 15 g~ 30 g. So, the dosage of herbs in a prescription may not be directly comparable. For a prescription, we first standardize each herb's dosage before the TFIDF weighting phase by the following rule:

di=didmax+dmin, (13)

where d i is the actual dosage of herb h i in a prescription, d max is its maximum usual dosage, and d min is the minimum usual dosage. Table 1 shows an example of dosage standardization on prescription “Ma Huang Tang.” The standardized dosage keeps the order of original data; that is, if a herb has higher dose in prescription p A than in prescription p B, it remains in the same order after standardization. Then, F(h i) can be calculated as

Fhi=dij=1Npdj. (14)
Table 1.

Dosage standardization for “Ma Huang Tang” (g).

Ma Huang Tang d i d min d max d i
Ephedrae Herba 9 2 9 0.82
Cinnamomi Ramulus 6 3 9 0.50
Armeniacae Semen Amarum 6 4.5 9 0.44
Glycyrrhizae Radix 3 1.5 9 0.29

2.2.2. Attribute Features

The attributes of each herb, named “channel tropism,” “nature & flavor,” and “efficiency,” are described with certain terms. For instance, “nature” refers to the temperature characteristics of the herb, such as “cold,” “hot,” and “warm.” “Flavor” refers to the taste property of the herb, such as “sour,” “bitter,” and “sweet.”

For each prescription, we sort the herbs according to its F(h i) and select top two herbs to represent the prescription. For the herb h i, we collect 9 attributes in “nature & flavor,” 12 attributes in “channel tropism,” and 46 attributes in “efficiency.” Then, the attribute feature vector for a prescription can be denoted as V={v1,v2,,vm}, where m = 134, v i ∈ [0,1]. If a herb contains feature i, the corresponding v i is 1, otherwise 0. Some specific attributes, such as “slightly bitter” and “slightly hot,” are quantified as 0.5.

We consider our prediction task as a multilabel classification problem: given a training set consisting of prescriptions with multiple function labels, predict the set of labels appropriate for each prescription in the test set. Based on the above features, several multiple one-vs-rest classifiers are trained to test our method. These classifiers are SVM (Support Vector Machine), Adaboost, and Bayes Network, which are popular and extremely competitive baselines used by most previous papers [18].

3. Results

We collected 3055 formulae (https://github.com/violetconch/label-prescription-herb-model) and 972 herbs for our experiments, the former were derived from our project CKCEST (http://zcy.ckcest.cn/tcm/search/classifybrowse?type=pre#), and the latter were derived from a famous book «Great Dictionary of Chinese Medicine» (https://pan.baidu.com/s/1c14N27Y). Examples of formula data and herb data are listed in Tables 2 and 3.

Table 2.

An example of a formula.

Formula Ma Huang Tang
Herbs Ephedrae Herba  (9 g),  Cinnamomi Ramulus (9 g),  Armeniacae Semen Amarum (6 g),  Glycyrrhizae Radix (3 g)
Function Relieving exterior syndrome

Table 3.

The detailed information about “Ephedrae Herba.”

Herb Ephedrae Herba
Efficiency Inducing perspiration, relieving superficies by cooling, opening the inhibited lung-energy, relieving asthma, clearing dam, subsidence of a swelling
Nature & flavor Spicy, slightly bitter, warm
Channel tropism Lungs, bladder
Usual dosage 2 g ~ 9 g

3.1. Setup

In compatibility rule mining step, our method returned top-N herb pairs according to their associated Cor value, which was used to decide the parameter x i during the process of topic modeling. The parameters a and b in (3) were both set to 0.5 through repeated experiments.

In topic modeling-based method, we set the number of topics K to be the number of function labels, which were set to 20. The number of unique herbs extracted from 3055 formulae was 972. Moreover, we set the hyperparameters α = 50/K and η = 0.1 and the iteration number l = 500.

In multilabel classifier-based method, we combined the TFIDF feature space and attribute features to represent a formula. The dimension for TFIDF feature space p was set to 972, the number of unique herbs. The dimension for attribute features V was 134. Then, the resulting feature vector of each formula was 1106. We adopted several classifiers (SVM, Adaboost, and Bayes Network) using 4-fold cross validation on 3055 formulae.

We designed five experiments to conduct our prediction task:

  1. Topic modeling based on Labeled LDA

  2. Topic modeling based on LPH

  3. TFIDF feature space

  4. Attribute feature space

  5. TFIDF + attribute feature space.

For experiments (a) and (b), we calculated the probability p(kH (p)) for the new prescription p, where k ∈ [1 ⋯ K]. The label k was returned when it satisfied the following condition:

pkHp>T, (15)

where T was the threshold. For experiments (c)~(e), these feature vectors were generated and used as inputs to classifiers. We tuned the SVMs' shared cost parameter C (=10). The “TFIDF + attributes” features were denoted as pV. The prediction was considered as a 20-class, multilabel classification problem. Each test was performed 10 times to obtain the average performance. We scored each method based on Precision, Recall, and Micro-F1 as our evaluation measures. These measures were defined as follows:

Precision=ThetotalnumberofcorrectlabelspredictedbyamethodThetotalnumberoflabelspredictedbyamethod, (16)
Recall=ThetotalnumberofcorrectlabelspredictedbyamethodThetotalnumberofreallabels, (17)
Micro-F1=2×Precision×RecallPrecision+Recall. (18)

3.2. Experimental Result

3.2.1. Compatibility Rule Mining

We use Precision@N metric to evaluate the effectiveness of our method and then determine the number of returned herb pairs. Precision@N is the ratio of correct pairs to the N returned pairs. The returned pairs are assumed to be correct when they have compatibility rule according to expert's instructions. The experimental results are shown in Table 4.

Table 4.

Experimental results of compatibility rule mining.

Number of returned herb pairs Precision@N Number of returned herb pairs Precision@N
100 100/100 1000 913/1000
200 200/200 1100 974/1100
300 294/300 1200 1026/1200
400 383/400 1300 1078/1300
500 472/500 1400 1135/1400
600 550/600 1500 1166/1500
700 630/700 1600 1171/1600
800 711/800 1700 1173/1700
900 809/900 1800 1174/1800

Based on the above results, when the number of returned pairs is more than 1500, the correct sample does not show an obvious increase. Thus, top 1500 herb pairs are returned in our experiment. The mining results are visualized in Figure 3. Each vertex in the graph represents a herb. An edge is drawn between a pair of herbs if they have compatibility rule. As shown in Figure 3, one herb can have compatibility rule with several other herbs. For instance, Ginseng Radix can be combined with Atractylodis Macrocephalae Rhizoma, Zingiberis Rhizoma, Dioscoreae Rhizoma, Angelicae Sinensis Radix, or Cervi Cornu Pantotrichum to promote different treatment effects. It is clear that utilizing powerful computers and efficient algorithms can mine latent compatibility rules, which would be useful for TCM practitioners for further study.

Figure 3.

Figure 3

Detected 1500 pairs of herbs.

3.2.2. Topic Discovery

Tables 5 and 6 show the 4 topics detected by LPH model, Table 7 shows the 2 topics detected by Labeled LDA model. Each topic contains top 20 herbs. As shown in Tables 5 and 6, we notice that most of the top 20 herbs have related functions corresponding to the topic, but several detected herbs do not have corresponding function, such as Plantaginis Semen in “cleaning heat” topic and Glycyrrhizae Radix in “relieving uneasiness of mind” topic. Although Plantaginis Semen has low posterior probability and does not have direct correspondence to the topic, the herb is an important component in some prescriptions having the corresponding function. Glycyrrhizae Radix can be detected in most of topics, since it is frequently used in many formulae to regulate actions of all other herbs. It has to be noted that Glycyrrhizae Radix is removed from the combinational rule mining results (see Section 2.1.1), not the topic modeling results; thus it can be assigned to a topic (function) as a single herb in the results of topic discovery.

Table 5.

Topics discovered by LPH model.

Cleaning heat Probability Relieving uneasiness of mind Probability
Szechwan Lovage Rhizome, Angelicae Sinensis Radix 0.05953 Polygalae Radix 0.04842
Unprocessed Rehmanniae Radix 0.05431 Ginseng Radix, Atractylodis Macrocephalae Rhizoma 0.03805
Atractylodis Macrocephalae Rhizoma, Paeoniae Radix Alba 0.03238 Rhei Radix 0.03574
Scutellariae Radix 0.02507 Jujubae Fructus 0.03259
Paeoniae Radix Alba 0.02403 Glycyrrhizae  Radix 0.02017
Phellodendri Chinensis Cortex, Anemarrhenae Rhizoma 0.02403 Angelicae Sinensis Radix 0.01960
Glycyrrhizae Radix 0.02298 Poria, Szechwan Lovage Rhizome 0.01615
Poria 0.02194 Fossil Fragments,Ostreae Concha 0.01615
Rehmanniae Radix 0.01881 Zingiberis Rhizoma 0.01384
Coptidis Rhizoma 0.01776 Coptidis Rhizoma 0.01384
Dichroae Radix 0.01672 Acori Tatarinowii Rhizoma 0.01038
Ophiopogonis Radix 0.01567 Fresh Rehmanniae Radix 0.01038
Forsythiae Fructus 0.01463 Kansui Radix 0.01038
Cimicifugae Rhizoma, Clerodendron Cyrtophyllum Turcz 0.01254 Dried Rehmanniae Radix 0.01038
Ginseng Radix 0.01254 Aconiti Lateralis Radix Praeparata, Pinelliae Rhizoma 0.01038
Plantaginis  Semen 0.01254 Schisandrae Chinensis Fructus 0.01038
Saposhnikoviae Radix, Notopterygii Rhizoma 0.01254 Realgar 0.00923
Ostreae Concha 0.01150 Salviae Miltiorrhizae Radix 0.00923
Mume Fructus 0.01150 Saposhnikoviae  Radix 0.00923
Cinnamomi Ramulus,
Paeoniae Radix Alba
0.00856 Scrophulariae Radix 0.00923
Table 6.

Topics discovered by LPH model.

Replenishing and restoring Probability Dispelling internal cold Probability
Atractylodis Macrocephalae Rhizoma Ginseng Radix 0.05533 Zingiberis Rhizoma Recens 0.04842

Poria, Szechwan Lovage Rhizome 0.05297 Glycyrrhizae  Radix 0.03805

Astragali Radix 0.03708 Codonopsis Radix 0.03574

Angelicae Sinensis Radix, Dioscoreae Rhizoma 0.03120 Pinelliae Rhizoma, Poria 0.03459

Glycyrrhizae  Radix 0.02767 Atractylodis Macrocephalae Rhizoma, Angelicae Sinensis Radix 0.02421

Codonopsis Radix 0.02649 Astragali Radix 0.01960

Rehmanniae Radix Praeparata, Angelicae Sinensis Radix 0.02531 Cinnamomi Ramulus 0.01615

Paeoniae Radix Alba 0.02096 Paeoniae Radix Alba, Szechwan Lovage Rhizome 0.01499

Pinelliae Rhizoma 0.01325 Fossil Fragments, Ostreae Concha 0.01384

Dried Rehmanniae Radix 0.01325 Leonuri Herba 0.01384

Asini Corii Colla, Angelicae Sinensis Radix 0.00943 Asini Corii Colla, Angelicae Sinensis Radix 0.01384

Schisandrae Chinensis Fructus, Atractylodis Macrocephalae Rhizoma 0.00943 Ginseng Radix 0.01269

Asari Radix, Zingiberis Rhizoma 0.00943 Atractylodis  Rhizoma 0.01154

Cornu Cervi Pantotrichum 0.00943 Radix  Asparagi 0.01038

Salviae Miltiorrhizae Radix, Schisandrae Chinensis Fructus 0.00943 Saposhnikoviae Radix, Angelicae Pubescentis Radix 0.01038

Zingiberis Rhizoma Recens 0.00825 Zingiberis Rhizoma 0.01038

Polygalae Radix 0.00825 Platycodonis  Radix 0.00923

Poria 0.00825 Salviae Miltiorrhizae Radix 0.00923

Gastrodiae  Rhizoma 0.00825 Ephedrae Herba 0.00923

Sophorae Flavescentis Radix 0.00707 Aconiti  Lateralis  Radix  Praeparata 0.00820
Table 7.

Topics discovered by Labeled LDA model.

Cleaning heat Probability Relieving uneasiness of mind Probability
Unprocessed Rehmanniae Radix 0.03172 Polygalae Radix 0.04112
Glycyrrhizae Radix 0.02984 Glycyrrhizae  Radix 0.03945
Szechwan Lovage Rhizome 0.02773 Ginseng Radix 0.03712
Ophiopogonis Radix 0.02678 Salviae Miltiorrhizae Radix 0.03226
Scutellariae Radix 0.02421 Rhei Radix 0.03226
Moutan Cortex 0.01933 Jujubae Fructus 0.02110
Anemarrhenae Rhizoma 0.01933 Angelicae Sinensis Radix 0.02110
Atractylodis Macrocephalae Rhizoma 0.01847 Fresh Rehmanniae Radix 0.02110
Rehmanniae Radix 0.01847 Poria 0.01958
Paeoniae Radix Alba 0.01847 Scrophulariae Radix 0.01646
Ginseng Radix 0.01811 Coptidis Rhizoma 0.01617
Coptidis Rhizoma 0.01652 Zingiberis Rhizoma 0.01617
Forsythiae Fructus 0.01584 Kansui Radix 0.01617
Cinnamomi Ramulus 0.01437 Fossil Fragments 0.01025
Phellodendri Chinensis Cortex 0.01437 Acori Tatarinowii Rhizoma 0.00943
Saposhnikoviae Radix 0.01394 Aconiti Lateralis Radix Praeparata 0.00943
Mume Fructus 0.01394 Pinelliae Rhizoma 0.00943
Poria 0.01386 Dried Rehmanniae Radix 0.00872
Chinese Herbaceous Peony 0.00945 Lycii Fructus 0.00845
Ostreae Concha 0.00835 Realgar 0.00845

In other topics, we can find similar results as well. Most of the herbs (marked by the rectangle) that do not have intensive correlation with the topic have low probability. A pair of herbs tend to indicate more intensive correlation with the corresponding topics than a single herb, such as Ginseng Radix and Atractylodis Macrocephalae Rhizoma from “relieving uneasiness of mind” topic and Atractylodis Macrocephalae Rhizoma and Angelicae Sinensis Radix from “dispelling internal cold” topic. Therapeutic effects can be promoted by the coordination of two herbs. In addition, many individual herbs are inactive in the corresponding topic but become active in combination with other herbs, such as Paeoniae Radix Alba and Szechwan Lovage Rhizome from “dispelling internal cold” topic. However, Labeled LDA cannot discover combinations of effective interacting herbs (see Table 7).

3.2.3. Function Prediction

In employing the LPH model to solve the multilabel classification problem, we should determine the threshold T in (15). However, there is no theoretical basis to automatically choose an optimal threshold. In this study, we provide the experimental results using different thresholds (see Table 8).

Table 8.

Average performance of topic model-based method.

Threshold T Labeled LDA LPH
Precision Recall Micro-F1 Precision Recall Micro-F1
1e − 5 0.6102 0.1187 0.1987 0.8124 0.1025 0.1820
1e − 6 0.7317 0.2658 0.3899 0.6075 0.2031 0.3044
1e − 7 0.6567 0.3278 0.4373 0.6874 0.3295 0.4455
1e − 8 0.5927 0.4076 0.4830 0.7220 0.4187 0.5300
1e − 9 0.5365 0.4127 0.4665 0.6267 0.4203 0.5031

Table 9 shows the classification performance. Comparing the above two methods, multilabel classifiers perform slightly better than topic model-based methods. As shown in Table 8, the value of threshold has a strong influence on the classification results. We can take T = 1e − 8 as an optimal value to achieve optimal prediction power. LPH substantially outperforms Labeled LDA on Micro-F1 with the optimal T. The results demonstrate that incorporating compatibility rule into topic model can promote prediction accuracy. The recall on both two models are not satisfactory, as the posterior probability can highlight the most probable function labels but neglect others.

Table 9.

Average performance of multilabel classifiers.

Classifier Feature space Precision Recall Micro-F1
SVM TFIDF 0.6202 0.3945 0.4822
Attributes 0.6510 0.4102 0.5033
TFIDF + attributes 0.7359 0.4823 0.5827

Adaboost TFIDF 0.5729 0.3102 0.4025
Attributes 0.6856 0.3358 0.4508
TFIDF + attributes 0.6894 0.3475 0.4621

Bayes Network TFIDF 0.5126 0.4325 0.4691
Attributes 0.6179 0.4218 0.5013
TFIDF + attributes 0.6397 0.5124 0.5690

From Table 9, we notice that when using TFIDF features only, the performance is not good. The predictive ability based on herbal attributes is better than TFIDF features. This indicates that “channel tropism,” “nature & flavor,” and “efficiency” are valuable information for function prediction, which is consistent with TCM theory. The combination of the features outperforms individual feature space. SVM produces the highest Micro-F1 on the “TFIDF + attributes” feature space compared with other classifiers.

3.3. Discussion

From the compatibility rule mining results, we can see that our method can effectively discover herb pairs with combinational rules. The method is not meant to perfectly model TCM reality, but to function as a tool for TCM practitioners. Also, it can indicate herbs that are likely to be used together for special therapeutic effects and allow researchers to make attempts at further study.

From the topic discovery results, we can see that it is feasible to employ the supervised topic model to predict the function of a new prescription. The idea of incorporating compatibility rules into the process of topic modeling promotes the accuracy of our task. The results are more satisfactory than Labeled LDA because the efficiency of a pair of herbs is more explicit than a single herb, which contributes to the function prediction on a new prescription.

The two proposed kinds of methods can provide valuable information for new prescription discovery before clinical test procedures [16], but each has its advantages. The method based on multilabel classifiers contains complicated and trivial steps in feature extraction, such as dosage standardization and attributes quantification, while the LPH topic model cannot choose the optimal threshold automatically. Although we may improve the function prediction performance by using SVM classifier and LPH model, the results are not very satisfactory. It is possible to combine these two methods to promote prediction accuracy in our future work.

4. Conclusions

This paper has presented two methods for prescription function prediction. In the first method, we employ a novel supervised topic model named LPH to calculate the prescription's mostly likely function labels. In the second method, we extract feature space based on TFIDF weighting and herbal attributes and use these features to build multilabel classifiers. Results on real world datasets show the effectiveness of our methods. The results can provide valuable information for new prescription discovery.

When doctors write a prescription for the patient, they should obey the principal named “Jun,” “Chen,” “Zuo,” “Shi”, which plays a significant role in determining a prescription's function. In the future, we plan to analyze the components of a prescription based on its herbal attributes and dosage information. In other words, the herbs in a prescription may possibly be clustered into four classes by data mining algorithms. The results may further improve the accuracy of our prediction task.

Acknowledgments

This study was funded by Zhejiang Provincial Natural Science Foundation of China under Grant no. LQ14F020008, National Natural Science Foundation of China under Grant no. 61602402, and Chinese Knowledge Center for Engineering Science and Technology (CKCEST).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  • 1.Yang H., Chen J., Tang S., et al. New drug, RD of traditional Chinese medicine: role of data mining approaches. Journal of Biological Systems. 2009;17(3):329–347. doi: 10.1142/S0218339009002971. [DOI] [Google Scholar]
  • 2.Liu X., Hong W., Song J., Zhang T. Using formal concept analysis to visualize relationships of syndromes in Traditional Chinese Medicine. Medical Biometrics. 2010;6165:315–324. doi: 10.1007/978-3-642-13923-9_34. [DOI] [Google Scholar]
  • 3.Yang T., Wu C., Xu Z., Ding Y. The syndrome differentiation model and program of traditional Chinese medicine based on the fuzzy recognition. Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013; December 2013; pp. 285–287. [DOI] [Google Scholar]
  • 4.Zhang X.-P., Zhou X.-Z., Huang H.-K., Feng Q., Chen S.-B., Liu B.-Y. Topic model for chinese medicine diagnosis and prescription regularities analysis: case on diabetes. Chinese Journal of Integrative Medicine. 2011;17(4):307–313. doi: 10.1007/s11655-011-0699-x. [DOI] [PubMed] [Google Scholar]
  • 5.Qiao S. J., Tang C. J. Mining the compatibility rule of multidimensional medicines based on dependence model sets. Journal of Sichuan University(Engineering and Science Edition. 2007;39(4):134–138. [Google Scholar]
  • 6.Wang L., Zhang Y., Xu X. A novel group detection method for finding related Chinese herbs. Journal of Information Science and Engineering. 2015;31(4):1387–1411. [Google Scholar]
  • 7.Wang Y., Yu Z., Jiang Y., Xu K., Chen X. Automatic symptom name normalization in clinical records of traditional Chinese medicine. BMC Bioinformatics. 2010;11, article no. 40 doi: 10.1186/1471-2105-11-40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Chen J., Poon J., Poon S. K., Xu L., Sze D. M. Y. Mining symptom-herb patterns from patient records using tripartite graph. Evidence-based Complementary and Alternative Medicine. 2015;2015:14. doi: 10.1155/2015/435085.435085 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Yao L., Zhang Y., Wei B., et al. Discovering treatment pattern in Traditional Chinese Medicine clinical cases by exploiting supervised topic model and domain knowledge. Journal of Biomedical Informatics. 2015;58:260–267. doi: 10.1016/j.jbi.2015.10.012. [DOI] [PubMed] [Google Scholar]
  • 10.Lin F., Zhang Z., Lin S.-F., Zeng J.-S., Gan Y.-F. Study of TCM clinical records based on LSA and LDA SHTDT model. Experimental and Therapeutic Medicine. 2016;12(1):288–296. doi: 10.3892/etm.2016.3285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Jiang Z., Zhou X., Zhang X., Chen S. Using link topic model to analyze traditional Chinese medicine clinical symptom-herb regularities. Proceedings of the IEEE 14th International Conference on e-Health Networking, Applications and Services, Healthcom 2012; October 2012; pp. 15–18. [DOI] [Google Scholar]
  • 12.Wang S., Hu Y., Tan W., et al. Compatibility art of traditional Chinese medicine: from the perspective of herb pairs. Journal of Ethnopharmacology. 2012;143(2):412–423. doi: 10.1016/j.jep.2012.07.033. [DOI] [PubMed] [Google Scholar]
  • 13.Salam A., Khayal M. S. H. Mining top−k frequent patterns without minimum support threshold. Knowledge and Information Systems. 2012;30(1):57–86. doi: 10.1007/s10115-010-0363-3. [DOI] [Google Scholar]
  • 14.Blei D. M., Ng A. Y., Jordan M. I. Latent Dirichlet allocation. Journal of Machine Learning Research. 2003;3(4-5):993–1022. [Google Scholar]
  • 15.Ramage D., Hall D., Nallapati R., Manning C. D. Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '09); August 2009; pp. 248–256. [Google Scholar]
  • 16.Yang H.-J., Shen D., Xu H.-Y., Lu P. A new strategy in drug design of Chinese medicine: Theory, method and techniques. Chinese Journal of Integrative Medicine. 2012;18(11):803–806. doi: 10.1007/s11655-012-1270-x. [DOI] [PubMed] [Google Scholar]
  • 17.Griffiths T. L., Steyvers M. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(1):5228–5235. doi: 10.1073/pnas.0307752101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Zhang M.-L., Zhou Z.-H. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering. 2014;26(8):1819–1837. doi: 10.1109/TKDE.2013.39. [DOI] [Google Scholar]

Articles from Evidence-based Complementary and Alternative Medicine : eCAM are provided here courtesy of Wiley

RESOURCES