Skip to main content
PLOS One logoLink to PLOS One
. 2021 Jul 9;16(7):e0253415. doi: 10.1371/journal.pone.0253415

Unsupervised multi-source domain adaptation with no observable source data

Hyunsik Jeon 1, Seongmin Lee 1, U Kang 1,*
Editor: Thippa Reddy Gadekallu2
PMCID: PMC8270218  PMID: 34242258

Abstract

Given trained models from multiple source domains, how can we predict the labels of unlabeled data in a target domain? Unsupervised multi-source domain adaptation (UMDA) aims for predicting the labels of unlabeled target data by transferring the knowledge of multiple source domains. UMDA is a crucial problem in many real-world scenarios where no labeled target data are available. Previous approaches in UMDA assume that data are observable over all domains. However, source data are not easily accessible due to privacy or confidentiality issues in a lot of practical scenarios, although classifiers learned in source domains are readily available. In this work, we target data-free UMDA where source data are not observable at all, a novel problem that has not been studied before despite being very realistic and crucial. To solve data-free UMDA, we propose DEMS (Data-free Exploitation of Multiple Sources), a novel architecture that adapts target data to source domains without exploiting any source data, and estimates the target labels by exploiting pre-trained source classifiers. Extensive experiments for data-free UMDA on real-world datasets show that DEMS provides the state-of-the-art accuracy which is up to 27.5% point higher than that of the best baseline.

Introduction

Given trained models from multiple source domains, how can we predict the labels of unlabeled data in a target domain? Unsupervised multi-source domain adaptation (UMDA) aims at predicting the labels of unlabeled target data by utilizing the knowledge of multiple source domains. Many previous works [19] for UMDA have focused on finding domain-invariant features z of data x to transfer the knowledge of conditional probability p(y|z), where y represents the label of data x, from the source domains to the target domain. It is thus essential for UMDA that data x is observable in all domains to be able to estimate the conditional probabilities p(z|x) of all domains while finding the domain-invariant features z.

However, source data are not always accessible, although models of conditional probabilities p(y|x) learned in source domains are often readily available, due to privacy or confidentiality issues in many practical scenarios. For instance, a hospital is allowed to access disease classifiers that are trained in other hospitals but not the data the classifiers observed because of privacy issues. Fig 1 illustrates the UMDA problems with two different constraints. It is problematic to find a shared manifold z and to translate data between domains if source data are not observable at all (Fig 1b), compared to the setting where data are observable in all domains (Fig 1a).

Fig 1. An illustration of unsupervised multi-source domain adaptation (UMDA) problems.

Fig 1

(a) illustrates UMDA problem with observable source data, and (b) illustrates data-free UMDA problem with no observable source data. It is challenging to reduce the distribution discrepancy between source and target domains in (b) since there are no accessible source data.

In this paper, we focus on data-free UMDA (Fig 1b), a more difficult but practical problem of knowledge transfer from multiple source domains to an unlabeled target domain. The main challenges are that: 1) we cannot directly estimate the target conditional probability p(y|x) since target labels are not given, and 2) we cannot directly learn the shared manifold z between domains since there is no information of source domain data distributions p(x). We propose DEMS (Data-free Exploitation of Multiple Sources), a novel architecture that adapts target data to source domains without using any source data and estimates the target labels exploiting pre-trained source classifiers. To the best of our knowledge, there has been no approach for data-free UMDA.

Table 1 compares DEMS with other algorithms for data-free UMDA in various perspectives. Since data-free UMDA is a new problem without previous studies, we introduce several baselines. The first one is Best Single Source which employs source classifiers individually and to find the best source classifier. The second one is Average which averages the results of all source classifiers. The third one is Weighted Sum which combines the results of all source classifiers by calculating domain proximities in a heuristic way. DEMS is the only method that utilizes multiple sources, considers domain proximity, and adapts source domains into target domain. Table 2 lists the symbols used in this paper. The contributions of this work are as follows:

Table 1. Comparison of DEMS and other methods.

Method Utilize multiple sources Consider domain proximity Domain adaptation
Best Single Source X X X
Average O X X
Weighted Sum O O X
DEMS (proposed) O O O

DEMS is the only method supporting all the desired properties.

Table 2. Table of frequently-used symbols.

Symbol Description
T Target domain
Sk k-th source domain
MSk k-th source classifier
xT , yT Data and label of target domain
xSk , ySk Data and label of k-th source domain
Ak Adaptation model from target domain to k-th source domain
E Encoder
DT Decoder for target domain
DSk Decoder for k-th source domain
  • Problem Formulation. We formulate a new problem of data-free UMDA which is challenging but important task for transfer learning (see Fig 1b). Unlike traditional UMDA, data-free UMDA needs to handle the issue of inaccessible source data.

  • Approach. We propose DEMS, a novel approach to solve data-free UMDA. DEMS adapts target data to source domains and exploits given source classifiers based on our proposed domain proximity. DEMS learns the adaptation functions while regulating the classification results of the source classifiers after adaptation.

  • Performance. Our extensive experiments demonstrate that DEMS provides the state-of-the-art accuracy which is up to 27.5% point higher than that of the best baseline (see Fig 2).

Fig 2. Classification accuracy.

Fig 2

DEMS shows the best classification accuracy for five target domains; each percentage indicates the accuracy increase compared to the second-best one for each target domain.

Related work

Domain adaptations (DA) aim at transferring the knowledge of a source domain to a different but related target domain. Unsupervised domain adaptation (UDA) aims to leverage a labeled source domain dataset for label prediction for an unlabeled target domain dataset. Various approaches for UDA have been proposed including adversarial methods [1013], distance-based methods [1418], and optimal transportations [19, 20].

Recent works [19] address unsupervised multi-source domain adaptation (UMDA) which aims at transferring the knowledge from multiple source domains rather than a single one to an unlabeled target domain. UMDA bestows high potential of a superior performance by exploiting multiple source domain knowledge, but poses challenges of reducing domain discrepancy between multiple domains and obtaining appropriate domain-invariant features. Many previous works have tackled UMDA problems with various approaches. Table 3 summarizes the key differences in various approaches. Zhao et al. [5] propose an adversarial network based approach with generalization bounds for UMDA. Xu et al. [6] propose Deep Cocktail Network which addresses the domain and category shifts among multiple source domains in a multi-way adversarial manner. Peng et al. [9] introduce moment matching to UMDA to dynamically align moments of low-dimensional features in source and target domains while training source classifiers. However, these approaches assume that source data are observable and train adaptation networks to align manifolds of source and target domains. Thus they are not applicable to our setting where no source data are accessible due to strict privacy or confidentiality issues. On the other hand, DEMS trains adaptation networks using target data while regulating the results of the given source classifiers.

Table 3. Comparison of different latent space transformation methods for unsupervised multi-source domain adaptation (UMDA).

Method Source data accessibility Feature alignment method
[5, 6] Accessible Adversarial approach
[9] Accessible Discrepancy-based approach
DEMS (proposed) Inaccessible Source classifier-based approach

Previous studies have proposed adversarial and discrepancy-based approaches which necessitate source data. On the other hand, DEMS works without source data by carefully utilizing source classifiers.

Proposed method

Problem definition

Suppose there are N source domains S1,S2,,SN and one target domain T where all domains have different data distributions. We are given pre-trained source classifiers {MSk:xSkySk}k=1N that predict the labels of data from the corresponding source domains {Sk}k=1N, and an unlabeled target dataset XT={xTi}i=1nT from the target domain T; for simplicity, we assume the target dataset is sampled from uniform label distribution. Each source classifier MSk is trained under a labeled dataset {(xSki,ySki)}i=1nSk which is drawn from the corresponding domain data distribution pSk(x,y). Note that the source datasets are unavailable to us, and only the source classifiers are available. In this work, we assume 1) homogeneity which indicates that sources and target domains have similar feature spaces and label distributions, and 2) closed label set, i.e. ySk,yTY for k = 1, 2, …, N, where Y is the label space, indicating all domains have the same label space. The goal of data-free UMDA is to accurately predict the target domain labels YT={yTi}i=1nT of the corresponding target domain data XT={xTi}i=1nT.

Method overview

In UMDA, directly training a target classifier MT:xTyT from the target dataset is not possible since the target labels are not observable. Thus, most UMDA methods train N adaptation functions {Ak:XTXSk}k=1N and exploit the pre-trained source classifiers {MSk}k=1N to predict the target labels YT of the target data XT. However, in data-free UMDA, we face the challenge of defining the objective function to train the adaptation functions {Ak}k=1N, since the source data are unobservable and we have no information about the source data distribution pSk(x) that was used to train MSk.

To address the challenge, we propose DEMS (Data-free Exploitation of Multiple Sources), a novel method for unsupervised multiple domain adaptation problem when the source data are entirely unavailable. We cannot directly learn the adaptation results of the target data to the source domains since we have no information on the source domains at all. Hence, we regulate the classification results using the source classifiers instead of learning the translation between the target and the source domains directly.

We introduce four ideas in DEMS to regulate the classification results.

  • The first idea is label consistency regularization which regulates the label predictions of all source classifiers to be similar. The adapted examples from the target domain to the source domains should all have the same label if the adaptation functions work properly; we relax the constraint so that the conditional probability p(y|x) of adapted examples should be similar across all source domains.

  • The second idea is batch entropy regularization which maximizes the label entropy of a shuffled mini-batch. The labels of randomly selected target examples are uniformly distributed; note that we assume the target dataset is sampled from uniform label distribution. Thus, we maximize the batch entropy to prevent mode collapse where most of the target examples are mapped to a specific label.

  • The third ideas are instance entropy regularization and pseudo label which minimize the label entropy of each instance. A target example naturally has a clear single label. Thus, the adapted examples should all have clear labels if the adaptation functions work properly; we minimize the label entropy after adaptation. We further bolster the entropy minimization by labeling highly confident target data with pseudo labels and minimizing cross-entropy loss between predictions and the pseudo labels.

  • The last idea is reconstruction regularization that forces an autoencoder to reconstruct target data from the shared manifold. The autoencoder helps find the manifold without losing meaningful information. Thus, we introduce the autoencoder in DEMS with shared parameters and reconstruct target examples to learn their manifold effectively.

The overall architecture of DEMS is depicted in Fig 3. DEMS adapts the target features XT to the source domains {Sk}k=1N via an encoder and decoders to exploit the source classifiers {MSk}k=1N. Each adaptation function Ak:XTXSk is divided into two components: encoder E and decoder DSk. The encoder E takes a target data xT as an input and returns its low-dimensional representation vector z; E is shared over all domain adaptation functions. The decoder DSk takes the vector z as an input and returns x^Sk, the translated data into the domain Sk. Additionally, we introduce a decoder DT that decodes the low-dimensional representation z into the target domain T. We describe the label prediction and the objective function of DEMS in the next.

Fig 3. Overall architecture of DEMS.

Fig 3

Method details

Label prediction

For each unlabeled target instance xT, DEMS exploits pre-trained source models {MSk}k=1N in predicting its label yT. Specifically, the predicted label by DEMS is formulated as:

y^T=k=1NwSkMSk(x^Sk). (1)

In the equation, x^Sk is DSk(E(xT)) which indicates the translated data instance into source domain Sk utilizing the encoder E and the decoder DSk; 0wSk1 (Eq 2) denotes the weight for the source domain Sk. All weights add up to 1, i.e. k=1NwSk=1, which states that DEMS predicts label y^T of data xT as a weighted sum of the source classifiers’ predictions after domain adaptations. DEMS depends more on the prediction of a source classifier with a higher proximity as:

wSk=exp(Φ(T,Sk)/λ1)k=1Nexp(Φ(T,Sk)/λ1), (2)

where Φ(A, B) (Eq 3) denotes the degree of proximity between domains A and B, and λ1 > 0 is a hyperparameter that controls the balance of dependency on source domains. For instance, all the source classifiers contribute almost equally to the label prediction if λ1 is a large value, while a source classifier with higher proximity Φ becomes dominant to the label prediction if λ1 is close to 0.

It is challenging to estimate the degree of proximity between domains since data distributions p(x) of domains are not observable except for the target domain. Our approach is to learn it using an objective function; the degree of proximity Φ(A, B) between domain A and B is defined by

Φ(A,B)=vAvB, (3)

where vA,vBRd are learnable parameters with dimensionality d, which indicates that the degree of proximity between domains A and B is estimated by an inner-product of their trained embedding vectors. The embedding vectors are trained in the optimization process.

Objective function

DEMS is trained to minimize the following loss:

Ltotal=αLlabel+βLentropy+γLpseudo+Lrecon, (4)

which consists of four different loss terms Llabel, Lentropy, Lpseudo, and Lrecon. α, β, and γ are nonnegative hyperparameters that adjust the balance between the loss terms. We define these loss terms in Eqs 5, 911, respectively.

Label consistency regularization

The aim of domain adaptation is to translate domain-specific features of an example from the target domain to any source domain while preserving its semantics. If a target example xT is adapted to multiple source domains while preserving its semantics, the conditional probability p(y|x) of the adapted examples in all source domains should be similar. For instance, if an example has a high probability of label 4 in the target domain, the adapted example should likewise have high probabilities of label 4 in any source domain. To guarantee this property, we propose a label-consistency regularization for multi-source domain adaptation as:

Llabel=(2N)11i<jNrSi,SjJSD(y^Si||y^Sj), (5)

where y^Sk is MSk(x^Sk) indicating the label probability distribution of xT estimated by source domain classifier MSk after adapted to the source domain Sk. JSD(⋅) in the equation indicates Jensen-Shannon divergence [21] which is a symmetrized and smoothed version of the Kullback-Leibler divergence [22]. Jensen-Shannon divergence measures the distance between two probability distributions; a small JSD indicates that the two distributions are similar, and a large JSD indicates otherwise. rSi,Sj (Eq 6) is a degree of proximity between Si and Sj over the sum of all possible proximities between source domains:

rSi,Sj=exp(Φ(Si,Sj)/λ2)1i<jNexp(Φ(Si,Sj)/λ2). (6)

rSi,Sj strengthens label-consistency between close source domains while mitigating that between distant source domains. λ2 > 0 is a hyperparameter to control the degree of the regularization.

Entropy regularizations

Entropy regularizations include two distinct losses based on information entropy [23]: 1) batch-entropy loss Lbe for maximizing the label entropy of a batch, and 2) instance-entropy loss Lie for minimizing the label entropy of each instance.

We assume that the target dataset is balanced against classes, i.e. examples are sampled with a similar probability from each label, which is a common prior for real-world data. By the assumption, the average of all target label probabilities follows a uniform distribution, i.e. (1|C|,1|C|,,1|C|) where C denotes the set of classes. Using the fact that a uniform distribution has the maximum value of information entropy, we define the batch-entropy loss as follows:

Lbe=-1Nk=1NH(1|B|iBy^Ski), (7)

where B is set of instances of a mini-batch {xTipT(x)}, and H(⋅) indicates the information entropy [23]; the mini-batch is also balanced against classes since it is randomly sampled from the whole dataset. By minimizing the batch-entropy loss, we force the average of batch-wise label probabilities estimated by each source classifier after adaptation to have a uniform probability distribution.

On another aspect, each target instance inherently has a clear single label, which indicates that it has a one-hot label probability even if the exact label probability is unknown. Based on the fact that a one-hot probability distribution has the minimum value of information entropy [23], we define the instance-entropy loss as follows:

Lie=1N|B|k=1NiBH(y^Ski). (8)

We finally define the total entropy loss by summing up batch-entropy loss (Eq 7) and instance-entropy loss (Eq 8) as follows:

Lentropy=Lbe+Lie. (9)

Pseudo label

High confidence of the predicted label of a target example, which is estimated by Eq 1, indicates that the example is successfully adapted to source domains and clearly classified by the source classifiers. Accordingly, we employ pseudo-labels to bolster the current predictions by pretending that the predicted label is the ground-truth label. The pseudo-label loss is formulated by a cross-entropy between the predictions and the pseudo-labels as follows:

Lpseudo=-1|b|ibjC(y¯Ti)jlog(y^Ti)j, (10)

where C is the set of classes, y¯T=Dirac(y^T), and (y)j denotes the probability of j-th class in y. y^T is a predicted target label by DEMS (Eq 1). Dirac(⋅) is a function that makes a Dirac distribution; for simplicity, we choose one-hot vectorization that sets the maximum probability to 1 and the rest to 0. Only examples that meet maxj(y^T)j>, where 0 ≤ ϵ ≤ 1 is a hyperparameter that regulates the threshold of confidence, are sampled from the mini-batch B; bB in Eq 10 indicates the selected subset of the mini-batch.

Reconstruction

Autoencoders [24], which encode input data to low-dimensional vectors and decode them into the original space by reconstruction regularization, learn a meaningful low-dimensional manifold by preventing the simple copy of the input data. We employ an autoencoder sharing the encoder E in finding a low-dimensional manifold z. The reconstruction loss is formulated as follows:

Lrecon=|xT-x^T|1, (11)

where x^T is DT(E(xT)) indicating the reconstruction of xT by encoder E and decoder DT, and ‖⋅‖1 denotes the l1 norm.

Algorithm 1 Training DEMS (Data-free Exploitation of Multiple Sources)

Require: unlabeled target dataset XT={xTi}i=1nT

Require: trained source classifiers {MSk:xSkySk}k=1N

Require: adaptation networks {Ak:XTXSk}k=1N

Require: hyperparameters α, β, γ, λ1, λ2, and ϵ

Ensure: trained adaptation networks {Ak:XTXSk}k=1N

1: for [1, num_epochs] do

2:  Calculate the label consistency loss Llabel (Eq 5)

3:  Calculate the batch-entropy loss Lbe (Eq 7)

4:  Calculate the instance-entropy loss Lie (Eq 8)

5:  Calculate the entropy loss LentropyLbe+Lie (Eq 9)

6:  Predict the target labels y^T (Eq 10) and filter only ones that meet maxj(y^T)j>

7:  Calculate the pseudo-label loss Lpseudo (Eq 10)

8:  Calculate the reconstruction loss Lrecon (Eq 11)

9:  Calculate the total loss LtotalαLlabel+βLentropy+γLpseudo+Lrecon (Eq 4)

10:  Update the parameters of {Ak}k=1N to minimize Ltotal

11: end for

Algorithm

We summarize the training algorithm of DEMS in Algorithm 1. DEMS takes initialized adaptation networks {MSk:xSkySk}k=1N and trains them while exploiting pre-trained source classifiers without any source data. DEMS calculates the total loss Ltotal in lines 2 to 9. Then, in line 10, DEMS updates the parameters of the adaptation networks {MSk}k=1N to minimize the total loss Ltotal. This is repeated until the adaptation networks {MSk}k=1N are trained properly; we use validation set and the training is performed until the total loss Ltotal of the validation set is the lowest. After being trained, DEMS predicts the target labels of test data by Eq 10 using the trained adaptation networks. The predicted target labels are evaluated by the ground-truth labels and we report the accuracies in the next section. The computational complexity is dependent on the architecture of the encoder and decoders. In the case of a CNN-based architecture, the computational complexity of label prediction of DEMS is O(HWk2MN); H and W are height and width of input image, respectively, k is size of kernel, and M and N are sizes of input and output channels, respectively.

Experiments

We conduct experiments to answer the following questions:

  • Q1. Accuracy. How accurate is DEMS on real-world datasets?

  • Q2. Qualitative analysis. How well does DEMS adapt a given target example to source domains?

  • Q3. Parameter sensitivity. How much do (Eq 10) and λ (Eqs 2 and 6) affect the accuracy?

Experimental settings

Datasets

We use five different number datasets: MNIST [25], MNIST-M [10], SVHN [26], SynDigits [27], and USPS [28], which are summarized in Table 4; Fig 4 shows sample images of each dataset. For SynDigits, we use a randomly selected subset of 60,000 images for training and validation out of 479,400 images;the subset is considered to possess sufficient domain knowledge since a classifier trained on it shows 95.9% accuracy. We use the original datasets for the other datasets. The five datasets are scaled to the size of (3 × 32 × 32) to have the same input dimensionality. We set one of them as a target and the rest as sources in the experiments.

Table 4. Summary of datasets.
Dataset Features Training Validation Test
MNIST 1 × 28 × 28 55,000 5,000 10,000
MNIST-M 3 × 32 × 32 55,000 5,000 10,000
SVHN 3 × 32 × 32 68,257 5,000 26,032
SynDigits 3 × 32 × 32 55,000 5,000 9,553
USPS 1 × 16 × 16 6,291 1,000 2,007
Fig 4. Sample images (10 classes).

Fig 4

Baselines

We set three baselines: Best single source, Average, and Weighted sum. Best single source directly feeds the target data into source classifiers, and the source classifier which yields the best performance is chosen. Average feeds the target data into all source classifiers and averages the resulting label probabilities to predict target labels. Weighted sum takes a weighted sum of the results after feeding the target data into source classifiers; we utilize Eq 2 for the weights, and set Φ(T,Sk) as ξ-LentropyTSk, where LentropyTSk is the sum of batch-entropy loss and instance-entropy loss that are estimated when the target data are directly fed into source classifier MSk. ξ is a hyperparmeter and we set it to 1 for all experiments. The intuition behind the definition of Φ(T,Sk) is that LentropyTSk is presumable to be low if the degree of proximity between T and Sk is high.

Network architecture

We pre-train ResNet14 [29] for each dataset to generate the source classifiers. We adopt the architecture of generator in CycleGAN [30]; the encoder is composed of two convolutional layers with stride size two and three residual blocks [29]; each of the decoder is composed of three residual blocks and two transposed convolutional layers with stride size two. We use batch normalization [31] for the encoder and the decoders. Note that an appropriate network architecture should be selected for each domain of application; recurrent neural networks [32] and graph autoencoders [33] could be selected in the natural language processing domain [34, 35] and in the graph domain [3639], respectively.

Training details

We first minimize Lrecon during the first 5 epochs, initialize {DSk}k=1N with the trained DT, and then minimize Ltotal. Finally, a classification accuracy of the test target dataset is reported at the lowest validation loss Ltotal among 100 epochs. Each experiment is performed 5 times with different random seeds, and the standard deviation is reported along with the average. We use the hyperparameters that give the best performance. We set α = 0.1, β = 1, and γ = 1 among {0.1, 0.5, 1, 5, 10} in Eq 4. Unless otherwise noted, ϵ (Eq 10) is set to 0.9 among {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}. We set λ1 (Eq 2) and λ2 (Eq 6) the same as λ; λ is set to 1 among {0.125, 0.25, 0.5, 1, 2, 4, 8}. We set the dimensionality of vA and vB as 10 in Eq 3. All the networks are trained with Adam optimizer [40] with learning rate 0.001, l2 regularization coefficient 0.0001, β1 = 0.9, and β2 = 0.999. We implement all the codes with PyTorch and perform a grid search to find the best hyperparameters, using a workstation with RTX 2080 Ti.

Accuracy

Overall performance

We compare DEMS with other baselines for data-free UMDA. Table 5 shows the classification accuracy. DEMS shows the best performance outperforming the baselines in all experiments. In particular, the performance differences between DEMS and the baselines are large for the MNIST-M target which has very complex patterns as shown in Fig 4; DEMS shows 27.5% point higher accuracy than the best baseline. In all experiments except the USPS target, Average and Weighted sum exploiting the knowledge of multiple source domains show worse performances than Best single source exploiting the knowledge of single source domain. This demonstrates how challenging data-free UMDA problem is and supports the contribution of this work.

Table 5. Classification accuracy of DEMS and baselines.
Target dataset Best single source (Single source) Average (Multi-sources) Weighted sum (Multi-sources) DEMS (proposed) (Multi-sources)
MNIST 97.65 ± 0.75% 94.87 ± 1.22% 96.37 ± 0.40% 99.01 ± 0.12%
MNIST-M 45.03 ± 3.74% 33.50 ± 1.72% 40.91 ± 1.24% 72.57 ± 3.20%
SVHN 71.87 ± 3.53% 23.11 ± 1.61% 56.09 ± 5.17% 76.60 ± 1.39%
SynDigits 91.89 ± 1.79% 60.47 ± 5.69% 78.66 ± 4.37% 93.74 ± 0.79%
USPS 82.03 ± 3.77% 84.54 ± 5.31% 88.09 ± 2.04% 96.14 ± 0.41%

The remains except for the target dataset are used for sources. The best method is in bold, and the second best one is underlined. Note that DEMS gives the best performance.

Ablation study

We conduct an ablation study to evaluate how each loss of DEMS contributes to the performance.Table 6 shows the ablation study that evaluates the effectiveness of each loss in DEMS. Note that each of the proposed losses in the objective function (Eq 4) contributes significantly to the performance of DEMS, showing the effectiveness of our ideas.

Table 6. Ablation study on MNIST-M target dataset.
DEMS DEMS-Llabel DEMS-Lbe DEMS-Lie DEMS-Lpseudo DEMS-Lrecon
72.69 ± 2.60% 65.65 ± 5.55% 10.33 ± 0.62% 59.58 ± 3.57% 43.88 ± 1.73% 11.11 ± 1.59%

DEMS-L indicates a variant of DEMS with L excluded from Ltotal. Note that each of the loss significantly contributes to the accuracy of DEMS.

Qualitative analysis

We analyze DEMS and its variants DEMS-L qualitatively to evaluate how well DEMS adapts data to different domains; DEMS-L indicates a variant of DEMS with L excluded from Ltotal. Note that the baseline algorithms are not analyzed qualitatively since they do not adapt data to different domains (see Table 1). For DEMS-L, we select three variants DEMS-Lpseudo, DEMS-Lbe, and DEMS-Lrecon which show the lowest accuracies in the ablation study (see Table 6).

Fig 5 visualizes adapted sample examples from MNIST-M to MNIST, SVHN, SynDigits, and USPS, respectively. DEMS (Fig 5b) translates the images into noises at the beginning of training (epoch 1). As training progresses, however, meaningful patterns (e.g. shape of digits rather than backgrounds) of the target images are detected and adapted to each source domain (epoch 7). As training progresses more (epoch 30), DEMS focuses adaptation on closer source domains (MNIST, SVHN, and SynDigits) than to the far source domain (USPS), and its classification performance improves. DEMS-Lpseudo (Fig 5c) successfully adapts most of the classes to MNIST and SynDigits, but fails to adapt some classes (digits 3, 7, and 9) to the source domains yielding degraded classification performance. It is shown that DEMS-Lbe (Fig 5d) and DEMS-Lrecon (Fig 5e) do not learn to adapt the target data to the source domains.

Fig 5. Visualization of image adaptation from MNIST-M to other source domains.

Fig 5

Fig (a) enumerates target samples for Figs (b), (c), (d), and (e). The target samples are adapted by adaptation networks which are trained with different losses. For DEMS (Fig (b)), the adaptation gradually focuses on the close source domains (MNIST, SVHN, and SynDigits), resulting in performance enhancement. For DEMS-Lpseudo (Fig (c)), some classes (digits 3, 7, and 9) are failed to be adapted to source domains. For DEMS-Lbe and DEMS-Lrecon (Figs (d) and (e)), the adaptations are not trained at all.

Parameter sensitivity

Sensitivity of ϵ

The hyperparameter ϵ, which is involved in Lpseudo (Eq 10), governs the threshold of pseudo-labels. As ϵ increases, the selected examples have higher confidence while fewer examples are selected. On the other hand, as ϵ decreases, the number of selected examples increases while the confidence of the examples decreases. As shown in Fig 6a, the accuracy is the highest when ϵ is 0.9 for all datasets, and the accuracy is significantly reduced in the extreme case when ϵ = 1. The results demonstrate that DEMS is best optimized through high-quality pseudo-labels.

Fig 6. Sensitivity of accuracy to the hyperparameters ϵ (Eq 10) and λ (Eqs 2 and 6).

Fig 6

Sensitivity of λ

The hyperparameter λ, which is involved in Eqs 2 and 6, controls the balance of dependency between domains; note that λ1 = λ2 = λ for our experiments. For instance, if λ is a large positive value, all the source classifiers almost equally contribute to the target label prediction in Eq 1 and are highly regulated to output the similar predictions in Eq 5. For instance, if λ is a large positive value, all the source classifiers almost equally contribute to the target label prediction in Eq 1 and even source classifiers that are not close to each other are regulated to output the similar predictions in Eq 5. Conversely, if λ is close to zero, a source classifier closer to the target domain contributes more to the target label prediction in Eq 1 and source classifiers that are not closer to each other are less regulated to output similar predictions in Eq 5. Fig 6b shows that the best results are obtained when λ = 1 for all target domains, and the performance degrades if the λ is too large or too small. In particular, SVHN which has relatively complex patterns shows a severely degraded performance when λ is larger than 2, which means that it is more helpful for a complex target to consider a nearby source than all sources.

Conclusion

We propose DEMS (Data-free Exploitation of Multiple Sources), a novel architecture for multi-source domain adaptation without any observable source data. DEMS learns to adapt target data to each source domain to exploit the pre-trained source classifiers. Experiments on real-world datasets show that DEMS outperforms baselines up to 27.5% point higher accuracy, by successfully learning the adaptation function and exploiting the source classifiers in target label predictions. However, DEMS assumes that the source and target domains have similar feature spaces and have the same label space. Thus, DEMS is not applicable in domain adaptation between heterogeneous domains. Future works include extending DEMS to transfer knowledge between heterogeneous domains, e.g. from images to text or vice versa, that may require careful design of adaptation networks.

Data Availability

The data and code are available at: https://github.com/snudatalab/DEMS.

Funding Statement

This work was supported by Institute of Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (No.2020-0-00894, Flexible and Efficient Model Compression Method for Various Applications and Environments). The Institute of Engineering Research and ICT at Seoul National University provided research facilities for this work. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Gan, C., Yang, T., & Gong, B. Learning attributes equals multi-source domain generalization. In CVPR (2016).
  • 2.Hoffman, J., Kulis, B., Darrell, T., & Saenko, K. Discovering latent domains for multisource domain adaptation. In ECCV (2012).
  • 3.Sun, Q., Chattopadhyay, R., Panchanathan, S., & Ye, J. A two-stage weighting framework for multi-source domain adaptation. In NeuIPS (2011).
  • 4.Zhang, K., Gong, M., & Schölkopf, B. Multi-source domain adaptation: A causal view. In AAAI (2015).
  • 5.Zhao, H., Zhang, S., Wu, G., Moura, J. M. F., Costeira, J. P., & Gordon, G. J. Adversarial multiple source domain adaptation. In NeurIPS (2018).
  • 6.Xu, R., Chen, Z., Zuo, W., Yan, J., & Lin, L. Deep cocktail network: Multi-source unsupervised domain adaptation with category shift. In CVPR (2018).
  • 7.Roy, S., Siarohin, A., Sangineto, E., Sebe, N., & Ricci, E. Trigan: Image-to-image translation for multi-source domain adaptation. CoRR (2020).
  • 8. Ben-David S., Blitzer J., Crammer K., Kulesza A., Pereira F., & Vaughan J. W. A theorcy of learning from different domains. Mach. Learn. 79(1-2), 151–175 (2010). doi: 10.1007/s10994-009-5152-4 [DOI] [Google Scholar]
  • 9.Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., & Wang, B. Moment matching for multi-source domain adaptation. In ICCV (2019).
  • 10.Ganin, Y. & Lempitsky, V. S. Unsupervised domain adaptation by backpropagation. In ICML (2015).
  • 11.Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., & Krishnan, D. Unsupervised pixel-level domain adaptation with generative adversarial networks. In CVPR (2017).
  • 12.Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. Adversarial discriminative domain adaptation. In CVPR (2017).
  • 13.Long, M., Cao, Z., Wang, J., & Jordan, M. I. Conditional adversarial domain adaptation. In NeurIPS (2018).
  • 14.Long, M., Zhu, H., Wang, J., & Jordan, M. I. Deep transfer learning with joint adaptation networks. In ICML (2017).
  • 15.Long, M., Zhu, H., Wang, J., & Jordan, M. I. Unsupervised domain adaptation with residual transfer networks. In NeurIPS (2016).
  • 16.Long, M., Cao, Y., Wang, J., & Jordan, M. I. Learning transferable features with deep adaptation networks. In ICML (2015). [DOI] [PubMed]
  • 17.Zellinger, W., Grubinger, T., Lughofer, E., Natschläger, T., & Saminger-Platz, S. Central moment discrepancy (CMD) for domain-invariant representation learning. In ICLR (2017).
  • 18.Chen, C., Chen, Z., Jiang, B., & Jin, X. Joint domain alignment and discriminative feature learning for unsupervised deep domain adaptation. In AAAI (2019).
  • 19.Courty, N., Flamary, R., Habrard, A., & Rakotomamonjy, A. Joint distribution optimal transportation for domain adaptation. In NeurIPS (2017).
  • 20.Damodaran, B. B., Kellenberger, B., Flamary, R., Tuia, D., & Courty, N. Deepjdot: Deep joint distribution optimal transport for unsupervised domain adaptation. In ECCV (2018).
  • 21. Lin J. Divergence measures based on the shannon entropy. IEEE Trans. Inf. Theory 37(1), 145–151 (1991). doi: 10.1109/18.61115 [DOI] [Google Scholar]
  • 22. Kullback S. & Leibler R. A. On information and sufficiency. The annals of mathematical statistics 22(1), 79–86 (1951). doi: 10.1214/aoms/1177729694 [DOI] [Google Scholar]
  • 23. Shannon C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948). doi: 10.1002/j.1538-7305.1948.tb01338.x [DOI] [Google Scholar]
  • 24.Masci, J., Meier, U., Ciresan, D. C., & Schmidhuber, J. Stacked convolutional auto-encoders for hierarchical feature extraction. In ICANN (2011).
  • 25. LeCun Y., Bottou L., Bengio Y., & Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998). doi: 10.1109/5.726791 [DOI] [Google Scholar]
  • 26.Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., & Ng, A. Y. Reading digits in natural images with unsupervised feature learning. (2011).
  • 27.Roy, P., Ghosh, S., Bhattacharya, S., & Pal, U. Effects of degradations on deep neural network architectures. CoRR (2018).
  • 28.Hastie, T., Friedman, J. H., & Tibshirani, R. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer (2001).
  • 29.He, K., Zhang, X., Ren, S., & Sun, J. Deep residual learning for image recognition. In CVPR (2016).
  • 30.Zhu, J., Park, T., Isola, P., & Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV (2017).
  • 31.Ioffe, S. & Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Bach, F. R. & Blei, D. M., editors, ICML (2015).
  • 32.Sutskever, I., Vinyals, O., & Le, Q. V. Sequence to sequence learning with neural networks. In Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D., & Weinberger, K. Q., editors, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada (2014).
  • 33.Kipf, T. N. & Welling, M. Variational graph auto-encoders. CoRR (2016).
  • 34.Clark, K., Luong, M., Le, Q. V., & Manning, C. D. ELECTRA: pre-training text encoders as discriminators rather than generators. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net (2020).
  • 35.He, J., Wang, X., Neubig, G., & Berg-Kirkpatrick, T. A probabilistic formulation of unsupervised text style transfer. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020 (2020).
  • 36.Ahamad, R. Z., Javed, A. R., Mehmood, S., Khan, M. Z., Noorwali, A., & Rizwan, M. Interference mitigation in d2d communication underlying cellular networks: Towards green energy. CMC-COMPUTERS MATERIALS & CONTINUA (2021).
  • 37.Alazab, A., Venkatraman, S., Abawajy, J., & Alazab, M. An optimal transportation routing approach using gis-based dynamic traffic flows. In ICMTA 2010: Proceedings of the International Conference on Management Technology and Applications (2010).
  • 38.Naeem, A., Javed, A. R., Rizwan, M., Abbas, S., Lin, J. C., & Gadekallu, T. R. DARE-SEP: A hybrid approach of distance aware residual energy-efficient SEP for WSN. IEEE Trans. Green Commun. Netw. (2021).
  • 39.Priya, R. M. S., Maddikunta, P. K. R., M., P., Koppu, S., Gadekallu, T. R., Chowdhary, C. L., et al. An effective feature engineering for DNN using hybrid PCA-GWO for intrusion detection in iomt architecture. Comput. Commun. (2020).
  • 40.Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. In Bengio, Y. & LeCun, Y., editors, ICLR (2015).

Decision Letter 0

Thippa Reddy Gadekallu

26 Apr 2021

PONE-D-21-11372

Unsupervised Multi-Source Domain Adaptation with No Observable Source Data

PLOS ONE

Dear Dr. Kang,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Based on the comments received from the reviewers and my own observation, I recommend minor revisions for the paper.

Please submit your revised manuscript by Jun 10 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Thippa Reddy Gadekallu

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: - Please highlight the contribution clearly in the introduction

- this paper lacks in Novelty of the proposed approach. The author should highlight the contribution clearly in the introduction and provide a comparison note with existing studies.

- Some Paragraphs in the paper can be merged and some long paragraphs can be split into two.

- The quality of the figures can be improved more. Figures should be eye-catching. It will enhance the interest of the reader.

- What are the computational resources reported in the state of the art for the same purpose?

- Please cite each equation and clearly explain its terms.

- Math work should be written math mode.

- What are the evaluations used for the verification of results?

- Clearly highlight the terms used in the algorithm and explain them in the text.

- Authors should add the most recent reference:

1) DARE-SEP: A Hybrid Approach of Distance Aware Residual Energy-Efficient SEP for WSN, IEEE Transactions on Green Communications and Networking

2) Interference Mitigation in D2D Communication Underlying Cellular Networks: Towards Green Energy, Computers, Materials & Continua

Reviewer #2: 1. What are the limitations of the existing works that motivated the current research?

2. Summarize the key findings from the related works in the form of a table.

3. Some of the recent works such as teh following on DNN/ML can be discussed in the paper: "An Effective Feature Engineering for DNN using Hybrid PCA-GWO for Intrusion Detection in IoMT Architecture, An optimal transportation routing approach using GIS-based dynamic traffic flows"

3. Present computational complexity of the proposed approach.

4.Compare the current work with recent state-of-the-art.

5. Discuss about the limitations of the current work in conclusion.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jul 9;16(7):e0253415. doi: 10.1371/journal.pone.0253415.r002

Author response to Decision Letter 0


3 Jun 2021

1. Reviewer 1.

• (R1-1) Please highlight the contribution clearly in the introduction.

– (A1-1) The contents of the contribution list have been revised to clearly highlight the contributions (lines 38-40 and lines 43-44 in introduction section).

• (R1-2) This paper lacks in novelty of the proposed approach. The author should highlight the contribution clearly in the introduction and provide a comparison note with existing studies.

– (A1-2) We revised the contribution list (lines 38-40 and lines 43-44 in introduction section) and provided a comparison note with the competitors in the introduction (lines 29-34 in introduction section). The existing studies for unsupervised multi-source domain adaptation (UMDA) assume that source data are observable and they train the adaptation networks to align manifolds of source and target domains. Thus, the existing methods are not applicable to our setting where source data are not observable. On the other hand, DEMS trains the adaptation networks while regulating the results of the source classifiers (lines 67-72 in related work section).

• (R1-3) Some Paragraphs in the paper can be merged and some long paragraphs can be split into two.

– (A1-3) We reviewed the overall manuscript to reorganize the paragraphs in it. Especially, we itemized the main ideas which were in one long paragraph into individual items (lines 104-126 in proposed method section).

• (R1-4) The quality of the figures can be improved more. Figures should be eye-catching. It will enhance the interest of the reader.

– (A1-4) We improved Figure 3, which previously looked complicated, to be eye-catching.

• (R1-5) What are the computational resources reported in the state-of-the-art for the same purpose?

– (A1-5) We implemented all the codes using PyTorch and trained all networks including DEMS and competitors using RTX 2080 Ti (lines 288-289 in experiments section).

• (R1-6) Please cite each equation and clearly explain its terms.

– (A1-6) We cited all equations in the manuscript and clearly explained the terms (lines 143, 147, 164, 179, 203, 204, and 213 in experiments section).

• (R1-7) Math work should be written math mode.

– (A1-7) We already wrote the math works by math mode. If there is anything we missed, please let us know and we will fix it.

• (R1-8) What are the evaluations used for the verification of results?

– (A1-8) We used validation set and trained DEMS until the loss L_{total} of the validation set is the lowest. Each experiment is performed 5 times with different random seeds and we reported the standard deviation along with the averaged accuracy. We added such description in lines 232-233 in proposed method section and lines 278-281 in experiments section.

• (R1-9) Clearly highlight the terms used in the algorithm and explain them in the text.

– (A1-9) We summarized the training algorithm of DEMS and explained the process in algorithm section (lines 226-240 in proposed method section).

• (R1-10) Authors should add the most recent reference: 1) DARE-SEP: A Hybrid Approach of Distance Aware Residual Energy-Efficient SEP for WSN, IEEE Transactions on Green Communications and Networking, and 2) Interference Mitigation in D2D Communication Underlying Cellular Networks: Towards Green Energy, Computers, Materials & Continua.

– (A1-10) We added the two references (lines 273-276 in experiments section).

2. Reviewer 2.

• (R2-1) What are the limitations of the existing works that motivated the current research?

– (A2-1)Previous works have focused on unsupervised multi-source domain adaptation(UMDA) where source data are accessible. Thus, they trained the adaptation networks to align manifolds between source and target domains using the source and the target data. However, source data are not easily accessible in practical scenarios although source classifiers are readily accessible. Hence, we are motivated to develop a method to train adaptation networks using the source classifiers and the target data without using the source data. We added such discussion in lines 67-72 in related works section.

• (R2-2) Summarize the key findings from the related works in the form of a table.

– (A2-2) We summarized the key findings from the related works and compared them with our proposed method in Table 3 (lines 61-62 in related works section).

• (R2-3) Some of the recent works such as the following on DNN/ML can be discussed in the paper: ”An Effective Feature Engineering for DNN using Hybrid PCA-GWO for Intrusion Detection in IoMT Architecture”, and ”An optimal transportation routing approach using GIS-based dynamic traffic flows”

– (A2-3) We added the two references (lines 273-276 in experiments section).

• (R2-4) Present computational complexity of the proposed approach.

– (A2-4) We added the computational complexity of DEMS (lines 236-240 in proposed method section).

• (R2-5) Compare the current work with recent state-of-the-art.

– (A2-5) The data-free UMDA is a novel problem since there were no previous approaches which can work without the source data. Nevertheless, we introduced several baselines in Table 1 and explained them in the introduction (lines 29-34 in introduction section).

• (R2-6) Discuss about the limitations of the current work in conclusion.

– (A2-6) We included the limitations of the work in conclusion section (lines 351-354 in conclusion section).

Attachment

Submitted filename: rebuttal_letter.pdf

Decision Letter 1

Thippa Reddy Gadekallu

7 Jun 2021

Unsupervised Multi-Source Domain Adaptation with No Observable Source Data

PONE-D-21-11372R1

Dear Dr. Kang,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Thippa Reddy Gadekallu

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have addressed almost all my suggestions. I would like to accept this paper.

Reviewer #2: The authors have done a good job in addressing all the comments and suggestions. The paper is improved significantly and is in a good shape now. I recommend the paper to be accepted in the current form.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Acceptance letter

Thippa Reddy Gadekallu

29 Jun 2021

PONE-D-21-11372R1

Unsupervised Multi-Source Domain Adaptation with No Observable Source Data 

Dear Dr. Kang:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Thippa Reddy Gadekallu

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: rebuttal_letter.pdf

    Data Availability Statement

    The data and code are available at: https://github.com/snudatalab/DEMS.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES