Skip to main content
PLOS One logoLink to PLOS One
. 2022 Nov 15;17(11):e0277550. doi: 10.1371/journal.pone.0277550

The silent majority: The typical Canadian sex worker may not be who we think

Lynn Kennedy 1,*
Editor: Hamid Sharifi2
PMCID: PMC9665380  PMID: 36378670

Abstract

Background

Most sex worker population studies measure population at discrete points in time and very few studies have been done in industrialized democracies. The purpose of this study is to consider how time affects the population dynamics of contact sex workers in Canada using publicly available internet advertising data collected over multiple years.

Methods

3.6 million web pages were collected from advertising sites used by contact sex workers between November, 2014 and December, 2016 inclusive. Contacts were extracted from ads and used to identify advertisers. First names were used to estimate the number of workers represented by an advertiser. Counts of advertisers and names were adjusted for missing data and overcounting. Two approaches for correcting overcounts are compared. Population estimates were generated weekly, monthly and for the two year period. The length of time advertisers were active was also estimated. Estimates are also compared with related research.

Results

Canadian sex workers typically advertised individually or in small collectives (median name count 1, IQR 1–2, average 1.8, SD 4.4). Advertisers were active for a mean of 73.3 days (SD 151.8, median 14, IQR 1–58). Advertisers were at least 83.5% female. Respectively the scaled weekly, monthly, and biannual estimates for female sex workers represented 0.2%, 0.3% and 2% of the 2016 Canadian female 20–49 population. White advertisers were the most predominant ethnic group (53%).

Conclusions

Sex work in Canada is a more pervasive phenomenon than indicated by spot estimates and the length of the data collection period is an important variable. Non-random samples used in qualitative research in Canada likely do not reflect the larger sex worker population represented in advertising. The overall brevity of advertising activity suggests that workers typically exercise agency, reflecting the findings of other Canadian research.

Introduction

The purpose of this study is to provide a more comprehensive picture of the population dynamics of contact sex workers in Canada by examining sex worker advertising behavior over multiple years. How does the population change over time? Most importantly, can we say that all Canadian sex workers are represented in the debate around policy and in the research that is often used as the basis for policy?

The majority of studies that estimate sex worker populations on a national scale only attempt to generate estimates at a single point in time. These studies nevertheless are usually costly, large-scale efforts that use a variety of techniques. These techniques include: in-person interviews [19]; respondent driven and token based sampling (where researchers use an initial group of participants to recruit other participants) [46]; in-person site counts (where researchers identify geographic “hotspots” where a population of interest is known to frequent, typically by recruiting and interviewing local experts, and visually enumerate the people at that location over a preselected time period) [3,4,68]; and indirect mapping via service delivery statistics (where non-governmental organization (NGO) or police statistics are used as a basis to infer population) [3,7,911]. One study from New Zealand combined counts of street workers with in person enumeration based on newspaper and internet advertising [12]. With the exception of Abel et al. [12], female sex workers (FSW) or female identified sex workers are the subject of the research. Most of the research is limited to populations in the Global South.

A number of criticisms of existing population studies have been put forward. Both Cusick et al. [11] and Abel et al. [12] describe how population estimates can be difficult to confirm and can inflate numbers when based on NGO and police records as these are often kept long after workers have left the industry. Population estimates that depend on specific locales or social networks [36,13] can leave out workers who are not part of these contexts. Workers who travel, for example, or who are only in the industry for short periods of time may be excluded.

Population studies are not the only research affected by sampling methodology. Many Canadian qualitative studies of sex workers use non-random samples [1424]. It is an open question whether non-random samples typically used in this research accurately reflect the demographic composition of the populations they intend to represent. This is an important question as this research is often used as the basis for government policy.

Internet advertising has become the dominant form of advertising for sex workers in Canada. Internet-based advertising has been shown to be a powerful tool that can significantly improve risk-management, safety, and communication between sex workers and clients. Research has shown that online communication has allowed sex workers to communicate significant details to clients including their preferences and personal health practices [14,25].

This study shows that population estimates can be made at a much lower cost compared to other methods as data collection and analysis are mostly automated. The source records, created by the advertisers themselves, can be extensive, improving the completeness of estimates and including otherwise hidden populations. Perhaps most important, the length of time individuals are involved as contact sex workers can be estimated. While the majority appear to be in the industry for relatively brief periods of time, there is no “one size fits all” description that applies to all sex workers.

Methods

Overview

This study provides evidence for the number of sex workers advertising over specific time periods using publicly available internet advertising data. Data was collected and analyzed over a two year period using a combination of open source and custom software tools [26]. Fig 1 provides an overview of the processing pipeline and Table 1 outlines the steps taken to generate population statistics. This process was done in four main phases. The first phase involved downloading data from the target websites. Once files were downloaded, they were analyzed and grouped based on contact metadata. Names were detected and counted in ads to estimate the actual number of workers represented. Collected metadata was then analyzed for errors and finally statistics were generated. Most steps were done once but some, such as cluster analysis, name interpolation and scaling, were done on an ad hoc basis when analyzing population for a specific time period.

Fig 1. Flow diagram illustrating the data processing pipeline used to generate population estimates.

Fig 1

Table 1. Data processing steps for population estimation.

Steps were automated except where indicated.

Step Description
Collect data Custom download programs make local copies of web page files and associated images for further processing and analysis. Date, time and gender metadata are collected here.
Metadata extraction
Analyze images Images are hashed to identify related images and analyzed for faces.
Extract contacts Find phone numbers and email addresses in ad text.
Extract names Find first names in ad text.
Extract other metadata Identify other variables of interest. Gender and social context (collective vs individual) are the main ones used in this analysis.
Cluster analysis Analyze contacts to find those that co-occur in ads. Treat these related contacts as a single virtual contact. This is done as needed for a given time period.
Error estimation and mitigation
Check for bad contact data* The automated extraction process for contact data can pick up invalid contact information. Invalid contacts are either removed or combined with related contacts.
Estimate proportion of valid advertisers* Not every advertiser is a contact sex worker. The number of relevant advertisers is counted from a sample of advertisers based on the criteria described in supplemental materials S1 File. This is used in the scaling calculation described below.
Estimate proportion of valid names* For any given contact, names may have been extracted in error. For a random sample of advertiser-name pairs the number of correct pairs are counted. This is used in the scaling formula described below.
Image validation* Check samples of images for the validity of image hashes. Determine the optimum confidence level for face detection. See S1 Appendix.
Estimate contact change rate Some advertisers change contacts periodically. This can be measured from ad metadata and related images. The rate of contact change is used in the scaling calculation described below.
Estimate probability name is new Some workers change their names. The probability that any name seen is in fact referring to a worker that has not been seen before is measured using methods similar to the contact change measure.
Interpolate missing name counts When advertisers lack name data, add in name counts based on median values for individual and collective advertisers. This is done as needed for a given time period.
Apply scaling A scaling calculation reduces the advertiser and worker counts based on the contact change rate and the proportions of valid names and contacts. This is done as needed for a given time period.
Measures and statistical analyses
Generate population estimates Sum raw, interpolated and scaled name and advertiser counts for multiple time periods. Generate descriptive statistics for shorter periods. Name counts are stratified by gender.
Identify trends Create linear regression models for ad count, advertisers and workers versus month for the study period.
Days online Generate descriptive statistics for the number of days online for advertisers.
Social context Measure social context by comparing proportions of individual versus collective advertisers and workers.

*indicates the step was not automated.

Data collection

The sites analyzed in this study represent where the majority of sex work advertising occurred in Canada during the 2014–2016 study period according to advisors from the Sex, Power, Agency, Consent, Environment and Safety Project (SPACES) [24]. SPACES was initiated in 2012 at the University of British Columbia to explore health and safety issues experienced by off-street sex workers. The SPACES advisors were people with experience in contact sex work either as workers or third parties who were users of such websites.

Each source website had a unique structure. Customized downloaders for each site were developed to gather ads and associated images [“downloaders” in 26]. Sites were checked for new ad pages at least every 15 minutes. It was assumed to be very unlikely that an advertiser would post and then delete an ad in less than 15 minutes.

For the purposes of time delimited population estimates, source data was restricted to classified sites that provided time-stamped ads from distinct advertisers where each ad was a unique web page. Content from other sources was analyzed separately for purposes of comparison.

Metadata extraction

Ad text was isolated from each web page and was parsed for metadata [26]. The metadata used for the population estimates consisted of contact information, first names, gender and whether an advertiser represented a collective or an individual. Gender was inferred from the location of the ad webpage on the source website encoded in the ad urls. Contact information and names were discovered in the ad text itself. Images were analyzed as a way to further connect advertisers identified by contact information. The extracted data was stored in a MariaDB database for later analysis [27].

Contacts to advertisers

The main strategy for estimating population was to group ads together based on contact information. These groups of ads were considered the output of an advertiser, an entity which could represent one or more workers. For the purpose of identifying advertisers two main types of primary contacts were extracted: phone numbers and email addresses. Fig 2 illustrates the decision making process for accepting or rejecting contacts in ads.

Fig 2. Contact and name processing flow diagram.

Fig 2

Phone numbers were often in a numeric form, 416-555-1234 or similar. However, some phone numbers were obscured using combinations of numerals and words similar to sevenseveneight 5five5 5421. To identify advertisers, all phone numbers were converted into a common numeric form: 778 555 5421. All extracted phone numbers were required to have valid North American area codes and were checked against the original ad to see if the phone could be matched. Any phone that could not be matched because it had been obscured was checked visually before being included as a contact for that advertiser.

Similarly, extracted emails could be misspelled or be extracted in error if the advertiser used the @ symbol in a way that might mimic an email address. Emails with poorly formed domains were visually inspected. Additionally, the Levenshtein distance [28] was calculated for every pair of emails. Email pairs with a Levenshtein distance of 2 or less were flagged for visual inspection. Groups of emails which appeared to be simple misspellings of each other were given a single canonical email address.

Most of the time advertisers could be identified by a single contact. However, in some cases ads could contain multiple contacts. To avoid overcounting, groups of co-occurring contacts were given a virtual contact identifier called a cluster. These clusters were identified using the DBSCAN algorithm [“pop/clusters.py” in 26,29]. Fig 3 illustrates a virtual contact Cluster1 that contains the contacts Phone1, Phone2, and Email1 in a group of ads. Contacts were considered related when they appeared together in at least one ad. Clusters are only meaningful in the context of a specific time period and are created as needed when population estimates are calculated. Any contact could either be stand alone or in exactly one cluster for any time period studied.

Fig 3. Example cluster created by DBSCAN.

Fig 3

Ads are linked by common contacts. The virtual contact Cluster1 is associated with the ads containing contacts Phone1,Phone2 and Email1. The primary contacts Phone1, Phone2 and Email1 are considered “neighbors” because each contact appeared in at least one ad with another contact in the cluster.

Advertisers to people

In the context of this study, advertisers could be thought of as a hidden variable represented by contacts. However, ultimately advertisers represent people. In order to determine approximately how many individuals an advertiser represented, first names were detected in ads. Most workers only used first names to identify themselves.

Detecting names in ads was a multi-step process of language model generation, refinement and finally application where a list of 1000 popular female and 1000 popular male first names [30] were compared against trigrams extracted from ads containing those name strings. Candidate name strings that appeared to be mostly used as names were retained. These seed names were used to identify words that typically preceded names (for example “name is …” or “je m’appelle …”). Trigrams containing these context words were then used to identify a larger set of names actually used in ads.

To count names in ads, name strings were identified in ad text where characters were converted to lowercase and all punctuation was removed except single quotes [“namelist” in 26,31]. As shown in Fig 2, when a potential name was found, it was saved in a canonical form with repeated letters removed. For example, a name “Angellaaaaaa” in the original text would be converted to the canonical form of “angela” before being stored. Names found in individual ads were collected and attached to the associated advertiser. Each canonical name used by an advertiser was treated as a unique worker for the purposes of population estimates.

An advertiser was considered a collective representing multiple individual workers if the advertiser either had more than one name associated with them or used keywords that indicated the advertiser represented a group of workers. The following keywords were used to detect collective advertisers missing name data: “models”, “girls”, “we”, “our”, “us”, “spa”, “agency”, “club”, “nous”, “filles”, “agence”, “four hands”, “duo”, “trio”, “roommate” and “couple”.

Image analysis

Images were hashed with the perceptual hashing algorithm [32,33] to identify similar images. Common images could be used to detect when advertisers had changed contacts. Faces were detected using the Python mtcnn module to identify images with people [34] (see also S1 Appendix). Facial analysis was used as an aid in identifying workers when generating error parameters.

Error detection and mitigation

Error estimates

The metadata extraction process could produce errors. Erroneous contacts were removed and co-occurring contacts were combined as described above. To further correct for overcounting, the probability that an advertiser represented contact sex workers, P(a relevant), was estimated by visually inspecting ads from a random sample of advertisers using the criteria outlined in supplemental materials SI File. Secondly, P(n valid), the probability that a name had been correctly extracted for a given advertiser, was estimated by visually inspecting a random sample of advertiser-name pairs.

Calculating rates of contact change

Population estimates could be inflated when an advertiser changed name or contact information. To measure these changes, sequences of ads were examined and frequencies of changed names and contacts were tallied [“pop/namechanged.pl” and “pop/idchanged.pl” in 26]. Face images in both cases were used as a proxy to identify an individual advertiser independently of name or contact. Ads with common face images but new contacts or names were considered changed.

Ad sequences used for detecting name change contained ads with at least one face image and only one contact per ad. For contact change, ad sequences from Site 3 were used. These advertisers could be identified independently with an internal chat id. The Site 3 ads all contained only one name and at least one face image. As Site 3 advertisers could change chat ids, the rate of chat id change was also measured by counting the number of chat ids associated with Site 3 contacts.

Scaling calculation for advertiser and worker counts

To mitigate the effects of errors and advertisers changing name and contact information, python modules were developed to adjust or scale population and advertiser counts for any time period [“pop” in 26]. A base module pop.py collected advertisers and names for a specific time period, generated clusters, added median name counts where names were missing and scaled back the counts based on the formulae described below. This base module was run repeatedly for different time periods by multipop.py to generate descriptive statistics.

Advertisers, who could represent one or more workers based on groups of ads, were estimated for a given period using Eq 1:

N^advertisers=P(arelevant)aAdvertisersP(aunique) (1)

Where N^advertisers is the adjusted number of advertisers for a given period. P(a relevant) is the measured probability that any given advertiser is relevant. Advertisers is the original set of advertisers active during the period. P(a unique) is the probability that an advertiser had not changed contacts during this time.

P(a unique) was estimated using Eq 2:

P(aunique)=11+NewContacts=11+(PeriodDays(a)1)Ridchange (2)

Where Period is the total number of days in the period, Days(a) is the number of days that advertiser a was online during the period and 1 < = Days(a) < = Period. The constant Ridchange is the measured rate per day that an advertiser adds new contacts. If we assume that advertisers tend to advertise for the same length of time each time they advertise, the number of days where an advertiser may create a new contact would be approximately the ratio of the Period and Days(a) minus the one known period in the interval.

Sex worker population for a given period based on name counts was estimated using Eq 3:

N^workers=P(nvalid)P(nunique)P(arelevant)aAdvertisersNames(a)P(aunique) (3)

Where N^workers is the estimated sex worker population for a given period. Names(a) is the estimated name count found for advertiser a in the period or, if no names were found, the median number of names based on social context (individual or collective). P(n valid) is the measured probability that any detected name is valid for a given advertiser and P(n unique) is the measured probability that a name was not changed.

Ridchange is defined by Eq 4:

Ridchange=Ridchanges/day+Rchatidchange (4)

Where Rid changes/day is the measured rate of new contacts per day and Rchatid change is the measured rate of new chat ids per day for Site 3 advertisers.

The 95 percent confidence intervals for Formulas 1 and 2 were calculated by first finding the confidence intervals (CI) for the P and R parameters. The scaling calculations were then rerun using the lower and upper CI values for the input parameters. For the probabilities the CI was calculated using Eq 5 [35]:

CI=P±zP(1P)/N (5)

Where z is the z function value for the 95 percent confidence interval, P is the parameter value and N is the sample size used to determine P. The root sum of squares, shown in Eq 6, was used to calculate the confidence interval CI for the Ridchange parameter that combined two rates:

CI=Ridchange±zRidchanges/day2/Nidchanges/day+Rchatidchange2/Nchatidchange (6)

Estimating advertisers using image data

An alternate way to estimate advertisers uses image sharing. If we know how many images advertisers use on average and we assume that advertisers changing contacts use their own images and tend to use the same images in ads, advertiser counts can be estimated with Eq 7:

N^advertisers=UniqueimagesAvimagesperadvertiser*RimagereusedP(arelevant) (7)

Where N^advertisers is the estimated number of advertisers for the duration of the study, Unique Images is the count of unique image hashes found associated with advertisers, Av images per advertiser is the average number of images used by any advertiser and Rimage reused represents the average number of times images were reused by advertisers and P(a relevant) is the measured proportion of relevant advertisers. A limitation of this technique is that the image parameters must be measured uniquely for each time period.

Measures and statistical analyses

Advertiser population estimates were generated for the whole two year period as well as monthly and weekly. Advertiser population estimates consisted of two variables: the raw advertiser count and the scaled estimate from Eq 1. Biannual scaled advertiser estimates were also calculated using Eq 7. Descriptive statistics were generated for monthly and weekly estimates. Days online were estimated for all advertisers, calculated as the number of days between the first date and last date ads from that advertiser were seen.

Similarly, worker population estimates were generated weekly, monthly and for the two year period stratified by gender and social context (individual vs collective). These estimates consisted of three variables: the raw name count for the time period, the name count where advertisers missing names are assigned median name counts and a scaled estimate that corrects for advertisers changing contacts and names. Descriptive statistics were generated for monthly and weekly population counts. As a point of comparison with existing research, scaled population estimates for cis female workers were compared with the 2016 Canadian 20–49 year old female population from Statistics Canada [36].

Monthly trends were calculated for downloaded ads, advertisers and workers. These trends were considered for all workers as well as stratified by gender. The R lm function [37] was used to calculate univariate linear models between the number of months from the start of data collection (independent variable, range 0–25) and the dependent variables of downloaded ad count, adjusted advertiser count and adjusted worker count for each month.

Verifying population estimates

The advertiser and worker population estimates were compared to four other data sources as well as an earlier analysis (see S2 Appendix). Firstly, the per-capita average weekly estimate of active Canadian workers of all genders was compared with a 2006 spot estimate for indoor workers in New Zealand [12]. Secondly, advertisers from non-classified sites during this period are compared with the advertisers identified from classified ads. Thirdly, the demographic breakdown of a large sample from Argento et al. [15] is compared with the composition of the advertisers identified in this study. Lastly, advertiser estimates for two periods: November 1, 2014 to August 1, 2015 and November 1, 2015 to August 1, 2016 are compared to chat id counts from Site 3 collected between November 1, 2021 and August 1, 2022.

Ethics statement

All source data used in this study consisted of publicly available data at the time it was collected and was collected in accordance with the policies of the sites in effect at the time. The methods used are conformant with the ethical standards of the Canadian Sociology Association (section 4.10 II) and the American Sociology Association (section 10.5 c) [38,39]. As the replicability of the main results of this paper is important, a data set is provided as part of the supporting information along with the code used to process it. However, in order to protect the safety and privacy of advertisers and third parties, all identifying information has been removed including the names of the source websites.

Results

Downloaded ads

A total of 3641544 web pages were collected from websites hosting Canadian contact sex work advertising between November 1, 2014 and December 31, 2016. The majority of the pages, 3545247 (97.36%), were collected from six classified ad sites designated here as Sites 1 to 6. The classified ad pages, where the advertiser and publication date was unambiguous, were used in the time based analysis. As a comparison, 96297 pages were downloaded from Canadian non-classified adult advertising sites over the same time period. Table 2 provides a breakdown of what was downloaded.

Table 2. Web pages collected per source in 2014–2016.

Source Pages collected Percent
Site 1 851206 23.37%
Site 2 2057728 56.51%
Site 3 220071 6.04%
Site 4 409381 11.24%
Site 5 5832 0.16%
Site 6 1029 0.03%
Non classified sites 96297 2.64%
Total classified ads 3545247 97.36%
Total ads 3641544 100.00%

Advertisers who used classified advertising tended to only advertise on one classified site. Of the 6 websites, advertisers used on average 1.08 sites to advertise (standard deviation 0.3, median 1.0).

Contact information was not found in 399318 (11.0%) of the classified ads used in the 2014–2016 population estimates. A sample of 6955 ads representing 3975 unique cleaned ad texts were evaluated using the criteria in supplemental materials S1 file. The ads were judged to be relevant contact sex work ads 76.4% of the time (95% CI 75.4%-77.4%). Advertisers on average produced 17.8 ads (SD = 199.8). Assuming this rate for ads without contact information, 17139 +/- 0.5 advertisers (unscaled) may have been missed or an additional 9.3%.

Error estimates

The probability that a name was valid for a given advertiser, P(n valid), was 0.9612 +/- 0.0322 based on a random sample of 3415 unique advertiser-name pairs. The probability that a given advertiser represented relevant contact sex workers, P(a relevant), was 0.9500 +/- 0.0294 based on a random sample of 3999 advertisers.

Contacts were found to change much more frequently than names. Advertisers were estimated to change contacts at a rate of 0.0223 +/- 0.0017 contacts per day from a sample of 8454 ads representing 928 Site 3 advertisers, all of which had ads with images containing faces and only one chat id, name and contact represented in the ad. Ads with the same chat id and face images were considered to belong to the same advertiser.

As it was free to register on Site 3 it was relatively easy for advertisers to have more than one chat id. Counting the number of chat ids associated with 4761 single contacts on Site 3 showed that advertisers created 0.0024 +/- 0.0001 new chat ids per day. The rate of chat id change was combined with the rate of contact change to arrive at the estimated true rate of contact change in the calculation that scaled back the advertiser and name counts described in the scaling calculation section above.

The probability that a name had not been changed by an advertiser P(n unique) was 0.9953 +/- 0.0940. The probability was estimated based on a sample of 221865 ads representing 431 advertiser/name pairs. This sample was not restricted by site. All ads had only one contact and name and had images containing faces. A name was considered changed when the contact and face images had been reused with a different name.

Population estimates

Fig 4 illustrates month to month trends during the 2014–2016 study period and Table 3 summarizes linear regression results between month and various population measures. Throughout this period there was a significant positive relationship between almost all measured statistics and month. Only new ads did not increase significantly. A dip in ad volume in November 2015 was the result of the downloaders being shut down for a two week period early in the month.

Fig 4. Canadian monthly counts for the period 2014-11-01 to 2016-12-31.

Fig 4

a) ads downloaded; b) contacts, advertisers and clusters; c) estimated population by gender; d) male and transgendered population detail.

Table 3. Changes by month from start of data collection for 2014–2016.

Variable Beta parameter Standard error p value
Ad volume 3501.5 620.9 p<0.001
New ads* 439.4 588.9 0.463
Scaled advertisers 471.5 31.3 p<0.001
Clusters 18.2 2.7 p<0.001
Scaled workers 662.2 55.9 p<0.001
Scaled cis-female 608.2 51.0 p<0.001
Scaled cis-male 92.2 6.1 p<0.001
Scaled trans-female 26.3 1.6 p<0.001

*result not significant.

The estimated number of Canadian advertisers for the 2014–2016 study period was 75600 (95% CI 74087–77219) based on the scaling formula described in Eq 1. Canadian worker estimates were generated using Eq 3: an average of 16846 (SD 5858) workers of all genders were estimated to be active weekly, monthly this average increased to 26326 (SD 5481) and over the two year study period 169473 sex workers (95% CI 166870–172226) were estimated to be active at least once. Population counts for the 2014–2016 study period are summarized in Table 4 stratified by self-identified gender comparing the original raw counts, counts with missing name data interpolated and the scaled estimates that correct for overcounting. Advertisers were estimated to be active for a mean of 73.3 days (SD 151.8, median 14, IQR 1–58).

Table 4. Name and advertiser counts for the period between 2014-11-01 and 2016-12-31.

Gender is based on ad category. Percentages are relative to total names. Corrected estimates are generated using Eq 3 for names and Eq 1 for advertisers.

Category Unscaled counts Missing names added Scaled (corrected) counts
Cis female names 252133 (84.7%) 258795 (83.5%) 141669 (95% CI 139469–143996, 83.6%)
Cis male names 15333 (5.1%) 18141 (5.9%) 8013 (95% CI 7843–8194, 4.7%)
Transgender names 4931 (1.7%) 5050 (1.6%) 2943 (95% CI 2903–2985, 1.7%)
Other* names 25407 (8.5%) 27736 (8.9%) 16849 (95% CI 16656–17051, 9.9%)
Total names 297805 309924 169473 (95% CI 166870–172226)
Advertisers 172767 n/a 75600 (95% CI 74087–77219)

*refers to ads where no gender was indicated.

The majority group were female identified sex workers who were estimated to represent 83.6% of workers overall (N = 141669 95% CI 139469–143996). Scaled counts in any given week in 2014–2016 suggest that an average of 13575 (SD 4994) cis female sex workers were active in Canada. Monthly, the average was 21344 (SD 5028). The Canadian adult female population between 20 and 49 years of age in 2016 was 7205721 [40]. Thus the weekly average would represent 0.2% (one in 531), the monthly average 0.3% (one in 338) and the biannual estimate 2.0% (one in 51). Most workers self-identified as white (53%) based on Site 3 data (see S3 Appendix).

Advertisers estimated from image reuse

The estimated number of advertisers for 2014–2016 based on image reuse was calculated to be 69562 (95% CI 69085–70047) from Eq 7 where the number of Unique images was 1640209 the Av images per advertiser was 16 (SD 87), Rimagereuse was 1.4 (SD 1.3) and P(a relevant) was measured to be 0.95. This is 92% (90%-94%) of the estimated 75600 (95% CI 74087–77219) advertisers for this period from Eq 1.

Cluster analysis

Ads with multiple contacts were treated as single advertisers using clustering as described above. Clusters could represent individuals or groups of workers. Weekly there was an average of 1013 clusters (SD 242), monthly this increased to an average of 1614 (SD 172). Over the 2014–2016 data collection period 12886 clusters were identified. Where clusters occurred they were typically not large. For the entire 2014–2016 period the median number of contacts represented by a cluster was 2 (IQR 2–3, mean 3.0, SD 7.5).

Individual versus collective advertisers

In 2014–2016, the unscaled number of collective advertisers was significantly smaller than the number of individual advertisers: 80040 versus 87499 respectively (collective proportion 47.8%, CI 47.5–48.0%, p<0.001). However, the unscaled name counts suggest that the number of workers who work collectively are significantly larger: 217847 collective versus 87499 individual (collective proportion 71.3%, CI 71.2–71.5%, p<0.001). The R prop.test [37] function was used to compare proportions.

Overall there was a median of 1 name per advertiser (IQR 1–2, average 1.8, SD 4.4). Collective advertisers had a median of 2 names per advertiser (IQR 1–3, average 2.4, SD 6.3). Most advertisers were associated with two names or less (unscaled N = 147434, 88%).

Comparing the results with other data sources

Comparison with a New Zealand population estimate

Proportional to population, the estimated number of workers in this study is similar to estimates described in the New Zealand PRLC report from 2008 [8]. The New Zealand study did not distinguish workers based on gender. The male and female 20 to 49 year old population from the 2006 New Zealand census was 1777770 [41]. Of the 2396 workers counted in the report in February to March 2006, 2143 were off-street workers, representing 121 workers per 100000. The researchers took great care to ensure that workers represented in the study were in fact active at the time they were counted.

The weekly Canadian average reported here of 16846 (SD 5858) represents 122 (SD 43) workers per 100000 for the 20 to 49 year old population of 13761540 in 2016 [36]. A proportions z-test indicated no significant difference (p = 0.5) between the Canadian and New Zealand population estimates. While these proportions are tantalizingly close, we should be cautious in their interpretation. The time distance between the New Zealand and Canadian estimates may have an effect on population size. Both studies likely represent the majority of sex workers at the time they were conducted, however, neither this study nor the New Zealand study should be interpreted as an exhaustive census.

Comparison with non-classified advertising

Advertisers who did not use classified advertising but nevertheless had an online presence were rare. Advertisers who used non-classified websites numbered 1550 or 1.0% of all advertisers (unscaled count). Of these, 1028 or 0.7% of all advertisers (unscaled), were found to exclusively use non-classified advertising sites.

Comparison with Argento et al.

Argento et al. [15] provides a detailed demographic breakdown of a large sample of participants (N = 852). Study participation was limited to cis and trans women from the Vancouver, BC area. Sexual minority participants were much more prevalent in Argento et al. at 36.3% (N = 309) compared to the 1.7% trans women represented in the scaled population counts found here. Argento et al. did not distinguish between different types of gender nonconformity. Indigenous participants were also much more prevalent at 38.8% (N = 331) compared to the 1.3% represented in the Site 3 data (see S3 Appendix). Primarily street-involved participants represented 50.7% (N = 432) of the sample. Of the 420 off street workers, the R prop.test function [37] showed that, similar to the advertising data, significantly more worked in a collective context: 253 or 60.2% (CI 55.4–64.9%, p<0.001).

Comparison with Site 3 data from 2021–2022

After 2016, Site 3 appears to have become the dominant advertising site for sex workers in Canada. Data collection was resumed using the same methods described above in October 2021 and is ongoing. The raw count of Site 3 advertisers, directly measured using chat id metadata from November 1, 2021 to July 31, 2022 was 48832. The advertisers used an average of 1.1 chat ids (SD 0.6) based on a sample of 2605 advertisers who used single phone numbers. The proportion of relevant advertisers was 0.68 (2707 from a sample of 4000 advertisers) based on criteria outlined in supplemental materials S1 File. Thus, the actual number of advertisers for this period is estimated to be 30042.8 (95% CI 30042.5–30043.0).

Between November 1, 2014 and August 1, 2015 there were an estimated 33145 advertisers (95% CI 32670–33654). Between November 1, 2015 and August 1, 2016 the number of advertisers grew to 47361 (95% CI 46766–47990). The 2021–2022 estimate is 90.6% and 63.4% of the 2014–2016 estimates respectively. Most likely the COVID-19 pandemic will have affected the number of active contact sex workers in 2021–2022.

Discussion

The purpose of this study was to gain insight into how sex worker populations change over time. Ad web pages from sites commonly used by sex workers were downloaded from November 1, 2014 to December 31, 2016. Analysis of primary contacts in the ads identified advertisers and analysis of first names in ads identified individual workers associated with these advertisers. Advertisers were active for a mean 73.3 days (SD 151.3, median 14, IQR 1–58) and 88% represented two workers or less. Population estimates generated weekly, monthly and over two years showed that the number of advertisers and workers increased significantly as the length of the analysis period increased, providing evidence that workers frequently enter and exit the industry. Comparisons with other data sources suggest that the metadata extraction and scaling techniques used are plausible on both short and long time scales. However, the demographic stratification represented in the advertising data does not appear to match that found in a recent qualitative study [15] suggesting that non-random sampling strategies used in qualitative research may not accurately reflect the greater Canadian sex worker population.

The effect of time

The element of time turned out to be crucial for interpreting population estimates. Even after controlling for changing contact information and co-occurring contacts, recently active workers were part of a much larger cohort of workers only intermittently active. The average estimated number of workers active week to week represented one tenth the estimated population for the whole two year period. This order of magnitude difference may seem surprising. However, in the context of the Canadian economy it is plausible.

If financial stress is a motivation for entry into sex work there are a very large number of women living in poverty in Canada, in 2015 this was estimated to be 14.7% [42]. Sex work, where it intersects with poverty, may represent one of many informal survival strategies [12,4345]. The brevity of involvement for most advertisers indicates sex work was likely not permanent employment for most workers. However, it would be a mistake to assume that all sex workers are economically disadvantaged. Prior research shows that there is wide variation in what sex workers earn [12,43,44] (see also S4 Appendix). Other types of informal workers frequently do not require their informal income to survive [46,47] and this may be true for many sex workers. The demographic patterns described in this study lend support for the view, also extensively reported in the literature, that Canadian sex workers exercise a substantial degree of agency in how they engage with the industry even in the face of structural obstacles [16,18,20,4851].

Research sponsored by the New Zealand Prostitution Law Reform Committee (PLRC) [7,12], some of which is described above, further illustrates how population counts can be misinterpreted when the dimension of time is not taken into account. Four estimates were generated as part of this research. Two were generated before the enactment of the Prostitution Reform Act of 2003 (PRA) based on NGO and police statistics: 8000 workers from the New Zealand Prostitutes Collective (NZPC) and 5932 workers based on police records. Two were generated after based on a direct enumeration of workers: 2396 in February-March 2006 and 2332 in June-October 2007 [8,12]. The pre-2003 counts, based on cumulative statistics, overestimated the number of active workers because many workers were included who had already left the industry.

Implications for research and policy

How representative are the samples used in the large body of Canadian qualitative research? Benoit and Shaver [52] note that this is difficult to determine when the characteristics of the greater population are unknown. Sample sizes can be small: 10 samples used in 13 studies [14,15,19,2124,43,44,5355] ranged from 21 to 852 (median 206.5, IQR 65.25–461.75, mean 288, SD 291.3). Based on the scaled population estimates, a minimum random sample of 206 cis female workers would be required to achieve a 5% confidence interval at a 95% confidence level [56]. Only three studies [14,15,23] had more than the minimum cis female participants. However, none of these studies used random sampling. Adding to the problem, Canadian research often shares participants between studies. For example, An Evaluation of Sex Workers Health Access (AESHA), a growing cohort of cis and trans female workers from Vancouver, BC, is shared by three studies [14,15,23]. Another sample of 218 participants from six Canadian municipalities is shared by five studies [17,19,21,53,54] including a working paper from 2014 that is the source of the often quoted statistic that Canadian sex workers are in the industry for an average of 10 years [54].

Non-random samples can be misleading even when large. For example, Argento et al. [15], which had the largest sample overall (N = 852), describe their participants as “highlighting the overrepresentation of gender and sexual minorities and Indigenous women among sex workers in Vancouver”. This statement is not consistent with the demographic makeup of the much larger group of online advertisers. For example, the proportion of Indigenous women to the Canadian female population is 4.8% based on the 2016 census [57] however the proportion of advertisers who self-identified as Indigenous in 2014–2016 was 1.3% (see S3 Appendix) indicating that Indigenous people are likely underrepresented in the industry.

In contrast, trans people are likely to be overrepresented but not to the extent indicated by Canadian research. A census test conducted in 2019 by Statistics Canada [58] found the proportion of trans people at that time to be 0.35%: far less than the 1.7% estimated here. In contrast, the AESHA cohorts had very high proportions of LGBTQ2S participants (25.3% to 36.3%) but made no distinction between trans and other gender nonconformity making direct comparison difficult. Studies that explicitly identified trans participants [17,43,44,55], while proportionally fewer than the AESHA cohorts, also had participation much greater than 1.7% (mean 7.8%, SD 5.15%). Proportions of men in Canadian research (mean 17.2%, SD 0.6%) were also higher than the 4.7% of workers estimated here [17,43,44].

Canadian studies often track workers over multiple years [14,15,23,43] potentially giving the impression that the majority will require help in exiting the industry. However, prior research shows that many workers find such offers of assistance intrusive [8,59]. This study supports this perspective. Advertising for more than one year was very uncommon and, as described in S4 Appendix, long-term individual advertisers routinely take breaks from advertising.

If the majority of sex workers only have sporadic involvement in the industry, what are they likely to need from policy? Because these workers are not well represented in research, this question is difficult to answer. More research specifically targeted at these workers is needed to determine what may be appropriate.

Directions for future research

The clustering algorithm used to disambiguate co-occurring contacts revealed that, over time, workers can participate in large scale social networks. The largest cluster found covered almost the entire country and was estimated to represent over 800 people; how pervasive is this type of ad hoc collective activity? Secondly, an analysis of advertiser restrictions showed that 16.89% of advertisers restrict clients based on skin color; what is motivating this choice?

Online classified ad data can be used as the basis for integrated qualitative research as metadata, once identified, can be used to stratify samples for further investigation. Parallelizing metadata extraction with data collection is important as advertisers are typically transient and timely contact is essential. Even a cursory review of existing research shows that integration of these research streams is needed to ensure that qualitative samples are representative and conversely to ensure that archival metadata is accurate.

Limitations

The population estimates presented in this study should not be interpreted as an exhaustive census of sex workers in Canada during 2014–2016. Instead the scaled estimates, extracted from the very large selection of classified ads collected, most likely represent a lower bound on the actual population.

It is possible that the collected ads may be an incomplete set. The list of sites used as starting points may not have been complete and metadata used to quantify advertisers and workers was not always available. We should also remember that, while other venues became less prominent during the study period [60], not all workers use online advertising. Furthermore, not all online advertising was usable for population counts. Issues with other online sources included not knowing if co-occurring contacts were from related advertisers and whether advertisers were active at the time the data was collected.

Estimates of contact change based on images have limitations: advertisers changing both contacts and images at the same time could be missed and advertisers that reuse images from other unrelated advertisers could be erroneously included. It is not possible to mitigate these sources of error from archival data alone. Other hard to detect sources of error were: advertisers using multiple identities simultaneously and workers changing work contexts without leaving the industry. Relevant qualitative research is needed to help resolve these questions.

Conclusions

This study is believed to be the first to consider sex worker population in the context of long term advertising behavior in an industrialized democracy using online archival sources. The data presented suggests that most workers are likely only active for brief periods of time. While there may be more than one reason for this pattern we must consider that the majority of workers in fact exercise agency and have autonomy in how they practice sex work. The fact that most workers were advertising in a collective context does not diminish this possibility as in general these appear to be small ad hoc collectives similar to the Small Owner Operated Brothel (SOOB) model described in the New Zealand research [12].

Intermittently active sex workers considered over long periods of time represent a much larger population than what one would expect from worker estimates from shorter time periods. Demographic studies must take into account the duration of the data collection period and how long any individual worker was active in that period. Similarly, to avoid misleading results, the demographic composition of non-random samples should be situated in the spectrum of workers actually active for the time period being considered.

The online advertising space provides opportunities to engage with sex workers in ways that may not have been possible before. The challenge for researchers, policy makers and advocacy groups will be to ensure that underrepresented groups are included in any discussion on sex workers’ future conditions of work.

Supporting information

S1 Appendix. Image validation.

(DOCX)

S2 Appendix. Comparison with the SPACES study.

(DOCX)

S3 Appendix. Multiple populations.

(DOCX)

S4 Appendix. Factors affecting advertiser longevity.

(DOCX)

S1 File. Criteria for deciding relevance.

(XLSX)

S2 File. Detailed spreadsheet with breakout of days online by demographic category.

(XLSX)

S3 File. Detailed spreadsheet with breakout of number of ads by demographic category.

(XLSX)

Data Availability

Data may be found at https://osf.io/mebvp/.

Funding Statement

The authors received no specific funding for this work.

References

Decision Letter 0

Hamid Sharifi

18 May 2022

PONE-D-22-00964The Silent Majority: Evidence for Part Time Sex Work in CanadaPLOS ONE

Dear Dr. Kennedy,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Thanks so much for submitting your work to PLOS ONE.

The manuscript was evaluated by two rviewers and you can find the comments below.

Please provide these items in the revised version.

- Please provide the aim of the study cleary in the abstract and full text. The purpose should be written at the end of the introduction of the abstract and full text.

- The conclusion should be in the context of the results.

- Introduction is too long and could be shorten.

- Please provide the ethical consideration and ethics code if available.

-The number of Tables and Figures are too high. If possible, please merge some tables and also you can move some of the tables or Figures to the appendix.

- Please provide the limitations of the study to the discussion.

Please submit your revised manuscript by Jul 02 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Hamid Sharifi

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In your Methods section, please include additional information about your dataset and ensure that you have included a statement specifying whether the collection and analysis method complied with the terms and conditions for the source of the data.

3. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

4. Please amend your list of authors on the manuscript to ensure that each author is linked to an affiliation. Authors’ affiliations should reflect the institution where the work was done (if authors moved subsequently, you can also list the new affiliation stating “current affiliation:….” as necessary).

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: 1. Summary of the research

The authors conducted a study on the dynamics of contact sex work in Canada, how the population changed over time and representation of sex workers in policy debate and research used for policy. They found the population of sex workers to be dynamic, with most sex workers having only brief involvement, while more of them are inactive than those active. The authors concluded that as policy in Canada is based on qualitative research using small, self-selecting samples, the perspectives of these short-term workers, who are the silent majority, may not be adequately represented.

The authors show that though most previous studies were large scale using multiple techniques, they generated population estimates at a single point in time, focusing on female sex workers as subjects. Few studies tried to directly estimate sex work population demographics from publicly available internet advertising data. Because of the methodologies used by those studies, the estimates are difficult to confirm and tend to overestimate the population of sex workers. Using online data not only includes populations enumerated using the previous methodologies but also encompasses the more transient sex workers usually left out, who advertise online. This study therefore adds value to the current body of knowledge as it improves accuracy and completeness of other methodologies to shed more light on the part time and transient sex workers in Canada who are actually the majority.

Though there are issues with the study methodology that need attention, the authors did a good job of describing their methodology and how the data was analyzed, including parsing metadata, contacts and images, and analyzing other variables to describe sex worker advertising behavior. The methods section however needs to be rewritten to ensure it flows better and addresses flaws highlighted below.

The results are also presented in detail though some tables and description of the results could be improved. The authors were able to bring out the element of time which turned out to be key to interpreting population estimates. Limitations were well described with follow up actions.

My overall recommendation is that the study needs major revisions, especially the methods section, with some reanalysis once some of the methodological issues area addressed.

2. Examples and evidence

2.1. Major issues

2.1.1. The abstract introduction does not fully to summarize the context. From the third sentence (line 8 to line 11), the source of the information is unclear. If it is from literature, the reference should be provided. In the event it is from the current study, these would be results which should be better moved to the results section.

2.1.2. The Methods section needs a lot of work as it does not flow well, does not describe the methodology well and includes results (e.g. lines 112 – 114). I think that it needs some major reorganization and rewrite. I suggest that lines 112-114 be moved to the results section. The methods could then start with a description of the identification of the web sites to be used as the source of data, then explaining why the 6 sites were selected (part of this is available in lines 116 – 120).

2.1.3. In paragraph 2, starting on line 121, the authors write that the downloaded ads should be considered as a sample. There is need to explore and explain to what extent the downloaded ads are representative of all ads on the site. Without this being addressed, it is not clear how the authors can be sure that the findings of the study can be generalizable.

2.1.4. In the description of the contents of table 2 (end of line 127 to start of line 129), the reasons for excluding some sections in sites 2 and 3, other than to simplify analysis and reduce irrelevant posts (though it’s not clear to what extent) are not clear to me. I believe this may actually have introduced sampling bias. I would therefore suggest that the excluded section be reintroduced, and analysis done again.

2.1.5. The authors explain in line 135-136 that because the study uses publicly available data, there is no need for ethical approval. In the next sentence starting in line 136, they however report that there is personal identifying information in the data which they had to remove. I agree that it is not necessary to get consent from the participants, which would otherwise be very difficult if not impossible. However, I believe that because of the presence of personal identifying information in the data, the authors should seek ethical approval for the study.

2.1.6. The first section under results, starting from the sentence in line 227 up to line 237, seems to be describing methods. I would suggest that this information be moved to the methods section.

2.1.7. With regards to the tables presented in the manuscript, in general, I believe that they are too many and there are a number of issues that need address. I will give examples of issues with some of the tables and actions that can be considered. Table 1 does not contain adequate information to justify its inclusion, while other tables, like table 8, are too complex and will need to either be simplified, be converted to a graphic presentation or be removed altogether and the data described in the text. Some tables are too busy and difficult to read (e.g. table 7 and especially Table 10 such that it is difficult to fully understand the message they are meant to convey. The categories may need to be regrouped and/or reorganized. In Table 10, including disparate groups like individual, male, French, escort, etc, in the same column may be comparing dissimilar variables. On the other hand, some tables (e.g. Table 1) have very scanty information to the extent that it is not clear what they are meant to communicate. Some, like table 15, have excessively long titles with too much information which could preferably by moved to the text, while others like Table 1 are too brief and need to be made more descriptive. Table 12 seems to have a wrong titles while others have headers that need review and revision, e.g. table 13, in which the last column header is called “names”, but the information contained in the column is actually numbers. For table 16, the reference in the text (line 348) seems to be referring to another table and needs correction. Finally, some tables are broadly referred to in the text, but their contents are not summarized in the text, for example tables 1, 3, 5, 7 and 8.

2.1.8. The evidence on how the industry regulates itself through economics of supply and demand (presented in line 425-478) from table 16 may be more appropriately presented in the results section. The interpretation of the evidence could however still be presented in the discussion section.

2.1.9. The change mentioned in the sentence in lines 249-250 that reads “A change on Site 1 which concealed contact information resulted in few contacts being extracted after June 2015” may affect the findings of the change in sex worker population dynamics over time. The effect of this change may need to be taken into account in the analysis and interpretation of relevant results.

2.1.10. In the sub-section on the effect of time (line 427-434), many of the figures presented were not described in the results section. They therefore seem misplaced and should be moved to the results. Their interpretation is what may be more relevant to be included in the discussion section.

2.1.11. In the discussion section, with regards to the implication of policy (ln 465 – 515), I feel that there is inadequate data presented. More analysis could be done from the data used to construct table 16 and more data could be presented in the results section to give a good background for discussion. With little data presented on the implications of policy, I feel that there is a bit of overreach with regards to the conclusions and it will help to present a bit more data to adequately support the conclusions.

2.2. Minor issues

2.2.1. I feel that the title does not adequately convey the key features of the article, though it does spark interest. I suggest that “part time” in the title be replaced with something about the transient nature of sex work in Canada, and other components such as online advertising, long term behavior of sex workers (since it is the first study to consider the long-term behavior of sex workers based on online advertising).

2.2.2. The sentence in line 114 starting with “Table 1 outlines...” is very vague and needs to highlight what was collected. The table itself may need to be moved to the results section.

2.2.3. The statement in line 133-134 that reads “Advertisers with no valid ads had all their ads checked for relevance” is not clear to me. My assumption is that if they had no valid ads, they should be excluded. It can be deleted.

2.2.4. Table 8 and Table 10 look very busy and I would suggest reducing the number of columns by putting the standard deviation in brackets in the same cell as the average days. The number of decimal points could also be harmonized and reduced possibly to 1 decimal point. The median IQR could also be added, and be put in the same cell in brackets after the relevant median day values.

2.2.5. Fig 1 is mentioned in lines 245-247, but the summary of the findings is not included in the text, and the text needs to be updated accordingly.

2.2.6. Under limitations (line 411), the relevance of the statement “At least four additional studies could be written based on this archival data” is not clear to me. I would suggest that it be deleted.

2.2.7. In the conclusion, with regards to the sentence starting in line 517, it is not clear what techniques are being referred to. It may be better for the authors to expound on what they are referring to.

Reviewer #2: It is a very important study, but we need to be sure that the people included are sex workers, so I think that the selection process should have a peer review to be sure that the same results are reached.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Dr Brian Chirombo (MBChB, MPH)

Reviewer #2: Yes: Edgard J. Narvaez D.

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Nov 15;17(11):e0277550. doi: 10.1371/journal.pone.0277550.r002

Author response to Decision Letter 0


2 Sep 2022

PONE-D-22-00964

The Silent Majority: Evidence for Part Time Sex Work in Canada

PLOS ONE

Dear Dr. Kennedy,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Thanks so much for submitting your work to PLOS ONE.

The manuscript was evaluated by two rviewers and you can find the comments below.

Please provide these items in the revised version.

- Please provide the aim of the study cleary in the abstract and full text. The purpose should be written at the end of the introduction of the abstract and full text.

Added the following to the abstract: “The purpose of this study is to consider how time affects the population dynamics of contact sex workers in Canada using publicly available internet advertising data collected over multiple years.”

- The conclusion should be in the context of the results.

moved some material from the conclusion to the discussion

- Introduction is too long and could be shorten.

shortened the introduction to less than 600 words and moved some material to the discussion.

- Please provide the ethical consideration and ethics code if available.

added the following ethics statement: “All source data used in this study consisted of publicly available data at the time it was collected and was collected in accordance with the policies of the sites in effect at the time. The methods used are conformant with the ethical standards of the Canadian Sociology Association (section 4.10 II) and the American Sociology Association (section 10.5 c) [34,35]. As the replicability of the main results of this paper is important, a data set is provided as part of the supporting information along with the code used to process it. However, in order to protect the safety and privacy of advertisers and third parties, all identifying information has been removed including the names of the source websites.”

-The number of Tables and Figures are too high. If possible, please merge some tables and also you can move some of the tables or Figures to the appendix.

Reduced the number of tables in the main paper to 4.

Removed two figures but added 3 figures (for a total of 4) to the methods to better illustrate the data processing pipeline, how contacts were handled and how the clustering algorithm for contacts works.

- Please provide the limitations of the study to the discussion.

moved and expanded the limitations to the end of the discussion.

Please submit your revised manuscript by Jul 02 2022 Sep 03 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager. and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?.

We look forward to receiving your revised manuscript.

Kind regards,

Hamid Sharifi

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/ and

https://journals.plos.org/

2. In your Methods section, please include additional information about your dataset and ensure that you have included a statement specifying whether the collection and analysis method complied with the terms and conditions for the source of the data.

This has been done.

3. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/.

This is no longer the case. linked an anonymized data set in the supplemental materials.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

This is no longer an issue.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/.

Data is available here: https://osf.io/mebvp/

We will update your Data Availability statement on your behalf to reflect the information you provide.

4. Please amend your list of authors on the manuscript to ensure that each author is linked to an affiliation. Authors’ affiliations should reflect the institution where the work was done (if authors moved subsequently, you can also list the new affiliation stating “current affiliation:….” as necessary).

Data was shared with the University of British Columbia as part of the SPACES project as described in the paper with no direct involvement in the project. If you could provide some clarification on how to handle this situation that would be much appreciated.

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: I Don't Know

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: 1. Summary of the research

The authors conducted a study on the dynamics of contact sex work in Canada, how the population changed over time and representation of sex workers in policy debate and research used for policy. They found the population of sex workers to be dynamic, with most sex workers having only brief involvement, while more of them are inactive than those active. The authors concluded that as policy in Canada is based on qualitative research using small, self-selecting samples, the perspectives of these short-term workers, who are the silent majority, may not be adequately represented.

The authors show that though most previous studies were large scale using multiple techniques, they generated population estimates at a single point in time, focusing on female sex workers as subjects. Few studies tried to directly estimate sex work population demographics from publicly available internet advertising data. Because of the methodologies used by those studies, the estimates are difficult to confirm and tend to overestimate the population of sex workers. Using online data not only includes populations enumerated using the previous methodologies but also encompasses the more transient sex workers usually left out, who advertise online. This study therefore adds value to the current body of knowledge as it improves accuracy and completeness of other methodologies to shed more light on the part time and transient sex workers in Canada who are actually the majority.

Though there are issues with the study methodology that need attention, the authors did a good job of describing their methodology and how the data was analyzed, including parsing metadata, contacts and images, and analyzing other variables to describe sex worker advertising behavior. The methods section however needs to be rewritten to ensure it flows better and addresses flaws highlighted below.

The results are also presented in detail though some tables and description of the results could be improved. The authors were able to bring out the element of time which turned out to be key to interpreting population estimates. Limitations were well described with follow up actions.

My overall recommendation is that the study needs major revisions, especially the methods section, with some reanalysis once some of the methodological issues area addressed.

This has been done. The data was reanalyzed as recommended. Comparisons were also completed to see how population estimates derived from advertising relate to other estimates reported in the literature.

2. Examples and evidence

2.1. Major issues

2.1.1. The abstract introduction does not fully to summarize the context. From the third sentence (line 8 to line 11), the source of the information is unclear. If it is from literature, the reference should be provided. In the event it is from the current study, these would be results which should be better moved to the results section.

The lines 8 - 11 have been removed and the purpose of the study was added instead

2.1.2. The Methods section needs a lot of work as it does not flow well, does not describe the methodology well and includes results (e.g. lines 112 – 114). I think that it needs some major reorganization and rewrite. I suggest that lines 112-114 be moved to the results section. The methods could then start with a description of the identification of the web sites to be used as the source of data, then explaining why the 6 sites were selected (part of this is available in lines 116 – 120).

This section has been extensively revised. Some sections have been moved to results.

The sites should be considered participants in the study along with the advertisers and the individual people they represent. The sites have been de-identified because of the legal situation in Canada vis-a-vis advertising for sex work at the moment (the Protection of Communities and Exploited Persons Act).

Added flow diagrams to show the data gathering and extraction process which is intended to make the processing steps clearer.

Added a section on sources of error and how they were mitigated.

Added a section on techniques to validate the collected data by comparing it with other research results as well as techniques to check internal validity.

2.1.3. In paragraph 2, starting on line 121, the authors write that the downloaded ads should be considered as a sample. There is need to explore and explain to what extent the downloaded ads are representative of all ads on the site. Without this being addressed, it is not clear how the authors can be sure that the findings of the study can be generalizable.

Explained in more detail how often ads were searched and indicated that all new ads on the sites were downloaded at least every 15 minutes. This download frequency was in fact necessary to keep up with the large volume of ads published.

Also included an analysis of data that was not included in the analysis as a form of comparison.

Note that even if the set of ads is not complete we are not in danger of overcounting the potential number of workers - which turns out to be quite large in any case.

2.1.4. In the description of the contents of table 2 (end of line 127 to start of line 129), the reasons for excluding some sections in sites 2 and 3, other than to simplify analysis and reduce irrelevant posts (though it’s not clear to what extent) are not clear to me. I believe this may actually have introduced sampling bias. I would therefore suggest that the excluded section be reintroduced, and analysis done again.

This has been removed and a reanalysis of the complete data set was done instead. The results are basically unchanged.

2.1.5. The authors explain in line 135-136 that because the study uses publicly available data, there is no need for ethical approval. In the next sentence starting in line 136, they however report that there is personal identifying information in the data which they had to remove. I agree that it is not necessary to get consent from the participants, which would otherwise be very difficult if not impossible. However, I believe that because of the presence of personal identifying information in the data, the authors should seek ethical approval for the study.

External ethics approval is not possible at the moment as the study is not being conducted in an institutional context. Great care has been taken to ensure that participants’ privacy is respected and that ethics guidelines have been followed.

The SPACES study received ethics approval from the University of British Columbia to use the data collected. The SPACES investigators did not do any data collection but instead used a subset of the data described in this study (see Appendix D). All data was from publicly available sources at the time of collection, and although a reasonable expectation of privacy may not apply in this case as persons observed were advertising to the public, layers of anonymization were enforced in the publicly released data to discourage data linkage.

Updated the ethics statement (see above). Data was collected in a way that is conformant to the ethics guidelines of the American Sociology Association and the Canadian Sociology Association.

2.1.6. The first section under results, starting from the sentence in line 227 up to line 237, seems to be describing methods. I would suggest that this information be moved to the methods section.

This has been removed as it exists already in other parts of the paper.

2.1.7. With regards to the tables presented in the manuscript, in general, I believe that they are too many and there are a number of issues that need address. I will give examples of issues with some of the tables and actions that can be considered. Table 1 does not contain adequate information to justify its inclusion, while other tables, like table 8, are too complex and will need to either be simplified, be converted to a graphic presentation or be removed altogether and the data described in the text. Some tables are too busy and difficult to read (e.g. table 7 and especially Table 10 such that it is difficult to fully understand the message they are meant to convey. The categories may need to be regrouped and/or reorganized. In Table 10, including disparate groups like individual, male, French, escort, etc, in the same column may be comparing dissimilar variables. On the other hand, some tables (e.g. Table 1) have very scanty information to the extent that it is not clear what they are meant to communicate. Some, like table 15, have excessively long titles with too much information which could preferably by moved to the text, while others like Table 1 are too brief and need to be made more descriptive. Table 12 seems to have a wrong titles while others have headers that need review and revision, e.g. table 13, in which the last column header is called “names”, but the information contained in the column is actually numbers. For table 16, the reference in the text (line 348) seems to be referring to another table and needs correction. Finally, some tables are broadly referred to in the text, but their contents are not summarized in the text, for example tables 1, 3, 5, 7 and 8.

Removed most of the tables from the main text and updated the text to describe the tables that remain.

Created appendices for the more detailed information on how demographic variables affect advertising behavior. included the original excel spreadsheets for the two tables summarizing this behavior as supplemental materials. added IQR to these spreadsheets.

2.1.8. The evidence on how the industry regulates itself through economics of supply and demand (presented in line 425-478) from table 16 may be more appropriately presented in the results section. The interpretation of the evidence could however still be presented in the discussion section.

Revised these paragraphs and, after reviewing the correlations between the actual number of charges vs the estimated number of advertisers (in contrast to the figures provided which use per capita measurements) have removed the province to province comparison. Generally the number of charges is correlated with the number of advertisers (pearson correlation 0.84, p < 0.001). Why this is the case is a topic for future research as under the PCEPA workers cannot be charged with an offense.

2.1.9. The change mentioned in the sentence in lines 249-250 that reads “A change on Site 1 which concealed contact information resulted in few contacts being extracted after June 2015” may affect the findings of the change in sex worker population dynamics over time. The effect of this change may need to be taken into account in the analysis and interpretation of relevant results.

Despite missing some data, we still see an increasing population of workers as time scale increases while weekly spot estimates remain in the range of 15-20k workers. The fact that we know some data is missing will not change this effect as the majority of the data was intact.

2.1.10. In the sub-section on the effect of time (line 427-434), many of the figures presented were not described in the results section. They therefore seem misplaced and should be moved to the results. Their interpretation is what may be more relevant to be included in the discussion section.

These have been moved to the results section as requested.

2.1.11. In the discussion section, with regards to the implication of policy (ln 465 – 515), I feel that there is inadequate data presented. More analysis could be done from the data used to construct table 16 and more data could be presented in the results section to give a good background for discussion. With little data presented on the implications of policy, I feel that there is a bit of overreach with regards to the conclusions and it will help to present a bit more data to adequately support the conclusions.

See the response to 2.1.8.

2.2. Minor issues

2.2.1. I feel that the title does not adequately convey the key features of the article, though it does spark interest. I suggest that “part time” in the title be replaced with something about the transient nature of sex work in Canada, and other components such as online advertising, long term behavior of sex workers (since it is the first study to consider the long-term behavior of sex workers based on online advertising).

Changed the title. Generally in Canada the view espoused in the press, derived from qualitative research that almost universally uses non-random samples, is that most sex workers work for 10 years in the industry but this study shows that that this view is too simplistic, thus the title reflects this: our beliefs are likely wrong. What is really needed are studies that use better sampling techniques.

2.2.2. The sentence in line 114 starting with “Table 1 outlines...” is very vague and needs to highlight what was collected. The table itself may need to be moved to the results section.

This has been updated.

2.2.3. The statement in line 133-134 that reads “Advertisers with no valid ads had all their ads checked for relevance” is not clear to me. My assumption is that if they had no valid ads, they should be excluded. It can be deleted.

This has been removed for clarity. The approach being taken now is to start with all advertisers then reduce the number based on a measured number that were shown to not be advertising contact sex work. The corrected population estimates reflect this. The actual number of excluded advertisers works out to be 5%.

2.2.4. Table 8 and Table 10 look very busy and I would suggest reducing the number of columns by putting the standard deviation in brackets in the same cell as the average days. The number of decimal points could also be harmonized and reduced possibly to 1 decimal point. The median IQR could also be added, and be put in the same cell in brackets after the relevant median day values.

These have been replaced with excel spreadsheets as described above and have been removed from the main text to the appendices.

2.2.5. Fig 1 is mentioned in lines 245-247, but the summary of the findings is not included in the text, and the text needs to be updated accordingly.

Added a summary of the findings to the text.

2.2.6. Under limitations (line 411), the relevance of the statement “At least four additional studies could be written based on this archival data” is not clear to me. I would suggest that it be deleted.

Removed this.

2.2.7. In the conclusion, with regards to the sentence starting in line 517, it is not clear what techniques are being referred to. It may be better for the authors to expound on what they are referring to.

This has been removed from the conclusions and expanded and added to the discussion. The techniques being referred to are the data collection and metadata extraction techniques described in the paper. The hope is that the expanded content will make this clearer.

The issue, as described above, with much of the qualitative research in Canada is that it depends on non-random samples. This issue can be partly mitigated by using advertisers as a basis for random samples.

Reviewer #2: It is a very important study, but we need to be sure that the people included are sex workers, so I think that the selection process should have a peer review to be sure that the same results are reached.

Peer review of the source sites was done as part of the SPACES project. All of the sites selected were recommended by industry insiders who used them. An expanded section is added to the methods to emphasize this. Both Site 1 and Site 2 have been the subject of other research (Boekner et al. for example).

Because of privacy issues regarding potentially identifying workers, the peer review has been limited to the original members of the SPACES team. However, Appendix D is added which compares the data presented here with the SPACES investigators’ preliminary results. The results in this study are very conservative in comparison as the current project undertakes additional error control measures to minimize overcounting.

Included in the supplemental materials S1 File are the criteria used to decide if ads were related to contact sex work.

The analysis of image data provides another way to validate the advertisers. The 1752880 unique images found for an estimated 70-75000 advertisers suggests these are indeed real people. See Appendix A for an analysis of the images.

To improve external validity the population estimates are compared with other studies and a more recent data set from 2021-2022. This more recent data set is available in anonymized form on the supplemental materials site.

Ultimately, the hope is that other researchers will attempt to replicate the results. That would be a very positive outcome.

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Dr Brian Chirombo (MBChB, MPH)

Reviewer #2: Yes: Edgard J. Narvaez D.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Hamid Sharifi

17 Oct 2022

PONE-D-22-00964R1The silent majority: the typical Canadian sex worker may not be who we thinkPLOS ONE

Dear Dr. Kennedy,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 01 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Hamid Sharifi

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have adequately addressed my comments raised in the previous round of review.

However, the revised manuscript has some new minor issues that I believe the authors need to address. These issues being minor however only entail minor revision of the manuscript.

1. The sentence starting in line 109 that begins with "Issues with these other source ....." could be better moved to the discussion section.

2. In Table 3 (from line 349), the authors present p values less than 0.001 as decimal numerals. However, the standard convention is for such p values to be expressed as p<0.001 regardless of the actual decimal numeral. I would therefore suggest that the authors consider following the standard convention and present all the p values that are less than 0.001 as p<0.001.

3. In lines 388, 390 and 427, the authors present Odds Ratios (ORs) without the corresponding confidence intervals (CIs). I would suggest that they consider presenting all ORs with their corresponding CIs as per standard convention.

Reviewer #2: Comments and questions raised in the previous revision have been incorporated or clarified.

The methodology is clear although a bit extensive.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Brian C Chirombo, MBChB, MPH

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Nov 15;17(11):e0277550. doi: 10.1371/journal.pone.0277550.r004

Author response to Decision Letter 1


26 Oct 2022

PONE-D-22-00964R1

The silent majority: the typical Canadian sex worker may not be who we think

PLOS ONE

Dear Dr. Kennedy,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 01 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Hamid Sharifi

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

• All references were checked for retractions. No retractions were found as of 2022-10-17.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: Yes

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have adequately addressed my comments raised in the previous round of review.

However, the revised manuscript has some new minor issues that I believe the authors need to address. These issues being minor however only entail minor revision of the manuscript.

1. The sentence starting in line 109 that begins with "Issues with these other source ....." could be better moved to the discussion section.

• This has been moved to the Discussion > Limitations section paragraph 2.

2. In Table 3 (from line 349), the authors present p values less than 0.001 as decimal numerals. However, the standard convention is for such p values to be expressed as p<0.001 regardless of the actual decimal numeral. I would therefore suggest that the authors consider following the standard convention and present all the p values that are less than 0.001 as p<0.001.

• These have been changed to use the convention.

3. In lines 388, 390 and 427, the authors present Odds Ratios (ORs) without the corresponding confidence intervals (CIs). I would suggest that they consider presenting all ORs with their corresponding CIs as per standard convention.

• The main reason for the calculation was two-fold:

a. Show that just using advertisers as a measure of collective vs individual can be misleading.

b. Show that, while the Argento et al. sample was different on some measures, it was comparable to the advertising data on the collective vs individual dimension when considering indoor workers.

• The comparisons were redone using the R prop.test function. The text has been updated to reflect the results.

• Previously, the odds ratio was calculated by dividing the probability of an advertiser or name being found in a collective context with that of the same being found in an individual context. For example, for advertisers this would be:

Odds Ratio = p(adv is collective)/p(adv is individual) = (Collective Adv/All Adv)/(Individual Adv/All Adv) = (Collective Adv)/(Individual Adv)

As these are simple frequency counts, there is no confidence interval. In retrospect, prop.test seems to be more informative.

Reviewer #2: Comments and questions raised in the previous revision have been incorporated or clarified.

The methodology is clear although a bit extensive.

• Because this is the first time this has been attempted, the methods are necessarily more detailed. In the future, if related work is being described, it is hoped that the Methods in this paper can be used as a reference.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 2

Hamid Sharifi

31 Oct 2022

The silent majority: the typical Canadian sex worker may not be who we think

PONE-D-22-00964R2

Dear Dr. Kennedy,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Hamid Sharifi

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Hamid Sharifi

5 Nov 2022

PONE-D-22-00964R2

The silent majority: the typical Canadian sex worker may not be who we think

Dear Dr. Kennedy:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Hamid Sharifi

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix. Image validation.

    (DOCX)

    S2 Appendix. Comparison with the SPACES study.

    (DOCX)

    S3 Appendix. Multiple populations.

    (DOCX)

    S4 Appendix. Factors affecting advertiser longevity.

    (DOCX)

    S1 File. Criteria for deciding relevance.

    (XLSX)

    S2 File. Detailed spreadsheet with breakout of days online by demographic category.

    (XLSX)

    S3 File. Detailed spreadsheet with breakout of number of ads by demographic category.

    (XLSX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    Data may be found at https://osf.io/mebvp/.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES