Domesticating data: Traveling and value-making in the data economy

Clémence Pinel; Mette N Svendsen

doi:10.1177/03063127231212506

. 2023 Nov 25;54(3):429–450. doi: 10.1177/03063127231212506

Domesticating data: Traveling and value-making in the data economy

Clémence Pinel ^1,^✉, Mette N Svendsen ¹

PMCID: PMC11119098 PMID: 38006306

Abstract

Data are versatile objects that can travel across contexts. While data’s travels have been widely discussed, little attention has been paid to the sites from where and to which data flow. Drawing upon ethnographic fieldwork in two connected data-intensive laboratories and the concept of domestication, we explore what it takes to bring data ‘home’ into the laboratory. As data come and dwell in the home, they are made to follow rituals, and as a result, data are reshaped and form ties with the laboratory and its practitioners. We identify four main ways of domesticating data. First, through storytelling about the data’s origins, data practitioners draw the boundaries of their laboratory. Second, through standardization, staff transform samples into digital data that can travel well while ruling what data can be let into the home. Third, through formatting, data practitioners become familiar with their data and at the same time imprint the data, thus making them belong to their home. Finally, through cultivation, staff turn data into a resource for knowledge production. Through the lens of domestication, we see the data economy as a collection of homes connected by flows, and it is because data are tamed and attached to homes that they become valuable knowledge tools. Such domestication practices also have broad implications for staff, who in the process of ‘homing’ data, come to belong to the laboratory. To conclude, we reflect on what these domestication processes—which silence unusual behaviours in the data—mean for the knowledge produced in data-intensive research.

Keywords: data, domestication, data journeys, boundary, value, data economy

For many biomedical researchers, producing valuable knowledge involves generating, collecting, formatting and utilizing vast quantities—and different sorts—of data (Leonelli, 2012; Meloni, 2016). As social scientists studying data-intensive biomedical research, we are interested in how practitioners handle data for the production of knowledge. In our ethnographic research, we have mapped the local practices of database managers or data analysts making, maintaining, and utilizing data. However, as we witnessed practitioners interacting with data in their local settings, we observed their concerns for the global origins of their data. Specifically, practitioners often pointed out that the data held within their laboratory had varied origins, with many actors outside their laboratory contributing their expertise, skills, technologies, or other resources to the making of data. These actors included research volunteers donating blood, biotechnology companies using DNA sequencing or mass spectrometry technologies to generate omics data, or database curators making data available.

These varied actors act in, and come to constitute, what we term the global data economy. More specifically, in the ever-expanding global data economy made of national governments, healthcare institutions or again Big Tech (Birch et al., 2021; Faulkner-Gurstein & Wyatt, 2021), they form a particular domain focused on research. As practitioners described the varied origins of their data, they hinted at a boundary (Gieryn, 1999) between an inside—their laboratory—where data could be accumulated, linked and worked on, and an outside—beyond the walls of their laboratory—from where and to which data travel. They expressed concerns about what bringing such data into their laboratory meant, not just in the sense of the crossing of a threshold between an outside and an inside, but more in the sense of creating a ‘home’ for data that had travelled, where meaningful ties could be formed between this home, data practitioners, and the data themselves.

Inspired by anthropologist Ingold (2000) we see the home as a place of dwelling. That is, beyond the physical structure, the home is a ‘container for life activities’ (Ingold, 2000, p. 185) organized around rituals, such as welcomes and introductions, and characterized by mutual exchanges between entities. As such, the home is where relationships are formed (Coolen & Meesters, 2012), and identities are shaped (Clark, 2007).

To trace the origins of data, critical data scholars have shed light upon different forms of ‘data work’ (Bonde et al., 2019; Leonelli, 2016) through which data are produced and turned into effective instruments. For example, they discuss the work of patients contributing their data through digital patient-reported outcomes (Torenholt et al., 2020), professionals caring for their data (Pinel et al., 2020), or research staff processing their data before analysis (Mayernik, 2019). Scholars have also sought to study data and their origins by focusing on ‘data journeys’ (Leonelli & Tempini, 2020), unpacking how data are made and utilized across contexts. They point to the contextual nature of data (Berg & Goorman, 1999), whereby data are always entangled with the context of their production, and discuss what it takes to enable the mobility of data. Leonelli (2014), in her study of databases for model organisms in experimental biology, defines data journeys as acts of de-contextualizing and re-contextualizing data. Data first need to be de-contextualized, which means disentangling data from their original context by, for example, formatting them in particular ways so that they can travel and be used for new purposes. Then, to be adopted in a new research context, data need to be re-contextualized through work such as providing detailed metadata about their origins, including information about the procedures and instruments used, or the staff involved in producing the data (Edwards et al., 2011).

From this literature we grasp the scale and diversity of the data economy, where many actors are involved in producing, maintaining and utilizing data. These studies tend to focus on data’s original contexts where they are structured by varied epistemologies and practices, and unpacking what facilitates data journeys, such as acts of de- or re-contextualization. In other words, existing literature places the emphasis on the practices enacted for data to flow and enable their use across contexts. In this article, we take a different approach. Rather than consider data travels as happening across contexts—a broad term encompassing the circumstances and situations within which something exists or happens, we approach data travels by focusing on the laboratory from where and to which data travel, conceptualizing the laboratory as a home where data can ‘live.’ The home—in contrast to a context—comes with doors that let some entities in and keeps some out; implies routines, rituals, and familiarity; and involves attachments being formed between the entities living in the home. Specifically, we pay attention to what happens as data come and dwell in the home of the laboratory. As mobile entities, data travel the global data economy to successively leave and enter laboratories. But data traveling is not always smooth nor straightforward. Not all data are given a home in the laboratory, and living in the home of the laboratory is deeply transformative. As data come into the laboratory and dwell in this space, they are made to follow the rituals of the home. For data to be of value, ties need to be built between data, places, and people. Crucially, in the home of the laboratory, data need to behave in particular ways to become useful tools, following particular codes, standards, and routines. Through these processes of dwelling, data are shaped and adopt a new identity, at the same time that attachments are formed between data, the home, and practitioners. As such, data come to belong to the home of the laboratory. This is a mutually constitutive process, as the home and practitioners operating in this home are themselves shaped by the experience of living with these data. Through exploring data’s travels with the concept of the home, we can get to grips with richer forms of data work than the de- and re-contextualization of data, especially the transformative, ritualistic, and affective processes that shape how data travel and come to belong to laboratories in the global data economy.

Domesticating the wild, building a home

As we examine the forms of data work necessary for data to inhabit a home in the global data economy, we also explore which data are made to dwell in and belong, which are excluded, how ties are formed between data, data practitioners, and the laboratory, and identities shaped. We conceptualize this form of data work as domestication.

The concept of domestication captures ongoing relationships between different entities, traditionally between people, animals, and plants (Cassidy, 2007). These relations, while they can take different forms, rest on a distinction between the domestic and the wild. Deriving from the Latin ‘domus,’ domestication implies a home, where life sources like plants and animals are brought in. Outside the home, such life sources are considered wild—that is, they are deemed uncultivated and uncivilized, thought of as uncontrolled, irregular, and even reckless. The domestication of such entities therefore implies addressing their wildness, by taming—but also cultivating—the wild to derive value (Anderson, 1997).

In practice, domestication usually entails spatial containment (Wilson, 2007). Animals and plants come to be domesticated through physical enclosure in the home. In this space they are isolated from the wilderness, and they evolve in close proximity to humans. The process of bringing these entities into the home means imposing control. As Clutton-Brock (1989, p. 7) puts it, domestication is both a cultural and biological process whereby animals are ‘incorporated into the social structure of the human group and become objects of ownership.’ With its emphasis on the home, the concept of domestication is tied to questions of belonging. Yuval-Davis (2011) defines belonging as both an emotional attachment and a feeling of being ‘at home’ within particular social groups and/or places, while also highlighting the ‘politics of belonging’ at play, whereby boundaries are drawn around communities, including certain people while excluding others (Pinel & Svendsen, 2021; Svendsen, 2022). In domestication processes, actors work on building a home and as they do so, they draw social and spatial boundaries. At the same time, as wild entities are brought into the home, they are made to feel at home. For example, through origin stories, entities are provided with roots to a space, which they share with other entities in that home. As wild entities are domesticated within the home, they are made to belong to a particular space, and to a social and political collectivity.

Domestication also implies transformations. Broadly speaking, any relation or exchange with an Other is potentially source of transformation: as we interact with others, we both give and take, care and are being cared for, which leaves traces on our bodies and identities. Through domestication, different entities are brought together into a home, and it is this intimacy that is generative of transformations. In his study of animal domestication, Clark (2007) speaks of the ‘generosity’ of domestication, whereby close interactions in the home leads to imitation processes and the incorporation of features from those who differ from us. Such transformations do not just concern animals or plants brought into the home. Domestication is a mutually constitutive process and implies that both parties are transformed in the process of living together (Lien et al., 2018). For instance, Haraway (2007), in her account of human–animal encounters, discusses the domestication of dogs and particularly points out how dogs and people configure one another in the process. Or as Diprose (2002) points out ‘We turn into our partners, even our dogs, just by dwelling with them.’ In other words, in this line of argument, by living together in the home, the domesticator shapes the wild and the wild shapes the domesticator. The transformations that result from domestication, can also involve ‘taming’ processes. For example, the home is characterized by codes, norms, and rituals (Ciborra, 1999). When animals and plants enter the home, these are extended to its new members. Entering the home and living in it thus means learning about and appropriating them, and in the process, altering one’s identity by absorbing aspects of an alien culture. Through domestication, entities are altered, and as they do so, they are made to fit the domus and its purposes. Put differently, they are made to belong.

Domestication processes are also tied to questions of value and value-making. This is something we are reminded of when analysing traditional domestication narratives, which associate domestication with civilization (Childe, 1928). In this line of thought, domestication is associated with human progress: through domestication, humans began cultivating animals and plants, thus producing a surplus that allowed (but also depended on) larger human settlements to herd the animals, plant, or harvest, which in turn paved the way for human population growth, division of labour, private property, and state formation (Lien et al., 2018). While we should be critical of such a grand narrative, it does underline how the concept of domestication implies value-making processes. Entities are brought in, tamed, and transformed to make them productive, thereby turning them into resources from which value can be derived.

In this article, we draw on the concept of domestication to analyse data and data travel in the global data economy. As we do so, we build on a line of literature within science and technology studies (STS) that has drawn on the concept of domestication in different contexts other than of plants or animals. In particular, authors have studied technology and its implementation in practice as domestication processes (Haddon, 2011; Pols & Willems, 2011; Silverstone, 1994; see also Sovacool & Hess, 2017 for a good summary of domestication theory in critical studies of technology). They analyse how users integrate technology into their home by pointing to the ways technology is ‘tamed’ by users, that is, how it is changed by users as they adopt it in their daily lives, while they also illustrate how users change in return, thus conceptualizing this mutual process as the domestication of technology by users.

Inspired by these STS scholars, in our article we apply the concept of domestication to data and data travels to explore what it means to bring and integrate data into the home of the laboratory. We find that it speaks to our informants’ concerns about how to work with data that have travelled widely. In our analysis, we see local laboratories as homes that data practitioners build for their data to dwell in. This is a protected and familiar space that data practitioners control and structure. At the same time, the concept of the home implies a boundary between the home and the outside. In our case, this outside is a space that is, for data practitioners, out of control and unknown. In employing these categories, we do not think of them as ontological givens, but as helpful cultural categories to shed light on boundaries between different spaces.

In this paper, our focus is on what happens when data are made to enter the laboratory. Crucially, the concept of domestication helps us shift analytical focus from data flowing across contexts to data dwelling in laboratories, to explore what data living in the home of the laboratory—made of servers, tools, standards, and data practitioners—entails for data themselves and others in the home. We thus come to examine how this intimacy in the home shapes data, practitioners, and the laboratory. Exploring data as they enter the home and become domesticated also means examining what is let into the home—that is, which data are included or excluded from these domestication processes. Finally, as we explore data travels through the lens of domestication, we also come to examine value-making processes in the global data economy, specifically exploring how domestication can ‘cultivate’ data. It is worth noting, however, that the home is a relative concept. What constitutes the home for one, may constitute the outside for another. We observed this first-hand when traveling with data from one laboratory to another as we conducted ethnographic fieldwork in different sites which were connected by data and samples. We expand on our study and methodology in the next section.

The study and methods

The study on which this article is based aimed to investigate the global data flows underpinning contemporary biomedicine. Inspired by what Sheller and Urry (2006) describe as a ‘mobility turn’ in the social sciences, we were keen to examine which data move, how they move, and what facilitates such movements, while also asking what does not move and why. To this end, one author (Pinel) conducted ethnographic fieldwork in two empirical sites connected by data, the first of which is the Wilson Lab. Well-known for its database of phenotypic and genotypic data originating from a cohort of healthy volunteers from the Northern Isles of Scotland, the Wilson Lab is composed on the one hand of staff managing the cohort and curating the resulting database, and on the other hand of researchers using the available data for the production of knowledge. The second site is a biotechnology company called Genos. Based in Croatia, Genos has built a niche in the study of glycans, which are sugar molecules surrounding proteins in the blood that influence the immune system. Having developed a method for the high-throughput analysis of glycans, Genos sells analysis services to laboratories conducting cohort studies, turning biological materials into omics data. Both the Wilson Lab and Genos are involved in collaborations globally, exchanging data and samples with clinical facilities, research institutes, pharmaceutical companies or biobanks, and as such, we see them as participating in the global data economy. As part of a large EU-funded consortium that encourages partnerships between regions of Europe and between research and commercial entities, the Wilson Lab and Genos collaborated. This entailed Genos analysing samples from the Wilson Lab, turning them into a high-throughput dataset to be added to the lab’s database. As such, the two sites are connected by data, with samples traveling from the Wilson Lab to Genos, and subsequently, omics data traveling from Genos to the Wilson Lab. This flow of samples and data from one site to another is what motivated our ethnographic mobility. In this article, when we refer to data, we refer to both physical materials (like biological samples such as blood) and digital information, including information produced from such materials.

Ethnographic fieldwork took place in October 2019 at Genos and February 2020 at the Wilson Lab. In both sites, it involved observing data practices. Pinel observed laboratory staff working at the bench with biological materials, witnessed database workers processing the newly acquired omics data, or watched researchers analyse datasets using computational methods. Her fieldwork also entailed participating in the daily life of the teams, sitting in meetings, attending seminars, and sharing lunch with team members, and in-depth, semi-structured interviews with team members (seven at Genos and 12 at the Wilson Lab). The ethnographic data we draw upon consists of field notes from participant observation, informal conversations with staff, reflections from meetings, as well as transcripts of these interviews.

In the following sections, we introduce our empirical material as we travel across the Wilson Lab and Genos. In each section, the site in focus becomes the home as we discuss data practitioners’ efforts to domesticate the data after these data have travelled. We identify four dimensions to data domestication. First, through storytelling about the origins of their data, data practitioners draw the boundaries of their laboratory while creating ties between data and the home of the laboratory. Second, through standardization processes, data practitioners create data that can travel well, while shaping the home by selecting which data can be let in. Third, through formatting, data practitioners become familiar with their data and, at the same time, imprint their data to make them belong to their home. And finally, through cultivation, data practitioners turn domesticated data points into a resource that can be used for knowledge production.

Domestication as storytelling

Many stories circulate at the Wilson Lab. Some are destined for the general public, others are addressed to volunteers who donate blood and become part of the cohort, while other stories speak to data practitioners. A common thread they share is that they make the Wilson Lab a home for data and samples while also drawing its boundaries. In this section, we unpack how storytelling, and more specifically, origin stories about data, become ways of domesticating data by tracing their origins and attaching them to the home of the laboratory.

In January 2020, Jim, the Principal Investigator (PI) of the Wilson Lab, traveled to Orkney and Shetland, the Northern Isles of Scotland, UK, to promote the new round of recruitment of volunteers to the cohort study. He gave two lectures discussing some of his work based on the cohort data. ‘The word cohort means a group of people that share something in common.’ This is how Jim introduces their cohort. He went on explaining that, as a population-based cohort, the common denominator bringing this cohort together was first and foremost its ties to a place—the Northern Isles of Scotland. Only those with ties to the particular space that is Orkney or Shetland could participate in the cohort. In his lecture, Jim reminded the audience of the shared past of the people of the islands. Referring to existing evidence, Jim pointed to a map and showed the proximity of Orkney and Shetland to Norway. Following an arrow linking the two geographical areas, he added,

It won’t surprise anyone in this room to be told the history of Orkney and Shetland, and in particular that the Norse men, and women to some degree, crossed the North Sea and colonized these places, the Northern Isles. (…) So these populations are part Norse, part Scottish.

By pointing to the Norse history of the Northern Isles of Scotland, Jim referred to the origin story of ancient settler groups, the Vikings. Anthropologists have long analysed origin stories (Malinowski, 1922/1992) and pointed to how social groups articulate origin stories to understand experiences and support particular identities (Ginsburg, 1987; Morgan, 2009). In our case, by referring to the shared past and imaginary of the Vikings, Jim articulated and reinforced the identity of the local volunteers as descendants of Vikings, which may have facilitated their recruitment to the cohort (Hinterberger, 2012). But what we find particularly interesting is how Jim’s narrative discursively drew the boundaries of the cohort. It did so by articulating a strong connection between the people who donated biological materials and a particular place in the Northern Isles of Scotland. By referring to the rich history of the islands and the imaginary of the Vikings, Jim stressed that there was something extraordinary in such location and in its people. The cohort of the Wilson’s Lab brings together people with such extraordinary origins. Those who do not share these roots are left out and not given a home at the Wilson Lab.

While Norse history constitutes the origin story of the people of Orkney and Shetlands, the data themselves held in the Wilson Lab have their own origin story. Many tales circulate about how the data came to be, many of which come from Jim himself, who was instrumental in building the cohort. In a narrative from an interview with Jim, he placed emphasis on the ‘logistical endeavour’ of data collection in remote islands:

This is slightly different because it is 200-something miles away from here [Edinburgh]. I mean I didn’t do any genetics for years; I became a master of logistics. We went to Glasgow and we bought a Mercedes box lorry and then we went and found a lorry fitter and, like, this sort of man who sells camper vans, to fit out the insides of this lorry such that it had a waiting room, a toilet, a nurses’ office, sink for cups of tea and then a clinic room and then, at the back of the clinic room, a lab. All inside this lorry. And then I had to persuade a nurse to be able to drive the lorry, nurses don’t like driving heavy goods vehicles. I actually drove it, I drove it from Glasgow to Orkney. And this lorry went round four different islands, because of course Orkney is an archipelago, six months here, three months there.

In this origin story, Jim emphasized that the data held in the cohort study originate from remote islands and are the result of great logistical efforts by staff adjusting to accessibility challenges. For Jim, these particular conditions for collecting data make the cohort study stand out from other data collection endeavours. Other elements in his story similarly helped draw the boundaries of the cohort study:

And there were other crazy things, like out on the little islands, one time a crow flew into, what do you call them, electricity wires, and the whole islands had their power cut. And so the nurse had to grab all of our frozen tubes and wrap them in a duvet and get on a plane and fly them into Kirkwall [Orkney’s main town] and run to the lab to freeze them. And you know, in the van, you would get very strong winds in Orkney and there’s loads of things on the clinic measurement sheets, ‘gale force winds, no measurements today,’ and then later on it’s like ‘gale force 9,’ they got quite good at telling the exact level of the wind.

Jim’s narrative described the cohort data as originating in a remote part of the world characterized by inhospitable weather and a landscape difficult to travel through, causing logistical challenges for data collection. These stories did not just stay with Jim, who led the building of the cohort and database, but they were borrowed by members of the lab as they described their work. Analysts often referred to these tales to talk about the data they worked with and its specificities. For example, this is what Anna, a PhD student, did in an interview when asked to describe the data she works with:

It is just quite nice, and especially you hear stories of Jim, when he first started taking samples, it was like, they bought an old van and carved out the back of it to be a mobile thing. And the roads on the island are tiny. You could imagine just going around and taking samples from people. Yeah, because it’s small and unique, and yeah, it’s nice. You know more about it than maybe working on the UK Biobank.

Working with the data, Anna is reminded of their origins in the remote area of the Orkney and Shetland archipelago. She specifically distinguished between their own data which have peculiar geographical and logistical roots, and the UK Biobank data, which they also use without knowing much about where these data originate. The origin stories serve a number of purposes for Anna and the laboratory. In particular, we see these stories as having a ritual character that shape practitioners’ attachments to the data and belonging to the laboratory. Through storytelling, Anna becomes familiar with the origins of the data she works with and learns about the lab’s history and identity. As staff hear from Jim and others’ origin stories about the cohort data, they understand ‘what it took’ to produce the data they work with. It fosters their loyalty and care to the data themselves and the group (Pinel et al., 2020), thus encouraging attachments to the data and shaping their identity as Wilson Lab’s members.

Such stories also do something to the data in relation to the home of the Wilson Lab. We analyse these origin stories as technologies of ‘homing’ that help bring the data into the laboratory. Ween and Swanson (2018) describe how, through tracking technologies, even animals considered wild such as salmon can become domesticated. Such technologies, they show, ‘root salmon to particular places’ and domesticate them. In our case, stories about how the data came to be function as technologies that track the history and movements of data, connecting them to the laboratory and its staff. Through these stories, the data are tied to a location in the Northern Isles of Scotland, and one also understands the process of creating the cohort and database—including the logistics and human labour involved in creating samples and data points. As they track the origins of the data in Orkney and Shetland, these origin stories in fact root the data to the laboratory that enabled their creation. In bringing the data home, these origins stories constitute processes of domestication.

To summarize, the data’s origin stories delineate the home of the Wilson Lab and its boundaries in the global data economy while making staff belong to this home. We argue that it is through origins stories that the Wilson Lab enacts a ‘politics of belonging’ (Yuval-Davis, 2011). These origin stories successively draw the boundaries of the home of the Wilson Lab, by delineating who the volunteers donating data are (only people with roots in the Islands), what sort of data are included (only data that originate from such location), and which staff belong to the laboratory.

Domestication as standardization

After visiting the islands and extracting bodily materials from volunteers, samples are turned into data. This is a transformation process which (for the most part) takes place away from the Wilson Lab, with a series of laboratories contracted to produce the data for the Wilson Lab’s database. Genos is one of these laboratories. In 2019, about two thousand plasma samples were shipped from the Wilson Lab to Genos, and omics data were returned in a digital form a few months later. The Wilson Lab’s data are thus made through such travels beyond the walls of the laboratory. But this outside constitutes the home for others, like Genos staff. Crucially, producing data across the Wilson Lab and Genos entails particular domestication practices. In what follows, we travel to Genos and explore how they turn themselves into a home for samples originating from collaborators like the Wilson Lab.

Genos specializes in the production of high-throughput omics data. This entails turning human samples (substances stored in a tube) into omics data (digital objects stored in an electronic file). This transformation process is highly standardized with detailed protocols and procedures. We see this standardization as a way of enacting control over the samples that have now entered Genos—in other words, of domesticating them (Clutton-Brock, 1989). Through processes of standardization, data are made to follow the norms of their new-found home. These standardization processes function like rituals that enable old ties to be controlled and new ties to be formed. To start with, collaborators are given detailed instructions about how and when to ship their samples, so as to ensure safe travel. Upon arrival at Genos, they are counted, labelled, checked, and then placed in freezers while they await being processed. The processing and analysis of samples takes place in batches of 96 samples, corresponding to the size of an analysis plate. After receiving the samples, Genos considers carefully which samples to allocate to which batch. This is done through randomization, a process Marko, the senior data analyst at Genos, described in an interview:

So basically, my first task is to properly randomize samples, in order to, [so] that technical details do not correlate with biological results. If we have 500 cases, 500 controls, and we are doing batches in 96, you must make sure that on each batch, you will have 48 cases and 48 controls, so that they don’t alter.

By technical details, Marko refers to the origins of the samples in the laboratories conducting cohort studies. There may be variation in the way samples were collected from volunteers, with differences in instruments, staff, or conditions. From Genos’s standpoint, these sample collection practices are out of their control. Randomization as a ritual performed by Genos staff happens in two steps. First, Marko obtains additional information on each sample sent by the laboratory, including for example age, sex and case or control status. This helps Genos trace the samples’ origins. Second, based on this information, Marko decides on the distribution of samples on the analysis plates. He thus constructs the different batches, according to which the analysis will be performed. Once the samples are randomized, they are no longer considered in relation to their roots in the cohort studies and the people from whom they originate (if need be, such relations can be recovered.) In other words, the origins of the samples are tamed, and they become ready to be analysed and transformed at Genos. Samples become domesticated. Or to use the hospitality metaphor (Ciborra, 1999; Lanzara, 2016), through randomization, the host (Genos staff) reaches out to the ‘guests’ (the data) to learn about their culture and history, at the same time that the host imposes norms and rules on how to behave in the new home. This is important because it is partly through that process that the ‘guests’ becomes ‘residents’ in the home, where new local histories and ties can be formed.

Following randomization, laboratory staff proceed with sample preparation and analysis. Preparing a batch of samples for analysis takes about three days. Using centrifuges, air pumps, pipettes, collection plates, and chemicals, laboratory staff manipulate the substance contained in the samples so as to extract particular proteins. It is these proteins that are analysed to produce omics data. After processing in the laboratory, staff obtain an analysis plate ready to insert into a high-throughput analysis machine. The analysis takes a full 24 hours, after which they obtain a series of numbers corresponding to each sample on the analysis plate—these are the data points that will make a dataset. To prepare and analyse the samples, staff carefully follow detailed protocols, cataloguing their work in corresponding ‘worksheets’ and ‘storage sheets.’ Together, standardized protocols and accompanying documentation provide a history of the data points produced by Genos. This details who was involved, when, with what materials, and in what conditions. At the same time, with any standardized protocol, Genos leaves room for deviations. When working at the bench, staff mark if any deviations from the protocols occurred, describing ‘what happened’ and the conditions for these deviations. This can include for example writing about the analysis plate ‘sitting’ and ‘waiting’ on the bench in between steps because of the unavailability of an instrument.

Standard protocols and annotated documents thus situate data points in time and in the space that is Genos. Together, they document what happens to samples as they are processed at Genos and turned into omics data. In other words, these standardization and documentation efforts enable ties to be formed between Genos and the newly produced data. This resembles what Leonelli (2014) and others (Edwards et al., 2011) term ‘metadata,’ which provide information about data’s provenance and enable them to travel to new contexts where they can be re-contextualized and made use of. Yet, the concept of domestication adds new insights to this process. Domestication enables us to also see how these protocols and annotated documents, as they enable roots to be formed in Genos, facilitate their subsequent travels. This is particularly important because Genos is a service provider, contracted by laboratories to transform samples into data points. Samples are only in transit at Genos for the purpose of their transformation. When Genos sends the produced data to their customers, Genos also provides information about the protocol used, the samples’ randomization, and if any issues during the process arose which could lead to ‘problematic’ data points. As staff member Katarina put it in an interview:

At the end, we provide with our results a comment, on [a] number of samples that were, you know, problematic from the beginning or at some point. Sometimes they’re completely excluded and not sent, and then we also give an explanation, that they haven’t passed quality control. Sometimes we obtain results that are, let’s say, good, but during the procedure something happened, and then we label those as ‘potentially problematic’, and then advise them to exclude them. You know, from our technical side we have to provide that information. (…) This is our policy, to say ‘technically we had some problems, we advise you to exclude those, but we leave it up to you.

By sharing detailed information about their standardized processes for producing datasets, Genos looks to foster accountability and transparency towards the laboratories who have their samples analysed. Standardization and documentation efforts facilitate trust in the data produced, thus enabling their subsequent travels in the global data economy. It is because the newly produced data have clear roots at Genos, with documentation detailing how they came to be, that they can be domesticated further on in their more permanent homes, like the Wilson Lab, where they will be added to existing databases.

There is something else at play here. Katarina mentions that they may exclude samples from the analysis or remove data points from the produced dataset. Samples may have been damaged during shipment, or can be difficult to work with because Genos does not have enough information about their origins to assess their present status. That is, upon arrival at Genos, some samples may be excluded before they are even processed—they are left at the door and not given entry into the home that is Genos. Staff thus learn about the origins of their data outside their institution in order to examine whether they are worthy of inclusion in their home. Later on, the processing and analysis of samples may render some data points ‘problematic’ because of technical issues in the laboratory. Even with standardized processes and detailed protocols, Genos sometimes struggles to turn samples into data. We suggest this in fact reflects data that resist domestication. Despite Genos’s efforts, there are some samples and data that staff find difficult to control or transform as they see fit. In other words, some samples and data cannot easily be tamed, and in that respect, they remain wild. Such data points will be included in the produced dataset and travel to the laboratories who contracted Genos, but their belonging to the dataset is uncertain—they may not be trustworthy for biological analysis. Untamed data are not as valuable for data practitioners because they cannot be trusted or utilized with certainty for biological analysis. As such, the standardization and documentation efforts we discuss do more than track the history of data and enable their travel. They also shape the home and what goes into it. Crucially, as we employ the lens of domestication, we suggest conceptualizing them as rituals that enable staff to police the boundaries of their home by evaluating what data can be included, while pointing to data that cannot be controlled and should therefore be excluded.

At Genos, bringing in samples from the outside—that is, from the laboratories who sent samples to be analysed—thus entails very particular domestication processes. It first means tracing their origins and ‘taming’ them through randomization. Using standardized processes and detailed documentation, Genos transforms samples into omics data, giving new origins to the data produced, this time rooting data to the home that is Genos. However, this rooting is only momentary. At Genos, domestication means temporarily hedging in samples and transforming them into omics data in a way that enables the resulting data to travel easily and find more permanent homes outside Genos. Crucially, these standardization processes are intertwined with exclusion mechanisms, and as such, they shape the home by ruling what data should go into it.

Domestication as formatting

Once produced, the new omics data do not stay at Genos. Instead, they are sent in a digital format to laboratories, like the Wilson Lab, which have contracted Genos for the analysis of their samples. In this section, we focus our attention on the efforts of the Wilson Lab’s staff to bring the newly produced data home after their passage through Genos, the outside. We analyse what the travel of data from Genos to the Wilson Lab entails, unpacking the particular set of domestication practices involved.

The Wilson Lab’s database brings together an array of datasets. To house these datasets, the lab invested in a server that would store and make data accessible. In this setup, each dataset can be found according to a consistent file-path, together with accompanying documentation describing the data and their origins. The Wilson Lab server reminds us of the role of architecture in domestication practices. Anthropologist Wilson (2007) argued that the construction of permanent settlements creates the necessary circumstances under which plants and animals can be domesticated. In our case, we see the Wilson Lab server as the digital equivalent of physical enclosure in permanent settlements, as it provides the material circumstances for containing and domesticating data. The server constitutes the house where various datasets can be brought home after their travels and transformations in the wild. But it takes more than a digital structure (a house) for data to belong to the home. Through formatting along specific standards and tools, data practitioners domesticate data and thicken their ties to the home of the laboratory.

Pinel learned about such standards and tools during the Education Week, a yearly event which (in the words of senior analyst Michael organizing the event) is motivated by ‘the lack of standardization’ in the ways tools are used throughout the lab. Or as he puts it, ‘We’ve got data, and then we’ve got tools, but the [analysis] pipelines might be different from one person to another.’ These tools are valued artefacts that serve the important purpose of bringing in and processing data from their collaborators and partners producing or sharing data with them. Throughout the Education Week, data practitioners took turns presenting the different tools available within the Wilson Lab server. The first session was focused on a tool called a ‘munger’ that enables the merging of data files that researchers may have received from collaborators. Two PhD students, Sofia and Edith, presented the tool. In the Wilson Lab open-plan office, Sofia and Edith walked up to a large screen on one of the walls, each with their laptop. Edith connected hers to the screen, which then displayed a presentation. She focused on the code used in R studio to process the data. She explained that when one receives a data file from a collaborator, these data usually need to be formatted in ‘Wilson format.’ Edith talked through the different steps built into the tool. At the end of the presentation, Michael walked to the whiteboard and adds a few words about ‘the rationale for using this tool.’ He drew what he termed ‘the flow of data’: the ‘old data’ (the data they would receive from a collaborator), connected by an arrow to an ‘auto-process’ where data are being configured, connected by an arrow to the output, which is manipulated by ‘bespoke configuration in R.’ He explained that this tool is a powerful way to format data received from another source. Using this tool, practitioners transform their data as they enter the Wilson Lab. That is, data are tamed. Such domestication practices are essential because the Wilson Lab is dependent on partners beyond their walls, like Genos, to produce the data they work with. Or as Michael put it, ‘We can’t control what’s sent to us. But we can control our output.’

For data to dwell in the designed server, they have to be of a particular format. These are standards that dictate what sort of file types can be used, how to name files, how to structure them or what units should be used to indicate values. The Wilson Lab’s tools, like the one presented here enable analysts to rename the columns in a standardized fashion drawing upon the ‘Wilson format.’ Anna below explains the process of receiving data from a provider and reformatting it:

The most difficult thing, as always, is not really to do with the data, the science, but the format you get this in, in Excel spreadsheets. (…) If you are just looking at the document, it’s very helpful, they had things in different coloured texts to mean different things, and things in different coloured cells. But none of that reads in when you read it in a computer, so you have to manually code. So the most annoying thing was not even the measures themselves. It was the translating to something that was readable.

The Excel spreadsheets Anna portrayed as unreadable and difficult to work with are the sort of data she first deals with when receiving data from partners. She enacts a set of practices to import such data to make them fit the formats of the team. As data enter the home, they are transformed and data originating from the ‘outside’ become just like any other data dwelling in the home. These formatting processes that transform data to make them compatible with data coming from other sources resonate with Leonelli’s (2014) analysis of data practitioners’ work de- and re-contextualizing their data. However, thinking about them as part of domestication processes, we come to see how formatting data also means fostering familiarity and intimacy in the home. These formatting processes in fact function as another ritual happening in the home, which ‘welcomes’ the newly arrived data, at the same time that it makes practitioners feel at ease with them. As data practitioners like Anna format their data, they not only transform them, but also become familiar with and get to know them.

Finally, thinking about these processes through the lens of domestication helps us see how formatting data is also about imprinting data. In her study of Dolly the sheep, Franklin (2007) discusses domestication as imprinting, whereby particular social, cultural or biological patterns can be applied and ‘stamped’ on the domesticated entities. At the Wilson Lab, the standard formats function like stamps that can be glued to data and signal their belonging to the Wilson Lab’s house. Using the accepted tools, the data are formatted and as such, they are made to look like Wilson Lab data. One could also say that through formatting, the data are tattooed ‘Wilson Lab’ on their surface. This imprinting of data is a way for the lab to make data belong in its home by creating a lasting connection with the data. At the same time, data practitioners, in their home, imprint their data through formatting as a way to assert control and ownership over them, thus claiming that the data that have returned from the wild are indeed theirs.

The imprinting of data through formatting not only has implications for data, but also for data practitioners. Or as Franklin (2007) suggests, imprinting is a dual process that impacts both humans and the entities being domesticated. One could analyse the formatting of data originating from the wild as a training exercise for young scholars in the lab. As analysts use accepted tools to shape their data, they learn about the formats and standards the group operates with, therefore adopting particular practices that mark their belonging to the lab. By dwelling in the home and taking part in the rituals of formatting data, practitioners build ties to the home and thus cultivate their belonging to the laboratory. Along with scholars who underline the mutuality of domestication processes (Cassidy, 2007; Lien et al., 2018), we find that by domesticating and imprinting their data, the lab also domesticates and imprints its staff. Domesticating data entails domesticating people.

Domestication as cultivation

At the Wilson Lab, data are domesticated for the purpose of knowledge production. That is, now that data have been brought into the home of the Wilson Lab and formatted, they are not just accumulated. Rather, data practitioners see domesticated data as a resource that ought to be utilized to derive value. In this section, we focus our attention on this particular dimension of data domestication, analysing the Wilson Lab’s efforts to use the data in their home for the production of knowledge.

Broadly speaking, to use their data and exploit it for value production, data practitioners employ two sets of practices. First, they look to ‘understand their data’ and perform quality control in order to build a dataset that can function as a resource and be used for analysis. Second, they use analysis tools and methods to draw biological meanings out of their dataset. In the day-to-day life of the lab, these two sets of practices are not as clear cut, but rather intertwine and feed one another (e.g. through the analysis process, which includes several attempts and iterations, data practitioners often come to understand better their data). For the sake of our analysis, we describe them separately.

Let’s begin with data practitioners performing quality control. When starting a new project, analysts assess the data they are to work with. Specifically, they adopt visualization techniques to observe the distribution of their data and identify patterns. As they do so, data practitioners read the documentation provided by data producers like Genos. This helps them understand the origins of their data outside the laboratory, in order to explain some of its ‘behaviours.’ For example, analysts might be able to relate outliers in their dataset to events in the biography of data. The outliers that data analysts spot may be the ‘problematic’ data points, to use Katarina’s term, that data producers document and provide explanation for. One could argue that, at the Wilson Lab, these data points, even when formatted, remain wild—they are difficult to control and are the result of technical difficulties outside their home (the data producers), which makes it impossible for data practitioners to use them for knowledge production in their own home, at least at present. As such, untamed data at the Wilson Lab are unusable. Based on the documentation provided by data producers, analysts can decide to exclude such data points from their dataset. In other instances, however, such documentation can foster inclusion and lead to specific forms of data work. That is the case, for example, with batch effects. Below, Anna explained how she assesses the proteomics data she works with:

You usually want to have extra information. Like I know we have details on the proteins, I know we have it for other datasets as well, exactly how the samples were randomized. So, if the samples are on a plate, I know for each one of the samples what position it was on the plate. So that we correct for that. … Because we know that with QPCR [laboratory technique], there is—you get better measurements in the middle, and worse measurements to the edge, so we correct for that.

From the documentation provided by Genos, Anna learns about the batches in which the analysis was performed and relates this information to patterns in her data—for example, she understands that the different clusters of data points she observes are to do with analysis batches. To include them in her dataset for her analysis, Anna needs to ‘correct’ for batch effects, which means applying statistical methods to make datasets homogeneous and coherent for analysis. In the example above, after learning about the analysis batches their contractor used to produce their data, Anna corrected her data, thus rendering invisible some of the experimental variation shaping her data. Through statistical manipulation, it no longer matters what batch and in what position each sample was on that batch, but instead Anna obtains a harmonized dataset. As a result of this process, Anna gains a workable dataset that can be used in the production of knowledge. At the same time, the processes of quality control are ways to continue building familiarity with the data, as practitioners get to know further the data that have entered their home so they can utilize them for knowledge production.

Data practitioners at the Wilson Lab are sometimes also faced with data that challenge their standard practices for processing data in the home. This may be the case when staff deal with data types they have never worked with before or data originating from a new data provider. In such instances, ‘understanding one’s data’ and performing quality control can bring about changes to their own protocols. For example, in the quote below, Anna recalled that when receiving data from a new provider they were faced with different ways of reporting data:

So we get units that we sort of have to live with because it is a widely used company and lots of people have it for their cohorts, so we have to accept that these are the units that it comes in, and there is precedence in other papers, for how to use the data.… Once we made the decision to stick with, we were going to have that process to include those files that were below our level of detection. And document that.

For Anna, working with these new data meant adapting the lab’s standard procedures for processing data by including new files and documentation. We thus learn how living with new data in the home of the Wilson Lab sometimes entails adapting to it and changing domestic protocols to make the best use of the available data. Intimacy between data and practitioners thus brings about change, not just in the data, but also in the home, thereby changing practices, routines, and staff competences. Also noteworthy here is how the Wilson Lab standards that Anna used to domesticate her data before analysis are shared by many in the community of data scientists. Here, the company providing the Wilson Lab with data shapes what data processing can be done on the data now living in the home of the laboratory. In other words, the standards of the home are both shaped by and shape what counts as standards outside the home.

Through these processes of quality control and correction, data practitioners turn data points into datasets ready for analysis. As they do so, they mould data into a research resource that can be cultivated for knowledge production. This is something we are able to grasp because domestication implies resource making (Ween & Swanson, 2018). Resource-making is a social and political process that takes work, so as to transform a source that has potential for productive wealth into a resource that can be usable and lead to actual wealth (Ferry & Limbert, 2008).

In the Wilson Lab, data analysts do not just format their data and perform quality control to re-contextualize their data, but they transform their data into a resource for the home from which value can be extracted. Specifically, data practitioners learn about the data’s specificities because they want to know how best to use that resource. They are in fact often concerned by missing out on what they call ‘relevant associations’ present in the data, which might seem like outliers or ‘noise.’ Without knowing one’s data, the risk is that such portions of the data might be thrown out or corrected for because it is assumed that these outliers or noise are due to technical variation (and not biological variation). Or as Michael puts it, ‘What’s quite possible is that we destroy associations. So there could be things to find that we don’t find.’

Data domestication practices such as correction not only shape the datasets analysts work with, but they also mould what knowledge can be produced, and by association, they lead to forms of non-knowledge (Frickel et al., 2010; Pinel, 2022)—things that, because of the way data are shaped, cannot be found. As Kleinman and Suryanarayanan (2013) argue, ‘the production of knowledge is always matched by the corresponding production of ignorance.’ In their study of the role of insecticides in Colony Collapse Disorder, they demonstrate the influence of ‘epistemic forms’ defined as ‘the suite of concepts, methods, measures, and interpretations that shapes the ways in which actors produce knowledge and ignorance in their professional/intellectual fields of practice’ (Kleinman & Suryanarayanan, 2013). In our case, the practice of correcting data can be understood as an epistemic form that prescribes what counts as valuable data. Or as Bowker (2005) argues, the way we record information in archives profoundly shapes the kind of knowledge we produce. The domestication of data through their organizing and formatting also means silencing some data, which ultimately shapes what can and cannot be learnt from the data. However, for data practitioners at the Wilson Lab, becoming familiar with one’s data, learning about their origins, and performing quality control is a way to minimize non-knowledge. Instead of immediately removing what may be perceived as ‘noise’ in their data, data practitioners are encouraged to carefully assess their data and try to explain its behaviours.

The next step for data practitioners is to use the dataset they built for analysis. Many different analyses can be performed, and these depend on the projects data practitioners are involved in. One common analysis is a Genome-Wide Association Study (GWAS). In the Wilson Lab, a standard analysis pipeline is available to perform such an analysis. This pipeline is stored on the server, with documentation describing what it does and how to use it, with detailed R Studio commands, as well as examples. During the Education Week, Anna presented the GWAS tool to the group, demonstrating on her dataset what it does to data and the sort of results it produces. While she developed this analysis tool for her own project and based on her dataset, it can be used widely and applied to other datasets in the lab. When asked about the usefulness of such tools, Anna explained ‘it’s mostly about efficiency and reliability,’ meaning that members of the lab have built these tools based on Wilson Lab’s data, that the tools have been thoroughly tested and have proved efficient in producing results. Or as Anna put it, these tools help to ‘draw biological stories’ out of the available data. In other words, these analysis tools also constitute epistemic forms (Kleinman & Suryanarayanan, 2013) that shape what can be known from the data, and as a result, what we come to ignore.

At stake here is the cultivation of Wilson Lab’s data. Domestication is not just about controlling or transforming entities, but rather control and transformation in the home are ways to produce value from the domesticated entities (Anderson, 1997). Data practitioners enact a set of practices to turn their data into a resource that can be exploited to produce value. Using accepted tools, they shape and exploit their data to maximize the benefits to be derived from this resource. These data practices have implications for the sort of knowledge produced in that they function as epistemic forms which shape what we know, and consequently, what we ignore.

Discussion

Much of the critical scholarship on the contemporary data economy has focused on data journeys and how they are facilitated through specific forms of data work (Hoeyer et al., 2017; Kalender & Holmberg, 2019; Leonelli & Tempini, 2020). In this article, we too see data journeys as central to the contemporary data economy. But rather than focusing on travels per se and what enables them, we take the home, from which and to where data flow, as our analytical point of interest. The home, in the context of the data economy, constitutes a space delineated by borders, where data can be produced, stored, or used. To attend to the home, we draw on the concept of domestication and explore what happens to data as they come home after their travels across the data economy.

By thinking about data travels through the lens of domestication, we unpack how a series of rituals—such as storytelling and standardization—help make data dwell and form ties in the home of the laboratory. We also point out how, through practices such as formatting and quality control, data are transformed into a resource of the home from which value can be produced. Crucially, through the lens of domestication, we shed light on the mutually constitutive nature of such practices, whereby intimacy in the home not only shapes data but also data practitioners, their identity, belonging and skills. These practices also shape the home by policing its boundaries and ruling what data should go into it. By focusing on the home and pointing to its rituals, and the familiarity and intimacy it enables, the lens of domestication thus foregrounds rich forms of data work that go beyond practices of de- or re-contextualization (Leonelli, 2014). It brings to the fore the affective and ritualistic dimension of data travels, and how traveling across homes in the data economy is a deeply transformative process that leaves marks on data, but also on the people handling them and the laboratories through which they travel.

We suggest that the more data travel across the global data economy, the more data have to be domesticated. Crucially, well-domesticated data make data that can travel well and be used widely. When Genos staff adopt standard protocols to handle samples and carefully document how they turn samples into omics data, they not only tie the newly produced data to Genos, but also make possible data travels beyond Genos’s borders, enabling such data to be further domesticated in other homes. As such, the grounding of data in the home through domestication facilitates their travels through the global data economy.

As we unpack data domestication practices, we foreground the politics of belonging (Yuval-Davis, 2011) inherent to the data economy. While data may travel widely, they come with ties to places and people which have to be created, organized and mapped out for data to be valuable instruments. Data, like people, always come from somewhere, and living in a particular place means building a home for oneself with ties to a space and its people. The domestication of data in the home of the laboratory creates and organizes such ties. More broadly, we observe how domestication processes are involved in drawing the very boundaries delineating homes in the data economy. As data practitioners bring certain data into the home, they in fact enact the boundaries of the home by articulating who should be let in and who should be left out.

In this article, we come to see the global data economy as a collection of homes connected by flows, thus echoing the etymology of the term ‘economy’, originating from the Greek words ‘oikos’ (the house) and ‘nomos’ (management). Put that way, rather than seeing the global data economy as constituted by endless flows, we come to think about it as household management that happens simultaneously across several homes. Managing the home means opening the door to the outside to let some data in, while it also entails welcoming and shaping the data that have entered. These insights, we argue, shed an important light on valuation processes in the data economy. Rather than thinking about valuation as a matter of flows and controls across boundaries (Pinel, 2021; Vezyridis & Timmons, 2021), we see value creation in the data economy as a matter of belonging. Crucially, we show how rendering data valuable in the home means making them belong. This involves crafting and organizing the data’s ties to the home, rather than only imposing control and claiming ownership over data that have travelled.

Data domestication practices, while they take place in the home, have implications for the sort of knowledge produced and left undone. As data practitioners tame data, using the standards and formats operating in the home, they make some things present while they keep absent what cannot be tamed, what does not fit or what cannot be documented (Ratner & Ruppert, 2019). These processes mould data flows across the global data economy, while they shape the data for knowledge production. While these data practices are aimed at making the most of the data available to produce knowledge about the world, they also silence certain data, and in that way constitute gaps in both data and knowledge.

Acknowledgments

Our first thanks go to members of the laboratories, who generously let Clémence into their daily work and enthusiastically shared their experiences and knowledge. We also thank the MeInWe team at the University of Copenhagen for stimulating discussions and providing valuable feedback on earlier versions of this article. Finally, we are grateful to the three anonymous reviewers for their insightful comments on the manuscript.

Author biographies

Clémence Pinel is a postdoc within the Centre for Medical Science and Technology Studies at the University of Copenhagen. In 2018, she completed her PhD at King’s College London and was awarded the Doctoral Prize from the UK Association for Studies in Innovation, Science and Technology (AsSIST-UK). Her work focuses on knowledge production and valuation practices in the life sciences with a specific interest for data-intensive research.

Mette N. Svendsen is Professor in Medical Anthropology in the Centre for Medical Science and Technology Studies at the University of Copenhagen. She has pioneered several team-based multisited, multispecies, and interdisciplinary research projects. Her book Near Human (2022) made significant contributions to the study of ethical, existential, and political aspects of life science research and medical practice.

Footnotes

Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Carlsberg Foundation (grant number CF17-0016, PI Mette N. Svendsen).

ORCID iD: Clémence Pinel Inline graphic https://orcid.org/0000-0003-3874-0373

References

Anderson K. (1997). A walk on the wild side: A critical geography of domestication. Progress in Human Geography, 21(4), 463–485. [Google Scholar]
Berg M., Goorman E. (1999). The contextual nature of medical information. International Journal of Medical Information, 56(1–3), 51–60. [DOI] [PubMed] [Google Scholar]
Birch K., Cochrane D., Ward C. (2021). Data as asset? The measurement, governance, and valuation of digital personal data by Big Tech. Big Data & Society, 8(1), 20539517211017308. [Google Scholar]
Bonde M., Bossen C., Danholt P. (2019). Data-work and friction: Investigating the practices of repurposing healthcare data. Health Informatics Journal, 25(3), 558–566. [DOI] [PubMed] [Google Scholar]
Bowker G. C. (2005). Memory Practices in the Sciences. MIT Press. [Google Scholar]
Cassidy R. (2007). Introduction: Domestication reconsidered. In Cassidy R., Mullin M. (Eds.), Where the wild things are now. Domestication reconsidered (pp. 1–26). Berg Publishers. [Google Scholar]
Childe G. (1928). The Most Ancient East: The Oriental Prelude to European Pre-history. Kegan Paul, Trench, Trubner. [Google Scholar]
Ciborra C. (1999). Hospitality and IT. In Kakola T. (Ed.), Proceedings of the 22nd Information Systems Research Seminar in Scandinavia (IRIS 22) (pp. 1–9). University of Jyväskylä, Department of Computer Science and Information Systems. [Google Scholar]
Clark N. (2007). Animal interface: The generosity of domestication. In Cassidy R., Mullin M. (Eds.) Where the wild things are now. Domestication reconsidered (pp. 49–70). Berg Publishers. [Google Scholar]
Clutton-Brock J. (1989). Introduction to domestication. In Clutton-Brock J. (Ed.), The walking larder: Patterns of domestication, pastoralism, and predation (pp. 7–9). Unwin Hyman. [Google Scholar]
Coolen H., Meesters J. (2012). Editorial special issue: House, home and dwelling. Journal of Housing and the Built Environment, 27(1), 1–10. [Google Scholar]
Diprose R. (2002). Corporeal generosity: On giving with Nietzsche, Merleau-Ponty, and Levinas. State University of New York Press. [Google Scholar]
Edwards P. N., Mayernik M. S., Batcheller A. L., Bowker G. C., Borgman C. L. (2011). Science friction: Data, metadata, and collaboration. Social Studies of Science, 41(5), 407–414. [DOI] [PubMed] [Google Scholar]
Faulkner-Gurstein R., Wyatt D. (2021). Platform NHS: Reconfiguring a public service in the age of digital capitalism. Science, Technology, & Human Values, 48(4), 888–908. [Google Scholar]
Ferry E., Limbert M. (2008). Introduction. In Ferry E., Limbert M. (Eds.), Timely assets. The politics of resources and their temporalities (pp. 3–24). School for Advanced Research Press. [Google Scholar]
Franklin S. (2007). Dolly mixtures. The remaking of genealogy. Duke University Press. [Google Scholar]
Frickel S., Gibbon S., Howard J., Kempner J., Ottinger G., Hess D. J. (2010). Undone science: Charting social movement and civil society challenges to research agenda setting. Science, Technology, & Human Values, 35(4), 444–473. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gieryn T. F. (1999). Cultural boundaries of science. University of Chicago Press. [Google Scholar]
Ginsburg F. (1987). Procreation stories: Reproduction, nurturance, and procreation in life narratives of abortion activists. American Ethnologist, 14(4), 623–636. [Google Scholar]
Haddon L. (2011). Domestication analysis, objects of study, and the centrality of technologies in everyday life. Canadian Journal of Communication, 36(2), 311–323. [Google Scholar]
Haraway D. J. (2007). When species meet. University of Minnesota Press. [Google Scholar]
Hinterberger A. (2012). Investing in life, investing in difference: Nations, populations and genomes. Theory, Culture & Society, 29(3), 72–93. [Google Scholar]
Hoeyer K., Tupasela A., Rasmussen M. B. (2017). Ethics policies and ethics work in cross-national genetic research and data sharing: Flows, nonflows, and overflows. Science, Technology, & Human Values, 42(3), 381–404. [Google Scholar]
Ingold T. (2000). The perception of the environment. Essays on livelihood, dwelling and skill. Routledge. [Google Scholar]
Kalender U., Holmberg C. (2019). Courtesy work: Care practices for quality assurance in a cohort study. Social Studies of Science, 49(4), 583–604. [DOI] [PubMed] [Google Scholar]
Kleinman D. L., Suryanarayanan S. (2013). Dying bees and the social production of ignorance. Science, Technology, & Human Values, 38(4), 492–517. [Google Scholar]
Lanzara G. (2016). Shifting practices: Reflections on technology, practice, and innovation. MIT Press. [Google Scholar]
Leonelli S. (2012). Introduction: Making sense of data-driven research in the biological and biomedical sciences. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 43(1), 1–3. [DOI] [PubMed] [Google Scholar]
Leonelli S. (2014). What difference does quantity make? On the epistemology of big data in biology. Big Data & Society, 1(1), 2053951714534395. [DOI] [PMC free article] [PubMed] [Google Scholar]
Leonelli S. (2016). Data-centric biology: A philosophical study. University of Chicago Press. [Google Scholar]
Leonelli S., Tempini N. (Eds.) (2020). Data journeys in the sciences. Springer. [Google Scholar]
Lien M. E., Swanson H. A., Ween G. (2018). Introduction: Naming the beast - exploring the otherwise. In Swanson H. A., Lien M. E., Ween G. (Eds.), Domesticaton gone wild, politics and practices of multispecies relations (pp. 1–30). Duke University Press. [Google Scholar]
Malinowski B. (1992. [1922]). Argonauts of the Western Pacific. Routledge. [Google Scholar]
Mayernik M. S. (2019). Metadata accounts: Achieving data and evidence in scientific research. Social Studies of Science, 49(5), 732–757. [DOI] [PMC free article] [PubMed] [Google Scholar]
Meloni M. (2016). Political biology: Science and social values in human heredity from eugenics to epigenetics. Palgrave. [Google Scholar]
Morgan L. (2009). Icons of life: A cultural history of human embryos. University of California Press. [Google Scholar]
Pinel C. (2021). Renting valuable assets: Knowledge and value production in academic science. Science, Technology, & Human Values, 46(2), 275–297. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pinel C. (2022). What counts as the environment in epigenetics? Knowledge and ignorance in the entrepreneurial university. Science as Culture, 31(3), 311–333. [Google Scholar]
Pinel C., Prainsack B., Mckevitt C. (2020). Caring for data: Value creation in a data-intensive research laboratory. Social Studies of Science, 50(2), 175–197. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pinel C., Svendsen M. N. (2021). In search of ‘extra data’: Making tissues flow from personal to personalised medicine. Big Data & Society. Advance online publication. 10.1177/20539517211035664 [DOI]
Pols J., Willems D. (2011). Innovation and evaluation: Taming and unleashing telecare technology. Sociology of Health & Illness, 33(3), 484–498. [DOI] [PubMed] [Google Scholar]
Ratner H., Ruppert E. (2019). Producing and projecting data: Aesthetic practices of government data portals. Big Data & Society, 6(2), 2053951719853316. [Google Scholar]
Sheller M., Urry J. (2006). The new mobilities paradigm. Environment and Planning A: Economy and Space, 38(2), 207–226. [Google Scholar]
Silverstone R. (1994). Television and everyday life. Routledge. [Google Scholar]
Sovacool B. K., Hess D. J. (2017). Ordering theories: Typologies and conceptual frameworks for sociotechnical change. Social Studies of Science, 47(5), 703–750. [DOI] [PMC free article] [PubMed] [Google Scholar]
Svendsen M. N. (2022). Near human. Border zones of species, life, and belonging. Rutgers University Press. [Google Scholar]
Torenholt R., Saltbæk L., Langstrup H. (2020). Patient data work: Filtering and sensing patient-reported outcomes. Sociology of Health & Illness, 42(6), 1379–1393. [Google Scholar]
Vezyridis P., Timmons S. (2021). E-Infrastructures and the divergent assetization of public health data: Expectations, uncertainties, and asymmetries. Social Studies of Science, 51(4), 606–627. [DOI] [PubMed] [Google Scholar]
Ween G., Swanson H. A. (2018). How the salmon found its way home: Science, state ownership, and the domestication of wild fish. In Swanson H. A., Lien M. E., Ween G. (Eds.), Domestication gone wild. Politics and practices of multispecies relations (pp. 196–214). Duke University Press. [Google Scholar]
Wilson P. (2007). Agriculture or architecture? The beginning of domestication. In Cassidy R., Mullin M. (Eds.), Where the wild things are now. Domestication reconsidered (pp. 101–122). Berg Publishers. [Google Scholar]
Yuval-Davis N. (2011). The politics of belonging: Intersectional contestations. Sage. [Google Scholar]

[bibr1-03063127231212506] Anderson K. (1997). A walk on the wild side: A critical geography of domestication. Progress in Human Geography, 21(4), 463–485. [Google Scholar]

[bibr2-03063127231212506] Berg M., Goorman E. (1999). The contextual nature of medical information. International Journal of Medical Information, 56(1–3), 51–60. [DOI] [PubMed] [Google Scholar]

[bibr3-03063127231212506] Birch K., Cochrane D., Ward C. (2021). Data as asset? The measurement, governance, and valuation of digital personal data by Big Tech. Big Data & Society, 8(1), 20539517211017308. [Google Scholar]

[bibr4-03063127231212506] Bonde M., Bossen C., Danholt P. (2019). Data-work and friction: Investigating the practices of repurposing healthcare data. Health Informatics Journal, 25(3), 558–566. [DOI] [PubMed] [Google Scholar]

[bibr5-03063127231212506] Bowker G. C. (2005). Memory Practices in the Sciences. MIT Press. [Google Scholar]

[bibr6-03063127231212506] Cassidy R. (2007). Introduction: Domestication reconsidered. In Cassidy R., Mullin M. (Eds.), Where the wild things are now. Domestication reconsidered (pp. 1–26). Berg Publishers. [Google Scholar]

[bibr7-03063127231212506] Childe G. (1928). The Most Ancient East: The Oriental Prelude to European Pre-history. Kegan Paul, Trench, Trubner. [Google Scholar]

[bibr8-03063127231212506] Ciborra C. (1999). Hospitality and IT. In Kakola T. (Ed.), Proceedings of the 22nd Information Systems Research Seminar in Scandinavia (IRIS 22) (pp. 1–9). University of Jyväskylä, Department of Computer Science and Information Systems. [Google Scholar]

[bibr9-03063127231212506] Clark N. (2007). Animal interface: The generosity of domestication. In Cassidy R., Mullin M. (Eds.) Where the wild things are now. Domestication reconsidered (pp. 49–70). Berg Publishers. [Google Scholar]

[bibr10-03063127231212506] Clutton-Brock J. (1989). Introduction to domestication. In Clutton-Brock J. (Ed.), The walking larder: Patterns of domestication, pastoralism, and predation (pp. 7–9). Unwin Hyman. [Google Scholar]

[bibr11-03063127231212506] Coolen H., Meesters J. (2012). Editorial special issue: House, home and dwelling. Journal of Housing and the Built Environment, 27(1), 1–10. [Google Scholar]

[bibr12-03063127231212506] Diprose R. (2002). Corporeal generosity: On giving with Nietzsche, Merleau-Ponty, and Levinas. State University of New York Press. [Google Scholar]

[bibr13-03063127231212506] Edwards P. N., Mayernik M. S., Batcheller A. L., Bowker G. C., Borgman C. L. (2011). Science friction: Data, metadata, and collaboration. Social Studies of Science, 41(5), 407–414. [DOI] [PubMed] [Google Scholar]

[bibr14-03063127231212506] Faulkner-Gurstein R., Wyatt D. (2021). Platform NHS: Reconfiguring a public service in the age of digital capitalism. Science, Technology, & Human Values, 48(4), 888–908. [Google Scholar]

[bibr15-03063127231212506] Ferry E., Limbert M. (2008). Introduction. In Ferry E., Limbert M. (Eds.), Timely assets. The politics of resources and their temporalities (pp. 3–24). School for Advanced Research Press. [Google Scholar]

[bibr16-03063127231212506] Franklin S. (2007). Dolly mixtures. The remaking of genealogy. Duke University Press. [Google Scholar]

[bibr17-03063127231212506] Frickel S., Gibbon S., Howard J., Kempner J., Ottinger G., Hess D. J. (2010). Undone science: Charting social movement and civil society challenges to research agenda setting. Science, Technology, & Human Values, 35(4), 444–473. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr18-03063127231212506] Gieryn T. F. (1999). Cultural boundaries of science. University of Chicago Press. [Google Scholar]

[bibr19-03063127231212506] Ginsburg F. (1987). Procreation stories: Reproduction, nurturance, and procreation in life narratives of abortion activists. American Ethnologist, 14(4), 623–636. [Google Scholar]

[bibr20-03063127231212506] Haddon L. (2011). Domestication analysis, objects of study, and the centrality of technologies in everyday life. Canadian Journal of Communication, 36(2), 311–323. [Google Scholar]

[bibr21-03063127231212506] Haraway D. J. (2007). When species meet. University of Minnesota Press. [Google Scholar]

[bibr22-03063127231212506] Hinterberger A. (2012). Investing in life, investing in difference: Nations, populations and genomes. Theory, Culture & Society, 29(3), 72–93. [Google Scholar]

[bibr23-03063127231212506] Hoeyer K., Tupasela A., Rasmussen M. B. (2017). Ethics policies and ethics work in cross-national genetic research and data sharing: Flows, nonflows, and overflows. Science, Technology, & Human Values, 42(3), 381–404. [Google Scholar]

[bibr24-03063127231212506] Ingold T. (2000). The perception of the environment. Essays on livelihood, dwelling and skill. Routledge. [Google Scholar]

[bibr25-03063127231212506] Kalender U., Holmberg C. (2019). Courtesy work: Care practices for quality assurance in a cohort study. Social Studies of Science, 49(4), 583–604. [DOI] [PubMed] [Google Scholar]

[bibr26-03063127231212506] Kleinman D. L., Suryanarayanan S. (2013). Dying bees and the social production of ignorance. Science, Technology, & Human Values, 38(4), 492–517. [Google Scholar]

[bibr27-03063127231212506] Lanzara G. (2016). Shifting practices: Reflections on technology, practice, and innovation. MIT Press. [Google Scholar]

[bibr28-03063127231212506] Leonelli S. (2012). Introduction: Making sense of data-driven research in the biological and biomedical sciences. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 43(1), 1–3. [DOI] [PubMed] [Google Scholar]

[bibr29-03063127231212506] Leonelli S. (2014). What difference does quantity make? On the epistemology of big data in biology. Big Data & Society, 1(1), 2053951714534395. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr30-03063127231212506] Leonelli S. (2016). Data-centric biology: A philosophical study. University of Chicago Press. [Google Scholar]

[bibr31-03063127231212506] Leonelli S., Tempini N. (Eds.) (2020). Data journeys in the sciences. Springer. [Google Scholar]

[bibr32-03063127231212506] Lien M. E., Swanson H. A., Ween G. (2018). Introduction: Naming the beast - exploring the otherwise. In Swanson H. A., Lien M. E., Ween G. (Eds.), Domesticaton gone wild, politics and practices of multispecies relations (pp. 1–30). Duke University Press. [Google Scholar]

[bibr33-03063127231212506] Malinowski B. (1992. [1922]). Argonauts of the Western Pacific. Routledge. [Google Scholar]

[bibr34-03063127231212506] Mayernik M. S. (2019). Metadata accounts: Achieving data and evidence in scientific research. Social Studies of Science, 49(5), 732–757. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr35-03063127231212506] Meloni M. (2016). Political biology: Science and social values in human heredity from eugenics to epigenetics. Palgrave. [Google Scholar]

[bibr36-03063127231212506] Morgan L. (2009). Icons of life: A cultural history of human embryos. University of California Press. [Google Scholar]

[bibr37-03063127231212506] Pinel C. (2021). Renting valuable assets: Knowledge and value production in academic science. Science, Technology, & Human Values, 46(2), 275–297. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr38-03063127231212506] Pinel C. (2022). What counts as the environment in epigenetics? Knowledge and ignorance in the entrepreneurial university. Science as Culture, 31(3), 311–333. [Google Scholar]

[bibr39-03063127231212506] Pinel C., Prainsack B., Mckevitt C. (2020). Caring for data: Value creation in a data-intensive research laboratory. Social Studies of Science, 50(2), 175–197. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr40-03063127231212506] Pinel C., Svendsen M. N. (2021). In search of ‘extra data’: Making tissues flow from personal to personalised medicine. Big Data & Society. Advance online publication. 10.1177/20539517211035664 [DOI]

[bibr41-03063127231212506] Pols J., Willems D. (2011). Innovation and evaluation: Taming and unleashing telecare technology. Sociology of Health & Illness, 33(3), 484–498. [DOI] [PubMed] [Google Scholar]

[bibr42-03063127231212506] Ratner H., Ruppert E. (2019). Producing and projecting data: Aesthetic practices of government data portals. Big Data & Society, 6(2), 2053951719853316. [Google Scholar]

[bibr43-03063127231212506] Sheller M., Urry J. (2006). The new mobilities paradigm. Environment and Planning A: Economy and Space, 38(2), 207–226. [Google Scholar]

[bibr44-03063127231212506] Silverstone R. (1994). Television and everyday life. Routledge. [Google Scholar]

[bibr45-03063127231212506] Sovacool B. K., Hess D. J. (2017). Ordering theories: Typologies and conceptual frameworks for sociotechnical change. Social Studies of Science, 47(5), 703–750. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr46-03063127231212506] Svendsen M. N. (2022). Near human. Border zones of species, life, and belonging. Rutgers University Press. [Google Scholar]

[bibr47-03063127231212506] Torenholt R., Saltbæk L., Langstrup H. (2020). Patient data work: Filtering and sensing patient-reported outcomes. Sociology of Health & Illness, 42(6), 1379–1393. [Google Scholar]

[bibr48-03063127231212506] Vezyridis P., Timmons S. (2021). E-Infrastructures and the divergent assetization of public health data: Expectations, uncertainties, and asymmetries. Social Studies of Science, 51(4), 606–627. [DOI] [PubMed] [Google Scholar]

[bibr49-03063127231212506] Ween G., Swanson H. A. (2018). How the salmon found its way home: Science, state ownership, and the domestication of wild fish. In Swanson H. A., Lien M. E., Ween G. (Eds.), Domestication gone wild. Politics and practices of multispecies relations (pp. 196–214). Duke University Press. [Google Scholar]

[bibr50-03063127231212506] Wilson P. (2007). Agriculture or architecture? The beginning of domestication. In Cassidy R., Mullin M. (Eds.), Where the wild things are now. Domestication reconsidered (pp. 101–122). Berg Publishers. [Google Scholar]

[bibr51-03063127231212506] Yuval-Davis N. (2011). The politics of belonging: Intersectional contestations. Sage. [Google Scholar]

PERMALINK

Domesticating data: Traveling and value-making in the data economy

Clémence Pinel

Mette N Svendsen

Abstract

Domesticating the wild, building a home

The study and methods

Domestication as storytelling

Domestication as standardization

Domestication as formatting

Domestication as cultivation

Discussion

Acknowledgments

Author biographies

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Domesticating data: Traveling and value-making in the data economy

Clémence Pinel

Mette N Svendsen

Abstract

Domesticating the wild, building a home

The study and methods

Domestication as storytelling

Domestication as standardization

Domestication as formatting

Domestication as cultivation

Discussion

Acknowledgments

Author biographies

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases