Aquarium: open-source laboratory software for design, execution and data management

Page created by Elmer Patterson
 
CONTINUE READING
Aquarium: open-source laboratory software for design, execution and data management
Synthetic Biology, 2021, 6(1): ysab006

                                                                                doi: 10.1093/synbio/ysab006
                                                                                Advance Access Publication Date: 30 January 2021
                                                                                Research Article

Aquarium: open-source laboratory software for design,
execution and data management

                                                                                                                                                           Downloaded from https://academic.oup.com/synbio/article/6/1/ysab006/6124325 by guest on 13 December 2021
Justin Vrana 1, Orlando de Lange 2, Yaoyu Yang 2, Garrett Newman2,
Ayesha Saleem2, Abraham Miller2, Cameron Cordray2, Samer Halabiya2,
Michelle Parks2, Eriberto Lopez2, Sarah Goldberg 2, Benjamin Keller 2,
Devin Strickland 2, and Eric Klavins2,*
1
 Department of Bioengineering, University of Washington, Seattle, WA, USA and 2Department of Electrical
and Computer Engineering, University of Washington, Seattle, WA, USA
*Corresponding author: E-mail: klavins@uw.edu

Abstract
Automation has been shown to improve the replicability and scalability of biomedical and bioindustrial research. Although
the work performed in many labs is repetitive and can be standardized, few academic labs can afford the time and money
required to automate their workflows with robotics. We propose that human-in-the-loop automation can fill this critical
gap. To this end, we present Aquarium, an open-source, web-based software application that integrates experimental
design, inventory management, protocol execution and data capture. We provide a high-level view of how researchers can
install Aquarium and use it in their own labs. We discuss the impacts of the Aquarium on working practices, use in biofoun-
dries and opportunities it affords for collaboration and education in life science laboratory research and manufacture.

Key words: automation; replicability; software; LIMS

1 Introduction                                                                  to either top-down enforcement, as with clinical research, or to
                                                                                the fact that doing so is simpler and more reliable, as with kits
As the scale of scientific research expands, systems to support                 for common procedures such as plasmid DNA isolation.
replicable methods are increasingly important (1). Critical to                  Similarly, record keeping varies from hand-written laboratory
replicability is the question of how experiments are described                  notebooks and manually curated digital records to fully struc-
and how closely these descriptions are followed. Working prac-                  tured electronic notebooks.
tices for conducting biology experiments fall on a continuum                        A consequence of less structured approaches is that many
between artisanal and highly standardized. On one end, idio-                    experiments cannot be repeated by different researchers (2, 3).
syncratic decisions made on the fly by a few people appear                      These types of issues are typically described under the terms
throughout the experimental design. At worst, these decisions                   replicability (reproducibility of results) (4–7). While replicability in
are poorly documented, while at best they introduce extra ex-                   science is a complex and multifaceted issue (8, 9), enforcement
perimental factors that make results challenging to compare                     of standardization of experimental work can result in more rep-
and replicate. At the other end, researchers follow well-                       licable data collection (10, 11). However, the reality is that main-
established protocols and do not deviate from them. This is due                 taining this standardization and record keeping is laborious and

Submitted: 1 October 2020; Received (in revised form): 12 January 2021; Accepted: 22 January 2021
C The Author(s) 2021. Published by Oxford University Press.
V
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/
licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
For commercial re-use, please contact journals.permissions@oup.com

                                                                                                                                                  1
Aquarium: open-source laboratory software for design, execution and data management
2   | Synthetic Biology, 2021, Vol. 6, No. 1

researchers often fail to enforce strict standards on themselves     design plans using a graphical user interface (GUI) that resem-
at the bench (12).                                                   bles a sketch board (Figure 1). Plans are built from workflows,
    Robotic automation has been proposed as a solution for rep-      essentially stereotyped series of procedures, in which materials
licable large-scale experimentation by delivering consistent per-    pass from an initial state to a final state. Within plans, each in-
formance of delicate high-throughput procedures (13).                put sample passes through a series of work modules termed
However, commercially available liquid handling robots and           operations (Figure 2b), to produce desired output samples and
general-purpose manipulators have high-upfront costs and are         data. For example, a plan that ends with a sequence-verified
laborious to reconfigure or reprogram (14). While robotic sys-       plasmid stock (Figure 1) might include a series of operations
tems are continually improving, labs solely reliant on robots to     such as PCR, DNA assembly, bacterial transformation and plas-
carry out experimental work are rare. Many experiments in-           mid DNA purification. A plan may represent a fixed workflow
volve tasks for which a human operator is well suited and more       that always executes in the same way, or it may be extended as
cost-effective than currently available robots; for example, tasks   it is executed. Thus, enabling the use of Aquarium for either
involving delicate hand-eye coordination or variable inputs and      manufacturing or exploratory research and development.

                                                                                                                                            Downloaded from https://academic.oup.com/synbio/article/6/1/ysab006/6124325 by guest on 13 December 2021
protocols. Hence, it seems likely that human researchers will be         Operations correspond to units of work that can be per-
involved in benchtop experimentation for some time, and thus         formed on one sample, by one person, within a single work ses-
the potential for non-standardized execution and sub-optimal         sion. Each operation will generally output a sample that can be
record keeping will remain as a concern.                             stored or used in a variety of other operations. Operations are
    To address this challenge, we built Aquarium—a web-based         wired together such that the output of one operation is auto-
application that integrates experimental design, inventory           matically routed to and triggers the execution of one or more
management, protocol execution, and data collection.                 subsequent operations (Figure 2b). Each operation is defined by
Aquarium supports flexible development and deployment of             valid inputs and outputs as well as a detailed laboratory proto-
standardized workflows, composed of modular protocols that           col, written in Krill, that renders on-screen instructions to guide
drive on-screen, step-by-step instructions for human techni-         technicians. An Aquarium plan can encode arbitrarily large and
cians. During execution, experimental data and metadata are
                                                                     complex programs of work that progress automatically. As the
captured in forms or uploaded as files. The software automates
                                                                     plan is executed, the state of inventory is automatically updated
computations involved in preparing and tracking samples
                                                                     and data are captured, stored and made available through the
through protocol execution. Aquarium also provides features to
                                                                     GUI as well as the Python API.
plan complex experiments involving many samples and
protocols.
    Aquarium integrates two key software innovations:                2.2 Executing laboratory work
Aquarium Workflow Language (AWL) for defining custom labo-           Aquarium contains its own laboratory information manage-
ratory workflows and Krill, a protocol language for describing       ment system (LIMS) that tracks lab inventory. Through the
replicable laboratory instructions. AWL is a dataflow program-       LIMS, changes in inventory are recorded automatically as a part
ming language (15) that represents a laboratory workflow as a        of protocol and workflow execution, rather than requiring man-
network of modular work units linked by inputs and outputs.          ual updates. Hence, the workflow planner and Krill can reliably
Borrowing concepts from visual programming languages, such           use the LIMS as an up-to-date representation of the laboratory.
as Scratch (16), protocols are represented graphically as blocks         In Aquarium, a physical object is referred to as an item. Each
that can be wired together to create workflows. Krill is a Ruby      item has a recorded location and may have associated data
domain-specific language that complements AWL by capturing           (Figure 3). An item is an instance of a sample, which is a class of
granular instructions for a protocol as computer code. In addi-      physical objects in the laboratory defined by a set of descriptors
tion to complex procedural steps such as if-then statements,         determined by a sample type. The information fields used to de-
loops and calculations, Krill has methods to facilitate sample       fine each sample type are chosen by the user and then apply to
flow management and render instructions for technicians              every sample of that sample type. For example, a user may de-
working at the bench. Through AWL and Krill, Aquarium pro-           fine a ‘plasmid’ sample type, including fields for information on
vides interactive web-based interfaces to build executable pro-      sequence, length and selectable markers, which would have to
tocols, design experimental workflows based on these                 be defined for each plasmid sample in the database. Using sam-
protocols, manage the execution of protocols in the lab and au-      ple types ensures that inventory descriptions are standardized
tomatically record the resulting data.                               while allowing the flexibility of custom definitions based on
    Aquarium also features a Python application program inter-       user needs.
face (API), called Trident that provides a common interface for          Each item belongs to a user-defined object type. One key pa-
other applications and scripts to interact with Aquarium, for ex-    rameter for the definition of an object type is the default location
ample, in planning complex workflows or extracting detailed          wizard for items of this object type. Location wizards correspond
datasets. These three programmatic interfaces, combined with
                                                                     to storage locations such as fridges and freezers and are repre-
inventory management and human-centered execution, make
                                                                     sented as matrices with unique numbered positions for items
Aquarium a comprehensive, open-source software platform
                                                                     within boxes, organized into rows and shelves (Figure 3b). Other
that facilitates low-cost scaling of laboratory research while
                                                                     than the location wizard, the name of each object type is its
retaining replicability and flexibility.
                                                                     most important defined parameter and will indicate a physical
                                                                     state of the sample, reflecting how it manifests or is used in the
2 Results                                                            laboratory. For example, a plasmid sample might have items as-
                                                                     sociated with it belonging to a number of object types such as
2.1 Planning laboratory work with Aquarium
                                                                     ‘Plasmid miniprep’, ‘Gibson assembly reaction product’ or ‘E. coli
From the perspective of the researcher, planning laboratory          overnight culture’. As items are generated in the course of labo-
work is the primary interaction with Aquarium. Researchers           ratory work, they are automatically assigned ID numbers as
Synthetic Biology, 2021, Vol. 6, No. 1 |           3

                                                                                                                                                                             Downloaded from https://academic.oup.com/synbio/article/6/1/ysab006/6124325 by guest on 13 December 2021
Figure 1. Workflow designer interface. Experimental plans can be created by dragging operation types (not shown) onto the designer canvas (right side) to create opera-
tions. Selecting a given operation input or output node users are prompted to select from a list of compatible up or downstream operations to create a custom work-
flow. Available inventory for each operation input is selected in the input view (bottom) and the I/O view (left). Designer tools are available for creating templates and
modifying/copying plans. Additionally there are several plan tools (top left) available for investigating input/output specifications, managing existing plans, and
launching plans. The designer also features annotation capabilities, allowing embedded text (such as Markdown; https://daringfireball.net/projects/markdown/),
images or links.

well as locations according to the location wizard defined by the                       and generate data, add and remove inventory, display videos
item’s object type.                                                                     and photos, retrieve user input, create interactive timers, output
   Once the items required to initiate an operation in a plan are                       audio alarms and send emails, among other functions. Krill
available in the lab, the inputs to the operation are satisfied and                     extends Ruby (17), a popular, dynamic, object-oriented language
the operation can be executed. In the manager subsystem, an                             used in web development. To facilitate development, Krill pro-
executable operation is batched into a job with operations of the                       vides ways of creating libraries of reusable code. A version con-
same type to be executed together (Figure 2). A job may include                         trol system allows a lab to record changes to their protocols over
operations from many different plans and researchers but can                            time, or revert their protocols to past versions.
only be created from operations of the same type. A strict exe-                             The Krill protocol is supplemented by two functions that fa-
cution policy governs how and when operations can be batched                            cilitate the proper execution of the protocol (Figure 4). The cost
into jobs and executed in the lab (Supplementary Figure S1).                            model calculates how much an operation may cost at the time of
Running a job launches a graphical user interface that displays                         execution. Cost models can use any information from
step-by-step instructions to guide a technician through the                             Aquarium to perform cost calculations, but typically calcula-
steps of the operation protocol (Figure 4).                                             tions use properties of samples or items used in an operation.
                                                                                        For example, an operation that orders a synthesized piece of
                                                                                        DNA may use the length and sequence of the DNA and check
2.3 Rendering instructions using the krill protocol
                                                                                        with a vendor website to establish an accurate monetary cost.
language
                                                                                        Some protocols may be labor-intensive, and so operation cost
The Krill protocol language produces detailed, context-specific                         models can include an estimation of the ‘labor rate’ and the av-
instructions for how materials and data should be handled for                           erage length of time required to complete the protocol.
each operation (Figure 4). To accomplish this, Krill provides                           Additional features in Aquarium allow the generation of
methods that allow arbitrary computations to be rendered dy-                            monthly spending reports and budget tracking for each user or
namically, so that specific instructions presented to the techni-                       user group. The precondition defines conditions required for a
cian can be made to reflect not only the number of items being                          protocol to be run; the default precondition is always true.
processed, their locations and ID numbers, but also the results of                      Preconditions are a critical part of an operation’s execution pol-
calculations such as pipetting volumes based on the molarity of                         icy (Supplementary Figure S1) and can be used to institute more
a solution. Thus, a Krill protocol describes a procedure in enough                      complex experimental workflows. For instance, enforcing a 12 h
detail that it can be replicated by another person or lab by follow-                    delay to wait for an E. coli plate to grow. Like the cost model, pre-
ing specific instructions on a tablet or computer screen. Krill                         condition code may use any of the other subsystems to estab-
methods include those to execute complex calculations, retrieve                         lish running conditions. For example, an operation that runs a
4   | Synthetic Biology, 2021, Vol. 6, No. 1

        (a)                                                                         (b)

                                                                                                                                                                             Downloaded from https://academic.oup.com/synbio/article/6/1/ysab006/6124325 by guest on 13 December 2021
        (c)                                                                         (d)

Figure 2. AWL. Depictions of operation, operation type, plan, and job models that comprise an AWL. (a) An example of a ‘Bacterial Transformation’ operation type is
displayed. Operation types define specific ways in which input samples and items can be processed to produce outputs. Each operation type contains specifications for
its input and output types. For example, the ‘DNA’ input of a hypothetical bacterial transformation operation type may be satisfied by a ‘maxiprep of plasmid library’
or a ‘miniprep of plasmid’. Input and output types are entirely customizable and may include any number of sample type and object type specifications. Sample rout-
ing, if provided, ensures the input and output samples are mapped correctly upon operation execution; here the input ‘DNA’ sample will be mapped to the
‘Transformed Cells’ output sample. Non-inventory inputs (i.e. parameters) can also be defined as inputs to operation types. (b) An example of a bacterial transforma-
tion connected to a colony PCR operation. Operations types are instantiated to operations when their input and output types are satisfied by items. Here, a bacterial
transformation uses specific items in the LIMS and a parameter (37 C) to produce a bacterial plate. After executing the transformation, the output plate is wired to the
colony PCR operation. The colony PCR outputs the amplicon to an empty well in a stripwell. Notice that the operation type sample routing ensures sample information
(here pUC19-GFP, sample 442) is maintained throughout the series of tasks. (c) Operations from several different researchers and different plans can be batched to-
gether into jobs if they have the same operation types. For instance, all ‘Colony PCR’ operations from all users can be run as a single job. (d) Once operations have been
batched into jobs, jobs can be run divorced from Aquarium plans because all necessary information for execution is included in the job. These jobs can then be per-
formed concurrently by separate technicians.

colony PCR on a bacterial plate may halt operation if there are                         data as attachments to inventory items, operations or plans.
data associated with the plate indicating there was contamina-                          Data associations may include numerical (e.g. DNA concentra-
tion, or if the plate is less than 12 h old and thus not ready to be                    tion), text (e.g. experiment notes) or file uploads (e.g. sequenc-
run. Finally, the documentation contains human-readable mark-                           ing results). Data associations can be created manually through
down text that describes how the protocols are used and                                 the GUI or automatically by a protocol during a job. During exe-
executed.                                                                               cution, protocols may include specific steps instructing the
                                                                                        technician to record or upload data (Figure 4-vii). Post-
2.4 Recording and accessing experimental data                                           execution, data can be accessed by researchers through the GUI,
                                                                                        or by scripting via Aquarium’s Python API, known as Trident.
Aquarium records data in several ways. By default, Aquarium                             Trident allows custom Python applications that power visual-
automatically logs protocol metadata during job execution.                              izations, interfaces, reports or machine-learning workflows to
These metadata include operations batched within the job as                             communicate easily with Aquarium.
well as the identity of submitting users and IDs of source plans
for each operation. Job logs also capture technician identity, job
start and end time, timestamps for each step in a protocol,
                                                                                        2.5 Interacting with Aquarium
job error records, and inventory handled. In addition to the job                        There are five major interfaces in Aquarium: designer, plans,
metadata, Aquarium has generic data associations that record                            manager, samples and developer. The designer tab provides
Synthetic Biology, 2021, Vol. 6, No. 1 |   5

 (a)                                                                                  descriptions are based on our experience using the system
                                                                                      while recognizing that individual members of laboratory per-
                                                                                      sonnel have often adopted multiple or blended roles.
                                                                                          Researchers, including graduate students and postdocs, use
                                                                                      Aquarium’s LIMS to define new samples and the AWL (Figure 1)
                                                                                      to design and launch plans. Once a researcher is ready to launch
                                                                                      a plan, costs are computed with the operation cost models, and
                                                                                      the researcher assigns the total to a budget that Aquarium uses
                                                                                      to automatically track spending and generate reports. After
                                                                                      launching, plans become visible within the plans interface,
                                                                                      where the researcher can see the status of each operation in the
                                                                                      plan, as well as access collected data. Some power users in the
 (b)                                                                                  researcher role entirely bypass the Aquarium browser front-end

                                                                                                                                                            Downloaded from https://academic.oup.com/synbio/article/6/1/ysab006/6124325 by guest on 13 December 2021
                                                                                      and instead add sample definitions, submit plans and retrieve
                                                                                      data through the Trident API. We have found that these tools al-
                                                                                      low researchers to spend minimal time at the lab bench, and
                                                                                      more time reading literature, planning experiments and analyz-
                                                                                      ing data.
Figure 3. Aquarium inventory types. The procedures of a given research lab will           Developers use Aquarium’s IDE to specify operation types and
require handling of multiple different sample types, representing things like
                                                                                      associate code (Figure 4). In our experience Aquarium protocol
DNA plasmid, various cell lines or chemical reagents. (a) For instance samples of
                                                                                      drafting typically begins with a pre-existing paper-based or digi-
type ‘Plasmid’ may be defined by a marker, length or sequence. Properties for
sample types are defined by the user allowing for custom definition of inven-         tal protocol as references, with the developer often working
tory. In Aquarium, there are no hardcoded concepts of ‘Plasmids’ or any other         with an experimentalist for guidance. Once a protocol has been
form of laboratory inventory. Once a sample type has been defined samples can         tested in the IDE, it can be deployed, making it available to add
then be added to the database. Items are physical manifestations of samples           to plans and run in jobs. Developers also work with researchers
and always have a location, as well as the ability to carry data associations. Each
                                                                                      and managers to develop the cost model, documentation and
item is of a given type, known as an object type defining relevant physical prop-
                                                                                      preconditions to create cohesive workflows (Figure 4d).
erties relevant to laboratory handling. (b) Each item produced automatically gets
assigned a location. How this location is assigned is reconfigurable in Aquarium.         Managers batch operations and schedule and launch jobs, in
Here, a   20C freezer is designated ‘M20’ and has three dimensions (shelf, box        the process deciding how many operations to include in each
and position). The location designation and capacity along the dimensions are         job, when each job should be executed and which technician
customizable. Which types of items go into which locations are defined in the         should run the job.
item’s object type. Proper management of item locations in Aquarium is critical
                                                                                          Technicians execute jobs at the lab bench in accordance with
as this feeds into protocol execution, which may use (or alter) item location dur-
ing execution.
                                                                                      the on-screen instructions provided by the protocol code
                                                                                      (Figure 4). Instructions typically include item retrieval and stor-
                                                                                      age, sample preparation and handling, operation of laboratory
access to the AWL interface, in which users can draft and
                                                                                      instruments and data uploading. Technicians may be guided to
launch plans, selecting from available operation types. The
                                                                                      directly upload data files from cameras or other equipment, or
plans interface offers a summary of launched plans including
                                                                                      asked to create data based on prompts (e.g. answering whether
up to date sample and status data. The manager tab is used to
                                                                                      or not a band of a given length is present on a gel).
batch operations into jobs and run them. Within the manager                               As well as formalizing personnel roles, Aquarium facilitates
interface, all operations are accessible, grouped by category, op-                    a conceptual shift to thinking about all laboratory work, includ-
eration type and status. Launching a job from the manager in-                         ing both manufacturing and experimentation, as composed of
terface starts with the on-screen instructions used by                                modular units, with the steps of each modular unit standard-
technicians (Figure 4). The samples interface provides search-                        ized. Standardization of laboratory methods can be beneficial
able access to inventory. The developer tab provides access to                        for replicability (10, 11) and Aquarium provides a means to en-
an Interactive Development Environment (IDE) where Krill code                         sure that standardized procedures are both established and fol-
for protocols can be written and tested directly in the web                           lowed. This arrangement can also reduce experimental bias and
browser (Figure 5).                                                                   shield sensitive sample information.

3 Discussion                                                                          3.2 Aquarium use cases
3.1 Specialization of roles                                                           Aquarium has been used for a number of applications beyond
                                                                                      the work of academic research groups. These include biofoun-
Aquarium facilitates, but does not require, a division of person-                     dries, service laboratories and laboratory skills training.
nel roles roughly corresponding to the different front-end inter-                        Biofoundaries are facilities providing laboratory services to
faces, thereby facilitating the standardization of laboratory                         the synthetic biology research community, generally including
workflows as a low-cost and flexible alternative to robotic auto-                     plasmid assembly and strain construction (1). Aquarium’s built-
mation systems. The module composition approach imple-                                in abstraction barrier between design and execution, and sys-
mented by AWL is intended to allow for flexible workflow                              tem for efficient task management are well suited to support
design, reflecting the reality of discovery-phase research, while                     biofoundries. Aquarium has supported a biofoundry at the
gaining the benefits of standardization, including replicability                      University of Washington, the UW BIOFAB that was first devel-
and efficiency gains from batched jobs. Laboratory roles can fur-                     oped for internal use in 2014 and then made publically accessi-
ther be divided into lab managers, scheduling and assigning                           ble in 2016. Between 2014 and 2020, the UW BIOFAB has run
jobs, and technicians, executing jobs. The following role                             over 30 000 jobs, serving 319 different users. BIOFAB technicians
6   | Synthetic Biology, 2021, Vol. 6, No. 1

 (a)

                                                                                                                                                                                  Downloaded from https://academic.oup.com/synbio/article/6/1/ysab006/6124325 by guest on 13 December 2021
 (b)

Figure 4. Krill protocol. An example of Krill protocol code and its corresponding rendered protocol instructions. Operation types have four sets of code that govern op-
eration behavior and scheduling: Krill protocol, precondition, documentation and cost. (a) Shown here is a snippet of Krill protocol code for the load template step in a
PCR amplification protocol with lettering highlighting aspects of the code. Operations are batched into a single job; when the job is executed, Krill code can access the
input and output information of all batched operations. (i) In this simple example, the template volume is calculated for each operation before generating instructions
for loading template DNA into wells. (ii) A new protocol step is rendered with a show block. (i–vii) Within the show block, various elements are rendered for the techni-
cian. (iii, iv) A title is displayed and has two checkboxes. Checkboxes must be checked before proceeding to the next step, forcing the user to be attentive. Operations
are iterated to display a table that uses the computed template volume, inputs and outputs of the operations. (v) Tables can be interactive and may include text inputs
or checkable boxes. (vi) A visible warning is displayed. (vii) A selection input instructs the user to select from a list of options; numerical, textual and file upload inputs
are also possible through Krill. (viii) Finally, an SVG graphics element can be rendered on the fly using operation information. (b) Optional precondition code governs
when operations can be scheduled into jobs. Cost model computes monetary costs prior to plan launch and documentation provides readable instruction about the un-
derlying protocol.

have assembled 23 million base pairs of DNA using 8.8 million                             crop analytics, and forensics. The impacts of the global pan-
base pairs of fragment DNA amplified in-house, and have built                             demics (such as COVID-19) highlighted the importance of low-
over 5700 different yeast strains. This work has supported syn-                           cost, flexible tools that can support the rapid scaling of labora-
thetic biology research efforts of the Klavins lab and collabora-                         tory services both in terms of throughput and geographical
tors (18–22), as well as other users with no shared research                              reach. Aquarium was recently used to support an HIV-
interests. The UW BIOFAB first implemented cloning and yeast                              resistance screening workflow for use in the developing world
construction services but has since moved on to offer plant cul-                          (23), taking advantage of Aquarium’s graphical technician inter-
tivation and transformation, mammalian cell culturing, protein                            face, data collection management and options for rapid deploy-
engineering, and next-generation sequencing and other                                     ment into new devices and locations.
workflows.                                                                                    Given the instructional efficacy of the technician interface,
    Operated by private companies or public institutions, service                         we have also found utility for using Aquarium as an education
laboratories support clinical diagnostics, agricultural soil and                          tool, teaching university laboratory courses with the software.
Synthetic Biology, 2021, Vol. 6, No. 1 |            7

                                                                                                                                                                                Downloaded from https://academic.oup.com/synbio/article/6/1/ysab006/6124325 by guest on 13 December 2021
Figure 5: Operation type integrated development environment (IDE). New operation types are developed through the operation type IDE. Input/output specifications
are defined for each operation, along with sample type and object type specifications for each input or output. Several tools are available (top) for editing Krill protocol,
precondition, cost and documentation code. A built-in protocol testing environment is available (top right) to speed workflow development.

Aquarium’s technician interface (Figure 4) delivers step-by-step                             An online hub (https://www.aquarium.bio/) for sharing and
instructions at the lab bench, reducing the need to front load                           peer-curation of Aquarium workflows supports the growing
learning of methods, similar to Just-in-Time Teaching (24). It                           user community. Aquarium workflows currently can be
has also been used to support undergraduate laboratory train-                            exported and published as Github repositories. Current develop-
ing courses at the University of Washington and elsewhere.                               ment plans include simplification of the Krill protocol language
                                                                                         to lower barriers and allow for wider use of Aquarium in life sci-
3.3 Comparison with other software                                                       ence research labs.
                                                                                             While Aquarium provides a way to formalize scientific work-
Aquarium is part of a growing ecosystem of laboratory research
                                                                                         flows and their execution so as to allow researchers without
software that we believe will become central to the working
                                                                                         high-level knowledge of the protocol to perform experiments
practices of researchers over the next decade. Similar to
                                                                                         reliably, it does not provide guidance on experimental design
Aquarium, existing software platforms like those provided by
                                                                                         choices. However, there have been many recent advances on
Benchling, Riffyn, Teselagen and Transcriptic have fully fea-
                                                                                         computer-aided design (CAD) tools for science (25–29). Using
tured LIMS capabilities that connect to protocols and workflows
in meaningful ways. However, as far as we are aware, Aquarium                            Aquarium and its Python API provides a way to execute experi-
is unique in its support for human-centric workflow execution                            mental plans developed by CAD software and return results in a
which allows labs to leverage existing equipment and person-                             machine-readable format. In the future, one can imagine com-
nel. This is in contrast to robotic automation approaches, like                          binations of such systems that mediate automatic design and
those provided by Transcriptic, that integrate workflow execu-                           submission of experiments, execution through Aquarium, auto-
tion primarily via laboratory robotics, which allows high-                               mated extraction and analysis of results, and rapid redesign.
throughput experimental automation. Other platforms special-
ize in other aspects of the laboratory research process, such as
Riffyn, which provides sophisticated tools for connecting and
                                                                                         4 Materials and methods
integrating workflow processing to data analytics; Benchling
and Teselagen integrate aspects of biodesign to LIMS and work-                           4.1 Glossary of Aquarium terminology
flow processes. Unlike these tools, Aquarium has limited capa-
                                                                                         The following is provided for disambiguation and covers
bilities for data processing and biodesign. Instead, Aquarium
                                                                                         Aquarium terminology used in this article for which an alterna-
focuses on flexible workflow planning and execution, leaving
                                                                                         tive common-usage definition exists. For a more complete de-
design and data analytics to other software better suited to that
                                                                                         scription of relevant terminology, please refer to the
task, such as those mentioned above. Aquarium’s open-source
                                                                                         documentation found at www.aquarium.bio.
Python API and flexible LIMS invite future integrations of
                                                                                             Sample: A biologically unique entity, with properties defined
Aquarium with other software systems.
                                                                                         by the needs of the user. A description of a specific plasmid is a
                                                                                         sample.
3.4 Future development of Aquarium                                                           Sample type: A category of samples, such as ‘Plasmid’ or
We support a growing Aquarium user community and are                                     ‘Mammalian Cell Line’.
aware of at least eight groups that have set up and operated in-                             Item: A physical manifestation of a sample that exists in the
dependent Aquarium servers for applications ranging from                                 laboratory. A miniprep stock is an item of a given plasmid
plant transgenics, to microbial strain construction to biomedical                        sample.
diagnostics. Aquarium is distributed under the MIT license to                                Object type: A category of items that includes a name and a
promote adoption by, and contributions from, users in any set-                           default location, and belongs to a particular sample type.
ting, whether academic, commercial or educational.                                       Examples could be ‘Plasmid Stock’ or ‘400 mL Bottle of Media’.
8   | Synthetic Biology, 2021, Vol. 6, No. 1

    Operation: The basic unit of laboratory work planned in           (DARPA), the Department of Defense or the United States
Aquarium, in which inputs are converted to outputs according          Government.
to a protocol defined using the Krill protocol language.
    Operation type: A protocol definition, which governs how          Acknowledgements
human-readable instructions are rendered for a batch of opera-
tions, and how operations change the inventory and other data.        Justin Vrana, Orlando de Lange, Devin Strickland, and Ben
    Plan: A set of operations that are linked by connecting inputs    Keller wrote and revised the manuscript. Eric Klavins and
and outputs.                                                          Yaoyu Yang conceptualized the software and wrote most of
    Job: A batch of operations of the same type that are run con-     the application code. Ben Keller, Abe Miller, Garrett
currently by a technician following instructions generated from       Newman, Phuong Le, Justin Vrana contributed to the appli-
the operation type protocol written in Krill.                         cation code. Abe Miller and Ben Keller prepared documenta-
    Krill: The domain-specific language used to define protocols,     tion and installation scripts. Tileli Amimeur, Nick Bolten,
a core element of an operation type. Krill extends Ruby by in-        Leandra Brettner, Cameron Cordray, Miles Gander, Sarah

                                                                                                                                               Downloaded from https://academic.oup.com/synbio/article/6/1/ysab006/6124325 by guest on 13 December 2021
cluding methods specific for managing Aquarium objects and            Goldberg, Samer Halabiya, Seunghee Jang, Yokesh
generating on-screen instructions for technicians.                    Jayakumar, Eriberto Lopez, Jon Luntzel, Cannon Mallory,
    Data association: Key-value pair that is associated with          Abraham Miller, Garrett Newman, Michelle Parks, Sundipta
plans, operations or items. Data associations are added auto-         Rao, Ayesha Saleem, Devin Strickland, Chris Takahashi,
matically during the execution of a job or manually by a user.        Justin Vrana, Yaoyu Yang and David Younger developed
                                                                      Aquarium workflows. Cami Corday, Samer Halabyla, Aza
4.2 Aquarium license and software availability                        Allen and Michelle Parks executed and tested workflows
Aquarium is distributed under the open-source MIT license.            and contributed to project ideas while managing the experi-
Aquarium, documentation and installation instructions are             mental laboratory.
freely available (https://www.aquarium.bio/) along with links to
                                                                      Conflict of interest statement. None declared.
Dockerized versions of the software. Code is maintained on
Github (https://github.com/aquariumbio/aquarium). Aquarium’s
Python API (Trident) is also under the open-source MIT license        References
and is hosted on the open-source python repository at PyPI            1. Jessop-Fabre,M.M. and Sonnenschein,N. (2019) Improving re-
(https://pypi.org/project/pydent/) and its documentation and in-          producibility in synthetic biology. Front. Bioeng. Biotechnol., 7,
stallation instructions are also freely available (https://aquari         18.
umbio.github.io/trident/).                                            2. Begley,C.G. and Ellis,L.M. (2012) Drug development: raise
                                                                          standards for preclinical cancer research. Nature, 483,
4.3 Aquarium software implementation                                      531–533.
                                                                      3. Fang,F.C. and Casadevall,A. (2012) Reforming science: struc-
Aquarium is implemented as a browser-based Ruby-on-Rails
                                                                          tural reforms. Infect. Immun., 80, 897–901.
application (https://rubyonrails.org/), with an AngularJS (https://
                                                                      4. National Academies of Sciences, Engineering, and Medicine.
angularjs.org/) and HTML5 front-end. The current implementa-
                                                                          (2019) Reproducibility and Replicability in Science. National
tion of Krill leverages Ruby (https://www.ruby-lang.org/en/),
                                                                          Academies Press.
which is a popular, dynamic, object-oriented language used in
                                                                      5. Goodman,S.N., Fanelli,D. and Ioannidis,J.P.A. (2016) What
web development. Data models are implemented using a
                                                                          does research reproducibility mean? Sci. Transl. Med., 8,
MySQL relational database (https://dev.mysql.com/).
                                                                          341ps12.
    Aquarium is distributed as a Docker (https://www.docker.
                                                                      6. Prinz,F., Schlange,T. and Asadullah,K. (2011) Believe it or not:
com) image (https://hub.docker.com/repository/docker/aquari
                                                                          how much can we rely on published data on potential drug
umbio/aquarium), along with Docker Compose (https://docs.
                                                                          targets? Nat. Rev. Drug Discov., 10, 712–712.
docker.com/compose/) scripts (https://github.com/aquarium
                                                                      7. Ioannidis,J.P.A. (2005) Why most published research findings
bio/aquarium-deployment) that can be used to orchestrate
                                                                          are false. PLoS Med., 2, e124.
backend, relational database and front-end services. The              8. Baker,M. (2016) 1,500 scientists lift the lid on reproducibility.
Trident API is implemented in Python using open-source librar-            Nature, 533, 452–454.
ies and is available as a Python package via pypi.org (https://       9. Fanelli,D. (2018) Opinion: is science really facing a reproduc-
pypi.org/project/pydent/).                                                ibility crisis, and do we need it to? Proc. Natl. Acad. Sci. USA,
                                                                          115, 2628–2631.
SUPPLEMENTARY DATA                                                    10. Fonio,E., Golani,I. and Benjamini,Y. (2012) Measuring behav-
                                                                          ior of animal models: faults and remedies. Nat. Methods, 9,
Supplementary Data are available at SYNBIO Online.                        1167–1170.
                                                                      11. Arroyo-Araujo,M., Graf,R., Maco,M., van Dam,E., Schenker,E.,
Funding                                                                   Drinkenburg,W., Koopmans,B., de Boer,S.F., Cullum-
                                                                          Doyle,M., Noldus,L.P.J.J. et al. (2019) Reproducibility via coor-
This material is based upon work supported by the Defense                 dinated standardization: a multi-center study in a Shank2 ge-
Advanced Research Projects Agency (DARPA) under                           netic rat model for autism spectrum disorders. Sci. Rep., 9,
Contract No. HR001117C0095. Any opinions, findings and                    1–10.
conclusions or recommendations expressed in this material             12. Nussbeck,S.Y., Weil,P., Menzel,J., Marzec,B., Lorberg,K. and
are those of the authors and do not necessarily reflect the               Schwappach,B. (2014) The laboratory notebook in the 21st
views of the Defense Advanced Research Projects Agency                    century: the electronic laboratory notebook would enhance
Synthetic Biology, 2021, Vol. 6, No. 1 |    9

    good scientific practice and increase research productivity.           21. Khakhar,A., Bolten,N.J., Nemhauser,J. and Klavins,E. (2016)
    EMBO Rep., 15, 631–634.                                                    Cell–cell communication in yeast using auxin biosynthesis
13. Miles,B. and Lee,P.L. (2018) Achieving reproducibility and                 and auxin responsive CRISPR transcription factors. ACS
    closed-loop automation in biological experimentation with                  Synth. Biol., 5, 279–286.
    an IoT-enabled lab of the future. SLAS Technol., 23, 432–439.          22. Gander,M.W., Vrana,J.D., Voje,W.E., Carothers,J.M. and
14. Naugler,C. and Church,D.L. (2019) Automation and artificial                Klavins,E. (2017) Digital logic circuits in yeast with
    intelligence in the clinical laboratory. Crit. Rev. Clin. Lab. Sci.,       CRISPR-dCas9 NOR gates. Nat. Commun., 8, 15459.
    56, 98–110.                                                            23. Panpradist,N., Beck,I.A., Vrana,J., Higa,N., McIntyre,D.,
15. Dennis,J.B. (1975) First Version of a Data Flow Procedure                  Ruth,P.S., So,I., Kline,E.C., Kanthula,R., Wong-On-Wing,A. et al.
    Language. Programming Symposium, Springer Berlin                           (2019) OLA-Simple: a software-guided HIV-1 drug resistance
    Heidelberg, Berlin, Heidelberg. 362–376.                                   test for low-resource laboratories. EBioMedicine, 50, 34–44.
16. Resnick,M., Maloney,J., Monroy-Hernández,A., Rusk,N.,                 24. Marrs,K.A. and Novak,G. (2004) Just-in-time teaching in biol-
    Eastmond,E., Brennan,K., Millner,A., Rosenbaum,E., Silver,J.,              ogy: creating an active learner classroom using the Internet.

                                                                                                                                                     Downloaded from https://academic.oup.com/synbio/article/6/1/ysab006/6124325 by guest on 13 December 2021
    Silverman,B. et al. (2009) Scratch: programming for all.                   Cell Biol. Educ., 3, 49–61.
    Commun. ACM, 52, 60–67.                                                25. Rohl,C.A., Strauss,C.E.M., Misura,K.M.S. and Baker,D. (2004)
17. Ruby Programming Language. (2020) https://www.ruby-lang.                   Protein structure prediction using rosetta. Methods Enzymol.,
    org/en/ (20 October 2020, date last accessed).                             383, 66–93.
18. Younger,D., Berger,S., Baker,D. and Klavins,E. (2017)                  26. Hillson,N.J., Rosengarten,R.D. and Keasling,J.D. (2012) j5 DNA as-
    High-throughput characterization of protein-protein interac-               sembly design automation software. ACS Synth. Biol., 1, 14–21.
    tions by reprogramming yeast mating. Proc. Natl. Acad. Sci.            27. Untergasser,A., Cutcutache,I., Koressaar,T., Ye,J., Faircloth,B.C.,
    USA, 114, 12166–12171.                                                     Remm,M., Rozen,S.G. et al. (2012) Primer3—new capabilities
19. Khakhar,A., Leydon,A.R., Lemmex,A.C., Klavins,E. and                       and interfaces. Nucleic Acids Res., 40, e115–e115.
    Nemhauser,J.L. (2018) Synthetic hormone-responsive tran-               28. Appleton,E., Tao,J., Haddock,T. and Densmore,D. (2014)
    scription factors can monitor and re-program plant develop-                Interactive assembly algorithms for molecular cloning. Nat.
    ment. Elife, 7, e34702.                                                    Methods, 11, 657–662.
20. Groves,B., Khakhar,A., Nadel,C.M., Gardner,R.G. and Seelig,G.          29. Nielsen,A.A.K., Der,B.S., Shin,J., Vaidyanathan,P., Paralanov,V.,
    (2016) Rewiring MAP kinases in Saccharomyces cerevisiae to reg-            Strychalski,E.A., Ross,D., Densmore,D. and Voigt,C.A. (2016)
    ulate novel targets through ubiquitination. Elife, 5, e15200.              Genetic circuit design automation. Science, 352, aac7341.
You can also read