AZoMaterials spoke with Philip Skinner and Kyle Mouallem about how structured data capture in Revvity Signals One can reduce manual data entry, improve search and traceability, and support analytics and machine learning. They discuss the three table types available in Signals One, how in-notebook visualizations are configured, and what it takes to integrate instrument outputs and move structured results into a data lake for cross-experiment analysis.
What do you mean by “structured” data in a lab notebook, and why do tools like Excel make retrieval and audit harder later on?
Philip: I’m Philip Skinner, and I work with our Signals Notebook and Signals One platform, an SaaS application focused on R&D for scientific organizations.
If you think about a lab notebook in a paper world, that content, although it may be organized and rational, isn’t very retrievable. You can read it and scan it, but that’s about it.
A lot of the data people capture electronically is still unstructured. Something like Microsoft Excel is incredibly powerful and obviously prevalent across scientific industries, but the data in there is not reliably structured. It may appear in rows and columns, but “structured” means you can search by values in a given column, and you have repeatable ways of capturing the same properties.
People try to achieve that in Excel through templates, and those templates drift. So data captured for assay results, process data, and similar purposes ends up in tools like Excel, Word, and PowerPoint, and many legacy applications follow that same issue.
The value is freedom for the end-user scientist, but with that freedom comes inconsistency. It’s hard to audit because you can’t predict if it was captured the same way every time, and it’s hard to query it by the properties it represents. What we’ve focused on is capturing data in more predictable ways, using structured tables that are repeatable in how they capture and represent data.

Image Credit: KanawatTH/Shutterstock.com
Once you have that kind of consistent structure, what does it unlock downstream that makes the constraint worth it for scientists?
Philip: We try to think beyond the feature and get to the underlying “why.” What problem are we solving when we capture structure, and what workflows does that support?
First, it supports the notebook’s core job: a narrative record of what you did, in a way that reduces cognitive load for the next person who has to read it. Then it supports retrieval. With structured tables, you can do faceted or filtered search, closer to something like shopping filters. If you’ve got a table with values in the 15 to 16 range, you can search specifically for tables with values between 15 and 15.5 in that property, rather than picking up unrelated entries where some other number happens to match.
Once the data is in, you also want to understand it. We’ve been expanding motifs for capturing structured data and then being able to graph it so you can see trends, differences, and outliers.
And then there’s the “garbage in, garbage out” issue with AI and machine learning. Clean, consistently captured data is critical for the quality of what comes out of the models. So the drivers are: reliable capture, precise retrieval, in situ visualization for insight, and clean inputs for AI/ML.
Signals One now has three table structures. When should a scientist use an admin-defined table versus a variation table versus a hierarchical table?
Philip: The first is what we call an admin-defined table, or ADT, which is basically a simple table. It has a two-level hierarchy because we allow headers at the top, and then a structured table below. Each property has a type (time, date, number, and so on). You can set conditional data entry, equations, and use headers both for metadata and to insert constants into calculations. You can also integrate other sources in Signals, or your own databases, for example, calling a list of samples by sample ID. ADTs have been around for many years.
A few years ago, in partnership with Kyle, we built the Variation Table. It supports batch-style work where you have a top-level component list (ingredients), and then a series of variations that share the same rows but change values in certain columns. You get the same ingredients, but different amounts across the variations, and you can replicate that component table quickly while introducing variability only where it’s needed.
More recently, we introduced the Hierarchical Table. This is a multi-layer relational table, almost like a relational database inside an object in an experiment. It lets data flow down and aggregate back up. In a formulation example, you might have headers, then batches, each batch containing layers, and each layer containing ingredients. This fits cases where the “everything is the same, with small tweaks” model doesn’t hold. You can replicate properties down through the hierarchy and aggregate totals back up.
Everything we built into ADTs, like math, conditional entry, and integrations, carries through into Variation Tables and Hierarchical Tables. The difference is the shape of the data.
Why did you start in-notebook visualization with a scatter plot, and how does the admin-versus-end-user control model work?
Philip: It started from a simple ask: why can’t we just have a bar chart on an admin-defined table? We investigated, found tooling that let us script chart generation, and proved we could represent table data elegantly. The problem was that doing it required scripting, and we didn’t want subject matter experts to have to deal with that, so we needed a configuration experience with buttons, sliders, and options.
When we talked to people who wanted charting, what they actually needed first was a scatter plot. There was also a need for consistency, where managers wanted a consistent look and feel. So rather than the end user defining the chart, we wanted the SME or configuration admin to define it.
Now the admin can define the scatter plot on top of the table: what’s on the X-axis and Y-axis, whether there’s a secondary Y-axis, and how colors and shapes are assigned. You can plug in example data in the configuration view to sanity-check the output, and you can either let axes scale automatically or preset known ranges.
Bench scientists still want flexibility, so we’re also adding the ability for end users to change those settings, if the admin allows it. Inside an experiment, there’s a small configuration icon on the graph that exposes the settings. If they have edit access, they can save changes so they persist.
We also tied this into a parallel project around sending data downstream. We use “external actions,” which let a customer launch an application they built. From the graph and table view, you can select one or more points and execute the external action, sending those selected points to the downstream workflow.
Kyle, from a day-to-day workflow perspective, what was the first “real” problem that admin-defined tables solved for your scientists?
Kyle Mouallem: I’m Kyle, and I work in digitization with R&D scientists across electronics and semiconductor-related workflows. I’ve been with Merck KGaA for about nine years. Five of those were spent working in R&D, pilot, and scale-up, and for the last four years, I’ve been working with IT and scientists to get better automated data flows into and out of the ELN.
My job has three main components: finding areas where scientists are getting slowed down, translating that into actionable configuration and engineering work, and then convincing scientists this is going to save them time. In R&D, things are flexible, but at some level, data can be structured.
Admin-defined tables were the first thing we really started working with. We use them broadly, usually built around instruments or processes, with multiple ADTs in a single template. When a scientist says they can’t structure their data, I ask, “What is it that you’re actually measuring?” Every time you use that instrument, the same data comes off it. So we structure around what does not change, and then give flexibility by letting scientists insert or remove certain tables depending on what they’re doing.
A concrete example is an ADT for a pilot plant batch sheet. It pulls data from inventory, including an assay, which for us means the concentration of the material, and then uses that concentration in a formula in a column right next to where the information comes in. Previously, with multiple lots of materials, a lot of time went into tracking down the concentration for a specific lot to fill out the formulation sheet. This saves time, and if the concentration isn’t available, we can update it in inventory, and it’s automatically available going forward.
What specific formulation R&D challenge pushed you to collaborate on variation tables, and how does that change how technicians move through an experiment?
Kyle: In bench-top formulation R&D, a scientist is typically trying to make multiple variations in a single experiment to save time and get more data faster. You might set up multiple batches, say six buckets, where most of what’s in each bucket is the same. The technician goes ingredient by ingredient across all buckets, so they only have to focus on measuring one chemical at a time, and then there are small differences between batches.
That’s where the variation table came in. It lets you create multiple variants with small changes. There’s a button to clone a variant, so you copy everything and then edit only what’s different.
You could ask, “Why not copy and paste an admin-defined table multiple times?” The value is the views. The horizontal summary matches how technicians enter data, going one ingredient at a time across variants. It’s easier than scrolling through an experiment to find and edit multiple separate tables. The vertical summary is more for scientists, giving a quick overview of component differences across all batches.
Once instrument data starts flowing into the notebook automatically, what does the path to a data lake look like, and what does it enable?
Kyle: Even with structured tables, manual entry is still tedious. The question becomes: why have someone fill out tables when you don’t have to?
We’ve been working on instrument connectivity using middleware, in our case Scitara, to pull data directly from instruments and push it into Signals. The idea is that users run the tool, use a sample ID to pull data, or name folders after the experiment name. With a single button press, data flows directly into the experiment. Once the experiment is created, something is checking for files, and as soon as those files drop into the right folder, they get pulled into the admin-defined table automatically.
That saves time and reduces manual transfers, whether that’s USB, copy/paste, or typing. And once the data comes in automatically, you can view it in a useful way. The scatter plot updating in near real time is a big deal. If you’re running something like a distillation column with temperature probes, you can have the notebook open and watch trends as they develop.
Once you have structure and automation, you also want to pull the data out. We’re setting up flows into a data lake, for example, Foundry, but it could also be SQL-based systems, Snowflake, and so on. The goal is to combine data across experiments so you can analyze at an experiment-by-experiment or batch-by-batch level, and stop rebuilding project datasets manually in Excel.
Setting up schemas takes work. Every time we pull in an admin-defined table, we add it into the ontology definition, specifying what each column means and making sure we’re not duplicating concepts. We pull data out via the API. Signals One has an expansive REST API, so you can extract table data as JSON, parse it, and push it to whatever warehouse or lake you’re using. There’s usually some cleaning and restructuring needed, especially with variation tables, but it’s workable.
Looking ahead, what visualization types are next, and how are customer use cases shaping what gets built first?
Philip: Today, table visualizations on ADTs support scatter plots, with configuration defined by the admin. The next step is end-user definition, and bar charts are the next chart type we’re actively working on.
We’re also thinking about bubble charts, which are basically scatter plots with extra settings like size-by and transparency, and comparison charts like histograms, box plots, and violin charts. We’ve also thought about Gantt-style charts for timeline data. Another area is helping people choose: a recommendation approach where, given your data and the chart types we support, you can preview options and pick what fits.
We’re prototyping these, but prioritization depends on real use cases. The model we want to repeat is what we did with Kyle: sit down, review the workflow, get representative scrubbed data, and understand what “good” looks like. Some specific asks, like generative topographic mapping, aren’t on the roadmap right now, but they can still be part of that conversation if there’s a clear use case and we understand the stepping stones.
Can you answer a few practical questions we hear often? Will charts update as new data arrives? Is the search SQL-based? How do ontologies fit in, and can tables be curated through the API?
Philip: Yes, if you have an integration that’s adding data to a table automatically, the graph updates in near real time. You can sit there and watch points come in and the plot update as the data arrives, which is useful if you’re watching for an endpoint or an emerging trend.
On search, it’s not SQL-based. Signals One is a modern cloud SaaS application, and search is Elastic-based. We also provide a robust set of APIs so data can be queried programmatically, including search.
On ontologies, we support ontologies within Signals Notebook and Signals One, and you can choose to use them or not. You can bring your own ontology, or use third-party options such as SciBite. Internally, the push is to reuse consistent definitions, for example, defining “conductivity” and units once and making sure it’s consistent across tables.
And yes, tables can be curated via the API. As a rule of thumb, the API lets you automate what you can do in the user interface. You can also pull table data out via the API, and the payload includes information about the admin-level table templates it came from, which helps you sort and organize it downstream.
Finally, what AI-related capabilities are you bringing into Signals One next, and how do they connect back to the structured data foundation?
Philip: Two AI-focused offerings we’re highlighting are Signals Xynthetica, a model-as-a-service solution for AI-driven molecular design and property prediction, and our partnership with Lilly TuneLab to bring proven AI/ML models into Signals One through Xynthetica integrations.
The connection back to structured data is simple: if you want reliable analytics and models, you need reliable inputs. The more consistently data is captured, the easier it is to retrieve, visualize, and use in downstream AI/ML workflows.
About Philip Skinner and Kyle Mouallem
Philip Skinner is a Spotfire evangelist, specifically in the lifescience and chemistry communities, and is delighted to bring his experience
and passion to provide consulting to customers of Revvity Signals who are using or considering using Spotfire. Prior to joining Revvity Signals, Philip spent a decade working as a medicinal chemist at a San Diego based biotech, progressing GPCR ligands into clinical development for metabolic diseases. Philip holds a PhD in Chemistry from the University of Durham and completed postdoctoral studies at the ETH in Zurich.
Kyle Mouallem is a digitization and connected-labs leader at Merck KGaA, Darmstadt, Germany (EMD Electronics), where he serves as Program Lead for Accelerate R&D, a global initiative transforming Electronics R&D labs into connected, data-driven environments. His work focuses on instrument connectivity, ELN-centered digital workflows, and building FAIR-by-design measurement data foundations
that enable faster decision-making and readiness for advanced analytics and AI/ML. Kyle has led multi-site rollouts spanning equipment and metrology data across thin films and analytical laboratories, delivering measurable outcomes including major cycle-time reductions, faster instrument onboarding, and quantified productivity gains through reduced manual work. He holds a B.S. in Chemical Engineering from Arizona State University, is an Engineer-in-Training (Arizona), and is certified as a Foundry Data Scientist, with additional training in AI/ML and Python. Kyle is a Senior Silicon Design Engineer at AMD.

This information has been sourced, reviewed, and adapted from materials provided by Revvity Signals Software Inc.
For more information on this source, please visit Revvity Signals Software Inc.
Disclaimer: The views expressed here are those of the interviewee and do not necessarily represent the views of AZoM.com Limited (T/A) AZoNetwork, the owner and operator of this website. This disclaimer forms part of the Terms and Conditions of use of this website.