Getting Started¶

The purpose of this tutorial is to teach users about the core features of Reveal Chromatography. Much more can be learned about these features in the User manual and we strongly encourage users to read that document when they wish to begin using this software with their own data. Here, pre-loaded data will be used to demonstrate how Reveal Chromatography’s tools can assist in the process of model calibration.

Launching Reveal Chromatography¶

The first time the application is launched, it will prompt the user for a license key:

To obtain a license key, please Contact us. Otherwise, type in the license key, and click OK. The application will remember this key until the license expires.

Once the license key has been authenticated, the application will start and display its welcome screen:

The interface is comprised of four basic components, which are:
1. The central pane,
2. The User Data browser,
3. The Study Data browser,
4. The performance parameters pane area

The user data browser displays study-independent parameters, and is where the user defines the properties of the components of the system they will use to build a study. These include the properties for products, resins, columns, etc..

To complement the User Data browser, the study data browser contains study-dependent data elements, simulation tools, and analysis tools. These data elements include study-specific buffers, purification method descriptions, binding models, transport models, as well as experiments and simulations made from these elements.

The central pane is where details about elements from the User and Study data browsers will be displayed when double-clicked. Each element will open in its own separate tab. In addition, the central pane is the area where chromatograms and other plots open, and the area where results from the analysis tools (e.g., parameter explorations, optimizers) will be displayed.

Finally, the performance parameters pane will display the performance parameter results for simulations once they have been run (e.g., yield, purity, etc.).

Specifying experimental data¶

Reveal Chromatography comes pre-loaded with a templated input file containing experimental chromatography data that can be used to calibrate a model for the sample product, called PROD000. That product was analyzed with Cation Exchange high precision chromaography and was made of 3 isoforms: Acidic 1, Acidic 2 and Native. Users can learn more about using their own data and products by referring to Next steps: Modeling your own protein at the end of this tutorial.

To open the sample data to be used in this tutorial, select Show sample input files... in the Help menu. To make it easier to find this data again, it is recommended that users copy the data files to their Desktop. Input data for Reveal Chromatography is made of a required Excel file (.xlsx) and optionally a set of supporting files (.asc or .csv) containing chromatography-related continuous data exported from chromatography equipment (e.g., AKTA). To illustrate the features of Reveal Chromatography, this tutorial will utilize data contained in Example_Gradient_Elution_Study.xlsx, and its companion AKTA file Example_AKTA_Data.asc.

Note

These sample input files can be found and downloaded again at any time from this link .

Open the file Example_Experiment_Gradient_Elution_Study.xlsx. This is the template Excel spreadsheet that must be used to specify experimental data. The example input file should first appear like this:

The Excel file contains all the data about the product, chromatography system and column used, the buffers and loads involved, as well as the experimental setup (method) for one or more experiments. At the bottom of each experiment description, the sheet name containing fraction data for that experiment may be specified (if available) as well as the path to the .asc or .csv file containing the experiment’s continuous data (if available). A quick inspection of this sample Excel file shows that this study involves the product PROD000:

The study also contains three experiments (called Run_1, Run_2, and Run_3):

Finally, the bottom of the file indicates that Run_1 is the only experiment for which fraction and continuous data are available:

For more details about the precise structure of this input file, refer to the user manual’s Preparing experiment input files section.

To begin using Reveal Chromatography to analyze the example data, create a new project around the previously referenced sample input file. To do so, select File > New Project from Experiment File... (or ctrl-L).

This will open a file browser. Select the sample input Excel file Example_Gradient_Elution_Study.xlsx, and click Open

Reveal Chromatography will then analyze the spreadsheet and the AKTA file specified within and try to discern the nature of each datasets:

The default dataset types that Reveal Chromatography detects are accurate. For the sake of brevity, we will assume that the default time of origin settings are accurate too. For more on this, refer to the user manual’s Parsing the AKTA files.

Click OK to finish loading the data into a new project. This should create a new window with the example data loaded into a new study, with the central pane displaying the Run_1 chromatogram and fraction data:

Simulating experimental data¶

Once experimental data about a product has been loaded, users can simulate the chromatography process and calibrate how the product components transport through the column and bind to the resin beads. This is done in Reveal Chromatography by creating a simulation around the experiment’s method, and assuming a transport model and a binding model. Comparing the simulation’s chromatogram to the chromatogram produced by the experimental data allows users to evaluate the accuracy of the chosen transport and binding models.

Creating transport and binding models¶

Before creating a simulation, users will need to create and add at least one transport model and one binding model to the study’s data. To do that, right-click on the Transport models entry in the study data, and select Create New...:

A new window will appear, allowing users to configure the new model. Reveal Chromatography currently only supports General Rate Model-type transport models, which describe mass transfer kinetics through the column and resin, and requires the initial estimates of the following variables: bead porosity, column porosity, axial dispersion, and three parameters for each product component (film mass transfer, pore diffusion and surface diffusion):

The process of building realistic estimates of these parameters is beyond the scope of this tutorial, though interested users can refer to the literature for additional guidance. For now, assume that the default values for these parameters are an acceptable starting point and click OK.

Building a binding model is done in the same way in which transport models were built (i.e., right-click on the Binding models folder, and select, Create new...):

Reveal chromatography currently supports Multi-component Langmuir models, and Steric Mass Action (SMA) models and the Professional Edition also supports pH-dependent versions of these models. Since the experiment beeing loaded in this tutorial are grandient elution experiments, where the cations are expected to impact the binding process of the product components, SMA models are recommended over simpler Langmuir models. For the sake of simplicity, we will use a pH-INdependent SMA model. And again, for the purpose of this tutorial, it is assumed that the SMA model with default values is an reasonable starting point. Click OK:

At this point there should be at least one transport and at least one binding model in the Study Data browser. This can be checked by expanding the Transport models and Binding models sections of the Study Data browser to make sure that they each contain an element. Then, double-click on these models to review their values. If these models are present and correct, then all of the necessary components for building a simulation are present.

Creating a simulation¶

As the goal of this tutorial is to try to build a simulation that describes the chromatography process as it was done in an experiment, navigate to File > New > New Simulation from Experiment:

A new window will appear, where users can select the experiment from which they would like to base the new simulation. Since there is only data for the Run_1 experiment, select that one:

At this point, users need to specify the method steps with which they would like the simulation to begin and end. It is recommended that users leave the field for Override initial buffer with blank. Then, Reveal Chromatography will set the initial buffer from the experimental method the buffer from the step immediately prior to the first simulated step. This will provide the initial conditions for the column. In this tutorial, we will simulate the Run_1 experiment from its Load step to its Strip step:

Next, select the binding and transport models that were just created. The default values can be used for the remaining simulation components (Solver type and Discretization type). Click OK to close the window and create the simulation. Expand the Simulations list in the Study Data browser to make sure that a simulation (called Sim: run_1 by default) is present:

Now double-click on the simulation to view its parameters and components in the central pane. Note that the new simulation has been added to the performance parameters pane. However, all performance metrics will remain blank until the simulation is run.

Note

You can create multiple simulations by selecting multiple experiments from which to create them, as long as they all share the same first and last step, as well as the same binding and transport models.

Running the simulation¶

Now that the simulation is created, and present inside the Simulations folder, running the CADET solver on the simulation is simple: right-click on the simulation, and select Run Simulation...:

The simulation will take a few seconds to run in the background, and once completed, will update its status in the central pane and trigger an update of the performance parameters pane.

Plotting the simulation¶

To produce a chromatogram from the newly simulated data, go to the Tools menu, and select Plot Chromatogram(s) (or ctrl-p):

Users may also produce a chromatogram from simulated data by clicking on the graphing icon, located in the upper left hand corner of their screen.

The simulated chromatogram will open in a new tab in the central pane and will display both the experimental and the simulated chromatograms for each product component:

Users can hide the controls on the left hand side of the screen by clicking on the controls icon (pictured below), located in the upper left hand section of the open window.

Users can also hide the legend that appears on the chromatogram by clicking on the legend icon located to the right of the controls icon.

This chromatogram indicates that the simulation is a poor representation of the experimental chromatography process. This is not surprising considering the assumption made above that all default transport and binding model parameters were an accurate description of the PROD000 product. The mismatched chromatograms indicate that this assumption is faulty.

It seems logical to next explore how modifying certain parameters might affect these chromatograms. This can be done most efficiently by creating, running and plotting batches of simulations where one or more parameters vary, which is the topic of the next section.

Exploring parameter impacts¶

Reveal Chromatography can be used to create multiple simulations to study the effect of modifying one or more parameters at once. It can help users explore the effects of modifying these parameters on chromatography processes, as well as the effects on the resulting chromatograms using the Parameter Explorer.

The Parameter Explorer allows users to create a grid of simulation results around an existing simulation, called the grid’s “center point”. The newly created grid represents simulations where every parameter except one is identical to that of the center point. In this demonstration, the Parameter Explorer will be used to understand the effect of the SMA binding model’s characteristic charge (denoted sma_nu) for the main product component (Acidic 1). sma_nu represents the number of binding sites that the protein component can access, which will have a direct impact on the time at which that component elutes.

Creating a grid¶

To create the simulation grid around the study’s only simulation, go to Tools > Parameter explorer:

A new window will open for users to select:

• The name of the simulation grid.
• The “center point simulation”, which is the simulation around which to build the grid. Since there is currently only one simulation in this tutorial’s study, there is only 1 possible choice there.
• one or more parameters for Reveal Chromatography to scan. To begin, click the New parameter scan button at the bottom of the open window:

Then click the white space under the heading Parameter name to open a drop down menu of available parameters that can be explored in the simulation grid:

To scan the sma_nu binding model parameter for the second product component, sma_nu[2] should be selected as the scanned parameter name.

Once selected, Reveal Chromatography will automatically display the value of the parameter from the current simulation. Specify the range of values to be scanned by double-clicking the cells in the Low and High columns. Since the current simulation has a sma_nu=5 as a center value, values between 4.0 and 6.0 seem reasonable.

Additionally, the Num values field can be used to specify how many simulations involving different values of the parameter being scanned should be produced. Use the Spacing field to select whether you would like a logarithmic or linear scan. To keep the run time relatively short in this tutorial, it is wise to make a coarse grid by keeping Num values at 5.0, and the spacing as linear since the values scanned are all of the same order of magnitude.

Press OK. The simulation grid will automatically open in the central pane. The newly created grid can also be located in the Study Data browser, by navigating to Analysis Tools > Simulation grids.

To locate this grid in the Study Data browser, navigate to Analysis Tools > Simulation grids and select the appropriate grid.

Note

For more detailed information on how grids are build and how to build more complex ones, see Exploring the parameter space: Simulation Groups.

Running the grid¶

The simulation grid in the central pane will display the number of simulations the grid contains, and the name of the simulation at its center. The central pane also displays a table made to display some performance data for each of the simulations in the grid. The table will eventually display performance data related to pool concentration, pool volume, and component purities. Upon creation, this table has no values, since the simulation has not yet been run.

To run the CADET solver on all simulations in the grid, click the Run Simulation Grid button in the grid’s view in the central pane. As the simulations run, the grid’s performance data table will begin to populate with updated values.

Once the grid has fully run, the performance table will be full, and the status will go from Running... to Finished running.

Plotting the grid¶

To visualize the effect of the scanned parameter, users can plot chromatograms for all the simulations of the grid by right-clicking on the simulation grid in the study data browser (under Analysis Tools) and selecting Plot all simulations:

This will produce a plot of all the simulations as well as the original experimental data:

The dashed peaks from right to left correspond to all grid simulations with sma_nu values of 6.0, 5.5, 5.0, 4.5 and 4.0 respectively. The pattern of these peaks confirm that lower values of nu result in shorter elution times, and therefore closer alignment with the experimental data (displayed as a solid line).

Note

As a review, the Parameter Explorer has shown that (assuming all other parameters are correct) a value of sma_nu smaller than 5.0 is needed to account for the experimental data. The plots indicate that values below 4.0 should be explored.

This can be accomplished by running another simulation grid that explores sma_nu values, say from 1.0 to 4.0. This is one possible workflow that Reveal Chromatography offers to calibrate a model: create a grid, run and plot all simulations, compare the resulting chromatograms to the experiment(s), and repeat.

Although much can be learned from this workflow, an additional tool in the software automates it by creating a simulation grid and automatically sorting each of the simulations by how well they match the experimental data. This additional tool, called the Parameter Optimizer, is described in the next and final section of this tutorial.

Calibrate the binding model by optimizing parameters¶

The Parameter Optimizer is used to scan and calibrate parameters (from the binding model, transport model, operational parameters, etc.) while also comparing the simulated chromatograms to the target experimental data that users would like to calibrate these chromatograms against.

Creating the optimization¶

The Parameter Optimizer tool allows users to minimize a “cost function”, which measures the alignment between a simulated chromatogram and an experimental chromatogram. Within the Parameter Optimizer, there are currently two types of optimizers that can be selected. The default is the most general and implements the simplest grid search-based algorithm. The idea is to define a set of parameters to scan, create and run all simulations in that grid, and compute the cost (or distance) between the resulting chromatograms and the experimental data, a workflow that was done more subjectively when using the Parameter Explorer tool, as described above.

To create a parameter optimization, begin by selecting Parameter optimizer from the Tools menu:

A new window will open to select experimental data against which the output simulations will be compared. Again, as Run_1 is the ony run with data, select Run_1.

Note

It is possible to optimize some parameters across more than one experiment. Select multiple experiments by holding down the Shift key while clicking on each experiment.

Next, specify a starting simulation from the drop down menu. Then, select the New parameter scan button to choose which parameters will be optimized. In the previous sections, it was observed that the binding model’s sma_nu[2] could be explored further. Additionally, the plot generated in Plotting the grid indicated that this parameter might be explored below 4.0, so it makes sense to scan sma_nu[2] between 1.0 and 4.0.

In addition, since the binding model’s ka parameter is and also a parameter that affects elution time. In this tutorial, users should add sma_ka[2] to the Scanned Parameters list to scan a 2D parameter space. With the center value for sma_ka[2] at 10-3, this tutorial will explore a range around this value by setting the low to 10-4 (type 1e-4) and the high to 10-2 (type 1e-2). As these values span multiple orders of magnitude, select a logarithmic scale to complete the set up:

It is recommended that users leave the Override initial buffer with field as blank, just as was done in the creation of a simulation, so the initial buffer is read off the experiment’s method (see ). Click Create to create the optimization. The newly created optimizer will automatically open in the central pane. It is also available in the study data browser, under the Analysis Tools > Optimizations section.

Running the optimization¶

The newly created optimizer’s central pane view displays the target experiment, the list of parameters that will be explored, the total number of simulations that will be run (25 if scanning two parameters, with five values each). Below, there will be a table which will display output data that will be collected during the optimization (values of scanned parameters, and total cost of the resulting simulation).

To run the Optimizer, click on the Launch optimizer button at the bottom left of the central pane view. The Status will change to Running.... Once the optimizer completes its run, the Status will become Finished running and the output table above will be populated with data (please allow about five minutes for this process):

Plotting the optimized solutions¶

To view optimized solutions, use the study data browser to navigate to Analysis Tools > Optimizations > Optimizer0 > Optimal Solutions. Here users can select their optimal simulations and plot chromatograms from those simulations:

Any of the optimal simulations can be plotted (one at a time), and plotting the best one (ka[2]=0.0001 and nu[2]=4) leads to the following plot:

Have we calibrated a model already?¶

A few comments on this tutorial must be made. First, it seems that the optimal result is fairly accurate, despite how (relatively) little effort has been invested in this calibration. This is misleading for at least three reasons:

1. A decent match was obtained for one of the product components (Acidic 1), but the other species are still eluting too late. In addition, optimal parameters for each component may influence all other components. Therefore, parameter values for all components should be optimized at once.

2. The experimental data was loaded without applying any time shift when we loaded the AKTA file (refer back to Loading experimental data into a new project). Consequently, the experimental data is displayed with time starting at the pre-equilibration step, whereas simulations are defined starting at the Load step. So the apparent match is completely fortuitous.

3. Calibrating a chromatographic model requires that users be able to fit experimental chromatograms from multiple operational conditions, like a different flow rate, a different gradient elution slope, a different pH, etc. This optimization task has moved towards a model that fits one experiment, but it is highly unlikely to remain predictive when tested under other conditions. This is why Reveal Chromatography’s Optimizer also supports fitting more than one experiment at a time. See the section on Parameter optimizers for more details.

This tutorial illustrated a simple calibration task with the goal of showing users the tools which Reveal Chromatography contains. However, the actual practice of calibrating models requires much more effort. More details about calibration will be published by KBI soon. In the meanwhile, please Contact us if you have questions.

Next steps: Modeling your own protein¶

The main features of Reveal Chromatography have been presented, but many topics have been left out and can be found in the User manual, which is the next step to continue understanding modeling with Reveal Chromatography. In particular, if users want to start modeling their own proteins and experimental data, they should read more about the following topics:

Other sections of the user manual are natural next topics to read about as well. In particular, the additional plotting capabilities, other Optimizer options, and full scripting capabilities for implementing any custom analysis are all explained in detail in the user manual:

Or, users may start at the beginning of the User manual.