If you’re looking to try out Lightup on your personal computer to get a feel for its features and capabilities, or aren’t ready to go through infosec approvals in order to try it, then the Personal Edition is right for you. This post will walk you through the steps to get Lightup Lite up-and-running in no time.
To start with, let’s cover the components of Lightup Lite. Lightup Lite runs as a Docker application that can be installed on any Linux or MacOS system. This is slightly different from the Premium and Enterprise deployment models which run on a Kubernetes cluster hosted by either Lightup or the client’s organization. You’ll need to make sure that both Docker and the Docker Compose plugin are installed for the Lightup Lite install process to run smoothly.
Now that we’ve got Docker and Docker Compose taken care of, let’s sign up for Lightup Lite. After clicking Start Lightup Lite Free Trial, you’ll be prompted to enter a username or continue with Google Single Sign-on. For security purposes, we use Auth0 for authentication.
It’s important to note that trial signups require approval before Lightup Lite can be downloaded. This only occurs the first time, and you should see that your request is pending.
Once approved, you’ll see the Lightup welcome screen. (If you leave the site, you’ll have to sign back in, but won’t need approval again.)
There are two commands needed to install Lightup. The first will log you in, and the second will download the Docker Compose YAML file to spin up the Lightup Lite containers.
Note: If you already have docker installed but the daemon is turned off you can either turn it back on with “sudo systemctl start docker” on Linux or on Mac and Windows go into Docker Desktop and click the start button.
All right! We’re up and running, so let’s get started by navigating to http://127.0.0:8000 in your browser and sign in with your login credentials.
Let’s start by adding a datasource in the “main” Workspace.
When adding a datasource, the required connection fields will change based on the datasource type selected (e.g. Snowflake, Databricks, Oracle…), so ensure you select the correct one. For more information on data sources, visit our docs.
Once the Datasource is added, you will see it in the list of Datasources, along with other Datasource metadata.
Most of our setup and exploration will be done from the “Explorer” view for our workspace. Clicking on the Explorer tag will show us a tree of all datasources, as well as all schemas, tables, columns, and metrics associated with each datasource. Since we’ve just added this datasource, there won’t be anything else to expand in the tree.
The plan here is to add any schemas we are interested in, then tables, and then columns. Of course, we don’t want to have to do this one by one, so we’re going to use some bulk selecting. (Just keep in mind that you can select individual datasource objects as well) Also, everything that you see here is accessible via an API and SDK for developers who prefer to do these steps programmatically.
In the Actions menu to the right, select “Manage All Schemas,” and either bulk select or select the specific ones you want.
After submitting, Lightup will run a quick scan of those schemas in order to get metadata information about the underlying tables and columns. Once that scan is complete, we should see the selected schemas underneath our datasource. If you have multiple datasources, repeat this and the schemas will appear under their corresponding datasource.
Let’s add some tables to one of our schemas. With the desired schema selected, click on the Actions menu and select “Manage Tables.” This is a good time to note that any of the menus and information on the right side will be context-sensitive to what is selected in our Explorer view.
Once in the Manage tables menu, we can either bulk select, select specific tables, and even search for tables.
With a desired table(s) selected, we can enable Data Profiling, Metadata Metrics, and other Table Autometrics. Additionally, we can click the name of the table and see what columns exist and even enable column-level Autometrics (Null%, Distribution, and so on).
In this case, let’s start by enabling “Data Profiling.” This is a great way to take a look at what data is there in order to determine what Data Quality Indicators (DQIs) we should build. It’s also handy in configuring a table, since under the covers Lightup is using a timestamp to generate the time series metrics that the DQIs are based on (e.g., data delay).
Note: For more information on profiling data, please see: https://docs.lightup.ai/docs/set-up-data-profiling.
After a few minutes, you will see a data profile of your table. For configuration, we are interested in timestamp columns. In my table, there’s actually a column named timestamp with a Type of timestamp. With that information, I’ll click on the Actions menu and select “Configure Table.”
In order to set up the configuration, we’ll need to select a timestamp and ensure the data collection options are correct. The Timestamp dropdown will show a list of all fields with a Type of timestamp, date, and datetime will appear. If there are more than one, it’s important to make sure you select the correct one. This will be the one that accurately reflects how you consider the data to be “fresh”(i.e., the time data entered the datasource or the related fiscal date for the data).
Other important considerations are:
- Use indexed timestamp columns for datasources. Using a non-indexed column will result in more resource intensive queries on larger datasets.
- Use partitions wherever applicable (e.g. Databricks, Athena) to limit how much data is scanned with each incremental query.
Pro Tip: Add a partition to help with performance. For more information, see our Partitions documentation.
You’ll be notified that this action is not reversible and be prompted to type “ok.” This is because Lightup uses this timestamp to generate the time series metrics that the Autometric Data Quality Indicators (DQIs) are based on.
Once the table is set up, Autometrics can be turned on without further configuration. These are similar to Lightup’s out-of-the-box Metadata metrics, but scan actual data in the table for deep data checks, instead of just data about the table.
The last thing we’ll want to do in this setup is enable Autometrics for the actual columns in the table. With the table selected in the Explorer view, click on the Actions menu and select “Manage Columns.” Select or bulk select the columns of interest and make the columns active, and then turn on Distribution, Null Percent, etc.
We are now able to select columns under the table in the Explorer view.