Choosing between Live or Extract data – Building and Integrating Data Pipelines

On the top right of your screen, you will see a live or extract option, as shown in Figure 3.14. If you are connecting using the Salesforce connector, that option will be greyed out, as it is not possible to connect to live Salesforce data via the connector.

Figure 3.14: Create Extract

If, instead, you are connecting to the database on which your Salesforce data is stored, you will be able to choose whether to use an extract or live data. But what is the difference between live and extracted data?

If you choose live data, no data is stored locally on your machine. Tableau will only store the connection details and will send a query to your database for each action you perform in Tableau. This has the advantage of not needing storage on your machine, and it gives you the certainty that you are always working with the latest available version of the data, but it requires a fast connection in order to avoid frustration.

An extract instead is a copy of the data that is stored on your machine and needs to be manually refreshed to be kept up to date with the master data. As all the data is stored locally, no time is spent waiting for a query to be sent to your database and back to your machine, but it requires enough storage space for all your data (which may easily be gigabytes), and the initial wait for the extract to be created could be even a few hours for large extracts. Furthermore, the result of your analysis may not be as relevant if your data is stale.

If you decide to use an extract, there are still options to reduce the size of the data, such as reducing the number of fields or aggregation. We will discuss these options later and explain why they may work for you or cause you frustration.

Bringing additional tables

One table may be enough to start your analysis, but often, you will need more than one object to retrieve the information you need. Tableau’s data visualization capabilities are such that you may start with one or two tables. Still, they will quickly add more as your analysis journey progresses and you ask yourself more questions.

There are several ways to connect different data sources in Tableau:

  • Joins: These function like traditional SQL joins.
  • Relationships: Introduced in Tableau 2020.2, they are an easier way to combine different data.
  • Blends.

Unions.

Leave a Comment