Review: Amazon QuickSight covers the BI basics
A promising data analysis tool in the cloud, QuickSight is easy to use but limited in terms of visualization types
-
Amazon QuickSight
When I reviewed self-service exploratory business intelligence (BI) products in 2015, I covered the strengths and weaknesses of Tableau 9.0, Qlik Sense 2.0, and Microsoft Power BI. As I pointed out at the time, these three products offer a range of data access, discovery, and visualization capabilities at a range of prices, with Tableau the most capable and expensive, Qlik Sense in the middle, and Power BI the least capable but a very good value.
A new entry, Amazon QuickSight, runs entirely in the AWS cloud, has good access to Amazon data sources and fair access to other data sources, and offers basic analysis and data manipulation at a basic price. Of the three products I reviewed in 2015, QuickSight most closely resembles Power BI, only without the dependence on a desktop product to create data sets—or the level of analysis power provided by the Power BI Desktop/Service combination.
Like Power BI, Qlik Sense, and Tableau, QuickSight connects to myriad data sources and lets you prepare data sets. Once you have data sets, you can create analyses with one or more visualizations, which you can organize into dashboards and stories. You can share data sets, dashboards, and stories within your organization. As we’ll see, QuickSight makes this process quite easy and straightforward, but it lacks some useful visualization capabilities found in competing tools.
Account creation
To create a QuickSight account without already having an AWS account, browse to quicksight.aws, click the Try It Free button in the upper-right corner, and sign up for AWS and then QuickSight. If you already have an AWS account, sign in and browse to the signup page. You’ll see the QuickSight signup form shown in the figure below.
As you can see in the screenshot, the first user is free forever, and a team trial with four users is free for 60 days. Beyond the trial, additional users cost $9 per month each for the Standard edition or $18 per month for the Enterprise edition.
The first user gets 1GB of SPICE storage, and additional users include 10GB of SPICE. Additional SPICE storage costs 25 cents per gigabyte per month for the Standard edition or 38 cents per gigabyte per month for the Enterprise edition. The Enterprise edition adds secure data encryption at rest and a connection to your organization’s AWS Active Directory.
SPICE, aka Superfast Parallel In-memory optimized Calculation Engine, is QuickSight’s high-performance in-memory data store for visualizations. SPICE is required for data imported from files and optional for data in SQL databases. SPICE tables are limited to 10GB each.
Connect to a data source
When you create your account, you’ll find it populated with a few sample data sets and analyses. It’s worth spending an hour playing with them and watching the video tutorials, shown in the screenshot below, before starting to work with your own data. If at some point in the future you need the small amount of SPICE space used by the sample data sets, you can delete them.
When you’re ready to work with your own data, you can create a new data set from one of the sources shown in the screenshot below. QuickSight can automatically discover Redshift, Athena, and RDS databases, as well as S3 buckets, but you can manually connect to most other supported databases. It seems a little odd that you cannot access Oracle databases outside of RDS; I imagine there’s a licensing issue.
If you’re uploading tabular data, it helps if there’s a title row; if not, you can always create your own column names after ingestion. QuickSight will automatically detect the type of your columns; if it gets one wrong, you can fix it. For example, it detected a Zip Code column in a CSV file I uploaded as numeric; I changed it to a string type to preserve the leading zeros.
To import data files from S3 buckets, first you need to authorize QuickSight to use the buckets, and second you need to create JSON manifest files for the import. You can specify URIs to import specific files, or URI prefixes to import all files in a folder or bucket. Valid formats for files in S3 buckets are CSV, TSV, CLF, and ELF. Uploaded files can additionally be in XLSX format.
Ways to import SQL databases
QuickSight lets you create data sets from SQL databases several ways. In the simplest case you can pick a single table. If you want data from several tables, QuickSight will help you join them, exposing a pick list of the fields for the join in each table, as well as a graphical representation for choosing the type of join: inner, outer, left, or right.
If you are good at writing SQL or have been supplied a working SQL query by a DBA, QuickSight will let you enter the query string. QuickSight does not do autocompletion as you type a SQL query. If you need your hand held, you should use the graphical method for creating the query.
I mentioned earlier that SPICE is an option for data queried from a SQL database. As SPICE is an optimized in-memory database local to QuickSight, it is always faster than querying a remote database over the network, even a fast database stored on solid-state disks such as Aurora. There are several reasons why you might need or want to query the database directly: If the resulting table is greater than the size limit for SPICE, currently 10GB; if your organization is low on SPICE space and lacks the budget for more; and if the query is invoked rarely enough and at low enough urgency that the increase in query time doesn’t impact anyone.
Prepare your data sets
Once you have defined a SQL query or imported a tabular file, you can perform several actions to prepare the data. By default, all fields are used, but you may want to unselect fields (columns) that are irrelevant to your current analysis.
Database field names are often constrained by the rules of the database. You can change the field names to ones that are more readable or meaningful in the QuickSight data set.
You may want to define calculated fields based on the original data, for example defining a Gross Profit field in a Sales table as Price – Cost. Calculated fields can use multiple functions and operators, and they can be based on multiple data fields and/or other calculated fields.
You may also want to filter the rows of data that go into the data set—for example, filtering out sales with 0 items or 0 price. Each filter only applies to a single field, but you can apply multiple filters to each field. You can also filter by calculated fields, which of course can depend on multiple data fields.
You can refresh your SPICE data sets immediately on demand or on an automated schedule. This makes sense when your data sets have been created from live data sources that change over time.
Visualize your data
A live set of visualizations based on one or more data sets is called an analysis, as shown in the screenshot below. For analysis purposes, fields are either dimensions or measures. Dimensions, with blue icons, are items or attributes. Measures, with green icons, are numeric values that you use for measurement, comparison, and aggregation. If we want to view total sales by date, sales needs to be a measure, the aggregate for sales needs to be sum, and date needs to be a dimension. You can change whether a field is used as a dimension or a measure within an analysis.
By default, QuickSight will pick Sum for the aggregation of a measure. You can change that to average, count, max, or min in the field well using the drop-down at the right of the well.
You can create many kinds of visualization from your data and arrange your charts as you wish within your analysis. The 13 kinds of visualization offered by QuickSight might seem like a lot, but some useful visualizations are notably missing—for example, map-based charts, candlestick graphs, Gantt charts, high-low-close charts, and combo charts. This is an area where Power BI, Qlik Sense, and Tableau are all currently superior to QuickSight.
If you looked carefully at the screenshot above, you may have noticed that QuickSight has 14 icons for visualization types, not 13. The lightning-bolt icon is for Autograph mode, meaning that you want QuickSight to choose the visualization based on the dimension and measure fields you have chosen. That’s convenient, especially if you aren’t sure what you want to see. It does not, however, make up for the lack of display options.
Any reader of Edward Tufte’s “The Visual Display of Quantitative Information” is aware that the information actually conveyed by a visualization is highly dependent on the form of the visualization. In other words, I’m not only pointing at an unchecked box when I complain that QuickSight lacks map-based charts. These omissions might be critically important, depending on the nature of your data and the questions you seek to answer.
Consider John Snow’s 1854 cholera outbreak map. The map is key to conveying the fact that most of the reported cases of cholera occurred around the Broad Street pump. (Remember that this was before the germ theory of disease; this case is considered foundational for the field of epidemiology.) I can’t reproduce this map in QuickSight; it has been reproduced very nicely as a story in Tableau.
Share your analyses
Once you have imported data and done an analysis, you can share your work in several ways. A sequence of visualization snapshots that you can share with others is called a story. This narration can convey insights much in the way that the Tableau story I cited earlier does, although the ability to add titles and descriptions to QuickSight is not as nice as the free-format annotation capabilities of Tableau.
A dashboard is a read-only snapshot of an analysis that you can share with other Amazon QuickSight users for reporting purposes. If you have enabled drill-downs, accomplished by adding multiple dimensions to the field wells, dashboards can be interactive even though they are read-only.
You can share your analysis itself with trusted collaborators and share data sets with collaborators. I would prefer to only share a copy of my analysis with collaborators and to be able to control the permissions granted to each collaborator. Unfortunately, neither capability has yet been implemented in QuickSight.
There is a little more control available when sharing data sets. There are two levels of permission: owner and user. A user can’t refresh, edit, delete, or reshare the data set, but can create new analyses from it. An owner can perform all of the above operations.
A good start
For a shop with many data sources hosted on AWS, limited analysis needs, and limited development time, using QuickSight appears to be a no-brainer. QuickSight adds easy analysis and visualization capabilities for a nominal cost.
On the other hand, QuickSight still lacks several capabilities that I consider important and may matter very much for some applications. For example, you wouldn’t want to analyze stock-market data without being able to display high-low-close graphs, and you wouldn’t want to analyze demographic or epidemiological data without being able to display map-based graphs, such as a choropleth map.
If you can program, you can of course create similar displays yourself using libraries such as D3.js, and host those apps on AWS EC2 instances or containers. But that would not help you to use QuickSight per se. It would be very nice if QuickSight had extensibility hooks so that you could, say, add your own display types, data transformation functions, and statistical aggregates to QuickSight that would, for example, take the form of AWS Lambda functions.
If you can’t program, however, and need those other capabilities, your best bet currently would be to use a more capable self-service BI tool—which brings us right back to Power BI, Qlik Sense, and Tableau.
—-
Cost: First user free with 1GB of SPICE; additional users $9 per month (Standard edition, yearly) or $18 per month (Enterprise edition, yearly). Additional users include 10GB of SPICE.
Platform: Amazon Web Services via web browser.
InfoWorld Scorecard |
Analytic power (20%)
|
Data sources (20%)
|
Presentation flexibility (20%)
|
Ease of use (20%)
|
Ease of learning (10%)
|
Value (10%)
|
Overall score (100%)
|
---|---|---|---|---|---|---|---|
Amazon QuickSight | 8 | 8 | 7 | 9 | 9 | 9 |
Copyright © 2017 IDG Communications, Inc.