In previous article, I created two tables in my Redshift Cluster. Now I wan’t to find out the relation between salary of every employee and their working age. Tableau is the best choice for visualizing data analysis (SAS is too expensive and has no trail-version for learning).
First, we connect to Redshift in Tableau, and double-click the “New Custom SQL”. In the popup window, type in our SQL to query first-year-salary of every employee:
data:image/s3,"s3://crabby-images/143bf/143bf806f997bd4224a4f16ef0023fff1d1c1d21" alt=""
Now we have the table “custom sql query”. Drag in table “salary”, and choose “inner join” for employee_id, start_date:
data:image/s3,"s3://crabby-images/43795/43795bbc6656a3e547ab6f1b3d0645677851ca9b" alt=""
Click into the “Sheet 1”. Drag “salary” to “Rows”, “min_start_date” to “Columns”, and “employee_id” to “Color” in “Marks” panel.
data:image/s3,"s3://crabby-images/8e2b5/8e2b55f4ecb78a62e7b8c4791a485eca9ea78d3d" alt=""
Now we can see the “expensive employees” (who have the most high salary in the same first-year) on the top of the graph:
data:image/s3,"s3://crabby-images/7f6f1/7f6f16e9e661d8786a456df9574d22df44acd6b5" alt=""
Instead of adding custom SQL in tableau datasource panel, we can also create view in Redshift, and let tableau show views in “Tables”.
CREATE VIEW ts
AS SELECT employee_id, MIN(start_date) AS min_start_date
FROM salary
GROUP BY employee_id;
CREATE VIEW first_year_salary
AS SELECT ts.employee_id, s.salary, ts.min_start_date
FROM ts
JOIN salary AS s
ON ts.employee_id = s.employee_id AND ts.min_start_date = s.start_date;
Or using “WITH” clause
WITH ts
AS (SELECT employee_id, MIN(start_date) AS min_start_date
FROM salary
GROUP BY employee_id)
SELECT ts.employee_id, s.salary, ts.min_start_date
FROM ts
JOIN salary AS s
ON ts.employee_id = s.employee_id AND ts.min_start_date = s.start_date;
data:image/s3,"s3://crabby-images/be81a/be81ace67e6a9dbee883bbe14db01b99fd5f0462" alt=""