I am developing an sql query to use to grab information from a postgres db. The scenario is as follows: the database I am grabbing information is the backend postgres database of an application (Tableau). My objective is to pull all of the workbooks, views, and datasources (considered objects) that are untagged in the database. I am using 4 tables - taggings (contains all tagged objects related to the object by the objects id), views (contains all views and their ids), workbooks (contains all workbooks and their ids), and datasources (contains all datasources and their ids).
I have already successfully created queries to pull all of the tagged objects from the database (they are the inner select's in the not exists statements), but getting the untagged objects is proving more difficult. The code I currently have works successfully for the datasources, but the workbooks section returns one extra record than it should and the views section returns 19 extra records than it should. I am using not exists to get all of the objects that are not tagged, and union to combine the separate queries for views, datasources, and workbooks. I cannot spot the error, so please let me know if you can.
I am testing the query using pgAdmin.
SELECT distinct w.name as Object,
t.taggable_type as Type
FROM workbooks w, taggings t
WHERE t.taggable_type = 'Workbook'
AND NOT EXISTS
(SELECT distinct w.name
FROM taggings
WHERE taggings.taggable_type = 'Workbook'
AND w.id = taggings.taggable_id)
Union
SELECT distinct v.name as Object,
t.taggable_type as Type
FROM views v, taggings t
WHERE NOT EXISTS
(SELECT v.name
FROM taggings
WHERE taggings.taggable_type = 'View'
AND v.id = taggings.taggable_id)
AND t.taggable_type = 'View'
Union
SELECT distinct d.name as Object,
t.taggable_type as Type
FROM datasources d, taggings t
WHERE t.taggable_type = 'Datasource'
AND NOT EXISTS
(SELECT d.name
FROM taggings
WHERE taggings.taggable_type = 'Datasource'
AND d.id = taggings.taggable_id)
order by type