I'm building a large scale application and was wondering the difference between static counts vs dynamic counts.
Right now, I'm using dynamic counts. For an example:
SELECT *,
(SELECT COUNT(1) FROM CarImages WHERE CarImages.carid = Cars.id) as ImageCount,
(SELECT COUNT(1) FROM CarBuyers WHERE CarBuyers.carid = Cars.id) as BuyerCount,
(SELECT COUNT(1) FROM CarSellers WHERE CarSellers.carid = Cars.id) as SellerCount,
FROM Cars WHERE Cars.CarCondition > 2 AND IsCarHasGoodRecord(Cars.carid);
Inside IsCarHasGoodRecord function:
function IsCarHasGoodRecrd(carid INT) RETURNS BOOLEAN
BEGIN
DECLARE GoodCondition INT; DECLARE CarMaxConditionId INT;
SELECT 1 INTO CarMaxConditionId FROM CarMaxConditions WHERE carid = carid;
SELECT 1 INTO GoodCondition FROM KellyBlueBookConfirm WHERE carid = carid AND carmaxid = CarMaxConditionId;
--If the current "car row" is good condition, return true so that the car will be included in the result set
if GoodCondition = 1 THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END$;
What happens is that for each car row, carimage count, car buyer count, car seller count is retrieved. Then the car id is passed to the IsCarHasGoodRecrd to find out if the car has a good record and whether to include it in the result set. As you can tell from the query above, about 3 correlated subqueries are executed in the main query, and 2 other correlate subqueries are returned in the function...totaling 5 correlated subqueries for each row in car table.
Now, if everything is properly INDEXED, on a high-end server (2x dual opteron, 4gigs ram, 2x 250 gig)...with a large concurrent userbase and 20 million rows in all tables......what is performance going to be like? I never dealt with such databases or userbase so I don't know what to expect.
I was thinking of using static counts (where the there is a column for each count in the parent table) but it's too much maintenance and i would have to use transactions everytime i update it to make sure the counts are not off.
Table Schema:
[Car]
id - identity
name - carname
status - active, deleted etc
created
modified
[CarImage]
id - identity
carid - foreign key, points to [car].id
name - image name
status
created
[CarBuyers]
id - identity
carid - foreign key, points to [car].id
buyerid - foreign key, points to [users].id
created
status
[CarSellers]
id - identity
carid - foreign key, points to [car].id
sellerid - foreign key, points to [users].id
created
status
[users]
id - identity
name
created
status
The reason I use a stored function is to keep the SQL syntax simple. There's alot of tables OTHER than [car] table which follows the SAME pattern (like a [parts] table OR [upgrades] table). If I put the logic in a stored procedure...I could just write:
SELECT * FROM ([car] OR [parts] or [upgrades] or whatever that follows the same pattern) WHERE iscarvalid(blahblahblah);
I could do that with left joins but it's very long and tedious to write. I heard left joins maybe faster.
However, if everything is indexed..what performance am I expecting with the server configuration in my above post?