Question

I have five tables in my database. Members, items, comments, votes and countries. I want to get 10 items. I want to get the count of comments and votes for each item. I also want the member that submitted each item, and the country they are from.

After posting here and elsewhere, I started using subselects to get the counts, but this query is taking 10 seconds or more!

SELECT `items_2`.*, 
   (SELECT COUNT(*) 
   FROM `comments` 
   WHERE (comments.Script = items_2.Id) 
   AND (comments.Active = 1)) 
  AS `Comments`, 
   (SELECT COUNT(votes.Member) 
   FROM `votes` 
   WHERE (votes.Script = items_2.Id) 
   AND (votes.Active = 1)) 
  AS `votes`, 
  `countrys`.`Name` AS `Country` 
FROM `items` AS `items_2` 
INNER JOIN `members` ON items_2.Member=members.Id AND members.Active = 1 
INNER JOIN `members` AS `members_2` ON items_2.Member=members.Id 
LEFT JOIN `countrys` ON countrys.Id = members.Country 
GROUP BY `items_2`.`Id` 
ORDER BY `Created` DESC 
LIMIT 10

My question is whether this is the right way to do this, if there's better way to write this statement OR if there's a whole different approach that will be better. Should I run the subselects separately and aggregate the information?

Was it helpful?

Solution

Yes, you can rewrite the subqueries as aggregate joins (see below), but I am almost certain that the slowness is due to missing indices rather than to the query itself. Use EXPLAIN to see what indices you can add to make your query run in a fraction of a second.

For the record, here is the aggregate join equivalent.

SELECT `items_2`.*,
  c.cnt AS `Comments`,
  v.cnt AS `votes`,
  `countrys`.`Name` AS `Country` 
FROM `items` AS `items_2` 
INNER JOIN `members` ON items_2.Member=members.Id AND members.Active = 1 
INNER JOIN `members` AS `members_2` ON items_2.Member=members.Id 
LEFT JOIN (
  SELECT Script, COUNT(*) AS cnt 
   FROM `comments` 
   WHERE Active = 1
   GROUP BY Script
) AS c
ON c.Script = items_2.Id 
LEFT JOIN ( 
  SELECT votes.Script, COUNT(*) AS cnt 
   FROM `votes` 
   WHERE Active = 1
   GROUP BY Script
) AS v
ON v.Script = items_2.Id 
LEFT JOIN `countrys` ON countrys.Id = members.Country 
GROUP BY `items_2`.`Id` 
ORDER BY `Created` DESC 
LIMIT 10

However, because you are using LIMIT 10, you are almost certainly as well off (or better off) with the subqueries that you currently have than with the aggregate join equivalent I provided above for reference.

This is because a bad optimizer (and MySQL's is far from stellar) could, in the case of the aggregate join query, end up performing the COUNT(*) aggregation work for the full contents of the Comments and Votes table before wastefully throwing everything but 10 values (your LIMIT) away, whereas in the case of your original query it will, from the start, only look at the strict minimum as far as the Comments and Votes tables are concerned.

More precisely, using subqueries in the way that your original query does typically results in what is called nested loops with index lookups. Using aggregate joins typically results in merge or hash joins with index scans or table scans. The former (nested loops) are more efficient than the latter (merge and hash joins) when the number of loops is small (10 in your case.) The latter, however, get more efficient when the former would result in too many loops (tens/hundreds of thousands or more), especially on systems with slow disks but lots of memory.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top