help with SQL count and nested SELECT

unoho · **Posted:** Sat May 07, 2011 8:57 pm

Okay..so I just started out SQL so it might be something really easy that im not figuring out. but this is what i want to achieve:

Posted Image, might have been reduced in size. Click Image to view fullscreen.

but so far this is what I have:

code:

SELECT region, name FROM bbc
WHERE region IN (SELECT region FROM bbc WHERE name='Brazil'
OR name='Mexico')

ORDER BY region, name

i got the region and name to display but when i try to implement count, (for eg, SELECT region, count(name), name), it messes up the whole table (displays only south am, 5, argentina).

Any ideas what I should do to get to my desired destination?
Im using SQL Server 2005 and SQL server management studio.

Thanks a lot!

Tony · **Posted:** Sat May 07, 2011 9:08 pm

If you want "Canada", "USA", and "Mexico" to all be in the same row, you would need to figure out how to combine those multiple entries into a single field. Perhaps something like concat? http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_concat

(more likely CONCAT_WS, as you'd want a separator in there; or whatever the equivalent for your DBMS is.)

unoho · **Posted:** Sat May 07, 2011 9:42 pm

Tony · **Posted:** Sat May 07, 2011 10:06 pm

You would probably select all the country names first, and LEFT JOIN all the other stuff onto them. Keep in mind that this would require you to calculate the number of countries in a region... for each country in that region.

2goto1 · **Posted:** Sat May 07, 2011 11:03 pm

You can use an inner join to join your countries to regions. That takes care of your requirement of returning each country and its region in a separate row. SQL allows you to add any number of correlated subqueries to a query, such as:

select 1, (select count(1) from foo) as total, (select count(1) from foo) as total2, etc.

That should be all you need to do to display the number of countries per region

unoho · **Posted:** Mon May 09, 2011 12:31 pm

thank you, got it to work, but it was kinda slow with 100K + data... took approx 20sec to display the values

Tony · **Posted:** Mon May 09, 2011 12:35 pm

If you had indexes, subqueries / JOINs can't use those. Check with EXPLAIN (or equivalent) to see how bad things actually get internally.

http://dev.mysql.com/doc/refman/5.0/en/explain.html

2goto1 · **Posted:** Mon May 09, 2011 12:44 pm

Yeah, it's not a very scalable type of query because for each row returned, a sub-query is executed. That being said for even 100,000 rows of data, it's surprising to hear that it's taking 20 seconds.

If performance is an issue, then you could try doing some things to improve performance. The most common things that usually improve performance are creating appropriate indexes based on your query columns, and creating surrogate keys for your tables - numeric id columns typically in SQL Server - that join and filter on numeric values rather than natural ids (i.e. RegionId, CountryId, versus RegionName, Name, etc.).

Getting rid of the sub-query for each and every row would definitely improve performance. Sometimes de-normalizing or introducing redundant columns into your data model can help to improve performance where it is important. A "number_of_countries" column in your regions table would eliminate the subquery that you're currently using, for example.

2goto1 · **Posted:** Mon May 09, 2011 12:47 pm

Tony @ Mon May 09, 2011 12:35 pm wrote:

If you had indexes, subqueries / JOINs can't use those. Check with EXPLAIN (or equivalent) to see how bad things actually get internally.

http://dev.mysql.com/doc/refman/5.0/en/explain.html

It sounds like he's using SQL Server 2005. Indexes are supported with subqueries and joins, at least with SQL Server. I'm sure that joins do as well with MySQL, at least with their InnoDB database engine? I'd be surprised if joins didn't support indexes.

SQL Server also has an explain plan type of tool, as well as other tools to evaluate performance.

Tony · **Posted:** Mon May 09, 2011 12:55 pm

I'm not familiar with the SQL flavour of SQL Server, but I figure there are equivalents for most of what MySQL implements.

I thought that in cases where a temporary table is created by the query, it would not have indexes on it. Is this not the case for some engines? (How would that work?)

2goto1 · **Posted:** Mon May 09, 2011 1:50 pm

In SQL Server, temporary objects are almost always created for all but the simplest of queries, if not always created. The query engine goes through a series of execution steps to execute a query. Each step produces an output which is then passed as input to the subsequent step. That sequence of steps is called the Execution Plan. I think that most database engines have the same concept of creating a plan, and then executing the plan in order to perform the query.

Those temporary objects are often decided on per database vendor, since their processing engines contain many proprietary things. Your indexes are still used for joins and subqueries, but just because they're used doesn't mean that the queries will be efficient. I.e. in the scenario where a subquery executes once per row for a query that processes millions of rows, your indexes will get used but performance might be terrible.

MySQL is neat because they have different DB engines that you can plug in, such as a flat file system, or relational DB. SQL Server and Oracle have similar things but only for backward compatibility

unoho · **Posted:** Tue May 10, 2011 7:59 am

ya the problem is that i don't really know much about SQL but i still have to get the work done somehow. LOL

according to the person im working with, he said you could index it somehow to make it much faster but again, it's not required.

	Computer Science Canada Programming C, C++, Java, PHP, Ruby, Turing, VB Username: Password: Register
Wiki Blog Search Turing Chat Room Members