문제

This question is about the effectiveness of a SQL Server indexing technique. I think it is known as "index intersection".

I'm working with an existing SQL Server (2008) application that has a number of performance and stability issues. The developers did some odd things with indexing. I’ve not been able to get conclusive benchmarks on these issues, nor can I find any really good documentation on the internets.

There are many searchable columns on a table. The developers created a single column index on EACH of the searchable columns. The theory was that SQL Server would be able to combine (intersect) each of these indexes to efficiently access the table in most circumstances. Here is a simplified example (real table has more fields):

CREATE TABLE [dbo].[FatTable](
    [id] [bigint] IDENTITY(1,1) NOT NULL,
    [col1] [nchar](12) NOT NULL,
    [col2] [int] NOT NULL,
    [col3] [varchar](2000) NOT NULL, ...

CREATE NONCLUSTERED INDEX [IndexCol1] ON [dbo].[FatTable]  ( [col1] ASC )
CREATE NONCLUSTERED INDEX [IndexCol2] ON [dbo].[FatTable] ( [col2] ASC )

select * from fattable where col1 = '2004IN' 
select * from fattable where col1 = '2004IN' and col2 = 4

I think multiple column indexes targeted to search criteria are much better, but I may be wrong. I have seen query plans that show SQL Server doing a hash match on two index seeks. Perhaps this makes sense when you don’t know how the table is searched? Thanks.

도움이 되었습니까?

해결책

What you need are covering indexes, ie. indexes that can satisfy a query on their own. But a 'covering' index has one problem: it is covering a specific query. So in order to develop a good indexing strategy, you need to understand your workload: what queries are hitting the database, which ones are critical and which ones are not, how often each type of query is run, etc etc etc. And then you balance this against the write and update cost of each index, and there you have your indexing strategy. If it sounds complicated that is because it is complicated.

However you can apply some rules of thumb. The MSDN covers the basics quite well:

There is also a myriad of articles contributed by the community, eg. Webcast Recording – DBA Darwin Awards: Index Edition.

And to answer your question specifically: separate indexes on each column can work, provided that each column has a high selectivity (many distinct values, each values appearing only a few times in the database). The resulted access plan using hash join between two index range scans usually works quite well. Columns with low selectivity (few distinct values, each value appearing many times in the database) make no sense to be indexed on their own, the query optimizer will simply ignore them. However, low selectivity columns many times make good composite keys when they are paired with a high selectivity column.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 dba.stackexchange
scroll top