Footer rows to skip: faster query

My database is much faster now because I found that I only need around
150,000 records in a table for any single query. The date selection will
always be in a certain range.
So if I break my tables up into smaller tables based on the maximum date
range, its much faster.
I don't understand why its 2x's slower if there are more than x-number of
records in the table.. I have tried non-clustered and then clustered indexes
on the date field, and the date + customer id as the key fields... but its
still slower as more records are added to the table. Is this normal?
SELECT DISTINCT A.CustID, B.Discount
FROM Sales A, Discounts B WHERE
A.SaleDate = '20000117' AND B.DiscountDate = '20000118' AND B.Discount > 0.2
If I have 9.5 million records in Sales, the query takes 2 seconds to 4
seconds.
If I have about 60 tables of 150000 records each, and I select from the
appropriate table, the query is about a half second or less.
Is this a common practice, to break tables up based on date range or am I
still doing something wrong?
CREATE TABLE [Sales] (
[SaleDate] [datetime] NOT NULL ,
[CustID] [varchar] (10) NOT NULL ,
[SaleAmt] [DECIMAL (8,3)] NOT NULL
) ON [PRIMARY]
CREATE TABLE [Discounts] (
[DiscountDate] [datetime] NOT NULL ,
[Discount] [DECIMAL (8,3)] NOT NULL
[Code] [numeric] NOT NULL
) ON [PRIMARY]Rich
You may want to read 'Creating a Partitioned View' article in the BOL.
Does the optimizer available to use indexes defined on the table?
"Rich" <no@.spam.invalid> wrote in message
news:UekHe.54059$4o.18050@.fed1read06...
> My database is much faster now because I found that I only need around
> 150,000 records in a table for any single query. The date selection will
> always be in a certain range.
> So if I break my tables up into smaller tables based on the maximum date
> range, its much faster.
> I don't understand why its 2x's slower if there are more than x-number of
> records in the table.. I have tried non-clustered and then clustered
> indexes
> on the date field, and the date + customer id as the key fields... but its
> still slower as more records are added to the table. Is this normal?
> SELECT DISTINCT A.CustID, B.Discount
> FROM Sales A, Discounts B WHERE
> A.SaleDate = '20000117' AND B.DiscountDate = '20000118' AND B.Discount >
> 0.2
> If I have 9.5 million records in Sales, the query takes 2 seconds to 4
> seconds.
> If I have about 60 tables of 150000 records each, and I select from the
> appropriate table, the query is about a half second or less.
> Is this a common practice, to break tables up based on date range or am I
> still doing something wrong?
> CREATE TABLE [Sales] (
> [SaleDate] [datetime] NOT NULL ,
> [CustID] [varchar] (10) NOT NULL ,
> [SaleAmt] [DECIMAL (8,3)] NOT NULL
> ) ON [PRIMARY]
> CREATE TABLE [Discounts] (
> [DiscountDate] [datetime] NOT NULL ,
> [Discount] [DECIMAL (8,3)] NOT NULL
> [Code] [numeric] NOT NULL
> ) ON [PRIMARY]
>|||Hi
Where are your indexes?
Regards
--
Mike Epprecht, Microsoft SQL Server MVP
Zurich, Switzerland
MVP Program: http://www.microsoft.com/mvp
Blog: http://www.msmvps.com/epprecht/
"Rich" wrote:

> My database is much faster now because I found that I only need around
> 150,000 records in a table for any single query. The date selection will
> always be in a certain range.
> So if I break my tables up into smaller tables based on the maximum date
> range, its much faster.
> I don't understand why its 2x's slower if there are more than x-number of
> records in the table.. I have tried non-clustered and then clustered index
es
> on the date field, and the date + customer id as the key fields... but its
> still slower as more records are added to the table. Is this normal?
> SELECT DISTINCT A.CustID, B.Discount
> FROM Sales A, Discounts B WHERE
> A.SaleDate = '20000117' AND B.DiscountDate = '20000118' AND B.Discount > 0
.2
> If I have 9.5 million records in Sales, the query takes 2 seconds to 4
> seconds.
> If I have about 60 tables of 150000 records each, and I select from the
> appropriate table, the query is about a half second or less.
> Is this a common practice, to break tables up based on date range or am I
> still doing something wrong?
> CREATE TABLE [Sales] (
> [SaleDate] [datetime] NOT NULL ,
> [CustID] [varchar] (10) NOT NULL ,
> [SaleAmt] [DECIMAL (8,3)] NOT NULL
> ) ON [PRIMARY]
> CREATE TABLE [Discounts] (
> [DiscountDate] [datetime] NOT NULL ,
> [Discount] [DECIMAL (8,3)] NOT NULL
> [Code] [numeric] NOT NULL
> ) ON [PRIMARY]
>
>|||Dear Rich,
When processing a join, The SQL optimizer evaluates all reasonable join
permutations and estimates the total I/O cost, in terms of I/O time. The pla
n
resulting in the lowest estimate of I/O time is the plan chosen.
Please Note that As the number of tables increases, then the number of
permutations that the optimizer must evaluate increases as a factorial of th
e
number of tables in the query:
I.E: Nbr Of Tables is 2 then Nbr Of Permutions On SQL is 2!
Nbr Of Tables is 3 then Nbr Of Permutions On SQL is 3!
Nbr Of Tables is 4 then Nbr Of Permutions On SQL is 4!
and so on...
In Your case you have 2 tables and as i can see they are not related(No Join
In Between), one way to solve your problem and optimize your query processin
g
time is either you filter your tables Discounts and Sales and then build the
join,
Or you build a relation between both tables on primary indexes and then make
your criteria fields as clustered indexes(i.e: SaleDate , DiscountDate,
Discount)
i prefer you combine both ways
Good Luck
Mario Aoun
"Rich" wrote:

> My database is much faster now because I found that I only need around
> 150,000 records in a table for any single query. The date selection will
> always be in a certain range.
> So if I break my tables up into smaller tables based on the maximum date
> range, its much faster.
> I don't understand why its 2x's slower if there are more than x-number of
> records in the table.. I have tried non-clustered and then clustered index
es
> on the date field, and the date + customer id as the key fields... but its
> still slower as more records are added to the table. Is this normal?
> SELECT DISTINCT A.CustID, B.Discount
> FROM Sales A, Discounts B WHERE
> A.SaleDate = '20000117' AND B.DiscountDate = '20000118' AND B.Discount > 0
.2
> If I have 9.5 million records in Sales, the query takes 2 seconds to 4
> seconds.
> If I have about 60 tables of 150000 records each, and I select from the
> appropriate table, the query is about a half second or less.
> Is this a common practice, to break tables up based on date range or am I
> still doing something wrong?
> CREATE TABLE [Sales] (
> [SaleDate] [datetime] NOT NULL ,
> [CustID] [varchar] (10) NOT NULL ,
> [SaleAmt] [DECIMAL (8,3)] NOT NULL
> ) ON [PRIMARY]
> CREATE TABLE [Discounts] (
> [DiscountDate] [datetime] NOT NULL ,
> [Discount] [DECIMAL (8,3)] NOT NULL
> [Code] [numeric] NOT NULL
> ) ON [PRIMARY]
>
>|||I was just curious about what is the approximate criterion
which allows an optimizer to tell that
T3 joined T1 joined T2
is faster than
T1 joined T2 joined T3 ?
Is it based on the number of columns? Can't be based on
rows because their number is unknown before the join
and would take to much time to compute it...
Any information or pointer on this problem?
Thank you
- Pamela
.NET developer|||Rich,
Why aren't the two tables joined in the query? Do all Customers get all
Discounts? That doesn't seem to make sense (from a data model point of
view).
Assuming you do have some common key (and you join on it in the query),
then this type of query would benefit from a clustered index on the date
range, or from a covering index (see BOL for more details).
Hope this helps,
Gert-Jan
Rich wrote:
> My database is much faster now because I found that I only need around
> 150,000 records in a table for any single query. The date selection will
> always be in a certain range.
> So if I break my tables up into smaller tables based on the maximum date
> range, its much faster.
> I don't understand why its 2x's slower if there are more than x-number of
> records in the table.. I have tried non-clustered and then clustered index
es
> on the date field, and the date + customer id as the key fields... but its
> still slower as more records are added to the table. Is this normal?
> SELECT DISTINCT A.CustID, B.Discount
> FROM Sales A, Discounts B WHERE
> A.SaleDate = '20000117' AND B.DiscountDate = '20000118' AND B.Discount > 0
.2
> If I have 9.5 million records in Sales, the query takes 2 seconds to 4
> seconds.
> If I have about 60 tables of 150000 records each, and I select from the
> appropriate table, the query is about a half second or less.
> Is this a common practice, to break tables up based on date range or am I
> still doing something wrong?
> CREATE TABLE [Sales] (
> [SaleDate] [datetime] NOT NULL ,
> [CustID] [varchar] (10) NOT NULL ,
> [SaleAmt] [DECIMAL (8,3)] NOT NULL
> ) ON [PRIMARY]
> CREATE TABLE [Discounts] (
> [DiscountDate] [datetime] NOT NULL ,
> [Discount] [DECIMAL (8,3)] NOT NULL
> [Code] [numeric] NOT NULL
> ) ON [PRIMARY]|||"Gert-Jan Strik" <sorry@.toomuchspamalready.nl> wrote in message
news:42EE6C4F.DD1000E0@.toomuchspamalready.nl...
> Rich,
> Why aren't the two tables joined in the query? Do all Customers get all
> Discounts? That doesn't seem to make sense (from a data model point of
> view).
> Assuming you do have some common key (and you join on it in the query),
> then this type of query would benefit from a clustered index on the date
> range, or from a covering index (see BOL for more details).
> Hope this helps,
> Gert-Jan
That helps, thanks.
Richard
> Rich wrote:
of
indexes
its
0.2
I

Friday, March 9, 2012

faster query - smaller tables?

No comments:

Post a Comment

Footer rows to skip

Blog Archive

About Me