Wednesday, March 7, 2012

Fast look up of long (n)varchar

I have a table containing URLs. I want to be able to look up an URL very
fast, so I used an nvarchar to store the URL, and put an index on it
(maybe naive).

Anyway, I bump into:
"The index entry of length 911 bytes for the index 'UQ__URL__1367E606'
exceeds the maximum length of 900 bytes."

What's the best way to handle this? I want to do the look up fast. The
only thing I could think up was adding an extra column containing a digest
for the URL, and look up all URLs with the same digest, *and* having the
same value (which could give either 1 or 0 results).

I am new to MS SQL, so I might describe a silly solution, basically I want
to look up URLs to ID the fastest way possible.

--
John MexIT: http://johnbokma.com/mexit/
personal page: http://johnbokma.com/
Experienced programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.htmlJohn,

use an index on checksum(url), as described here:

http://www.devx.com/dbzone/Article/30786|||>Use an index on checksum(url), as described here:

oh wow. that is brilliant. I have never run into that before.

that is a GREAT idea that I am filing away for futures.

thank you!|||John Bokma <john@.castleamber.com> wrote:

> I have a table containing URLs. I want to be able to look up an URL
> very fast, so I used an nvarchar to store the URL, and put an index on
> it (maybe naive).
> Anyway, I bump into:
> "The index entry of length 911 bytes for the index 'UQ__URL__1367E606'
> exceeds the maximum length of 900 bytes."
> What's the best way to handle this? I want to do the look up fast. The
> only thing I could think up was adding an extra column containing a
> digest for the URL, and look up all URLs with the same digest, *and*
> having the same value (which could give either 1 or 0 results).
> I am new to MS SQL, so I might describe a silly solution, basically I
> want to look up URLs to ID the fastest way possible.

To answer my own question: under CHECKSUM in SQL Server Books Online:

"The checksum index can be used as a hash index, particularly to improve
indexing speed when the column to be indexed is a long character column."

Comes with an example, etc.

--
John MexIT: http://johnbokma.com/mexit/
personal page: http://johnbokma.com/
Experienced programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html

No comments:

Post a Comment