Showing posts with label one-to-many. Show all posts
Showing posts with label one-to-many. Show all posts

Wednesday, March 7, 2012

Fast insert and select at the same time?

I have two tables: Account and AccountTransction. Each table contains more
than 20 million records. It one-to-many relationship between account and
AccountTransaction. The new records are constantly loading into each table
through text file by using DTS.
Question:
My team member insists that using cursor to insert record one by one to the
table to avoid affect (lock) the selection on these tables. . There are tons
of indexes on both tables for fast searching. The insertion process is
extremely slow. I recommended batch mode insertion, instead of one by one
using cursor. It is much more faster and efficient in terms of insertion,
but the selection while insertion going on is a little bit slower. What is
your suggestion? How can I achieve the fast insertion and fast selection at
the same time'
Is cursor alway a bad idea in terms of speed and performance?
Thanks a lot,
FlxI would never use a cursor to insert one row of data one by one...
Your colleague is smart to have worries about contention. However... it's
quite easy to do this in a safe manner. My standard technique for manaing a
situation like this is to:
* insert N number of rows per batch through an insert into stmt
* N is tested to ensure
- the insert happens fast enough to have a negligible impact on blocks
for selects
- durtion between batch inserts is long enough to ensure we're not
having a constant impact and quueses aren't growing
- but N is large enough to ensure I can insert enough records fast
enough such that the insert process isn't horrible slow.
I've been able to achieve VERY high insert and select throughput using
techniques like that...
--
Brian Moran
Principal Mentor
Solid Quality Learning
SQL Server MVP
http://www.solidqualitylearning.com
"FLX" <nospam@.hotmail.com> wrote in message
news:e5rEoTiAEHA.3004@.TK2MSFTNGP10.phx.gbl...
> I have two tables: Account and AccountTransction. Each table contains more
> than 20 million records. It one-to-many relationship between account and
> AccountTransaction. The new records are constantly loading into each table
> through text file by using DTS.
> Question:
> My team member insists that using cursor to insert record one by one to
the
> table to avoid affect (lock) the selection on these tables. . There are
tons
> of indexes on both tables for fast searching. The insertion process is
> extremely slow. I recommended batch mode insertion, instead of one by one
> using cursor. It is much more faster and efficient in terms of insertion,
> but the selection while insertion going on is a little bit slower. What is
> your suggestion? How can I achieve the fast insertion and fast selection
at
> the same time'
> Is cursor alway a bad idea in terms of speed and performance?
> Thanks a lot,
> Flx
>
>
>
>
>
>
>
>
>
>
>|||Here's an out-of-the-box idea.
Create new tables, same schema, so you have pairs to tables.
These new tables are for 'todays' data.
Create views to cover the pairs of tables. These are what your application/users look at / use.
The DTS populates the 'today' tables.
Once a day, at some light / quite period, stop the DTS. Copy / Move 'todays' data into the oringal, large table
Don't index the 'todays' tables, as they will (hopefully) be small enough to not need them. Or add only essential indexes.
Or only do this for the AccountTransaction table, and use the current method for the Account table.
If possible, you might want to drop the main table indexes just before you load the data from the 'todays' tables.
It depends, of course, on how quite your quite period will be.

Fast insert and select at the same time?

I have two tables: Account and AccountTransction. Each table contains more
than 20 million records. It one-to-many relationship between account and
AccountTransaction. The new records are constantly loading into each table
through text file by using DTS.
Question:
My team member insists that using cursor to insert record one by one to the
table to avoid affect (lock) the selection on these tables. . There are tons
of indexes on both tables for fast searching. The insertion process is
extremely slow. I recommended batch mode insertion, instead of one by one
using cursor. It is much more faster and efficient in terms of insertion,
but the selection while insertion going on is a little bit slower. What is
your suggestion? How can I achieve the fast insertion and fast selection at
the same time'
Is cursor alway a bad idea in terms of speed and performance?
Thanks a lot,
FlxI would never use a cursor to insert one row of data one by one...
Your colleague is smart to have worries about contention. However... it's
quite easy to do this in a safe manner. My standard technique for manaing a
situation like this is to:
* insert N number of rows per batch through an insert into stmt
* N is tested to ensure
- the insert happens fast enough to have a negligible impact on blocks
for selects
- durtion between batch inserts is long enough to ensure we're not
having a constant impact and quueses aren't growing
- but N is large enough to ensure I can insert enough records fast
enough such that the insert process isn't horrible slow.
I've been able to achieve VERY high insert and select throughput using
techniques like that...
Brian Moran
Principal Mentor
Solid Quality Learning
SQL Server MVP
http://www.solidqualitylearning.com
"FLX" <nospam@.hotmail.com> wrote in message
news:e5rEoTiAEHA.3004@.TK2MSFTNGP10.phx.gbl...
> I have two tables: Account and AccountTransction. Each table contains more
> than 20 million records. It one-to-many relationship between account and
> AccountTransaction. The new records are constantly loading into each table
> through text file by using DTS.
> Question:
> My team member insists that using cursor to insert record one by one to
the
> table to avoid affect (lock) the selection on these tables. . There are
tons
> of indexes on both tables for fast searching. The insertion process is
> extremely slow. I recommended batch mode insertion, instead of one by one
> using cursor. It is much more faster and efficient in terms of insertion,
> but the selection while insertion going on is a little bit slower. What is
> your suggestion? How can I achieve the fast insertion and fast selection
at
> the same time'
> Is cursor alway a bad idea in terms of speed and performance?
> Thanks a lot,
> Flx
>
>
>
>
>
>
>
>
>
>
>|||Here's an out-of-the-box idea.
Create new tables, same schema, so you have pairs to tables.
These new tables are for 'todays' data.
Create views to cover the pairs of tables. These are what your application/u
sers look at / use.
The DTS populates the 'today' tables.
Once a day, at some light / quite period, stop the DTS. Copy / Move 'todays'
data into the oringal, large table
Don't index the 'todays' tables, as they will (hopefully) be small enough to
not need them. Or add only essential indexes.
Or only do this for the AccountTransaction table, and use the current method
for the Account table.
If possible, you might want to drop the main table indexes just before you l
oad the data from the 'todays' tables.
It depends, of course, on how quite your quite period will be.

Sunday, February 26, 2012

Fan Trap or Multiple one-to-many joins

I'm having a problem creating a view on some data that involves two one-to-many joins like this:

tbl1 m--> tbl2 <--n tbl3

and I need to create a view on the data from all three tables without duplicates from tbl1 and tbl3.

The problem is that tbl1 and tbl3 are not related at all, except that they are linked by data in tbl2.

Think of it like this: you have a project, which can have multiple consultants, and multiple stakeholders, and the data must be returned in such a way that each consultant and each stakeholder appears once in the output (the project name must appear multiple times of course). The issue is that the following two datasets are logically distinct but semantically identical:

proj A, consultant A, stakeholder A
proj A, consultant B, stakeholder B
--
proj A, consultant B, stakeholder A
proj A, consultant A, stakeholder B

but what I'm aiming for is:

proj A, consultant A, stakeholder A
proj A, consultant B, stakeholder B

I've heard this described as a fan trap, but usual solutions involve reorganzing the data so that tbl3 joins tbl1 joins tbl2, but in my case there is no link there

Incidently, the intended platform for this is DB2 and/or SQL Server. Any help appreciated, thanks.your examples are not very clear

you start out by diagramming tbl1, tbl2, tbl3, and then immediately switch to projects, consultants, and shareholders, without showing the actual data in these tables, just some apparent cross join query results

you might wish to show a more comprehensive example, because so far, it's hard to understand what you're asking|||I've heard this described as many things, most of which aren't polite to repeat. ;)

You're trying to figure out how to build a join to show the relationship between consultants and shareholders, to produce a one-to-one join between two tables that explicitly have no relationship. If you figure out how to make this happen, please let me know... I'm sure that I'll be fascinated by the explanation!

You've got a clear relationship between project and shareholder, and another relationship between project and consultant. You don't explicitly state that there must be a shareholder or a consultant for any given project, and at some point in the project's life I can guarantee that there will not be one of either. You don't explicitly state that there must be one shareholder for every consultant. If you think about these requirements, unless at least one of these requirements is false, you can't get the output you want... There ain't no way to git there from here.

You need to rethink either the specifications or the requirements. Something has got to give because using the definitions that you've given, the present problem can't be solved.

-PatP|||ah, so that's what that is -- i had never heard that terminology before

this pdf is a pretty good explanation --
http://support.businessobjects.com/documentation/installation_resources/5i/tips_and_tricks/pdf/universe_design/ut001.pdf

i would solve this problem with a UNION query --

proj A, consultant A
proj A, consultant B
proj A, stakeholder A
proj A, stakeholder B