Using COLLECT STATISTICS in Teradata Using COLLECT STATISTICS in Teradata database database

Using COLLECT STATISTICS in Teradata


1>are there any disadvantages to using collect stats?

Yes, collect stats itself is time consuming, it actually locate data from AMPS and insert the stats in dictionary tables.

Suppose you have a table definition like:

ct t1(x1 int,y1 int, z1 int);

The table contains millions of rows and z1 is never used in the ST/Join conditions, then it is not worth to collect stats on z1.

2>When is it appropriate/inappropriate to use collect statistics in your SQL scripting?

Already answered above. If a column is going to be used as ST/Join condition .i.e in where or on clause, you must collect stats, otherwise not needed.

3>What's the performance benefit to collect statistics on a field that's already indexed?

ct t1(x1 int,y1 int) primary index(x1);

for a simple query like sel * from t1 where x1 = 5;

will demonstrate the usefulness of collect stats.

How?

the optimizer can correctly estimate how many rows this query will select and if t1 is going to be joined with say t2, a efficient join will be chosen by optimizer.

4>How long are statistics stored for (table, volatile tables)?

Table : permanently.

volatile tables: till session expires.

5>Any other comments concerning collect statistics would be appreciated.

Nothing has been discussed about multicolumn stats.

Say, the query is like:

sel * from t1 join t2 on y1=y2 and x1=2;

then collecting multi-column stats on (x1,y1) would be quite helpful in optimization.

Also, if table demography has been changed (increased number of rows) you must consider re-collecting the stats