Is using COUNT() or SELECT a good idea?

sql database

1. On count(*) vs. count(something else)

SQL is declarative in that you specify what you want. This is different from specifying how to get what you want. That means the database engine is free to realize your query in whatever way it thinks is the most efficient. Many database optimizers rewrites your query to a less costly alternative (if such a plan is available).

Given the following table:

table(   pk       not null  ,color    not null  ,nullable null  ,unique(pk)  ,index(color));

...all of the following are functionally equivalent (due to the mechanics of count and nulls):

1) select count(*) from table;2) select count(1) from table;3) select count(pk) from table;4) select count(color) from table;

Regardless of which form you use, the optimizer is free to rewrite the query to another form if it is more efficient. (Again, not all optimizers are sophisticated enough to do this). The unique index(pk) would be smaller (bytes occupied) than the entire table. Therefore it would be more efficient to count the number of index entries rather than scanning through the entire table. In Oracle we have bitmap indexes, which also compress repeating strings. If we had used such an index on the color column, it would probably have been the smallest index to scan. Oracle also supports table compression which in some cases makes the physical table smaller than a composite index.

1. TL;DR;Your specific dbms will have its own set of tools that enables different rewriting rules and in turn execution plans. That renders the question somewhat useless (unless we talk about a specific release of a specific dbms). I recommend COUNT(*) in all cases because it requires the least cognitive effort to grasp.

2. On select a,b,c vs. select *

There are very few valid uses of SELECT * in code you write and put into production. Imagine a table which contains Bluray movies (yes, the movies is stored as a blob in this table). So you slapped together your awesomesauce abstraction layer and put SELECT * FROM movies where id = ? in the getMovies(movie_id) method. I will refrain myself from explaining why SELECT name FROM movies will be transported across the network just a tad faster. Of course, in most realistic cases it won't have a noticable impact.

One last point on performance is that when all the referenced columns (selected, filtered) in your query exists as an index (called a covering index), the database need not touch the table at all. It can be fully resolved from scanning the index only. By selecting all columns you remove this option from the optimizer.

Another thing about SELECT * which is far more serious than anything, is that it creates an implicit dependency on a specific physical layout of the table. Let me explain. Consider the following tables:

table T1(name, id)table T2(name, id)

The following statement...

insert into t1 select * from t2;

... will break or produce a different result if any of the following happens:

Any of the tables columns are rearranged for example T1(id, name)
T1 gets an additional not-null column
T2 gets another column

2. TL;DR; When possible, explicitly specify the columns you want (eventually, you'll have to do that anyway). Also, selecting fewer columns are faster than selecting more columns. A possitive side-effect on explicit selects is that it gives greater freedom to the optimizer.

sql database

COUNT(*) is different from COUNT(column1) !
COUNT(*) returns the number of records, and does NOT use more resources, while COUNT(column1) counts the number of records where column1 is non null.

For SELECT, it is different. SELECT * will of course request more data.

sql database

When using count(*) the * doesn't mean "all fields". Using count(field) will count all non-null values in the field, but count(*) will always count all records even if all fields in all records are null, so it doesn't need to check the data in the fields at all.

Using select * means that you almost always return more data than you are going to use, which of course is a waste. However, perhaps more serious is the maintainence problem; if you add fields to a table your query will return these too. That might mean that the record becomes too large to fit in the buffer, resulting in an error message.

CodeHunter

Is using COUNT() or SELECT a good idea?

1. On count(*) vs. count(something else)

2. On select a,b,c vs. select *

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last

Is using COUNT(*) or SELECT * a good idea?

1. On count(*) vs. count(something else)

2. On select a,b,c vs. select *

Recent Posts

Is using COUNT() or SELECT a good idea?