Oracle <> , != , ^= operators Oracle <> , != , ^= operators database database

Oracle <> , != , ^= operators


I have tested the performance of the different syntax for the not equal operator in Oracle. I have tried to eliminate all outside influence to the test.

I am using an 11.2.0.3 database. No other sessions are connected and the database was restarted before commencing the tests.

A schema was created with a single table and a sequence for the primary key

CREATE TABLE loadtest.load_test (  id NUMBER NOT NULL,  a VARCHAR2(1) NOT NULL,  n NUMBER(2) NOT NULL,  t TIMESTAMP NOT NULL);CREATE SEQUENCE loadtest.load_test_seqSTART WITH 0MINVALUE 0;

The table was indexed to improve the performance of the query.

ALTER TABLE loadtest.load_testADD CONSTRAINT pk_load_testPRIMARY KEY (id)USING INDEX;CREATE INDEX loadtest.load_test_i1ON loadtest.load_test (a, n);

Ten million rows were added to the table using the sequence, SYSDATE for the timestamp and random data via DBMS_RANDOM (A-Z) and (0-99) for the other two fields.

SELECT COUNT(*) FROM load_test;COUNT(*)----------100000001 row selected.

The schema was analysed to provide good statistics.

EXEC DBMS_STATS.GATHER_SCHEMA_STATS(ownname => 'LOADTEST', estimate_percent => NULL, cascade => TRUE);

The three simple queries are:-

SELECT a, COUNT(*) FROM load_test WHERE n <> 5 GROUP BY a ORDER BY a;SELECT a, COUNT(*) FROM load_test WHERE n != 5 GROUP BY a ORDER BY a;SELECT a, COUNT(*) FROM load_test WHERE n ^= 5 GROUP BY a ORDER BY a;

These are exactly the same with the exception of the syntax for the not equals operator (not just <> and != but also ^= )

First each query is run without collecting the result in order to eliminate the effect of caching.

Next timing and autotrace were switched on to gather both the actual run time of the query and the execution plan.

SET TIMING ONSET AUTOTRACE TRACE

Now the queries are run in turn. First up is <>

> SELECT a, COUNT(*) FROM load_test WHERE n <> 5 GROUP BY a ORDER BY a;26 rows selected.Elapsed: 00:00:02.12Execution Plan----------------------------------------------------------Plan hash value: 2978325580--------------------------------------------------------------------------------------| Id  | Operation             | Name         | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------------|   0 | SELECT STATEMENT      |              |    26 |   130 |  6626   (9)| 00:01:20 ||   1 |  SORT GROUP BY        |              |    26 |   130 |  6626   (9)| 00:01:20 ||*  2 |   INDEX FAST FULL SCAN| LOAD_TEST_I1 |  9898K|    47M|  6132   (2)| 00:01:14 |--------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("N"<>5)Statistics----------------------------------------------------------          0  recursive calls          0  db block gets      22376  consistent gets      22353  physical reads          0  redo size        751  bytes sent via SQL*Net to client        459  bytes received via SQL*Net from client          3  SQL*Net roundtrips to/from client          1  sorts (memory)          0  sorts (disk)         26  rows processed

Next !=

> SELECT a, COUNT(*) FROM load_test WHERE n != 5 GROUP BY a ORDER BY a;26 rows selected.Elapsed: 00:00:02.13Execution Plan----------------------------------------------------------Plan hash value: 2978325580--------------------------------------------------------------------------------------| Id  | Operation             | Name         | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------------|   0 | SELECT STATEMENT      |              |    26 |   130 |  6626   (9)| 00:01:20 ||   1 |  SORT GROUP BY        |              |    26 |   130 |  6626   (9)| 00:01:20 ||*  2 |   INDEX FAST FULL SCAN| LOAD_TEST_I1 |  9898K|    47M|  6132   (2)| 00:01:14 |--------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("N"<>5)Statistics----------------------------------------------------------          0  recursive calls          0  db block gets      22376  consistent gets      22353  physical reads          0  redo size        751  bytes sent via SQL*Net to client        459  bytes received via SQL*Net from client          3  SQL*Net roundtrips to/from client          1  sorts (memory)          0  sorts (disk)         26  rows processed

Lastly ^=

> SELECT a, COUNT(*) FROM load_test WHERE n ^= 5 GROUP BY a ORDER BY a;26 rows selected.Elapsed: 00:00:02.10Execution Plan----------------------------------------------------------Plan hash value: 2978325580--------------------------------------------------------------------------------------| Id  | Operation             | Name         | Rows  | Bytes | Cost (%CPU)| Time     |--------------------------------------------------------------------------------------|   0 | SELECT STATEMENT      |              |    26 |   130 |  6626   (9)| 00:01:20 ||   1 |  SORT GROUP BY        |              |    26 |   130 |  6626   (9)| 00:01:20 ||*  2 |   INDEX FAST FULL SCAN| LOAD_TEST_I1 |  9898K|    47M|  6132   (2)| 00:01:14 |--------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("N"<>5)Statistics----------------------------------------------------------          0  recursive calls          0  db block gets      22376  consistent gets      22353  physical reads          0  redo size        751  bytes sent via SQL*Net to client        459  bytes received via SQL*Net from client          3  SQL*Net roundtrips to/from client          1  sorts (memory)          0  sorts (disk)         26  rows processed

The execution plan for the three queries is identical and the timings 2.12, 2.13 and 2.10 seconds.

It should be noted that whichever syntax is used in the query the execution plan always displays <>

The tests were repeated ten times for each operator syntax. These are the timings:-

<>2.092.132.122.102.072.092.102.132.132.10!=2.092.102.122.102.152.102.122.102.102.12^=2.092.162.102.092.072.162.122.122.092.07

Whilst there is some variance of a few hundredths of the second it is not significant. The results for each of the three syntax choices are the same.

The syntax choices are parsed, optimised and are returned with the same effort in the same time. There is therefore no perceivable benefit from using one over another in this test.

"Ah BC", you say, "in my tests I believe there is a real difference and you can not prove it otherwise".

Yes, I say, that is perfectly true. You have not shown your tests, query, data or results. So I have nothing to say about your results. I have shown that, with all other things being equal, it doesn't matter which syntax you use.

"So why do I see that one is better in my tests?"

Good question. There a several possibilities:-

  1. Your testing is flawed (you did not eliminate outside factors -other workload, caching etc You have given no information aboutwhich we can make an informed decision)
  2. Your query is a special case (show me the query and we can discuss it).
  3. Your data is a special case (Perhaps - but how - we don't see that either).
  4. There is some other outside influence.

I have shown via a documented and repeatable process that there is no benefit to using one syntax over another. I believe that <> != and ^= are synonymous.

If you believe otherwise fine, so

a) show a documented example that I can try myself

and

b) use the syntax which you think is best. If I am correct and there is no difference it won't matter. If you are correct then cool, you have an improvement for very little work.

"But Burleson said it was better and I trust him more than you, Faroult, Lewis, Kyte and all those other bums."

Did he say it was better? I don't think so. He didn't provide any definitive example, test or result but only linked to someone saying that != was better and then quoted some of their post.

Show don't tell.


You reference the article on the Burleson site. Did you follow the link to the Oracle-L archive? And did you read the other emails replying to the email Burleson cites?

I don't think you did, otherwise you wouldn't have asked this question. Because there is no fundamental difference between != and <>. The original observation was almost certainly a fluke brought about by ambient conditions in the database. Read the responses from Jonathan Lewis and Stephane Faroult to understand more.


" Respect is not something a programmer need to have, its the basic attitude any human being should have"

Up to a point. When we meet a stranger in the street then of course we should be courteous and treat them with respect.

But if that stranger wants me to design my database application in a specific way to "improve performance" then they should have a convincing explanation and some bulletproof test cases to back it up. An isolated anecdote from some random individual is not enough.


The writer of the article, although a book author and the purveyor of some useful information, does not have a good reputation for accuracy. In this case the article was merely a mention of one persons observations on a well known Oracle mailing list. If you read through the responses you will see the assumptions of the post challenged, but no presumption of accuracy. Here are some excerpts:

Try running your query through explain plan (or autotrace) and see what that says... According to this, "!=" is considered to be the same as "<>"... Jonathan Lewis

Jonathan Lewis is a well respected expert in the Oracle community.

Just out of curiosity... Does the query optimizer generate a different execution plan for the two queries? Regards, Chris

.

Might it be bind variable peeking in action? The certain effect of writing != instead of <> is to force a re-parse. If at the first execution the values for :id were different and if you have an histogram on claws_doc_id it could be a reason. And if you tell me that claws_doc_id is the primary key, then I'll ask you what is the purpose of counting, in particular when the query in the EXISTS clause is uncorrelated with the outer query and will return the same result whatever :id is. Looks like a polling query. The code surrounding it must be interesting.

Stéphane Faroult

.

I'm pretty sure the lexical parse converts either != to <> or <> to !=, but I'm not sure whether that affects whether the sql text will match a stored outline.

.

Do the explain plans look the same? Same costs?

The following response is from the original poster.

Jonathan, Thank you for your answer. We did do an explain plan on both versions of the statement and they were identical, which is what is so puzzling about this. According to the documentation, the two forms of not equal are the same (along with ^= and one other that I can't type), so it makes no sense to me why there is any difference in performance.

Scott Canaan

.

Not an all inclusive little test but it appears at least in 10.1.0.2 it gets pared into a "<>" for either (notice the filter line for each plan)

.

Do you have any Stored Outline ? Stored Outlines do exact (literal) matches so if you have one Stored Outline for, say, the SQL with a "!=" and none for the SQL with a "<>" (or a vice versa), the Stored Outline might be using hints ? (although, come to think of it, your EXPLAIN PLAN should have shown the hints if executing a Stored Outline ?)

.

Have you tried going beyond just explain & autotrace and running a full 10046 level 12 trace to see where the slower version is spending its time? This might shed some light on the subject, plus - be sure to verify that the explain plans are exactly the same in the 10046 trace file (not the ones generated with the EXPLAIN= option), and in v$sqlplan. There are some "features" of autotrace and explain that can cause it to not give you an accurate explain plan.

Regards, Brandon

.

Is the phenomenon totally reproducible ?

Did you check the filter_predicates and access_predicates of the plan, or just the structure. I don't expect any difference, but a change in predicate order can result in a significant change in CPU usage if you are unlucky.

If there is no difference there, then enable rowsource statistics (alter session set "_rowsource_execution_statistics"=true) and run the queries, then grab the execution plan from V$sql_plan and join to v$sql_plan_statistics to see if any of the figures about last_starts, last_XXX_buffer_gets, last_disk_reads, last_elapsed_time give you a clue about where the time went.

If you are on 10gR2 there is a /*+ gather_plan_statistics */ hint you can use instead of the "alter session".

Regards Jonathan Lewis

At this point the thread dies and we see no further posts from the original poster, which leads me to believe that either the OP discovered an assumption they had made that was not true or did no further investigation.

I will also point out that if you do an explain plan or autotrace, you will see that the comparison is always displayed as <>.

Here is some test code. Increase the number of loop iterations if you like. You may see one side or the other get a higher number depending on the other activity on the server activity, but in no way will you see one operator come out consistently better than the other.

DROP TABLE t1;DROP TABLE t2;CREATE TABLE t1 AS (SELECT level c1 FROM dual CONNECT BY level <=144000);CREATE TABLE t2 AS (SELECT level c1 FROM dual CONNECT BY level <=144000);SET SERVEROUTPUT ON FORMAT WRAPPEDDECLARE   vStart  Date;   vTotalA Number(10) := 0;   vTotalB Number(10) := 0;   vResult Number(10);BEGIN      For vLoop In 1..10 Loop      vStart := sysdate;      For vLoop2 In 1..2000 Loop         SELECT count(*) INTO vResult FROM t1 WHERE t1.c1 = 777 AND EXISTS            (SELECT 1 FROM t2 WHERE t2.c1 <> 0);      End Loop;      vTotalA := vTotalA + ((sysdate - vStart)*24*60*60);      vStart := sysdate;      For vLoop2 In 1..2000 Loop         SELECT count(*) INTO vResult FROM t1 WHERE t1.c1 = 777 AND EXISTS            (SELECT 1 FROM t2 WHERE t2.c1 != 0);      End Loop;      vTotalB := vTotalB + ((sysdate - vStart)*24*60*60);      DBMS_Output.Put_Line('Total <>: ' || RPAD(vTotalA,8) || '!=: ' || vTotalB);      vTotalA := 0;      vTotalB := 0;   End Loop;END;