Compare two table Datas

0 comments

from the forum by

John Spencer

thread:HOWTO: Compare two tables

TO FIND RECORDS IN TABLE1 WHICH DO NOT EXIST IN TABLE2

SELECT column_list
FROM table1
WHERE column_list NOT IN (SELECT column_list
FROM table2)

SELECT column_list
FROM table1
MINUS
SELECT column_list
FROM table2;

SELECT column_list
FROM table1
WHERE NOT EXISTS (SELECT 1
FROM table2
WHERE table1.column1 = table2.column1 and
table1.column2 = table2.column2 and
...);

SELECT table1.column_list
FROM table1, table2
WHERE table1.column1 = table2.column1(+) and
table1.column2 = table2.column2(+) and
... and
(table2.column1 IS NULL or
table2.column2 IS NULL or
...)
-- Use and instead of or if you are sure that there is no case where
-- the two column1's will match, but the two column2's will not.
TO FIND RECORDS IN TABLE1 WHICH EXIST IN TABLE2.

SELECT column_list
FROM table1
WHERE column_list IN (SELECT column_list
FROM table2)

SELECT column_list
FROM table1
INTERSECT
SELECT column_list
FROM table2;

SELECT column_list
FROM table1
WHERE EXISTS (SELECT 1
FROM table2
WHERE table1.column1 = table2.column1 and
table1.column2 = table2.column2 and
...);

SELECT table1.column_list
FROM table1, table2
WHERE table1.column1 = table2.column1 and
table1.column2 = table2.column2 and
...
Which is better, depends on a number of factors. Some of the most important factors are:

1. Absloute sizes of the two tables
2. The relative sizes of the two tables
3. Cost Based Optimizer versus Rule Based Optimizer
4. Version of Oracle
5. Number and type columns that are being compared.

Number 5 may need a little clarification. If the question is find all Primary Key's in table1 which exist or do not exist in table2, then using EXISTS or a JOIN is probably best. However, if the question is find entire rows in table1 that do or do not have a corresponding entire row in table2 then INTERSECT or MINUS are likely to be best.

In all of the queries above, column_list represents the columns that you want to find the matches or mismatches for. This can be anything from a single column to * (all columns). column1, column2 ... represent each of the columns in your column list.

If we assume that there are no other selection criteria used against either table (which might affect the choice of indexes), then all methods will require at least one full table scan (at best a fast full index scan), and sometimes two. So, the chief factor affecting performance is the method Oracle uses to get the rows in table1 that do or do not exist in table2. On a conceptual basis, this is how I understand the various approaches operate. This is almost certainly not the exact algorithm that Oracle uses.

IN / NOT IN will do a full scan of table2 to get a list of values, then for each row of table1 scan this LOV to see if it finds a match (IN) or does not (NOT IN). If table2 is absolutely small (e.g. state_code table with 50 records) then this may well be the fastest approach since scanning the LOV should be fast in memory.

INTERSECT/ MINUS does a full scan of both tables, sorts each, then scans the two lists together to identify matches or mismatches. If the two tables are relatively similar in size, then this is likely to be one of the faster methods. The optimizer may choose to use index scans if the columns being compared are all indexed. If you are trying to compare all columns, or a large (unindexed) subset, then this is likely to be the fastest method. It may also be faster if you expect a relatively large number of matched or mis-matched records.

In a NOT EXISTS (or EXISTS), for each row in table 1, the correlated sub-query is executed against table2. If the all of the indentity columns in the sub-query are indexed, then this is pretty quick. The relative sizes of the tables do not matter much, but if both are large, this may be a little slower. Also, it is not as efficient when you need to look at values in several columns. If table2 is not appropriately indexed, and it is large, then this could be incredibly slow since it will require a full scan of table2 for each row in table1.

The [OUTER] JOIN approach uses SORT JOIN - MERGE JOIN or SORT JOIN - NESTED LOOP (depending on the Oracle version and optimizer mode). This is pretty efficient, particularly if the two tables are large, and relativley equally sized, and you expect a relativley small number of mismatches, or a relatively large number of matches (INNER JOIN).

There is also a fifth approach to finding records in table1 that do not exist in table2.

In sqlplus run $ORACLE_HOME/rdbms/admin/utlexcpt.sql then

ALTER TABLE table1
ADD constraint t1_t2_fk
FOREIGN KEY (column_list) REFERENCES table2 (column list)
EXCEPTIONS INTO exceptions
The rowid of any rows that do not pass validation will be stored in a table called exception (created by the script). You can then use the rowids to identify and deal with the rows. NOTE: This requires a unique index on column_list in table2.

Probably not useful for a one-off excercise, but it will prevent it happening on an on-going basis.

Caveat emptor

This posting is the work of the Rogue Moderators. It is posted with the best of intentions but is not guaranteed in any way, shape or form. In particular, the code is presented "as is" and you assume full responsibility for running it on your system.

reference:
http://forums.oracle.com/forums/thread.jspa?threadID=2996

Terminal time for Linux

0 comments

Linux only: Do you spend most of your day working at the command line? Either in a terminal editor or full screen text? Then how about embedding a clock into the top right hand corner of your terminal so you know exactly how long you have been working. There are many ways to achieve this (or similar) effects:

Sergio Gonzalez Duran on Linux.com offers this longish but effective way round:

Save the clock.sh script into your home directory: clock.sh

Chmod clock.sh’s status to 755:

chmod 755 clock.sh

Then in a terminal run:

./clock.sh&

and you’ll have a small clock in the top right hand of your terminal window. It works in xterm and apparently in many other emulators as well. There are also variations on this script at the original article on Linux.com.

For a simpler and more elegant solution if you regularly use full screen text (ie no graphical interface at all) try this on Ubuntu (and many other distros):

sudo -b vcstime&

and you’ll have a simple clock in the right-top corner of your screen.

[Via Linux.com]

http://www.tectonic.co.za/2008/01/terminal-time-for-linux/comment-page-1/#comment-976175