Tuesday, January 27, 2015

Mass data update on SQL table

INTRODUCTION

It is a tedious job to perform a mass update of a huge database by plain UPDATE statement since it will eat a lot of time. It is not advisable, a situation like executing, on production deployment since the deployment time window is minimal.

I was facing the same situation on one of my projects needs huge data update in a SQL table. So here I am providing step by step guide to do mass table update.
Consider a Table (Employee) with Column
emp_id INTEGER,
emp_name VARCHAR(),
emp_address NCLOB(),
emp_mob INTEGER;

ISSUE

The emp_address contains data like 'chennai', 'andhra' which needs to be changed to 'madras', 'telangana' respectively
then, we prefer to go with traditional approach of:
UPDATE employee set emp_address = REPLACE(emp_address, 'chennai', 'madras') where emp_address like '%chennai%';

UPDATE employee set emp_address = REPLACE(emp_address, 'andra', 'telangana') where emp_address like '%andra%';

Though it's a correct one, but it is not suitable updating large volume (1+ crore) of data.

SOLUTION

To overcome this situation,  one of the best approaches would be copying the whole table and do the necessary modification/transform on it.

Step 1

First, alter the session
ALTER SESSION FORCE PARALLEL QUERY PARALLEL 4;

Step 2

Drop the new table if exists. if may throw an error if not exists, but it's fine.
DROP TABLE employee_new;

Step 3

Copy the existing table with necessary Modification.
CREATE TABLE employee_new AS
SELECT emp_id, emp_name,
CASE
 WHEN emp_address LIKE '%chennai%'
 THEN REPLACE(emp_address,'chennai','madras')

 WHEN emp_address LIKE '%andra%'
 THEN REPLACE(emp_address,'andra','telangana')

 ELSE field_values

 END field_values,
emp_mob FROM employee NOLOGGING;

Step 4

Apply necessary post table creation like NOCOMPRESS, NOCACHE, NOPARALLEL, MONITORING
ALTER TABLE employee_new NOCOMPRESS;
/
ALTER TABLE employee_new NOCACHE;
/
ALTER TABLE employee_new NOPARALLEL;
/
ALTER TABLE employee_new MONITORING;
/

6. Step 5

  1. Apply necessary comments for the column if any
  2. Create index if there is anything on 'employee' table
  3. Create Constraints if there is anything on 'employee' table.
ALTER TABLE employee_new LOGGING;
/

Step 6

If required, do necessary Validation(whether the new employee table has correct data copied from employee table)
CREATE OR REPLACE PROCEDURE EMPUPDATE
AS
 chennai_count    NUMBER := 0;
 andra_count    NUMBER := 0;
BEGIN
--if the new table has 0 records for the replaced value, then we can assume replace is the success.
SELECT COUNT(emp_id) INTO chennai_count FROM employee_new WHERE emp_address LIKE '%|chennai|%';
SELECT COUNT(emp_id) INTO chennai_count FROM employee_new WHERE emp_address LIKE '%|andra|%';

Step 7

Drop existing table, only if all the values are replaced successfully. (i.e: chennai_count and andra_count are 0)
IF chennai_count = 0 AND andra_count =0 THEN
 --Rename the actual employee table into employee_old which can be deleted manually once all the update was proper.
 EXECUTE IMMEDIATE 'ALTER TABLE employee RENAME TO employee_old';

 --Rename new employee table to employee table.
 EXECUTE IMMEDIATE 'ALTER TABLE employee_new RENAME TO employee';
ELSE
 raise_application_error (-20002, 'ERROR on Updating employee table. - ' || SQLCODE || ' -ERROR- ' || SQLERRM );
END IF;

EXCEPTION
WHEN OTHERS THEN
 raise_application_error (-20001, 'AN ERROR WAS ENCOUNTERED - ' || SQLCODE || ' -ERROR- ' || SQLERRM );
END EMPUPDATE;
/

EXECUTE EMPUPDATE;

No comments :

Post a Comment