Home » SQL & PL/SQL » SQL & PL/SQL » scrambling the data for a column in a tables (oracle 11g version 2 )
scrambling the data for a column in a tables [message #659224] |
Mon, 09 January 2017 23:09 |
|
raj85844
Messages: 26 Registered: November 2016 Location: chennai
|
Junior Member |
|
|
Hi Friends,
Wishing you all a very happy new year...
I got a requirement in scrambling the data in the table
The below are the columns with the data length as mention for referrence
DATE_OF_BIRTH - Year(last digit),Month and date
SURNAME - length 30
FIRST_NAME - length 20
MIDDLE_NAMES - length 50
HOME_PHONE_NO - length 25
MOBILE_PHONE_NO - length 25
EMAIL_ADDRESS - length 255
POSTCODE - Total length is 8 but first 4 digits should be scramble.
As the data in the columns need to be scrambled with the logic as mentioned below
Char/Varchar : randomly generate the letter from(A-Z).
Number : For number randomly select the no from 0-9.
Month : For Month the no should be generate randomly using 01-12.
Day : For Day the no should be generate randomly using 01-31.
i tried to create a generic stored procedure for scrambling the data's in the table
-->input parameters
i/p as p_table_name
i/p as p_schema_name
required a help in achieving the logic in sql with better performance in the store procedure
since this change is going to be in a larger table with 10 million records
Thanks
Raj
|
|
|
Re: scrambling the data for a column in a tables [message #659226 is a reply to message #659224] |
Mon, 09 January 2017 23:38 |
|
Barbara Boehmer
Messages: 9097 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
It is unclear what you mean by scrambling the data. I am guessing that you want to generate random data. If so, then please see the partial example below. I have used only three columns, one date, one character, and one numeric. For the date, I have subtracted a random number from sysdate, instead of using 1 to 31 days, since some months do not have 31 days. I have generated 10 rows, but you could specify any number of rows you like.
SCOTT@orcl_12.1.0.2.0> CREATE TABLE scrambled
2 (DATE_OF_BIRTH DATE,
3 FIRST_NAME VARCHAR2(20),
4 POSTCODE NUMBER(8))
5 /
Table created.
SCOTT@orcl_12.1.0.2.0> INSERT INTO scrambled (date_of_birth, first_name, postcode)
2 SELECT SYSDATE - DBMS_RANDOM.VALUE (1,366),
3 DBMS_RANDOM.STRING ('U', 20),
4 LPAD (CEIL (DBMS_RANDOM.VALUE (1,99999999)), 8, '0')
5 FROM DUAL
6 CONNECT BY LEVEL <= 10
7 /
10 rows created.
SCOTT@orcl_12.1.0.2.0> SELECT date_of_birth, first_name, postcode
2 FROM scrambled
3 /
DATE_OF_BIRTH FIRST_NAME POSTCODE
--------------- -------------------- ----------
Wed 17-Aug-2016 RVRGBSLRWWJRQJBEGCLV 49870953
Thu 18-Feb-2016 GAUERUCEJVWPVBRYOBKO 57185556
Wed 10-Aug-2016 YCYYYLLAHOLLZJBEWVNS 47283027
Sat 22-Oct-2016 SHIPGRRAGLZFZSPDVKSU 29099732
Sat 12-Mar-2016 YXFNUBUXUHSHSFPPLRMH 75059706
Mon 15-Aug-2016 EOMXOPQUOLDYVDRBOLHY 74212704
Mon 02-May-2016 KOIBUHRMFLKLUUDKWIIX 23236438
Fri 01-Jul-2016 KJGDBTPFZDYMPGEGBVGY 86288842
Tue 09-Feb-2016 SYOOJWJOMEBQRKUKNQPE 95729329
Tue 12-Jan-2016 RZAFGDHEVYMZYQZLHBJB 76658230
10 rows selected.
|
|
|
|
Re: scrambling the data for a column in a tables [message #659243 is a reply to message #659224] |
Tue, 10 January 2017 06:45 |
|
EdStevens
Messages: 1376 Registered: September 2013
|
Senior Member |
|
|
raj85844 wrote on Mon, 09 January 2017 23:09Hi Friends,
Wishing you all a very happy new year...
I got a requirement in scrambling the data in the table
The below are the columns with the data length as mention for referrence
DATE_OF_BIRTH - Year(last digit),Month and date
WRONG! DATE_OF_BIRTH should be defined as a DATE data type. That is an internal, binary format, so your idea of it being 'Year(last digit),Month and date' is incorrect. And if it is not defined as DATE, then that is a fundamental design failure that will come back to haunt you again, and again, and again. And further, if it is not a DATE, and it is as you say, one single digit for the year, you have nicely replicated the Y2K problem. Only instead of a once-in-a-mellinium issue, you have made it an annual issue. So yet another design fail.
|
|
|
|
Re: scrambling the data for a column in a tables [message #659249 is a reply to message #659243] |
Tue, 10 January 2017 08:22 |
cookiemonster
Messages: 13925 Registered: September 2008 Location: Rainy Manchester
|
Senior Member |
|
|
EdStevens wrote on Tue, 10 January 2017 12:45
And further, if it is not a DATE, and it is as you say, one single digit for the year, you have nicely replicated the Y2K problem. Only instead of a once-in-a-mellinium issue, you have made it an annual issue. So yet another design fail.
Reminds me of a thread I saw back in Jan 2010 on the otn forums. OP had date in DDDY format and couldn't work out why oracle had suddenly decided to interpret 8 as 2018 when the previous December it had treated it as 2008. And you do have to wonder what stupidity with dates caused that German bank to go offline on 1st Jan 2010.
|
|
|
|
|
|
Re: scrambling the data for a column in a tables [message #659455 is a reply to message #659260] |
Wed, 18 January 2017 04:29 |
|
raj85844
Messages: 26 Registered: November 2016 Location: chennai
|
Junior Member |
|
|
Hi Team,
Thanks for the inputs as it was really help full to program.
One query in updating the email id column, please find the below attempt of trying the expected result through scrambling.Could you help me to archive the expected result *
Table Name : Customers
Column Name : Cust_Email
CUST_EMAIL VARCHAR2(30)
(DATA)Before Scrambling
---------------------------------
Roger.Mastroianni@CREEPER.COM
* (DATA) Expected result through Scrambling
---------------------------------
Oktud.Cfmcxmrkvps@HJSDFGS.ysu
(DATA) After Scrambling by the below update
---------------------------------
oktud.cfmcxmrkvps@.ysu
Error report -
SQL Error: ORA-12899: value too large for column "OE_TEST"."CUSTOMERS"."CUST_EMAIL" (actual: 32, maximum: 30)
UPDATE customers
SET cust_email =
DBMS_RANDOM.string ('l',
LENGTH (REGEXP_SUBSTR (cust_email,
'[A-Z][a-z]+',
1,
1)))
|| '.'
|| DBMS_RANDOM.string ('l',
LENGTH (REGEXP_SUBSTR (cust_email,
'[A-Z][a-z]+',
1,
2)))
|| '@'
|| DBMS_RANDOM.string ('l',
LENGTH (REGEXP_SUBSTR (cust_email,
'[A-Z][a-z]+',
1,
1)))
|| '.'
||DBMS_RANDOM.string('l',3)
WHERE cust_email IS NOT NULL;
/
|
|
|
|
|
Re: scrambling the data for a column in a tables [message #659472 is a reply to message #659462] |
Wed, 18 January 2017 09:32 |
cookiemonster
Messages: 13925 Registered: September 2008 Location: Rainy Manchester
|
Senior Member |
|
|
source column = target column so if there's a length mismatch it'll be because the code isn't doing a 1-1 replacement.
I would have thought that the occurance parameter for the third regexp_substr call should be set 3 not 1.
Also it'll probably be more efficient to use plain instr to work out the lengths of the different sections.
|
|
|
|
|
Re: scrambling the data for a column in a tables [message #659624 is a reply to message #659623] |
Tue, 24 January 2017 13:28 |
Bill B
Messages: 1971 Registered: December 2004
|
Senior Member |
|
|
If you absolutely have to maintain the case (email doesn't require it) then use the following function
The call to use it for emails would be
SELECT CUST_EMAIL,
SCRAMBLE_STR(CUST_EMAIL,'.@') Altered_email
FROM CUSTOMERS
WHERE CUST_Email is not null;
CREATE OR REPLACE FUNCTION Scramble_str (P_str IN VARCHAR2,
P_filter IN VARCHAR2 DEFAULT NULL)
RETURN VARCHAR2
IS
P_return VARCHAR2 (1000);
Flag VARCHAR2 (1);
BEGIN
P_return := NULL;
IF P_str IS NOT NULL
THEN
Flag := 'N';
FOR Pnt IN 1 .. LENGTH (P_str)
LOOP
IF P_filter IS NOT NULL
THEN
FOR Pnt2 IN 1 .. LENGTH (P_filter)
LOOP
IF SUBSTR (P_str, Pnt, 1) = SUBSTR (P_filter, Pnt2, 1)
THEN
IF P_return IS NULL
THEN
P_return := SUBSTR (P_str, Pnt, 1);
ELSE
P_return := P_return || SUBSTR (P_str, Pnt, 1);
END IF;
Flag := 'Y';
EXIT;
END IF;
END LOOP;
END IF;
IF Flag = 'N'
THEN
IF ASCII (SUBSTR (P_str, Pnt, 1)) NOT BETWEEN 97 AND 122
THEN
IF P_return IS NULL
THEN
P_return := DBMS_RANDOM.String ('U', 1);
ELSE
P_return := P_return || DBMS_RANDOM.String ('U', 1);
END IF;
ELSIF ASCII (SUBSTR (P_str, Pnt, 1)) BETWEEN 97 AND 122
THEN
IF P_return IS NULL
THEN
P_return := DBMS_RANDOM.String ('L', 1);
ELSE
P_return := P_return || DBMS_RANDOM.String ('L', 1);
END IF;
ELSE
IF P_return IS NULL
THEN
P_return := SUBSTR (P_str, Pnt, 1);
ELSE
P_return := P_return || SUBSTR (P_str, Pnt, 1);
END IF;
END IF;
END IF;
Flag := 'N';
END LOOP;
END IF;
RETURN P_return;
END Scramble_str;
[Updated on: Tue, 24 January 2017 13:28] Report message to a moderator
|
|
|
Goto Forum:
Current Time: Sat Jun 29 07:27:24 CDT 2024
|