Click here to Skip to main content
15,884,629 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I need to insert one row at a time into my target table. Before each insert, I need to scan the target table and see if the value already exists. If it does, then the uniqueID of the inserted row should match the uniqueID of the matched row in the target table. If the uniqueID does not exist, then increment the MAX uniqueID in the target table by 1 and use this as the uniqueID.

I've been told I can do this with loops and variables etc but Amazon Redshift does not support these at the moment.

My source table (tbl_source) has the fields anonID, userID and rowNum.
My target table (tbl_target) has fields anonID, userID and uniqueID

My insert is very simple in essence
SQL
INSERT into tbl_target
    (select anonID, userID, XXXX
     from tbl_source)

The XXX is where I need help. XXX is uniqueID.


SAMPLE DATA

╔════════╦════════╦════════╗
║ rownum ║ anonID ║ userID ║
╠════════╬════════╬════════╣
║      1 ║ A      ║      1 ║
║      2 ║ A      ║      2 ║
║      3 ║ A      ║      3 ║
║      4 ║ B      ║      5 ║
║      5 ║ B      ║      6 ║
║      6 ║ C      ║      7 ║
║      7 ║ D      ║      8 ║
║      8 ║ D      ║      9 ║
║      9 ║ E      ║      1 ║
║     10 ║ E      ║      2 ║
║     11 ║ E      ║      3 ║
║     12 ║ F      ║      9 ║
╚════════╩════════╩════════╝


To show you the logic I need, I will take this table row by row and show you what should be computed;

rowNum 1:

Search for "A" and "1" in the target table -> Neither exist in the target table (since it is empty, this is the first row bring inserted), therefore set uniqueID in the target table to 1

TARGET TABLE

+========+========+==========+
| anonID | userID | uniqueID |
+========+========+==========+
| A      | 1      | 1        |
+========+========+==========+

rowNum 2:

Search for "A" and "2" in the target table -> A exists. Therefore set the uniqueID of the new row to the SAME uniqueID as in the target table = 1
+========+========+==========+
| anonID | userID | uniqueID |
+========+========+==========+
| A      | 1      | 1        |
+--------+--------+----------+
| A      | 2      | 1        |
+========+========+==========+

rowNum 3:

Search for "A" and "3" in the target table -> A exists. Therefore set the uniqueID of the new row to the SAME uniqueID as in the target table = 1.
+========+========+==========+
| anonID | userID | uniqueID |
+========+========+==========+
| A      | 1      | 1        |
+--------+--------+----------+
| A      | 2      | 1        |
+--------+--------+----------+
| A      | 3      | 1        |
+========+========+==========+


rowNum 4:

Search for "B" and "5" in the target table -> neither exist. Therefore find the MAX uniqueID in the target table (1) and increment by 1.

+========+========+==========+
| anonID | userID | uniqueID |
+========+========+==========+
| A      | 1      | 1        |
+--------+--------+----------+
| A      | 2      | 1        |
+--------+--------+----------+
| A      | 3      | 1        |
+--------+--------+----------+
| B      | 5      | 2        |
+========+========+==========+

rowNum 5:

Search for "B" and "6" in the target tale -> "B" exists. Therefore set the uniqueID of the new row to the SAME uniqueID as in the target table = 1

+========+========+==========+
| anonID | userID | uniqueID |
+========+========+==========+
| A      | 1      | 1        |
+--------+--------+----------+
| A      | 2      | 1        |
+--------+--------+----------+
| A      | 3      | 1        |
+--------+--------+----------+
| B      | 5      | 2        |
+--------+--------+----------+
| B      | 6      | 2        |
+========+========+==========+


rowNum 6:

Search for "C" and "7". Neither found


    +========+========+==========+
    | anonID | userID | uniqueID |
    +========+========+==========+
    | A      | 1      | 1        |
    +--------+--------+----------+
    | A      | 2      | 1        |
    +--------+--------+----------+
    | A      | 3      | 1        |
    +--------+--------+----------+
    | B      | 5      | 2        |
    +--------+--------+----------+
    | B      | 6      | 2        |
    +--------+--------+----------+
    | C      | 7      | 3        |
    +========+========+==========+
.....
.....
.....

    +========+========+==========+
    | anonID | userID | uniqueID |
    +========+========+==========+
    | A      | 1      | 1        |
    +--------+--------+----------+
    | A      | 2      | 1        |
    +--------+--------+----------+
    | A      | 3      | 1        |
    +--------+--------+----------+
    | B      | 5      | 2        |
    +--------+--------+----------+
    | B      | 6      | 2        |
    +--------+--------+----------+
    | C      | 7      | 3        |
    +--------+--------+----------+
    | D      | 8      | 4        |
    +--------+--------+----------+
    | D      | 9      | 4        |
    +========+========+==========+


rowNum 9:

Search for "E" and "1" in the target table. "1" already exists! Therefore set the uniqueID to the same uniqueID as the row which already exists with "1" - Which is uniqueID 1.

+========+========+==========+
| anonID | userID | uniqueID |
+========+========+==========+
| A      | 1      | 1        |
+--------+--------+----------+
| A      | 2      | 1        |
+--------+--------+----------+
| A      | 3      | 1        |
+--------+--------+----------+
| B      | 5      | 2        |
+--------+--------+----------+
| B      | 6      | 2        |
+--------+--------+----------+
| C      | 7      | 3        |
+--------+--------+----------+
| D      | 8      | 4        |
+--------+--------+----------+
| D      | 9      | 4        |
+--------+--------+----------+
| E      | 1      | 1        |
+========+========+==========+


RowNum 10:

Search for "E" and "2" in the target table. Both "E" and "2 already exist. In this case just return the uniqueID for the first one it finds (the uniqueID will be the same for either one).

    +========+========+==========+
    | anonID | userID | uniqueID |
    +========+========+==========+
    | A      | 1      | 1        |
    +--------+--------+----------+
    | A      | 2      | 1        |
    +--------+--------+----------+
    | A      | 3      | 1        |
    +--------+--------+----------+
    | B      | 5      | 2        |
    +--------+--------+----------+
    | B      | 6      | 2        |
    +--------+--------+----------+
    | C      | 7      | 3        |
    +--------+--------+----------+
    | D      | 8      | 4        |
    +--------+--------+----------+
    | D      | 9      | 4        |
    +--------+--------+----------+
    | E      | 1      | 1        |
    +--------+--------+----------+
    | E      | 2      | 1        |
    +--------+--------+----------+
....


ROwNum 12

Search for "F" and "9" in the target table -> 9 exists. Therefore set uniqueID for 9 to the same uniqueID where 9 exists -> 4

THE FINAL TABLE then should look like this;

+========+========+==========+
| anonID | userID | uniqueID |
+========+========+==========+
| A      | 1      | 1        |
+--------+--------+----------+
| A      | 2      | 1        |
+--------+--------+----------+
| A      | 3      | 1        |
+--------+--------+----------+
| B      | 5      | 2        |
+--------+--------+----------+
| B      | 6      | 2        |
+--------+--------+----------+
| C      | 7      | 3        |
+--------+--------+----------+
| D      | 8      | 4        |
+--------+--------+----------+
| D      | 9      | 4        |
+--------+--------+----------+
| E      | 1      | 1        |
+--------+--------+----------+
| E      | 2      | 1        |
+--------+--------+----------+
| E      | 3      | 1        |
+--------+--------+----------+
| F      | 9      | 4        |
+========+========+==========+


If you wish to use my data;

SQL
CREATE TABLE tbl_source
    (
       rownum  integer,
       anonid  varchar(8),
       userid  integer
    );

    insert into tbl_source
     values
    (1,'A',1), (2,'A',2), (3,'A',3),
    (4,'B',5), (5,'B',6), (6,'C',7),
    (7,'D',8), (8,'D',9), (9,'E',1), 
    (10,'E',2), (11,'E',3), (12,'F',9)   
    ;
 
    CREATE TABLE tbl_target
    (
       anonid  varchar(8),
       userid  integer,
       uniqueID integer
    );


What I have tried:

I've been told I can do this with loops and variables etc but Amazon Redshift does not support these at the moment.
Posted
Updated 28-Feb-18 1:34am
v2
Comments
CHill60 27-Feb-18 12:43pm    
I would avoid loops but I have some serious doubts about your logic here - are you really saying that if there had been an entry B,1 then just choose the "first" 1 to assign to the E1 row. How do you think you are going to determine that "first" without some ordering. What are you really trying to do or this just a contrived exercise?
Member 10104822 27-Feb-18 14:40pm    
This is not a contrived exercise. The data supplied is just analogous dummy data. All I'm saying is that for what i need this for, if there are conflicts then i dont need to add some additional logic to determine the correct value. It can be ordered first but for simplicities sake its not needed
Jörgen Andersson 28-Feb-18 3:23am    
How would you handle (a,1),(b,1),(b,2),(c,2),(c,3) ?
Shall they all get the same unique ID?
Member 10104822 28-Feb-18 4:46am    
Correct
Jörgen Andersson 28-Feb-18 4:50am    
So, how would you handle this case:
First you add (a,1) it gets UniqueID 1
Then add (b,2), this will get a new UniqueID 2
How do you handle the next row containing (a,2)?

Or is all data supposed to be added as a set in one go?

Assuming you can use temporary tables, then this appears to work:

Firstly get a list of the unique anonIDs
SQL
create table #t1 (id integer identity(1,1), anonid varchar(8))
insert into #t1
SELECT DISTINCT anonid FROM tbl_source
Results:
1	A
2	B
3	C
4	D
5	E
6	F
Then get an intial starting position based purely on anonID
SQL
select B.ID as uniqueID, A.*
INTO #T2
FROM #tbl_source A
INNER JOIN #t1 B ON A.anonid = B.anonid
Results:
1	1	A	1
1	2	A	2
1	3	A	3
2	4	B	5
2	5	B	6
3	6	C	7
4	7	D	8
4	8	D	9
5	9	E	1
5	10	E	2
5	11	E	3
6	12	F	9
Then adjust that position based on whether or not userID has already appeared:
SQL
UPDATE A SET uniqueId= b.uniqueID
FROM #T2 A
inner JOIN #T2 B ON A.userid = B.userid and a.anonid <> b.anonid
where B.uniqueid < A.Uniqueid
Results:
SQL
1	1	A	1
1	2	A	2
1	3	A	3
2	4	B	5
2	5	B	6
3	6	C	7
4	7	D	8
4	8	D	9
1	9	E	1
1	10	E	2
1	11	E	3
4	12	F	9
You can then just query from #t2 to get your target
SQL
insert into tbl_target
	SELECT anonID, userID, uniqueID FROM #t2
 
Share this answer
 
Comments
Maciej Los 28-Feb-18 7:47am    
Good job!
Member 10104822 28-Feb-18 8:09am    
You are a genius sir.

Thank you very much!
INSERT INTO #tbl_target
         SELECT anonid,
                userid,
                COALESCE((SELECT uniqueID FROM #tbl_target WHERE anonid=@anonID),
                        (SELECT MAX(uniqueID)+1 FROM #tbl_target),1)AS uniqueID
           FROM #tbl_source
                       WHERE anonid=@anonID AND userid=@userID;
 
Share this answer
 
v2
Comments
Member 10104822 28-Feb-18 7:00am    
unfortunately this does not work. I get the error;

Invalid operation: operator does not exist: character varying =@ character varying;

Are you using variables? Redshift does not support variables unfortunately
Santosh kumar Pithani 28-Feb-18 7:22am    
If you insert all records at a time in target table then how you compare with source table?
Member 10104822 28-Feb-18 7:26am    
That's the thing. I need to be able to insert one row at a time into the target table and query that table before each new row is inserted.

That is the crux of the problem really

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month


CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900