PostgreSQL - UPSERT Operation
In PostgreSQL, the UPSERT operation means either UPDATE or INSERT operation. The UPSERT operation allows us to either insert a row or skip the insert operation if a row already exists and update that row instead.
Suppose you want to insert bulk data from one table to another table that already has some data. In this case, there might be some rows that already exist in the table or some rows with old data which you want to update. You can either skip the insert operation or update some fields if records already exist with the same primary key or update some fields for that rows based on some criteria.
PostgreSQL implements the UPSERT operation using ON CONFLICT
clause of the INSERT statement. The ON CONFLICT
specifies alternative action to be taken in case of conflict occurring during the insert operation.
INSERT INTO <table_name> (column_list)
VALUES(value_list)
[ON CONFLICT <conflict_targe> <conflict_action>,]
[RETURNING * or <column_name>];
In the above INSERT statement, The conflict_target can be
- A column name that is a primary key or has a unique constraint or has a unique index. It cannot be other non-unique columns.
- ON CONSTRAINT constraint_name where constraint_name can be a unique constraint name
- WHERE clause with a predicate condition that returns TRUE or FALSE.
- DO NOTHING: It skips the insert operation and does nothing. If you use the DO NOTHING clause, then conflict_target is not needed.
- DO UPDATE statement: Use the DO UPDATE SET column1 = value1,... WHERE <condition> - to update fields of the table. ON CONFLICT DO UPDATE ensure that either insert or update operation is definitely performed.
To see how the UPSERT operation works, let's create the following employee
table.
CREATE TABLE employee (
emp_id INT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
gender CHAR(1),
birthdate DATE,
email VARCHAR(100) UNIQUE,
salary INT
);
Now let's insert a row into the employee
table using the insert statement.
INSERT INTO employee
VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000);
This will insert a row into the employee
table, as shown below.
Now, if you execute the same INSERT statement again then it will raise an error
The above employee
table has the emp_id
as a Primary key and data already there for emp_id = 1
. If you try to insert a new record with emp_id = 1
then it will raise an error.
ON CONFLICT DO NOTHING
If you use the ON CONFLICT DO NOTHING statement with the INSERT statement then it will not raise any error and will not insert any rows into the employee table, as we are asking to DO NOTHING in case of conflict.
INSERT INTO employee
VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT DO NOTHING;
The above statement would display the following result in pgAdmin:
In the above result, INSERT 0 0
indicates that it has inserted 0 and updated 0 rows.
You can achieve the same functionality as above using the ON CONFLICT(column_name) DO NOTHING
, as shown below. Here the column_name
should be the primary key column or has a unique constraint that ensures all the values are unique.
INSERT INTO employee
VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT(emp_id) DO NOTHING;
Alternatively, you can also specify the primary key or unique constraint name with the ON CONFLICT ON CONSTRAINT
clause, as shown below.
INSERT INTO employee
VALUES(1,'Annie','Smith','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT ON CONSTRAINT employee_pkey DO NOTHING;
In the above example, the primary key constraint name employee_pkey
is used with DO NOTHING which indicates to skip the insert operation if a row violates the primary key constraint.
ON CONFLICT DO UPDATE
The ON CONFLICT DO UPDATE
statement executes the specified update statement if a conflict occurs. For example, the following updates the last_name
column in the existing row if a row with the same emp_id
exists.
INSERT INTO employee
VALUES(1,'Annie','Rizzolo','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT(emp_id) DO UPDATE SET last_name = 'Rizzolo';
The same thing can be achieved using the EXCLUDED
table object instead of specifying explicit value in the update statement, as shown below.
INSERT INTO employee
VALUES(1,'Annie','Rizzolo','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT(emp_id) DO UPDATE SET last_name = EXCLUDED.last_name;
In the above result, INSERT 0 1
indicates 0 rows inserted and 1 row updated. PostgreSQL will try to insert a record, but because the emp_id
column already contains 1, the conflict will occur and it will execute the DO UPDATE SET last_name = EXCLUDED.last_name
statement.
The EXCLUDED
is a table object that points to the reference values of the specified INSERT statement. Here, the EXCLUDED.last_name
returns 'Rizzolo'.
Let's check the data in the table, you can see the last_name
is changed to 'Rizzolo'.
ON CONFLICT WHERE
Use the ON CONFLICT WHERE
clause to skip or do update operation based on some condition.
For example, the following executes the update statement UPDATE SET salary = 0
if a conflict occurs and also if salary > 0
in the existing row. If the salary
is 0 then skip then it will skip the insert or update operation.
INSERT INTO employee
VALUES(1,'Annie','Rizzolo','F', DATE '1988-01-09', '[email protected]',5000)
ON CONFLICT(emp_id) where salary > 0
DO UPDATE SET salary = 0;
The above would update the salary field to 0, as shown below.
Note that you cannot use EXCLUDED
table with the WHERE clause.
Thus, you can perform the UPSERT operation using the INSERT statement and manage the conflicts.