INSERT INTO with LAST_INSERT_ID()

Question

Dani 4,645 The Queen of DaniWeb

5 Years Ago

My code has a query that looks like:

INSERT INTO table
VALUES (...)
ON DUPLICATE KEY UPDATE
    id = LAST_INSERT_ID(id),
    ...

id is a PRIMARY KEY.

My question is that I can't for anything remember why I have id = LAST_INSERT_ID(id). I wrote it years ago and it's not commented. Can anyone think of a reason why I was doing this?

mysql

4 Contributors
31 Replies
1K Views
1 Month Discussion Span
Latest Post 5 Years Ago Latest Post by Dani

Reverend Jim 5,216 Hi, I'm Jim, one of DaniWeb's moderators.

5 Years Ago

id = LAST_INSERT_ID(id),

From what I gather, this expression returns the auto-inceremented value of the field, id. And

ON DUPLICATE KEY UPDATE

says that if the insert fails because of a duplicate record, the existing record is updated.

Reverend Jim 5,216 Hi, I'm Jim, one of DaniWeb's moderators.

5 Years Ago

Could you just solve this with a lock?

Reverend Jim 5,216 Hi, I'm Jim, one of DaniWeb's moderators.

5 Years Ago

Then how does any multi-user database manage to function?

Reverend Jim 5,216 Hi, I'm Jim, one of DaniWeb's moderators.

5 Years Ago

I don't understand how that works. From what I read the code says

Get a new id which is one more than the last inserted key and
if the new id already exists then
    just update the existing record
else
    insert a new record

To me that seems, what's the technical term? Oh yeah. Bonkers. Using lastid+1 should guarantee no duplicate key. If multiple people are inserting records and there is a duplicate key then someone's newly inserted record is going to get clobbered. Can you give me a scenario where this behaviour is desirable?

Edited 5 Years Ago by Reverend Jim

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

AndrisP 193 Posting Pro in Training · Answer 1 · 2020-04-04T17:11:18+00:00

Actually it do not affect any changes - on duplicate update ... in your example supress error message. The same can be achieved with insert ignore into table ...

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 2 · 2020-04-06T18:44:24+00:00

I think I figured out what my question actually is.

Does LAST_INSERT_ID() return the last ID inserted for the current MySQLi connection, or overall? Does the MySQL connection resource provide the last insert ID for the current connection?

AndrisP 193 Posting Pro in Training · Answer 3 · 2020-04-06T18:49:43+00:00

Its id who you try to insert as duplicate - result is no changes and supress error message

AndrisP 193 Posting Pro in Training · Answer 4 · 2020-04-06T19:09:13+00:00

E.g.

drop table if exists test_table;
create table test_table(
    id int not null auto_increment primary key
    ,test varchar(30) null
    ,unique key (test)
);
insert into test_table(test) values('1 sample string');
insert into test_table(test) values('2 sample string');
insert into test_table(test) values('3 sample string');

when you try to repeat any of inserts above - it raise error unique key constraint but if you try any of below inserts

insert into test_table values(1,'1 sample string') 
    on duplicate key update id = last_insert_id(id);
insert into test_table values(2,'2 sample string') 
    on duplicate key update id = last_insert_id(id);
insert into test_table values(3,'3 sample string') 
    on duplicate key update id = last_insert_id(id);

then 0 rows affected without errors. Check select last_insert_id(); after each of last 3 insert examples

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 5 · 2020-04-06T20:31:52+00:00

Hi, sorry, I think you’re misunderstanding my question.

My question is how can I get the last insert ID for the specific current MySQL connection? I am having a problem with race conditions when two separate connections are inserting data into the same table simultaneously.

AndrisP 193 Posting Pro in Training · Answer 6 · 2020-04-07T14:30:26+00:00

I'm not sure you can get it exactly. Oracle DB and Postgres have a returning clause e.g. insert into table(id,val_1,val_2) returning id into variable; but MySQL doesn't

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 7 · 2020-04-07T19:09:59+00:00

Dani 4,645 The Queen of DaniWeb

5 Years Ago

Of course MySQL has that ability as well.

Prashant_27 0 Newbie Poster · Answer 8 · 2020-04-09T05:35:19+00:00

Yes MySQL return last_inserted_id() if it is called immediately. If another or same user logged and inserted records on same table. In such case the session will be different so MySQL manage this think and return appropriately to there sessions

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 9 · 2020-04-18T20:26:42+00:00

Rev,

Yes, I know that. :)

The question is whether the built-in MySQL function LAST_INSERT_ID() retrieves the latest auto-increment ID for the current MySQL connection, or globally. Some MySQL functions refer to global values and some refer to per-connection values.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 10 · 2020-04-18T23:43:14+00:00

Not if you don't want only one person to be able to access DaniWeb at a time.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 11 · 2020-04-19T01:46:27+00:00

Dani 4,645 The Queen of DaniWeb

5 Years Ago

With write locks on individual rows.

score 0 · Answer 12 · 2020-04-19T01:56:07+00:00

It's been a while since I did databases and all the ones I set up were large but only had one point of update (my apps). Everyone else was read only. What do multi-user DBs do when inserts are done by many users? It has to be a frequent thing.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 13 · 2020-05-12T00:46:44+00:00

As mentioned in my previous post, row-level locking allows multiple simultaneous updates to the same table. The individual rows are locked for update instead of the entire table.

score 0 · Answer 14 · 2020-05-12T02:45:38+00:00

I was going to ask a question but I realized that because I haven't done database programming for 12 years it would likely just show my ignorance, as opposed to this statement which definitely shows it.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 15 · 2020-05-12T02:52:46+00:00

Dani 4,645 The Queen of DaniWeb

5 Years Ago

Feel free to ask anyways. That's how we learn, right?

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 16 · 2020-05-12T03:03:33+00:00

Also ... not only is it how we learn, but it's the point of DaniWeb. If no one asked a question at DaniWeb because they didn't want to show their ignorance, then ... well.

score 0 · Answer 17 · 2020-05-12T03:17:16+00:00

All righty then. Instead of doing

INSERT INTO table
VALUES (...)
ON DUPLICATE KEY UPDATE
    id = LAST_INSERT_ID(id),

why not just have an auto-incrementing primary key? I have a sneaking feeling I used to know the answer to this.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 18 · 2020-05-12T03:58:35+00:00

There is. The LAST_INSERT_ID() functon actually only works when there is one.

Basically what the SQL query above is attempting to do is deal with both inserting and updating records in one query, in situations when you don't know the primary key of the record you'd like to update.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 19 · 2020-05-12T19:26:16+00:00

The ON DUPLICATE KEY UPDATE portion doesn't require that it be a duplicate primary key.

An example scenario is the DaniWeb members table. There's a record for each user. There's an id column which is the primary key for the table, and it gets autoincremented, assigning each new member a unique ID associated with their record. However, additional columns such as the username column have constraints in that two members cannot have the same username, so a UNIQUE index must be set up on the username field.

Taking all that into consideration, let's take the scenario where a new member signs up, and we're inserting a new record in the database. We do our proper checks to ensure that they select a unique username, and we'll reprompt them to pick out a different username if the check fails. If all looks good, we execute this query which generates a new autoincrement ID for the new user.

However, sometimes DaniWeb is a little slow, and the user gets impatient and hits the Submit button twice in a row, thinking the first time didn't go through. Or their hand slipped, or whatever the case may be. Accidentally double-clicking on a button is not an unheard of concept.

Each time they pressed the button, it shoots off a separate HTTP request, and HTTP requests have no knowledge of each other. We could easily end up with a race condition here because it's likely that neither request would have completed inserting the record into the database at the stage where the other request would be doing its little sanity check for another record with a duplicate username.

Therefore, if we just had on its own:

INSERT INTO table
VALUES (...)

Then the query would succeed for one of the requests, the MySQL server would pass back the new ID # back to the application, and then application logic would continue to do different things such as creating other records for the new user based on their new ID #.

In a situation in which there was no accidental double-click, the page would finish loading, it would send its HTML output back to the browser, cookies associated with the new record would be sent to the browser as well, and the new user would be logged into their new account.

However, whichever request was late to the party would have its MySQL query fail, and an invalid ID # would be sent back to the application. Application logic would continue and bad things would happen. To make matters worse, the second HTTP request would have replaced the first one in the end-user's web browser. This is critical because it means that the end-user would never receive the HTML output or cookies that were associated with that first request. Instead, all they would see is an error message that their new registration failed, and invalid cookies would be sent so they wouldn't be logged in. The cookies would be associated with the latter request, so there would be zero way of the end user's browser to have known that a successful record was created, let alone be able to associate that record to browser cookies.

The experience for the end user would be trying to sign up on DaniWeb, and being presented with an error message. Then going back, trying again, and getting an error message that that username alredy exists. They would then have to either create a new account with a different username, in which case the record with their preferred username would permanently become an orphaned account, or they would have to use the lost password reset, which they wouldn't think to do on a new record creation.

I attempted to solve that problem by doing

ON DUPLICATE KEY UPDATE
id = LAST_INSERT_ID(id)

Basically that means that if the record creation fails because another record already exists with the same username, then instead of creating a new record with a new ID #, fetch the last ID number that we just generated (e.g. the one from the first request), and update that record instead. The ID # created from the first request would be passed back to the application logic of the second request. The application logic will see that a valid ID was returned, and be able to present the user with a Success page and send appropriate cookies to their browser that recognize the user record.

This logic is basically saying that if we can't do the insertion due to UNIQUE field constraints, update the last record we created instead. The assumption would be that it would be low risk to be updating someone else's account because our application logic does a check to ensure that the username has not already been taken before we run this SQL query.

Anyways, that was the code I originally had, and then I long forgot about it. Fast forward a handful of years. I encountered a bug where two new members signed up at the very split second, and one of them overwrote the other's record.

I kid you not, but two new users, from completely different parts of the world, with very different IP addresses, and very different email accounts, tried to create accounts at the exact same fraction of a millisecond, with the same username.

Most likely it was a distributed bot that was creating fake accounts with random email addresses, and various global VPNs to fake their location, but they were selecting from a limited pool of usernames. Who knows. Either way, a race condition in which one user account overwrites the other is pretty bad.

I tried to investigate the code to see what went wrong. Upon seeing it again, at first I didn't remember what I had originally did.

Then I remembered. And I fixed my logic to fail a little more gracefully.

I then asked the question (second post in this thread) if anyone knows if the LAST_INSERT_ID() function returns the last ID inserted for the current MySQLi connection, or overall?

In other words, it's my now-understanding that it's overall. However, I was hoping that there would be a way (maybe a flag that could be passed to the function, or a different function) to return the LAST_INSERT_ID() for the current MySQLi connection. I wanted to be able to use it to help create a more elegant workaround to fix the bug.

score 0 · Answer 20 · 2020-05-12T19:57:31+00:00

The ON DUPLICATE KEY UPDATE portion doesn't require that it be a duplicate primary key.

That seems to be strongly implied by the wording

ON DUPLICATE KEY UPDATE

That's why I said

if the new id already exists (i.e. duplicate key)

I attempted to solve that problem by doing

ON DUPLICATE KEY UPDATE
id = LAST_INSERT_ID(id)

So in this case id is the username rather than the autoincremented userid? I'm still confused then. From what I read, LAST_INSERT_ID(id) returns an incremented id. If id is the username then you can't increment a string and if it is a userid then what stops it from getting bolloxed when several members try to sign up at once? I also don't see how that does anything when someone goes all Woody Woodpecker on the submit button. Why not just disable the submit button on the first click?

According to W3School LAST_INSERT_ID Returns the AUTO_INCREMENT id of the last row that has been inserted or updated in a table:

Of course it also says that LAST_INSERT_ID takes an optional parameter identified as expression and is oh so helpfully documented as an optional expression.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 21 · 2020-05-12T20:03:49+00:00

No, id is the autoincrement ID value. However, the insert failed as a result of a failed username duplicate key, and therefore we trigger the ON DUPLICATE KEY UPDATE portion of the query.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 22 · 2020-05-12T20:04:45+00:00

No, LAST_INSERT_ID(id) does not return an incremented id. It returns the primary key ID of the last successful insert operation.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 23 · 2020-05-12T20:07:37+00:00

Why not just disable the submit button on the first click?

We're talking about race conditions that are faster than javascript would be able to disable the submit button upon it being clicked. Or multiple tabs.

score 0 · Answer 24 · 2020-05-12T20:53:03+00:00

No, LAST_INSERT_ID(id) does not return an incremented id. It returns the primary key ID of the last successful insert operation.

OK. That's where most of the confusion came from. I would have thought that was the case except for what was documented at W2School.

Dani 4,645 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 25 · 2020-05-12T21:19:45+00:00

W3Schools says: The LAST_INSERT_ID() function returns the AUTO_INCREMENT id of the last row that has been inserted or updated in a table.

I said: It runs the primary key id of the last successful insert operation.

What's the difference???