Duplicate key error during migration #331

Closed
opened 2023-01-28 19:00:20 +00:00 by mia · 8 comments
Contributor

When running migration scripts as of 9b76c805ec:

QueryFailedError: could not create unique index "IDX_71d35fceee0d0fa62b2fa8f3b2"
    at PostgresQueryRunner.query (/data/misskey/misskey/node_modules/typeorm/driver/postgres/PostgresQueryRunner.js:211:19)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async syncOrm1674499888924.up (file:///data/misskey/misskey/packages/backend/migration/1674499888924-sync-orm.js:7:3)
    at async MigrationExecutor.executePendingMigrations (/data/misskey/misskey/node_modules/typeorm/migration/MigrationExecutor.js:178:17)
    at async DataSource.runMigrations (/data/misskey/misskey/node_modules/typeorm/data-source/DataSource.js:248:35)
    at async Object.handler (/data/misskey/misskey/node_modules/typeorm/commands/MigrationRunCommand.js:60:13) {
  query: 'CREATE UNIQUE INDEX "IDX_71d35fceee0d0fa62b2fa8f3b2" ON "note" ("url") ',
  parameters: undefined,
  driverError: error: could not create unique index "IDX_71d35fceee0d0fa62b2fa8f3b2"
      at Parser.parseErrorMessage (/data/misskey/misskey/node_modules/pg-protocol/dist/parser.js:287:98)
      at Parser.handlePacket (/data/misskey/misskey/node_modules/pg-protocol/dist/parser.js:126:29)
      at Parser.parse (/data/misskey/misskey/node_modules/pg-protocol/dist/parser.js:39:38)
      at Socket.<anonymous> (/data/misskey/misskey/node_modules/pg-protocol/dist/index.js:11:42)
      at Socket.emit (node:events:513:28)
      at addChunk (node:internal/streams/readable:324:12)
      at readableAddChunk (node:internal/streams/readable:297:9)
      at Readable.push (node:internal/streams/readable:234:10)
      at TCP.onStreamRead (node:internal/stream_base_commons:190:23) {
    length: 290,
    severity: 'ERROR',
    code: '23505',
    detail: 'Key (url)=(https://blog.piratenpartei-nrw.de/kvbonn/2022/05/17/wir-danken-unseren-waehlern-3/) is duplicated.',
    hint: undefined,
    position: undefined,
    internalPosition: undefined,
    internalQuery: undefined,
    where: undefined,
    schema: 'public',
    table: 'note',
    column: undefined,
    dataType: undefined,
    constraint: 'IDX_71d35fceee0d0fa62b2fa8f3b2',
    file: 'tuplesort.c',
    line: '4297',
    routine: 'comparetup_index_btree'
  },
  length: 290,
  severity: 'ERROR',
  code: '23505',
  detail: 'Key (url)=(https://blog.piratenpartei-nrw.de/kvbonn/2022/05/17/wir-danken-unseren-waehlern-3/) is duplicated.',
  hint: undefined,
  position: undefined,
  internalPosition: undefined,
  internalQuery: undefined,
  where: undefined,
  schema: 'public',
  table: 'note',
  column: undefined,
  dataType: undefined,
  constraint: 'IDX_71d35fceee0d0fa62b2fa8f3b2',
  file: 'tuplesort.c',
  line: '4297',
  routine: 'comparetup_index_btree'
}
When running migration scripts as of 9b76c805ec5d109e5cdfc4097c8c268afa1d3b58: ``` QueryFailedError: could not create unique index "IDX_71d35fceee0d0fa62b2fa8f3b2" at PostgresQueryRunner.query (/data/misskey/misskey/node_modules/typeorm/driver/postgres/PostgresQueryRunner.js:211:19) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async syncOrm1674499888924.up (file:///data/misskey/misskey/packages/backend/migration/1674499888924-sync-orm.js:7:3) at async MigrationExecutor.executePendingMigrations (/data/misskey/misskey/node_modules/typeorm/migration/MigrationExecutor.js:178:17) at async DataSource.runMigrations (/data/misskey/misskey/node_modules/typeorm/data-source/DataSource.js:248:35) at async Object.handler (/data/misskey/misskey/node_modules/typeorm/commands/MigrationRunCommand.js:60:13) { query: 'CREATE UNIQUE INDEX "IDX_71d35fceee0d0fa62b2fa8f3b2" ON "note" ("url") ', parameters: undefined, driverError: error: could not create unique index "IDX_71d35fceee0d0fa62b2fa8f3b2" at Parser.parseErrorMessage (/data/misskey/misskey/node_modules/pg-protocol/dist/parser.js:287:98) at Parser.handlePacket (/data/misskey/misskey/node_modules/pg-protocol/dist/parser.js:126:29) at Parser.parse (/data/misskey/misskey/node_modules/pg-protocol/dist/parser.js:39:38) at Socket.<anonymous> (/data/misskey/misskey/node_modules/pg-protocol/dist/index.js:11:42) at Socket.emit (node:events:513:28) at addChunk (node:internal/streams/readable:324:12) at readableAddChunk (node:internal/streams/readable:297:9) at Readable.push (node:internal/streams/readable:234:10) at TCP.onStreamRead (node:internal/stream_base_commons:190:23) { length: 290, severity: 'ERROR', code: '23505', detail: 'Key (url)=(https://blog.piratenpartei-nrw.de/kvbonn/2022/05/17/wir-danken-unseren-waehlern-3/) is duplicated.', hint: undefined, position: undefined, internalPosition: undefined, internalQuery: undefined, where: undefined, schema: 'public', table: 'note', column: undefined, dataType: undefined, constraint: 'IDX_71d35fceee0d0fa62b2fa8f3b2', file: 'tuplesort.c', line: '4297', routine: 'comparetup_index_btree' }, length: 290, severity: 'ERROR', code: '23505', detail: 'Key (url)=(https://blog.piratenpartei-nrw.de/kvbonn/2022/05/17/wir-danken-unseren-waehlern-3/) is duplicated.', hint: undefined, position: undefined, internalPosition: undefined, internalQuery: undefined, where: undefined, schema: 'public', table: 'note', column: undefined, dataType: undefined, constraint: 'IDX_71d35fceee0d0fa62b2fa8f3b2', file: 'tuplesort.c', line: '4297', routine: 'comparetup_index_btree' } ```
Owner

Hmm, I don't think it should happen that two notes have the same URL. Can you provide the result of the following query please:

SELECT "createdAt", "url", "uri" FROM "note" WHERE "url" IN (SELECT "url" FROM "note" GROUP BY "url" HAVING COUNT("url") > 1);
what this query does

Select the creation date, url (link to human readable representation) and uri (link to machine readable representation) from notes where the url occurs more than once.

Hmm, I don't think it should happen that two notes have the same URL. Can you provide the result of the following query please: ```sql SELECT "createdAt", "url", "uri" FROM "note" WHERE "url" IN (SELECT "url" FROM "note" GROUP BY "url" HAVING COUNT("url") > 1); ``` <details><summary>what this query does</summary> Select the creation date, `url` (link to human readable representation) and `uri` (link to machine readable representation) from notes where the `url` occurs more than once. </details>
Author
Contributor
No description provided.
350 KiB
Owner

Hmm, I think this use of the url field is incorrect. I have raised an issue with Friendica which many of the cases seem to come from, see https://github.com/friendica/friendica/issues/12753. I'm not sure how to best resolve this.

Hmm, I think this use of the `url` field is incorrect. I have raised an issue with Friendica which many of the cases seem to come from, see <https://github.com/friendica/friendica/issues/12753>. I'm not sure how to best resolve this.
Owner
see also https://github.com/LemmyNet/lemmy/issues/2676
Owner

Can you try running this query manually and then rerun the migration:

UPDATE "note" AS "a" SET "url" = NULL WHERE "createdAt" != (SELECT MIN("createdAt") FROM "note" AS "b" WHERE "a"."url" = "b"."url");

This should remove a url (i.e. set it to null) from all but the earliest note where it occurs.

If the migration works afterwards I'll add this to the migration itself.

Can you try running this query manually and then rerun the migration: ```sql UPDATE "note" AS "a" SET "url" = NULL WHERE "createdAt" != (SELECT MIN("createdAt") FROM "note" AS "b" WHERE "a"."url" = "b"."url"); ``` This should remove a `url` (i.e. set it to null) from all but the earliest note where it occurs. If the migration works afterwards I'll add this to the migration itself.
Author
Contributor

That is a very slow query. Been running for almost 10 minutes now…

That is a *very* slow query. Been running for almost 10 minutes now…
Author
Contributor

Yeah this isn’t gonna finish anytime soon. Figure out a better query.

Yeah this isn’t gonna finish anytime soon. Figure out a better query.
Author
Contributor

In the meantime, I’ve run

UPDATE "note" SET "url" = NULL WHERE "url" IN (SELECT "url" FROM "note" GROUP BY "url" HAVING COUNT("url") > 1);

which completed within 2 seconds. The migration that introduces the index was successful.

I don’t think there’s a problem with discarding these URLs altogether 🤷‍♀️

In the meantime, I’ve run ```sql UPDATE "note" SET "url" = NULL WHERE "url" IN (SELECT "url" FROM "note" GROUP BY "url" HAVING COUNT("url") > 1); ``` which completed within 2 seconds. The migration that introduces the index was successful. I don’t think there’s a problem with discarding these URLs altogether 🤷‍♀️
Sign in to join this conversation.
No labels
feature
fix
upkeep
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: FoundKeyGang/FoundKey#331
No description provided.