-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(database): indices for columns used in 'where' and 'order by' clauses #3880
base: main
Are you sure you want to change the base?
chore(database): indices for columns used in 'where' and 'order by' clauses #3880
Conversation
knex doesn't seem to add Indices concurrently knex/knex#3913 |
}) | ||
await knex.schema.alterTable('commits', (table) => { | ||
// table.index('id') // index should already be available due to this being the primary key | ||
table.index('author') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do these really need to be separate indices, instead of composite ones? it was at least the case in MySQL that if you had an index like "x, y, z", then it would be used even if you filter by only "x" or "x" and "y", as long as the order is the same (so it wouldn't work when just filtering by "z")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
important to think about this, cause they do take up space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the best explanation I found on the subject: https://stackoverflow.com/questions/11352056/postgresql-composite-primary-key#comment113601692_11352543
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's even better than what I expected then, I didn't think there would be a speedup on just filtering "y" at all. But my point is that we need to look at what the actual filter conditions are, and if "y" is usually used together with "x", then there's no need to make "y" have its own index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just having X and Y as separate indices could even be worse than having a "X, Y" index, if both are filtered together. It's just going to use the X or the Y index, where "X, Y" would be the better one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPaginatedProjectModelsBaseQueryFactory
, getModelTreeItemsFilteredBaseQueryFactory
, getUserAuthoredCommitCountsFactory
query on the author
column alone
surprising that so many tables don't have appropriate indices. we need to be more attentive to new table creating migrations, indices should be planned out from the very beginning, if possible |
There's a lot more in the list linked in the Linear issue, but those were much smaller tables (at the moment). I'll do the work of sweeping them up in some follow up PRs. |
…d-95-of-the-time-for-any-table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all the indices, and there are indeed some candidates for being composite indices instead. Will revise.
export async function up(knex: Knex): Promise<void> { | ||
await knex.schema.alterTable('stream_commits', (table) => { | ||
// table.index('streamId') // index should already be available due to this being the first item in the composite primary key | ||
table.index('commitId') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getCommitStreamsFactory & markCommitStreamUpdatedFactory has a 'where' statement on this column alone
}) | ||
await knex.schema.alterTable('branch_commits', (table) => { | ||
// table.index('branchId') // index should already be available due to this being the first item in the composite primary key | ||
table.index('commitId') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
markCommitBranchUpdatedFactory, moveCommitsToBranchFactory, getCommitBranchesFactory, switchCommitBranchFactory has a 'where' statement on this column alone
}) | ||
await knex.schema.alterTable('refresh_tokens', (table) => { | ||
// table.index('id') // index should already be available due to this being the primary key | ||
table.index('userId') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleteExistingAuthTokensFactory has a 'where' statement on this column alone. (Though I'm confused as to what the statement in that factory is actually doing, it appears to query and then not use the result)
await knex.schema.alterTable('commits', (table) => { | ||
// table.index('id') // index should already be available due to this being the primary key | ||
table.index('author') | ||
table.index('sourceApplication') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPaginatedProjectModelsBaseQueryFactory
, getModelTreeItemsFilteredBaseQueryFactory
, getStreamsSourceAppsFactory
query on the sourceApplication
column alone
}) | ||
await knex.schema.alterTable('commits', (table) => { | ||
// table.index('id') // index should already be available due to this being the primary key | ||
table.index('author') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPaginatedProjectModelsBaseQueryFactory
, getModelTreeItemsFilteredBaseQueryFactory
, getUserAuthoredCommitCountsFactory
query on the author
column alone
//table.index('id') // index should already be available due to this being the primary key | ||
// table.index('gendoGenerationId') // already explicitly indexed | ||
table.index('versionId') | ||
table.index('createdAt') // used in an ORDER BY clause. https://www.postgresql.org/docs/current/indexes-ordering.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getLatestVersionRenderRequestsFactory
and getVersionRenderRequestFactory
include this column in an ORDER BY
. It is always used alongside versionId
, so could be a composite index.
}) | ||
await knex.schema.alterTable('pwdreset_tokens', (table) => { | ||
// table.index('id') // index should already be available due to this being the primary key | ||
table.index('email') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
baseQueryFactory
includes this column, and is used alone in deleteTokensFactory
await knex.schema.alterTable('pwdreset_tokens', (table) => { | ||
// table.index('id') // index should already be available due to this being the primary key | ||
table.index('email') | ||
table.index('createdAt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPendingTokenFactory
includes this column, but alongside email
so could be the secondary in a composite index.
}) | ||
await knex.schema.alterTable('automation_triggers', (table) => { | ||
// table.index('automationRevisionId') // index should already be available due to this being the first item in the composite primary key | ||
table.index('triggerType') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getActiveTriggerDefinitionsFactory
includes this column, alongside triggeringId
getAutomationTriggerDefinitionsFactory
includes this column, alongside automationRevisionId
.
await knex.schema.alterTable('automation_triggers', (table) => { | ||
// table.index('automationRevisionId') // index should already be available due to this being the first item in the composite primary key | ||
table.index('triggerType') | ||
table.index('triggeringId') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getActiveTriggerDefinitionsFactory
includes this column, but only alongside triggerType
…d-95-of-the-time-for-any-table
…d-95-of-the-time-for-any-table
Description & motivation
We analyzed a running database and identified significant tables where indices were involved in a low proportion of queries. This PR adds indices to select table columns to aim to improve the index hit ratio and overall performance.
Existing indices can be found with:
Changes:
To-do before merge:
Screenshots:
Validation of changes:
Checklist:
References