Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(database): indices for columns used in 'where' and 'order by' clauses #3880

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
import { Knex } from 'knex'

export async function up(knex: Knex): Promise<void> {
await knex.schema.alterTable('stream_commits', (table) => {
// table.index('streamId') // index should already be available due to this being the first item in the composite primary key
table.index('commitId')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getCommitStreamsFactory & markCommitStreamUpdatedFactory has a 'where' statement on this column alone

})
await knex.schema.alterTable('branch_commits', (table) => {
// table.index('branchId') // index should already be available due to this being the first item in the composite primary key
table.index('commitId')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

markCommitBranchUpdatedFactory, moveCommitsToBranchFactory, getCommitBranchesFactory, switchCommitBranchFactory has a 'where' statement on this column alone

})
await knex.schema.alterTable('refresh_tokens', (table) => {
// table.index('id') // index should already be available due to this being the primary key
table.index('userId')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleteExistingAuthTokensFactory has a 'where' statement on this column alone. (Though I'm confused as to what the statement in that factory is actually doing, it appears to query and then not use the result)

})
await knex.schema.alterTable('commits', (table) => {
// table.index('id') // index should already be available due to this being the primary key
table.index('author')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these really need to be separate indices, instead of composite ones? it was at least the case in MySQL that if you had an index like "x, y, z", then it would be used even if you filter by only "x" or "x" and "y", as long as the order is the same (so it wouldn't work when just filtering by "z")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

important to think about this, cause they do take up space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@fabis94 fabis94 Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's even better than what I expected then, I didn't think there would be a speedup on just filtering "y" at all. But my point is that we need to look at what the actual filter conditions are, and if "y" is usually used together with "x", then there's no need to make "y" have its own index

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just having X and Y as separate indices could even be worse than having a "X, Y" index, if both are filtered together. It's just going to use the X or the Y index, where "X, Y" would be the better one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getPaginatedProjectModelsBaseQueryFactory, getModelTreeItemsFilteredBaseQueryFactory, getUserAuthoredCommitCountsFactory query on the author column alone

table.index('sourceApplication')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getPaginatedProjectModelsBaseQueryFactory, getModelTreeItemsFilteredBaseQueryFactory, getStreamsSourceAppsFactory query on the sourceApplication column alone

table.index('referencedObject')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getObjectCommitsWithStreamIdsFactory queries on this column alone

table.index('createdAt') // used in an ORDER BY clause. https://www.postgresql.org/docs/current/indexes-ordering.html
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getBranchLatestCommitsFactory orders by this column alone.
getPaginatedBranchCommitsItemsFactory orders by this column, but alongside a where on id column

})
await knex.schema.alterTable('file_uploads', (table) => {
// table.index('id') // index should already be available due to this being the primary key
// table.index('streamId') // already explicitly indexed
table.index('convertedStatus')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getStreamFileUploadsFactory includes in where but only in conjunction with streamId column. Probably best to drop the streamId index, and recreate as a composite index of `['streamId', 'convertedStatus'].

table.index('uploadDate')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getStreamFileUploadsFactory includes in where but only in conjunction with streamId column. Probably best to drop the streamId index, and recreate as a composite index of `['streamId', 'uploadDate'].

table.index('branchName')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getStreamPendingModelsFactory, getBranchPendingVersionsFactory includes this column in where but only in conjunction with streamId column.
Probably best to drop the streamId index, and recreate as a composite index of `['streamId', 'branchName'].

})
await knex.schema.alterTable('server_invites', (table) => {
//table.index('id') // index should already be available due to this being the primary key
table.index('updatedAt') // used in an ORDER BY clause. https://www.postgresql.org/docs/current/indexes-ordering.html
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildInvitesBaseQuery as an order by without any where statement

table.index('target')
Copy link
Contributor Author

@iainsproat iainsproat Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

insertInviteAndDeleteOldFactory, deleteServerOnlyInvitesFactory, queryAllUserResourceInvitesFactory, deleteInvitesByTargetFactory includes this column in where statement in conjunction with the resource column. Composite index with target already exists.
findServerInviteFactory, findInviteFactory, deleteAllUserInvitesFactory & findServerInvitesBaseQueryFactory include this column in where statement alone, and this is in dominant position in composite key, so no individual index is required.

table.index('createdAt')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryServerInvitesFactory includes this column in a where statement, but only ever in conjunction with target. The two should be a composite index.

})
await knex.schema.alterTable('gendo_ai_renders', (table) => {
//table.index('id') // index should already be available due to this being the primary key
// table.index('gendoGenerationId') // already explicitly indexed
table.index('versionId')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLatestVersionRenderRequestsFactory includes this column in a where statement, but alongside createdAt in an ORDER BY clause; could be a composite index.
getVersionRenderRequestFactory includes this column in a where statement alongside id column.

table.index('createdAt') // used in an ORDER BY clause. https://www.postgresql.org/docs/current/indexes-ordering.html
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLatestVersionRenderRequestsFactory and getVersionRenderRequestFactory include this column in an ORDER BY. It is always used alongside versionId, so could be a composite index.

})
await knex.schema.alterTable('pwdreset_tokens', (table) => {
// table.index('id') // index should already be available due to this being the primary key
table.index('email')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

baseQueryFactory includes this column, and is used alone in deleteTokensFactory

table.index('createdAt')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getPendingTokenFactory includes this column, but alongside email so could be the secondary in a composite index.

})
await knex.schema.alterTable('automation_triggers', (table) => {
// table.index('automationRevisionId') // index should already be available due to this being the first item in the composite primary key
table.index('triggerType')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getActiveTriggerDefinitionsFactory includes this column, alongside triggeringId
getAutomationTriggerDefinitionsFactory includes this column, alongside automationRevisionId.

table.index('triggeringId')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getActiveTriggerDefinitionsFactory includes this column, but only alongside triggerType

})
}

export async function down(knex: Knex): Promise<void> {
await knex.schema.alterTable('stream_commits', (table) => {
table.dropIndex('commitId')
})
await knex.schema.alterTable('branch_commits', (table) => {
table.dropIndex('commitId')
})
await knex.schema.alterTable('refresh_tokens', (table) => {
table.dropIndex('userId')
})
await knex.schema.alterTable('commits', (table) => {
table.dropIndex('author')
table.dropIndex('sourceApplication')
table.dropIndex('referencedObject')
table.dropIndex('createdAt')
})
await knex.schema.alterTable('file_uploads', (table) => {
table.dropIndex('convertedStatus')
table.dropIndex('uploadDate')
table.dropIndex('branchName')
})
await knex.schema.alterTable('server_invites', (table) => {
table.dropIndex('target')
table.dropIndex('updatedAt')
table.dropIndex('createdAt')
})
await knex.schema.alterTable('gendo_ai_renders', (table) => {
table.dropIndex('versionId')
table.dropIndex('createdAt')
})
await knex.schema.alterTable('pwdreset_tokens', (table) => {
table.dropIndex('email')
table.dropIndex('createdAt')
})
await knex.schema.alterTable('automation_triggers', (table) => {
table.dropIndex('triggerType')
table.dropIndex('triggeringId')
})
}