Skip to content

Commit

Permalink
Merge permission-performance branch
Browse files Browse the repository at this point in the history
This fixes issue2551330:
Add an optional 'filter' function to the Permission objects and the
addPermission method. This is used to optimize search performance by not
checking items returned from a database query one-by-one (using the
check function) but instead offload the permission checks to the
database. For SQL backends this performs the filtering in the database.
  • Loading branch information
schlatterbeck committed Nov 26, 2024
2 parents 28a7dc3 + d485ecc commit 99065f4
Show file tree
Hide file tree
Showing 19 changed files with 574 additions and 134 deletions.
6 changes: 6 additions & 0 deletions CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,12 @@ Features:
- issue2551315 - Document use of
RestfulInstance.max_response_row_size to limit data returned
from rest request.
- issue2551330 Add an optional 'filter' function to the Permission
objects and the addPermission method. This is used to optimize search
performance by not checking items returned from a database query
one-by-one (using the check function) but instead offload the
permission checks to the database. For SQL backends this performs the
filtering in the database. (Ralf Schlatterbeck)

2024-07-13 2.4.0

Expand Down
11 changes: 11 additions & 0 deletions doc/design.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1524,6 +1524,7 @@ The security module defines::
- klass (optional)
- properties (optional)
- check function (optional)
- filter function (optional)

The klass may be unset, indicating that this permission is
not locked to a particular hyperdb class. There may be
Expand All @@ -1536,6 +1537,16 @@ The security module defines::
If check function is set, permission is granted only when
the function returns value interpreted as boolean true.
The function is called with arguments db, userid, itemid.

A filter function complements a check function: It is used
when searching for viewable items. The filter function
allows to filter in SQL rather than calling the check
function for each item after a query. It must return a list
of dictionaries containing parameters for the hyperdb.Class.filter
method. An empty list indicates no access. The signature of
the filter function is::

def filter(db, userid, klass):
'''

class Role:
Expand Down
94 changes: 93 additions & 1 deletion doc/reference.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1484,7 +1484,7 @@ When adding a new Permission, you need to:
4. check it in the appropriate hasPermission methods in your tracker's
extensions/detectors/interfaces.py modules

The ``addPermission`` method takes a three optional parameters:
The ``addPermission`` method takes a four optional parameters:

**check**
A function to be executed which returns boolean determining whether
Expand Down Expand Up @@ -1520,6 +1520,98 @@ The ``addPermission`` method takes a three optional parameters:
permission. When searching there is no item defined, so a check
function does not make any sense.

**filter**
This optional function returns parameters for the ``filter`` method
when getting ``Class`` items (users, issues etc.) from the
database. It filters items at the database level (using SQL where
possible). This pre-filters the number of items returned from the
database when displaying results in an ``index`` template. This
filtering is usually faster than calling a ``check`` method (see
previous argument) on *each individual result*.

The ``filter`` method has the signature::

filter(db, userid, klass)

where ``db`` is the database handle, ``userid`` is the user attempting
access and ``klass`` is the ``Class`` in the schema.

The ``filter`` function must return a *list* of dictionaries of
parameters of the
`Class.filter <design.html#:~:text=search_matches>`_ call.
This includes filterspec, retired and exact_match_specs.
Note that sort and group parameters of the filter call should
not be set by filter method (they will be overwritten) and the
parameter search_matches must not be set.

The query executed by an index template is modified by the
parameters computed by the ``filter`` function. An
empty list of filter parameters (``[]``) indicates no access. When
using a filter, a check function is still needed to test each
individual item for visibility. When the filter function is defined
but a check function is not defined, a check function is
manufactured automatically from the ``filter`` function.

Note that the filter option is not supported for the Search
permission. Since the filter function is called *after* the search was
already performed a filter function does not make any sense.

An example ``filter`` function for the ``view_query`` check function
in the query checks above would look like::

def filter_query(db, userid, klass):
return [{'filterspec': {
'private_for': ['-1', userid]
}}]

This would be called by the framework for all queries found when
displaying queries. It filters for all queries where the
``private_for`` field is the userid or empty. This matches the
definition of the ``view_query`` function above where permission is
granted if the ``private_for`` field indicates the query is owned by
the user, or the ``private_for`` field is empty indicating that the
query is public. If we want to modify the check to also allow acess if
the user is the ``creator`` of a query we would change the filter
function to::

def filter_query(db, userid, klass):
f1 = {'filterspec': {'private_for': ['-1', userid]}}
f2 = {'filterspec': {'creator': userid}}
return [f1, f2]

This is an example where we need multiple filter calls to model an
"or" condition, the user has access if either the ``private_for``
check passes *or* the user is the creator of the query.

Consider an example where we have a class structure where both the
``issue`` class and the ``user`` class include a reference to an
``organization`` class. Users are permitted to view only those
issues that are associated with their respective organizations. A
check function or this could look like::

def view_issue(db, userid, itemid):
user = db.user.getnode(userid)
if not user.organisation:
return False
issue = db.issue.getnode(itemid)
if user.organisation == issue.organisation:
return True

The corresponding ``filter`` function::

def filter_issue(db, userid, klass):
user = db.user.getnode(userid)
if not user.organisation:
return []
return [{'filterspec': {
'organisation': user.organisation
}}]

This filters for all issues where the organisation is the same as the
organisation of the user. Note how the filter fails early returning an
empty list (meaning "no access") if the user happens to not have an
organisation.

**properties**
A sequence of property names that are the only properties to apply the
new Permission to (eg. ``... klass='user', properties=('name',
Expand Down
2 changes: 1 addition & 1 deletion roundup/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -1920,7 +1920,7 @@ def do_security(self, args):
roles.sort()
for _rolename, role in roles:
sys.stdout.write(_('Role "%(name)s":\n') % role.__dict__)
for permission in role.permissions:
for permission in role.permission_list():
d = permission.__dict__
if permission.klass:
if permission.properties:
Expand Down
2 changes: 2 additions & 0 deletions roundup/backends/back_anydbm.py
Original file line number Diff line number Diff line change
Expand Up @@ -408,11 +408,13 @@ def getnode(self, classname, nodeid, db=None, cache=1):
if db is None:
db = self.getclassdb(classname)
if nodeid not in db:
db.close()
raise IndexError("no such %s %s" % (classname, nodeid))

# check the uncommitted, destroyed nodes
if (classname in self.destroyednodes and
nodeid in self.destroyednodes[classname]):
db.close()
raise IndexError("no such %s %s" % (classname, nodeid))

# decode
Expand Down
18 changes: 4 additions & 14 deletions roundup/cgi/actions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1706,13 +1706,9 @@ def fct(arg):
# generate the CSV output
self.client._socket_op(writer.writerow, columns)
# and search
for itemid in klass.filter(matches, filterspec, sort, group):
filter = klass.filter_with_permissions
for itemid in filter(matches, filterspec, sort, group):
row = []
# don't put out a row of [hidden] fields if the user has
# no access to the issue.
if not self.hasPermission(self.permissionType, itemid=itemid,
classname=request.classname):
continue
for name in columns:
# check permission for this property on this item
# TODO: Permission filter doesn't work for the 'user' class
Expand Down Expand Up @@ -1807,15 +1803,9 @@ def handle(self):
self.client._socket_op(writer.writerow, columns)

# and search
for itemid in klass.filter(matches, filterspec, sort, group):
filter = klass.filter_with_permissions
for itemid in filter(matches, filterspec, sort, group):
row = []
# FIXME should this code raise an exception if an item
# is included that can't be accessed? Enabling this
# check will just skip the row for the inaccessible item.
# This makes it act more like the web interface.
# if not self.hasPermission(self.permissionType, itemid=itemid,
# classname=request.classname):
# continue
for name in columns:
# check permission to view this property on this item
if not self.hasPermission(self.permissionType, itemid=itemid,
Expand Down
8 changes: 4 additions & 4 deletions roundup/cgi/templating.py
Original file line number Diff line number Diff line change
Expand Up @@ -3427,7 +3427,7 @@ def batch(self, permission='View'):
return Batch(self.client, [], self.pagesize, self.startwith,
classname=self.classname)

filterspec = self.filterspec
fspec = self.filterspec
sort = self.sort
group = self.group

Expand All @@ -3453,9 +3453,9 @@ def batch(self, permission='View'):
matches = None

# filter for visibility
allowed = [itemid for itemid in klass.filter(matches, filterspec,
sort, group)
if check(permission, userid, self.classname, itemid=itemid)]
allowed = klass.filter_with_permissions(
matches, fspec, sort, group, permission=permission, userid=userid
)

# return the batch object, using IDs only
return Batch(self.client, allowed, self.pagesize, self.startwith,
Expand Down
15 changes: 13 additions & 2 deletions roundup/configuration.py
Original file line number Diff line number Diff line change
Expand Up @@ -1462,6 +1462,17 @@ def str2value(self, value):
("rdbms", (
(DatabaseBackend, 'backend', NODEFAULT,
"Database backend."),
(BooleanOption, "debug_filter", "no",
"Filter debugging: Permissions can define additional filter\n"
"functions that are used when checking permissions on results\n"
"returned by the database. This is done to improve\n"
"performance since the filtering is done in the database\n"
"backend, not in python (at least for the SQL backends). The\n"
"user is responsible for making the filter return the same\n"
"set of results as the check function for a permission. So it\n"
"makes sense to aid in debugging (and performance\n"
"measurements) to allow turning off the usage of filter\n"
"functions using only the check functions."),
(Option, 'name', 'roundup',
"Name of the database to use. For Postgresql, this can\n"
"be database.schema to use a specific schema within\n"
Expand Down Expand Up @@ -1545,8 +1556,8 @@ def str2value(self, value):
"Set the database cursor for filter queries to serverside\n"
"cursor, this avoids caching large amounts of data in the\n"
"client. This option only applies for the postgresql backend."),
), "Settings in this section (except for backend) are used\n"
" by RDBMS backends only.",
), "Most settings in this section (except for backend and debug_filter)\n"
"are used by RDBMS backends only.",
),
("sessiondb", (
(SessiondbBackendOption, "backend", "",
Expand Down
54 changes: 54 additions & 0 deletions roundup/hyperdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -1796,6 +1796,60 @@ def filter(self, search_matches, filterspec, sort=[], group=[],
# anyway).
filter_iter = filter

def filter_with_permissions(self, search_matches, filterspec, sort=[],
group=[], retired=False, exact_match_spec={},
limit=None, offset=None,
permission='View', userid=None):
""" Do the same as filter but return only the items the user is
entitled to see, running the results through security checks.
The userid defaults to the current database user.
"""
if userid is None:
userid = self.db.getuid()
cn = self.classname
sec = self.db.security
filterspec = sec.filterFilterspec(userid, cn, filterspec)
if exact_match_spec:
exact_match_spec = sec.filterFilterspec(userid, cn,
exact_match_spec)
sort = sec.filterSortspec(userid, cn, sort)
group = sec.filterSortspec(userid, cn, group)
item_ids = self.filter(search_matches, filterspec, sort, group,
retired, exact_match_spec, limit, offset)
check = sec.hasPermission
if check(permission, userid, cn, skip_permissions_with_check = True):
allowed = item_ids
else:
debug = self.db.config.RDBMS_DEBUG_FILTER
# Note that is_filterable returns True if no permissions are
# found. This makes it fail early (with an empty allowed list)
# instead of running through all ids with an empty
# permission list.
if not debug and sec.is_filterable(permission, userid, cn):
new_ids = set(item_ids)
confirmed = set()
for perm in sec.filter_iter(permission, userid, cn):
fargs = perm.filter(self.db, userid, self)
for farg in fargs:
farg.update(sort=[], group=[], retired=None)
result = self.filter(list(new_ids), **farg)
new_ids.difference_update(result)
confirmed.update(result)
# all allowed?
if not new_ids:
break
# all allowed?
if not new_ids:
break
# Need to sort again in database
allowed = self.filter(confirmed, {}, sort=sort, group=group,
retired=None)
else: # Last resort: filter in python
allowed = [id for id in item_ids
if check(permission, userid, cn, itemid=id)]
return allowed


def count(self):
"""Get the number of nodes in this class.
Expand Down
9 changes: 4 additions & 5 deletions roundup/rest.py
Original file line number Diff line number Diff line change
Expand Up @@ -953,7 +953,7 @@ def get_collection(self, class_name, input):
kw['limit'] = self.max_response_row_size
if page['index'] is not None and page['index'] > 1:
kw['offset'] = (page['index'] - 1) * page['size']
obj_list = class_obj.filter(None, *l, **kw)
obj_list = class_obj.filter_with_permissions(None, *l, **kw)

# Have we hit the max number of returned rows?
# If so there may be more data that the client
Expand All @@ -973,10 +973,9 @@ def get_collection(self, class_name, input):
result['collection'] = []
for item_id in obj_list:
r = {}
if self.db.security.hasPermission(
'View', uid, class_name, itemid=item_id, property='id',
):
r = {'id': item_id, 'link': class_path + item_id}
# No need to check permission on id here, as we have only
# security-checked results
r = {'id': item_id, 'link': class_path + item_id}
if display_props:
# format_item does the permission checks
r.update(self.format_item(class_obj.getnode(item_id),
Expand Down
Loading

0 comments on commit 99065f4

Please sign in to comment.