Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimisation: Avoid set copies when compiling files #31

Merged

Conversation

leamingrad
Copy link
Contributor

This PR aims to speed up file compilation by avoid unnecessary copying of sets.

Previously, the Scope class had methods of the form:

def names_in_use(self):
	names = self.names
    if self.parent_scope is not None:
        # This create a new set
        result = result | self.parent_scope.names_in_use()

    return names

These sets were used by various other codegen functions to check for membership. With large fluent files, this results in a lot of unnecessary set creation, as there might be many names added to a given scope.

This PR replaces these methods with query equivalents of the form:

def is_name_in_use(self, name: str) -> bool:
    if name in self.names:
        return True

    if self.parent_scope is None:
        return False

    return self.parent_scope.is_name_in_use(name)

By doing this we speed up compiling by about 60%. For the 10K line benchmark added in #30, it reduces the time from ~10s to ~4s.

Benchmark before
❯ ./compiler.py -k 10k --benchmark-warmup=off
================================================================================================= test session starts ==================================================================================================
platform darwin -- Python 3.10.14, pytest-8.2.1, pluggy-1.5.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Volumes/Code/fluent-compiler
configfile: pyproject.toml
plugins: anyio-4.3.0, hypothesis-6.102.6, benchmark-4.0.0
collected 4 items / 3 deselected / 1 selected                                                                                                                                                                          

compiler.py .                                                                                                                                                                                                    [100%]


----------------------------------------------- benchmark: 1 tests -----------------------------------------------
Name (time in s)                Min      Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
------------------------------------------------------------------------------------------------------------------
test_file_with_10k_items     9.7863  10.1248  9.9234  0.1377  9.8794  0.2118       1;0  0.1008       5           1
------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
====================================================================================== 1 passed, 3 deselected in 69.96s (0:01:09) ======================================================================================
Benchmark after
❯ ./compiler.py -k 10k --benchmark-warmup=off
================================================================================================= test session starts ==================================================================================================
platform darwin -- Python 3.10.14, pytest-8.2.1, pluggy-1.5.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Volumes/Code/fluent-compiler
configfile: pyproject.toml
plugins: anyio-4.3.0, hypothesis-6.102.6, benchmark-4.0.0
collected 4 items / 3 deselected / 1 selected                                                                                                                                                                          

compiler.py .                                                                                                                                                                                                    [100%]


----------------------------------------------- benchmark: 1 tests ----------------------------------------------
Name (time in s)                Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------
test_file_with_10k_items     3.9957  4.0290  4.0099  0.0127  4.0051  0.0164       2;0  0.2494       5           1
-----------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
=========================================================================================== 1 passed, 3 deselected in 29.47s ==========================================================================================

This will make the following changes clearer.
Prior to this change, if we were checking whether a non-builtin name was
allowed, we would expand the set of reserved names to include all
keywords.

This change switches to using the iskeyword function instead, which will
allow us to stop using the set of reserved names altogether in a future
commit.
Prior to this change, code which was checking whether a name was
reserved in a given scope would need to fetch the relevant set of names
and check whether the name was in it directly.

This change adds query functions which allow the checks to be performed
directly. The query functions recurse to the parent scope, rather than
needing to fetch a new set containing this scope's names and the parent
scope's.
Prior to this change, we had replaced all callers with the query
equivalents. This provides a considerable performance improvement when
compiling large files, as we are no longer repeatedly creating new sets.

This change removes the unused methods which generated the sets as they
are no longer needed (and are a footgun).
@leamingrad leamingrad force-pushed the remove-repeated-set-creations branch from b097d44 to a9d5d69 Compare May 26, 2024 18:19
@leamingrad leamingrad marked this pull request as ready for review May 26, 2024 18:26
@spookylukey
Copy link
Contributor

This looks great, thanks!

On my Linux box I'm seeing even bigger speedups, from ~28s to ~6s on the same benchmark.

@spookylukey spookylukey merged commit 40e624c into django-ftl:master May 30, 2024
6 of 11 checks passed
@leamingrad leamingrad deleted the remove-repeated-set-creations branch May 31, 2024 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants