Optimisation: Avoid set copies when compiling files #31

leamingrad · 2024-05-26T18:14:10Z

This PR aims to speed up file compilation by avoid unnecessary copying of sets.

Previously, the Scope class had methods of the form:

def names_in_use(self):
	names = self.names
    if self.parent_scope is not None:
        # This create a new set
        result = result | self.parent_scope.names_in_use()

    return names

These sets were used by various other codegen functions to check for membership. With large fluent files, this results in a lot of unnecessary set creation, as there might be many names added to a given scope.

This PR replaces these methods with query equivalents of the form:

def is_name_in_use(self, name: str) -> bool:
    if name in self.names:
        return True

    if self.parent_scope is None:
        return False

    return self.parent_scope.is_name_in_use(name)

By doing this we speed up compiling by about 60%. For the 10K line benchmark added in #30, it reduces the time from ~10s to ~4s.

Benchmark before

❯ ./compiler.py -k 10k --benchmark-warmup=off
================================================================================================= test session starts ==================================================================================================
platform darwin -- Python 3.10.14, pytest-8.2.1, pluggy-1.5.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Volumes/Code/fluent-compiler
configfile: pyproject.toml
plugins: anyio-4.3.0, hypothesis-6.102.6, benchmark-4.0.0
collected 4 items / 3 deselected / 1 selected                                                                                                                                                                          

compiler.py .                                                                                                                                                                                                    [100%]


----------------------------------------------- benchmark: 1 tests -----------------------------------------------
Name (time in s)                Min      Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
------------------------------------------------------------------------------------------------------------------
test_file_with_10k_items     9.7863  10.1248  9.9234  0.1377  9.8794  0.2118       1;0  0.1008       5           1
------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
====================================================================================== 1 passed, 3 deselected in 69.96s (0:01:09) ======================================================================================

Benchmark after

❯ ./compiler.py -k 10k --benchmark-warmup=off
================================================================================================= test session starts ==================================================================================================
platform darwin -- Python 3.10.14, pytest-8.2.1, pluggy-1.5.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Volumes/Code/fluent-compiler
configfile: pyproject.toml
plugins: anyio-4.3.0, hypothesis-6.102.6, benchmark-4.0.0
collected 4 items / 3 deselected / 1 selected                                                                                                                                                                          

compiler.py .                                                                                                                                                                                                    [100%]


----------------------------------------------- benchmark: 1 tests ----------------------------------------------
Name (time in s)                Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------
test_file_with_10k_items     3.9957  4.0290  4.0099  0.0127  4.0051  0.0164       2;0  0.2494       5           1
-----------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
=========================================================================================== 1 passed, 3 deselected in 29.47s ==========================================================================================

This will make the following changes clearer.

Prior to this change, if we were checking whether a non-builtin name was allowed, we would expand the set of reserved names to include all keywords. This change switches to using the iskeyword function instead, which will allow us to stop using the set of reserved names altogether in a future commit.

Prior to this change, code which was checking whether a name was reserved in a given scope would need to fetch the relevant set of names and check whether the name was in it directly. This change adds query functions which allow the checks to be performed directly. The query functions recurse to the parent scope, rather than needing to fetch a new set containing this scope's names and the parent scope's.

…function_arg

Prior to this change, we had replaced all callers with the query equivalents. This provides a considerable performance improvement when compiling large files, as we are no longer repeatedly creating new sets. This change removes the unused methods which generated the sets as they are no longer needed (and are a footgun).

spookylukey · 2024-05-30T19:36:08Z

This looks great, thanks!

On my Linux box I'm seeing even bigger speedups, from ~28s to ~6s on the same benchmark.

leamingrad added 7 commits May 26, 2024 19:06

Pull out a helper for checking whether a candidate name is allowed

f51c0f4

This will make the following changes clearer.

Replace uses of all_reserved_names with is_name_reserved

9151edc

Replace callers of function_arg_reserved_names with is_name_reserved_…

af9cd54

…function_arg

Replace all callers of names_in_use with is_names_in_use

72442eb

leamingrad force-pushed the remove-repeated-set-creations branch from b097d44 to a9d5d69 Compare May 26, 2024 18:19

leamingrad mentioned this pull request May 26, 2024

Optimisations for compiling large fluent files #32

Open

leamingrad marked this pull request as ready for review May 26, 2024 18:26

spookylukey merged commit 40e624c into django-ftl:master May 30, 2024
6 of 11 checks passed

leamingrad deleted the remove-repeated-set-creations branch May 31, 2024 16:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimisation: Avoid set copies when compiling files #31

Optimisation: Avoid set copies when compiling files #31

leamingrad commented May 26, 2024

spookylukey commented May 30, 2024

Optimisation: Avoid set copies when compiling files #31

Optimisation: Avoid set copies when compiling files #31

Conversation

leamingrad commented May 26, 2024

spookylukey commented May 30, 2024