Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improving tag performance (and simplicity) #26

Open
mplemay opened this issue Jan 2, 2025 · 3 comments
Open

improving tag performance (and simplicity) #26

mplemay opened this issue Jan 2, 2025 · 3 comments
Labels
enhancement New feature or request

Comments

@mplemay
Copy link
Contributor

mplemay commented Jan 2, 2025

Hi @volfpeter,

First off, I want to say how much I admire your work.

I was taking a closer look at the implementation of Tag and noticed that it’s essentially structured like a dataclass. This got me thinking about what switching it to an actual dataclass might bring to the table in terms of both performance and readability.

Based on my benchmarking and a review of the dataclass approach, it seems like such a change could simplify the implementation while maintaining or even improving efficiency (see the benchmark below). Also, as a side-effect, it would also make the code more Pythonic and easier to maintain.

If you agree this is worth exploring, I’d be happy to assist with the implementation. Let me know what you think!

Results

Benchmarking object creation:
Object creation for ExampleTag: 0.0110 seconds
Object creation for ExampleTagNew: 0.0073 seconds

Benchmarking property access:
Property access for ExampleTag: 0.0021 seconds
Property access for ExampleTagNew: 0.0021 seconds

Benchmarking memory usage:
Memory usage for ExampleTag: Current=7813.41 KB, Peak=7813.48 KB
Memory usage for ExampleTagNew: Current=7032.16 KB, Peak=7032.23 KB

Benchmark

import time
import tracemalloc

from dataclasses import dataclass
from functools import cached_property
from abc import ABC, abstractmethod


class Context:
    pass


class Component:
    pass


class BaseTag(ABC):
    """
    Base tag class.

    Tags are always synchronous.

    If the content of a tag must be calculated asynchronously, then the content can be implemented
    as a separate async component or be resolved in an async parent component. If a property of a
    tag must be calculated asynchronously, then the tag can be wrapped in an async component that
    resolves the async content and then passes the value to the tag.
    """

    __slots__ = ("_htmy_name",)

    def __init__(self) -> None:
        self._htmy_name = self._get_htmy_name()

    @property
    def htmy_name(self) -> str:
        """The tag name."""
        return self._htmy_name

    @abstractmethod
    def htmy(self, context: Context) -> Component:
        """Abstract base component implementation."""
        ...

    def _get_htmy_name(self) -> str:
        return type(self).__name__


@dataclass(frozen=True, slots=True, kw_only=True)
class BaseTagNew(ABC):
    """
    Optimized BaseTag class using dataclass.
    """

    @cached_property
    def htmy_name(self) -> str:
        """The tag name."""
        return type(self).__name__

    @abstractmethod
    def htmy(self, context: Context) -> Component:
        """Abstract base component implementation."""
        ...


class ExampleTag(BaseTag):
    def htmy(self, context: Context) -> Component:
        return Component()


class ExampleTagNew(BaseTagNew):
    def htmy(self, context: Context) -> Component:
        return Component()


def benchmark_creation(cls, iterations=100_000):
    start = time.time()
    for _ in range(iterations):
        obj = cls()
    end = time.time()
    print(f"Object creation for {cls.__name__}: {end - start:.4f} seconds")


def benchmark_property_access(instance, iterations=100_000):
    start = time.time()
    for _ in range(iterations):
        _ = instance.htmy_name
    end = time.time()
    print(f"Property access for {type(instance).__name__}: {end - start:.4f} seconds")


def benchmark_memory(cls, iterations=100_000):
    tracemalloc.start()
    objs = [cls() for _ in range(iterations)]
    current, peak = tracemalloc.get_traced_memory()
    tracemalloc.stop()
    print(
        f"Memory usage for {cls.__name__}: Current={current / 1024:.2f} KB, Peak={peak / 1024:.2f} KB"
    )


if __name__ == "__main__":
    iterations = 100_000

    print("Benchmarking object creation:")
    benchmark_creation(ExampleTag, iterations)
    benchmark_creation(ExampleTagNew, iterations)

    print("\nBenchmarking property access:")
    obj1 = ExampleTag()
    obj2 = ExampleTagNew()
    benchmark_property_access(obj1, iterations)
    benchmark_property_access(obj2, iterations)

    print("\nBenchmarking memory usage:")
    benchmark_memory(ExampleTag, iterations)
    benchmark_memory(ExampleTagNew, iterations)
@volfpeter volfpeter added the enhancement New feature or request label Jan 2, 2025
@volfpeter
Copy link
Owner

Hi,

Thanks. And thank you for taking the time and effort to look into this!

The creation and memory usage improvements look pretty good. We could even disable some more dataclass features (eq, match_args). There would be one downside, tag_config declaration (breaking change), but I don't think that's a big issue.

To properly benchmark this change, I guess we'd need to re-implement tags and the html module (trivial change), and run comparisons using larger, nested component trees. test_main_components.py and test_renderer_comparison.py may be good starting points for this, but maybe we'd need new tests (or test data) that use components with more properties.

If you're willing to work on a PR, I'd be happy to collaborate.

Cheers,
Peter

@mplemay
Copy link
Contributor Author

mplemay commented Jan 2, 2025

Hey Peter,

I can definitely take a crack at it in the next couple of days.

As for your comment below, do you have any preferences; otherwise, I will probably work towards something like https://docs.pydantic.dev/2.0/usage/model_config/

There would be one downside, tag_config declaration (breaking change), but I don't think that's a big issue.

@volfpeter
Copy link
Owner

Yeah, tag_config currently works pretty much like Pydantic's model_config (well, except there's no inheritance, which is fine, but maybe inheritance would be even better, one for the future.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants