You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First off, I want to say how much I admire your work.
I was taking a closer look at the implementation of Tag and noticed that it’s essentially structured like a dataclass. This got me thinking about what switching it to an actual dataclass might bring to the table in terms of both performance and readability.
Based on my benchmarking and a review of the dataclass approach, it seems like such a change could simplify the implementation while maintaining or even improving efficiency (see the benchmark below). Also, as a side-effect, it would also make the code more Pythonic and easier to maintain.
If you agree this is worth exploring, I’d be happy to assist with the implementation. Let me know what you think!
Results
Benchmarking object creation:
Object creation for ExampleTag: 0.0110 seconds
Object creation for ExampleTagNew: 0.0073 seconds
Benchmarking property access:
Property access for ExampleTag: 0.0021 seconds
Property access for ExampleTagNew: 0.0021 seconds
Benchmarking memory usage:
Memory usage for ExampleTag: Current=7813.41 KB, Peak=7813.48 KB
Memory usage for ExampleTagNew: Current=7032.16 KB, Peak=7032.23 KB
Benchmark
importtimeimporttracemallocfromdataclassesimportdataclassfromfunctoolsimportcached_propertyfromabcimportABC, abstractmethodclassContext:
passclassComponent:
passclassBaseTag(ABC):
""" Base tag class. Tags are always synchronous. If the content of a tag must be calculated asynchronously, then the content can be implemented as a separate async component or be resolved in an async parent component. If a property of a tag must be calculated asynchronously, then the tag can be wrapped in an async component that resolves the async content and then passes the value to the tag. """__slots__= ("_htmy_name",)
def__init__(self) ->None:
self._htmy_name=self._get_htmy_name()
@propertydefhtmy_name(self) ->str:
"""The tag name."""returnself._htmy_name@abstractmethoddefhtmy(self, context: Context) ->Component:
"""Abstract base component implementation."""
...
def_get_htmy_name(self) ->str:
returntype(self).__name__@dataclass(frozen=True, slots=True, kw_only=True)classBaseTagNew(ABC):
""" Optimized BaseTag class using dataclass. """@cached_propertydefhtmy_name(self) ->str:
"""The tag name."""returntype(self).__name__@abstractmethoddefhtmy(self, context: Context) ->Component:
"""Abstract base component implementation."""
...
classExampleTag(BaseTag):
defhtmy(self, context: Context) ->Component:
returnComponent()
classExampleTagNew(BaseTagNew):
defhtmy(self, context: Context) ->Component:
returnComponent()
defbenchmark_creation(cls, iterations=100_000):
start=time.time()
for_inrange(iterations):
obj=cls()
end=time.time()
print(f"Object creation for {cls.__name__}: {end-start:.4f} seconds")
defbenchmark_property_access(instance, iterations=100_000):
start=time.time()
for_inrange(iterations):
_=instance.htmy_nameend=time.time()
print(f"Property access for {type(instance).__name__}: {end-start:.4f} seconds")
defbenchmark_memory(cls, iterations=100_000):
tracemalloc.start()
objs= [cls() for_inrange(iterations)]
current, peak=tracemalloc.get_traced_memory()
tracemalloc.stop()
print(
f"Memory usage for {cls.__name__}: Current={current/1024:.2f} KB, Peak={peak/1024:.2f} KB"
)
if__name__=="__main__":
iterations=100_000print("Benchmarking object creation:")
benchmark_creation(ExampleTag, iterations)
benchmark_creation(ExampleTagNew, iterations)
print("\nBenchmarking property access:")
obj1=ExampleTag()
obj2=ExampleTagNew()
benchmark_property_access(obj1, iterations)
benchmark_property_access(obj2, iterations)
print("\nBenchmarking memory usage:")
benchmark_memory(ExampleTag, iterations)
benchmark_memory(ExampleTagNew, iterations)
The text was updated successfully, but these errors were encountered:
Thanks. And thank you for taking the time and effort to look into this!
The creation and memory usage improvements look pretty good. We could even disable some more dataclass features (eq, match_args). There would be one downside, tag_config declaration (breaking change), but I don't think that's a big issue.
To properly benchmark this change, I guess we'd need to re-implement tags and the html module (trivial change), and run comparisons using larger, nested component trees. test_main_components.py and test_renderer_comparison.py may be good starting points for this, but maybe we'd need new tests (or test data) that use components with more properties.
If you're willing to work on a PR, I'd be happy to collaborate.
Yeah, tag_config currently works pretty much like Pydantic's model_config (well, except there's no inheritance, which is fine, but maybe inheritance would be even better, one for the future.)
Hi @volfpeter,
First off, I want to say how much I admire your work.
I was taking a closer look at the implementation of
Tag
and noticed that it’s essentially structured like adataclass
. This got me thinking about what switching it to an actualdataclass
might bring to the table in terms of both performance and readability.Based on my benchmarking and a review of the dataclass approach, it seems like such a change could simplify the implementation while maintaining or even improving efficiency (see the benchmark below). Also, as a side-effect, it would also make the code more Pythonic and easier to maintain.
If you agree this is worth exploring, I’d be happy to assist with the implementation. Let me know what you think!
Results
Benchmark
The text was updated successfully, but these errors were encountered: