Skip to content

Releases: domferr/fastflow-python

FastFlow 1.0.0 | Bringing Parallelism to Python

18 Oct 11:17
Compare
Choose a tag to compare

🚀 First Release of FastFlow. Bringing Parallelism to Python Like Never Before!

We are thrilled to announce the first release of FastFlow, a Python library that enables effortless parallelism by leveraging the high-performance FastFlow C++ library under the hood. With this release, Python developers can now implement advanced parallel patterns and building blocks (such as pipelines, farms, and all-to-all communication) while overcoming the limitations of Python's Global Interpreter Lock (GIL).

✨ Key Features

  • Parallel Building Blocks
    • Farm: Distribute tasks across multiple worker nodes for parallel execution. Ideal for cases like web scraping, data processing, and ML inference, where parallelizing repetitive tasks can significantly boost performance.
    • Pipeline: sequential stages of task processing, with each stage running in parallel. Useful for data pipelines, image processing workflows, and batch data transformation.
    • All-to-All: sophisticated parallel workflows where all nodes in one stage communicate with all nodes in another stage. This is perfect for computational graphs or systems requiring multi-path data flow.
  • Python’s GIL limitations are overcomed under the hood with two strategies
    • Multi-processing: Default strategy that spawns multiple processes for parallel execution.
    • Subinterpreters: Use multiple threads, each with its own Python subinterpreter, to achieve parallelism while avoiding GIL constraints. Only available from Python 3.12.
  • Advanced Data Management
    • Flexibly share data between nodes, with the burden of performing serialization/deserialization. The library does it for you!

🛠 Additional Features

  • Efficient Communication with ff_send_out
    • Send data to multiple destinations or specific nodes, enabling advanced control over task distribution and result collection.
  • Blocking Mode & Resource Optimization
    • Control whether nodes block while waiting for data (for lower resource usage) or actively check inputs (for higher responsiveness). Also, leverage FastFlow’s mapping feature to pin nodes to specific CPU cores for better performance.
  • On-Demand Scheduling
    • Use on-demand scheduling with the All-to-All building block, allowing nodes to request data only when needed, in contrast to round-robin distribution. This feature optimizes resource usage for workloads with variable execution times.