Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch to micromamba #6

Merged
merged 7 commits into from
Nov 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,4 @@ share/python-wheels/
MANIFEST
venv
.history
.conda_env_created
16 changes: 9 additions & 7 deletions Dockerfile.moeflow_worker
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# FROM pytorch/pytorch:2.2.2-cuda12.1-cudnn8-runtime
FROM continuumio/miniconda3
FROM mambaorg/micromamba:2-debian12-slim

RUN apt update \
&& DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata build-essential python3-opencv
COPY ./conda.yaml /tmp/conda.yaml
RUN --mount=type=cache,uid=57439,gid=57439,target=/opt/conda/pkgs micromamba env create --yes --file /tmp/conda.yaml
# RUN apt update \
# && DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata build-essential
COPY --chown=mambauser . /app
Comment on lines +5 to +6
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Keep system packages - they are required for custom compilation

The packages should not be commented out as they are necessary:

  • Found C++ source files (custom_ctc.cc, custom_ctc_cuda_driver.cc) in training/ocr/ that require compilation
  • build-essential is needed for compiling these custom CTC (Connectionist Temporal Classification) implementations
  • tzdata is likely needed for proper timezone handling in the application (multiple Python files use datetime)
  • These packages are not provided by conda.yaml which only includes Python and PyTorch dependencies
  • The micromamba base image (debian12-slim) is minimal and doesn't include these by default
🔗 Analysis chain

Verify removal of system packages

The commented-out system packages (tzdata, build-essential) might still be required. Please verify:

  1. Are these packages now provided by the micromamba base image?
  2. Are they installed through conda.yaml instead?
  3. Are they no longer needed in the new setup?

Let me gather more information about the base image and conda environment setup.


Let me check one final thing about the base image and any potential custom compilation needs.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if these packages are specified in conda.yaml
echo "Checking conda.yaml for system packages..."
rg -i "tzdata|build-essential" conda.yaml

# Check if any build commands or timezone operations exist
echo "Checking for build or timezone dependencies..."
rg -l "datetime|pytz|compile|build" .

Length of output: 989


Script:

#!/bin/bash
# Check base image and conda environment setup
echo "Checking Dockerfile base image and conda setup..."
head -n 20 Dockerfile.moeflow_worker

echo "Checking conda.yaml dependencies..."
cat conda.yaml

echo "Checking if there are any compile/build steps in Makefile..."
cat Makefile

Length of output: 2020


Script:

#!/bin/bash
# Check if there are any custom compilation needs
echo "Checking for C/C++ files that might need compilation..."
fd -e cc -e cpp -e c

# Check base image details
echo "Checking if base image already includes build tools..."
docker run --rm mambaorg/micromamba:2-debian12-slim dpkg -l | grep -E "build-essential|tzdata"

Length of output: 499

WORKDIR /app
COPY . /app
RUN --mount=type=cache,target=/root/.conda conda env update --prune --file conda.yaml
RUN conda run -n mit-py311 --no-capture-output python3 docker_prepare.py
RUN micromamba run -n mit-py311 python3 -mvenv --system-site-packages venv
RUN venv/bin/pip install -r requirements-moeflow.txt
RUN venv/bin/python docker_prepare.py --models ocr.48px,ocr.48px_ctc,ocr.32px,ocr.mocr
18 changes: 15 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
.PHONY: default

CONDA_ENV = mit-py311
CONDA_YML ?= conda.yaml

default:
@echo Please use other targets

conda-env:
conda env update --prune --file conda.yaml

run-worker:
conda run -n mit-py311 --no-capture-output celery --app moeflow_worker worker --queues mit --loglevel=debug --concurrency=1

Expand All @@ -25,3 +25,15 @@ run-web-server:
--use-gpu \
--host=0.0.0.0 \
--port=5003

install-venv-deps:
venv/bin/pip install -r requirements-moeflow.txt

conda-venv: .conda_env_created # alt to `venv/.venv_created` target, but uses conda python to create venv
micromamba run --attach '' -n $(CONDA_ENV) python3 -mvenv --system-site-packages ./venv
touch venv/.venv_created

.conda_env_created: $(CONDA_YML)
# setup conda environment AND env-wise deps
micromamba env create -n $(CONDA_ENV) --yes -f $(CONDA_YML)
touch $@
Comment on lines +36 to +39
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add environment cleanup and micromamba version check

The target creates a new Conda environment but doesn't handle existing environments. Consider:

  1. Adding cleanup of existing environment
  2. Verifying micromamba version for compatibility
 .conda_env_created: $(CONDA_YML)
+	micromamba --version
+	micromamba env remove -n $(CONDA_ENV) --yes || true
 	# setup conda environment AND env-wise deps
 	micromamba env create -n $(CONDA_ENV) --yes -f $(CONDA_YML)
 	touch $@
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.conda_env_created: $(CONDA_YML)
# setup conda environment AND env-wise deps
micromamba env create -n $(CONDA_ENV) --yes -f $(CONDA_YML)
touch $@
.conda_env_created: $(CONDA_YML)
micromamba --version
micromamba env remove -n $(CONDA_ENV) --yes || true
# setup conda environment AND env-wise deps
micromamba env create -n $(CONDA_ENV) --yes -f $(CONDA_YML)
touch $@

138 changes: 9 additions & 129 deletions conda.yaml
Original file line number Diff line number Diff line change
@@ -1,132 +1,12 @@
name: mit-py311
channels:
- pytorch
- nvidia
- defaults
- conda-forge
- pytorch
- nvidia
dependencies:
- _libgcc_mutex=0.1=main
- _openmp_mutex=5.1=1_gnu
- blas=1.0=mkl
- brotli-python=1.0.9=py311h6a678d5_7
- bzip2=1.0.8=h5eee18b_5
- ca-certificates=2024.3.11=h06a4308_0
- certifi=2024.2.2=py311h06a4308_0
- charset-normalizer=2.0.4=pyhd3eb1b0_0
- cuda-cudart=12.1.105=0
- cuda-cupti=12.1.105=0
- cuda-libraries=12.1.0=0
- cuda-nvrtc=12.1.105=0
- cuda-nvtx=12.1.105=0
- cuda-opencl=12.4.127=0
- cuda-runtime=12.1.0=0
- ffmpeg=4.3=hf484d3e_0
- filelock=3.13.1=py311h06a4308_0
- freetype=2.12.1=h4a9f257_0
- gmp=6.2.1=h295c915_3
- gmpy2=2.1.2=py311hc9b5ff0_0
- gnutls=3.6.15=he1e5248_0
- idna=3.4=py311h06a4308_0
- intel-openmp=2023.1.0=hdb19cb5_46306
- jinja2=3.1.3=py311h06a4308_0
- jpeg=9e=h5eee18b_1
- lame=3.100=h7b6447c_0
- lcms2=2.12=h3be6417_0
- ld_impl_linux-64=2.38=h1181459_1
- lerc=3.0=h295c915_0
- libcublas=12.1.0.26=0
- libcufft=11.0.2.4=0
- libcufile=1.9.1.3=0
- libcurand=10.3.5.147=0
- libcusolver=11.4.4.55=0
- libcusparse=12.0.2.55=0
- libdeflate=1.17=h5eee18b_1
- libffi=3.4.4=h6a678d5_0
- libgcc-ng=11.2.0=h1234567_1
- libgomp=11.2.0=h1234567_1
- libiconv=1.16=h7f8727e_2
- libidn2=2.3.4=h5eee18b_0
- libjpeg-turbo=2.0.0=h9bf148f_0
- libnpp=12.0.2.50=0
- libnvjitlink=12.1.105=0
- libnvjpeg=12.1.1.14=0
- libpng=1.6.39=h5eee18b_0
- libstdcxx-ng=11.2.0=h1234567_1
- libtasn1=4.19.0=h5eee18b_0
- libtiff=4.5.1=h6a678d5_0
- libunistring=0.9.10=h27cfd23_0
- libuuid=1.41.5=h5eee18b_0
- libwebp-base=1.3.2=h5eee18b_0
- llvm-openmp=14.0.6=h9e868ea_0
- lz4-c=1.9.4=h6a678d5_0
- markupsafe=2.1.3=py311h5eee18b_0
- mkl=2023.1.0=h213fc3f_46344
- mkl-service=2.4.0=py311h5eee18b_1
- mkl_fft=1.3.8=py311h5eee18b_0
- mkl_random=1.2.4=py311hdb19cb5_0
- mpc=1.1.0=h10f8cd9_1
- mpfr=4.0.2=hb69a4c5_1
- mpmath=1.3.0=py311h06a4308_0
- ncurses=6.4=h6a678d5_0
- nettle=3.7.3=hbbd107a_1
- networkx=3.1=py311h06a4308_0
- numpy=1.26.4=py311h08b1b3b_0
- numpy-base=1.26.4=py311hf175353_0
- openh264=2.1.1=h4ff587b_0
- openjpeg=2.4.0=h3ad879b_0
- openssl=3.0.13=h7f8727e_0
- pillow=10.2.0=py311h5eee18b_0
- pip=23.3.1=py311h06a4308_0
- pysocks=1.7.1=py311h06a4308_0
- python=3.11.8=h955ad1f_0
- pytorch=2.2.2=py3.11_cuda12.1_cudnn8.9.2_0
- pytorch-cuda=12.1=ha16c6d3_5
- pytorch-mutex=1.0=cuda
- pyyaml=6.0.1=py311h5eee18b_0
- readline=8.2=h5eee18b_0
- requests=2.31.0=py311h06a4308_1
- setuptools=68.2.2=py311h06a4308_0
- sqlite=3.41.2=h5eee18b_0
- sympy=1.12=py311h06a4308_0
- tbb=2021.8.0=hdb19cb5_0
- tk=8.6.12=h1ccaba5_0
- torchaudio=2.2.2=py311_cu121
- torchtriton=2.2.0=py311
- torchvision=0.17.2=py311_cu121
- typing_extensions=4.9.0=py311h06a4308_1
- tzdata=2024a=h04d1e81_0
- urllib3=2.1.0=py311h06a4308_1
- wheel=0.41.2=py311h06a4308_0
- xz=5.4.6=h5eee18b_0
- yaml=0.2.5=h7b6447c_0
- zlib=1.2.13=h5eee18b_0
- zstd=1.5.5=hc292b87_0
- pip:
- pydensecrf@https://github.com/lucasb-eyer/pydensecrf/archive/refs/heads/master.zip
- opencv-python-headless
- numpy==1.26.4
- celery==5.3.6
- celery[redis]==5.3.6
- asgiref
- aiohttp
- omegaconf==2.3.0
- langcodes
- marshmallow
- colorama
- tqdm
- einops
- shapely
- scikit-image
- pyclipper
- python-dotenv
- py3langid
- timm
- redis
- manga-ocr
- pymongo
- pandas
- onnxruntime
- openai==1.35.9
# - httpcore==0.9.1
# - httpx==0.13.3
# - googletrans==4.0.0rc1

- python==3.11
- pytorch==2.2.2
- torchvision==0.17.2
- torchaudio==2.2.2
- pytorch-cuda=12.1
- numpy<2
Comment on lines +7 to +12
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Review the removal of previously included dependencies

The dependency list has been significantly reduced. While this aligns with the switch to micromamba for a leaner environment, please verify if essential dependencies are missing.

Consider adding these common essential packages if they're required:

 - python==3.11
 - pytorch==2.2.2
 - torchvision==0.17.2
 - torchaudio==2.2.2
 - pytorch-cuda=12.1
 - numpy<2
+- pip
+- setuptools
+- wheel

Committable suggestion skipped: line range outside the PR's diff.

65 changes: 46 additions & 19 deletions docker_prepare.py
Original file line number Diff line number Diff line change
@@ -1,28 +1,55 @@
import asyncio

from argparse import ArgumentParser
from manga_translator.utils import ModelWrapper
from manga_translator.detection import DETECTORS
from manga_translator.ocr import OCRS
from manga_translator.inpainting import INPAINTERS


arg_parser = ArgumentParser()
arg_parser.add_argument("--models", default="")
arg_parser.add_argument("--continue-on-error", action="store_true")
Comment on lines +9 to +11
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance argument parser implementation

The argument parser needs improvements in documentation and validation:

  1. The --continue-on-error flag is defined but never used in the code
  2. Both arguments lack help text explaining their purpose and format
  3. The --models argument should validate the input format

Apply this diff to improve the implementation:

 arg_parser = ArgumentParser()
-arg_parser.add_argument("--models", default="")
-arg_parser.add_argument("--continue-on-error", action="store_true")
+arg_parser.add_argument(
+    "--models",
+    default="",
+    help="Comma-separated list of models to download (format: 'detector.name,ocr.name,inpaint.name')"
+)
+arg_parser.add_argument(
+    "--continue-on-error",
+    action="store_true",
+    help="Continue downloading other models if one fails"
+)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
arg_parser = ArgumentParser()
arg_parser.add_argument("--models", default="")
arg_parser.add_argument("--continue-on-error", action="store_true")
arg_parser = ArgumentParser()
arg_parser.add_argument(
"--models",
default="",
help="Comma-separated list of models to download (format: 'detector.name,ocr.name,inpaint.name')"
)
arg_parser.add_argument(
"--continue-on-error",
action="store_true",
help="Continue downloading other models if one fails"
)



cli_args = arg_parser.parse_args()


async def download(dict):
for key, value in dict.items():
if issubclass(value, ModelWrapper):
print(' -- Downloading', key)
try:
inst = value()
await inst.download()
except Exception as e:
print('Failed to download', key, value)
print(e)
""" """
for key, value in dict.items():
if issubclass(value, ModelWrapper):
print(" -- Downloading", key)
try:
inst = value()
await inst.download()
except Exception as e:
print("Failed to download", key, value)
print(e)
if not cli_args.continue_on_error:
raise

Comment on lines +16 to +30
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid using global variables

The function directly accesses the global cli_args variable, which makes it harder to test and reuse. Consider passing the continue_on_error flag as a parameter.

-async def download(dict):
+async def download(dict, *, continue_on_error: bool = False):
     """ """
     for key, value in dict.items():
         if issubclass(value, ModelWrapper):
             print(" -- Downloading", key)
             try:
                 inst = value()
                 await inst.download()
             except Exception as e:
                 print("Failed to download", key, value)
                 print(e)
-                if not cli_args.continue_on_error:
+                if not continue_on_error:
                     raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def download(dict):
for key, value in dict.items():
if issubclass(value, ModelWrapper):
print(' -- Downloading', key)
try:
inst = value()
await inst.download()
except Exception as e:
print('Failed to download', key, value)
print(e)
""" """
for key, value in dict.items():
if issubclass(value, ModelWrapper):
print(" -- Downloading", key)
try:
inst = value()
await inst.download()
except Exception as e:
print("Failed to download", key, value)
print(e)
if not cli_args.continue_on_error:
raise
async def download(dict, *, continue_on_error: bool = False):
""" """
for key, value in dict.items():
if issubclass(value, ModelWrapper):
print(" -- Downloading", key)
try:
inst = value()
await inst.download()
except Exception as e:
print("Failed to download", key, value)
print(e)
if not continue_on_error:
raise


async def main():
await download(DETECTORS)
await download(OCRS)
await download({
k: v for k, v in INPAINTERS.items()
if k not in ['sd']
})

if __name__ == '__main__':
asyncio.run(main())
models: set[str] = set(filter(None, cli_args.models.split(",")))
# print("parsed.models", models)
await download(
{
k: v
for k, v in DETECTORS.items()
if (not models) or (f"detector.{k}" in models)
}
)
await download(
{k: v for k, v in OCRS.items() if (not models) or (f"ocr.{k}" in models)}
)
await download(
{
k: v
for k, v in INPAINTERS.items()
if (not models) or (f"inpaint.{k}" in models) and (k not in ["sd"])
}
)


if __name__ == "__main__":
asyncio.run(main())
52 changes: 52 additions & 0 deletions requirements-moeflow.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# vim: ft=requirements
# networkx
# torch
# groq
# torchvision
# torch-summary
einops
Comment on lines +1 to +7
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

⚠️ Potential issue

Based on the search results, I can now generate the final response:

Uncomment and maintain PyTorch dependencies

The codebase heavily relies on PyTorch and related ML libraries. Evidence found:

  • Multiple model implementations using PyTorch operations (.cuda, .forward(), .backward())
  • Neural network architectures in OCR, inpainting, and detection modules
  • Extensive use of CUDA and device management
  • Model definitions across various components

Key files requiring these dependencies:

  • manga_translator/ocr/model_*.py files
  • manga_translator/inpainting/ modules
  • manga_translator/detection/ modules
  • manga_translator/colorization/ modules
🔗 Analysis chain

Review core ML dependencies configuration

Several critical ML dependencies are commented out (networkx, torch, torchvision). This could lead to installation issues or version mismatches if these are required by other packages.

Let's check which packages might require these dependencies:


Let me try a different search strategy to verify the dependencies:

I'm looking for evidence of ML operations and model definitions that would require these dependencies, as well as any dependency specifications in build files.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for dependencies that might require torch
rg -l "import torch|from torch" --type py

# Check for dependencies that might require torchvision
rg -l "import torchvision|from torchvision" --type py

# Check for dependencies that might require networkx
rg -l "import networkx|from networkx" --type py

Length of output: 630


Script:

#!/bin/bash
# Search for ML-related imports and function calls in Python files
rg -l "\.cuda|\.to\(device\)|\.to\('cuda'\)|model\." --type py

# Search for specific ML operations that typically require torch
rg -l "\.forward\(|\.backward\(|\.train\(|\.eval\(|optimizer\." --type py

# Search for any files that might contain ML model definitions
rg -l "class.*\(.*Model\)|class.*\(.*Net\)" --type py

# Look for setup.py or pyproject.toml to check dependencies
fd "setup.py|pyproject.toml"

Length of output: 4821

scikit-image
opencv-python
pyclipper
shapely
# requests
# cryptography
# freetype-py
# aiohttp
tqdm
# deepl
# ImageHash
# kornia
# backports.cached-property
# huggingface_hub
# transformers
py3langid==0.2.2
# sentencepiece
# editdistance
# numpy==1.26.4
# tensorboardX
# websockets
# protobuf
# ctranslate2
colorama
# openai==1.35.9
# open_clip_torch
# safetensors
pandas
onnxruntime==1.18.1
timm
omegaconf
python-dotenv
# nest-asyncio
# marshmallow
# cython
# aioshutil
# aiofiles
# arabic-reshaper
# pyhyphen
# langcodes
manga-ocr
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Pin manga-ocr version and add missing dependencies

The analysis reveals potential compatibility concerns:

  • numpy is pinned to 1.26.4 which conflicts with manga-ocr's requirement of numpy<2
  • Several required dependencies are missing: fire, fugashi, jaconv, loguru, pyperclip
  • Pillow>=10.0.0 dependency is not specified

Update requirements-moeflow.txt to include:

  • manga-ocr==1.0.0 (latest stable version)
  • Add missing dependencies with compatible versions
  • Ensure numpy version satisfies both project needs and manga-ocr constraint
🔗 Analysis chain

Verify manga-ocr compatibility

The manga-ocr package might have specific version requirements for its dependencies. We should verify compatibility with other packages.

Consider pinning the manga-ocr version to ensure stability.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check manga-ocr's dependencies
curl -s https://pypi.org/pypi/manga-ocr/json | jq -r '.info.requires_dist[]'

Length of output: 226


Script:

#!/bin/bash
# Check for potential conflicts in requirements files
echo "=== Checking requirements-moeflow.txt ==="
cat requirements-moeflow.txt | grep -E "numpy|Pillow|torch|transformers|fire|fugashi|jaconv|loguru|pyperclip"

echo -e "\n=== Checking other requirement files ==="
fd "requirements.*\.txt" --exec cat {} \; | grep -E "numpy|Pillow|torch|transformers|fire|fugashi|jaconv|loguru|pyperclip"

Length of output: 678

# langdetect
# pydensecrf@https://github.com/lucasb-eyer/pydensecrf/archive/refs/heads/master.zip
# accelerate
# bitsandbytes
Loading