Documentation and *.heic support

bohdanbobrowski · Nov 15, 2024 · 0e7737d · 0e7737d
1 parent ff42c65
commit 0e7737d
Show file tree

Hide file tree

Showing 4 changed files with 25 additions and 12 deletions.
diff --git a/BACKLOG.md b/BACKLOG.md
@@ -1,17 +1,11 @@
 # List of features and bugfixes I'm considering to add
 
 ## Known bugs
-- [ ] sometimes images are not correctly scrapped and replaced, like in this post: [modernistyczny-poznan.blogspot.com](https://modernistyczny-poznan.blogspot.com/2021/08/wiepofama-10lat.html)
-- [ ] app is not resistant to http errors, which is embarrassing
+..
 
 ## Scraping in general:
 - [ ] stop with keeping content in RAM - save it as ready to use ebook chapters
-- [ ] use sitemaps.xml for scraping!
-- [ ] replace blog url's in article content to actual chapters in ebook
-- [ ] major refactor of Crawler class:
-  - [ ] use data models
-  - [ ] more common methods in crawler class
-  - [ ] expand crawler abstract
+- [ ] replace blog internal url's in article content to actual chapters in ebook
 - [ ] support for blog categories, tags and pages
 - [ ] manually decide which crawler should be used
 - [ ] blog2epub.yaml - this might be too ambitious, but what if user could compose he's/hers own book, with custom
@@ -25,6 +19,5 @@
 
 ## Additional crawlers:
 - [ ] [nrdblog.cmosnet.eu](https://nrdblog.cmosnet.eu/)
-- [ ] [zeissikonveb.de](zeissikonveb.de)
 - [ ] [scigacz.pl](https://www.scigacz.pl/)
 - [ ] [jednoslad.pl](https://www.jednoslad.pl)
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,19 @@
 # ChangeLog
 
+### [v1.5.0](https://github.com/bohdanbobrowski/blog2epub/releases/tag/v1.5.0) - ?
+- [X] integration testing
+- [X] increase unit test coverage
+- [X] use sitemaps.xml for scraping
+- [X] crawlers refactor
+  - [X] use data models
+  - [X] more common methods in crawler class
+  - [X] expand crawler abstract
+- [X] cli interface refactor
+- [X] greek alphabet support
+- [X] image download and attachment bug solved (ex. modernistyczny-poznan.blogspot.com)
+- [X] improved resistance to http errors
+- [X] dedicated crawler class for zeissikonveb.de
+
 ### [v1.4.0](https://github.com/bohdanbobrowski/blog2epub/releases/tag/v1.4.0) - 2024-11-01
 - [X] custom destination folder
 - [X] UI improvements (better scaling, more rely on KivyMD default features)

diff --git a/README.md b/README.md
@@ -149,10 +149,16 @@ Example:
 ### v1.5.0
 - [X] integration testing
 - [X] increase unit test coverage
+- [X] use sitemaps.xml for scraping
 - [X] crawlers refactor
-- [X] add more crawlers
+  - [X] use data models
+  - [X] more common methods in crawler class
+  - [X] expand crawler abstract
 - [X] cli interface refactor
-- [X] greek alphabet support 
+- [X] greek alphabet support
+- [X] image download and attachment bug solved (ex. modernistyczny-poznan.blogspot.com)
+- [X] improved resistance to http errors
+- [X] dedicated crawler class for zeissikonveb.de
 
 
 [&raquo; Complete Change Log here &laquo;](https://github.com/bohdanbobrowski/blog2epub/blob/master/CHANGELOG.md)

diff --git a/blog2epub/common/downloader.py b/blog2epub/common/downloader.py
@@ -155,7 +155,7 @@ def download_image(self, image_obj: ImageModel) -> bool:
         img_hash = self.get_urlhash(image_obj.url)
         img_type = os.path.splitext(image_obj.url)[1].lower()
         img_type = img_type.split("?")[0]
-        if img_type not in [".jpeg", ".jpg", ".png", ".bmp", ".gif", ".webp"]:
+        if img_type not in [".jpeg", ".jpg", ".png", ".bmp", ".gif", ".webp", ".heic"]:
             return False
         original_fn = os.path.join(self.dirs.originals, img_hash + "." + img_type)
         resized_fn = os.path.join(self.dirs.images, img_hash + ".jpg")