Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix _format_genre and _list2str methods #6

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

JOJ0
Copy link

@JOJ0 JOJ0 commented Jan 25, 2025

Since I patched out _format_tags recently in beets' lastgenre plugin I implement it here.

Also I found that sometimes I get Nonetype genres. I ensured _list2str can handle them.

@JOJ0 JOJ0 changed the title Fix lastgenre format and _list2str methods Fix _format_genre and _list2str methods Jan 25, 2025
Copy link
Owner

@mgoltzsche mgoltzsche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating the PR @JOJ0!
There are some changes needed for this to work, though:

  1. In order to drive the release process, the GH workflow requires the commit messages to follow the conventional commit format, e.g. fix: _format_genre() beets incompatibility. In doubt I can squash-merge the PR and specify the commit message myself but the e2e tests are also failing.
  2. To fix the beets-incompatibility the e2e test runs into, the latest beets version with your changes needs to be installed into the docker container here as follows:
# Install beets patch + yt-dlp upgrade
RUN set -eux; \
	BUILD_DEPS='git'; \
	apk add --update --no-cache $BUILD_DEPS; \
	python3 -m pip install \
		git+https://github.com/beetbox/beets.git@a1c0ebdeef267e227c26f9defc93799c86c8fe54#egg=beets \
		ytmusicapi==1.9.1 \
		yt-dlp==2025.01.26; \
	apk del --purge $BUILD_DEPS
  1. Then, you can see the e2e tests (make clean clean-data beets-container test-e2e) fail with errors like this:
    # ERROR: Expected query 'album:Reggae Jungle Drum and Bass Mix #9 New 2022 Rudy, a message to you' with format '$genre_source | $genre | $genres' to evaluate to 'title | Ragga Drum And Bass | Ragga Drum And Bass, Drum And Bass, Electronic' but was 'title | ragga drum and bass | ragga drum and bass, electronic, drum and bass'
    Apparently that is because in the latest beets version the lastgenre.title_case option is disabled by default and because the order of genres returned by the lastgenre plugin changed.
  2. To fix the title case, specify lastgenre.title_case=true here.
  3. Running the e2e tests again and see them failing with errors like that:
# ERROR: Expected query 'album:Reggae Jungle Drum and Bass Mix #9 New 2022 Rudy, a message to you' with format '$genre_source | $genre | $genres' to evaluate to 'title | Ragga Drum And Bass | Ragga Drum And Bass, Drum And Bass, Electronic' but was 'title | Ragga Drum And Bass | Ragga Drum And Bass, ragga drum and bass, Drum And Bass, Electronic'

Apparently the title case is not applied to the 2nd value within the genres field.

def _list2str(self, list):
return self._separator.join(list)
def _list2str(self, genrelist):
return self._separator.join([str(g) for g in genrelist if g is not None])
Copy link
Owner

@mgoltzsche mgoltzsche Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the e2e test pass, I think you also need to apply the lastgenre.title_case option here now:

return self._separator.join([self._format_genre(str(g)) for g in genrelist if g is not None])

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what (theoretically) should have changed:

https://github.com/beetbox/beets/pull/4982/files#diff-f85204832b1e3e76cf854983f1f1773be9f7857d94656cbb3a0f2a0ae6431140L102-L111

title_case is not part of it, but there was a lot of shifts of functions. title_case is now happening elsewhere. I'll tell you details where it is later....need to look myself. A lot of things where happening in _resolve_genre, this function does less now....

Copy link
Author

@JOJ0 JOJ0 Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final filtering and translating from a list of genres to one separator delimited string is now done here: https://github.com/beetbox/beets/pull/4982/files#diff-f85204832b1e3e76cf854983f1f1773be9f7857d94656cbb3a0f2a0ae6431140R167-R177

Should we just try to use that function here? But it also limits the count to what's set in the config. Would it be worth a try?

Copy link
Owner

@mgoltzsche mgoltzsche Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the genre count limit would not be applied with the change I suggested previously.
Given that _to_delimited_genre_string is a private function, how about (re-)implementing the genre limit also within the _list2str function? (I don't really have a strong opinion but the private function may change in the future in a breaking way without being reflected in the beets version, though.)

The title_case option is not applied to the genre list anymore which indicates that the change I suggested is still also needed.
More to the point, while I had to disable the e2e test a while ago because YouTube blocked (and still blocks) traffic from GHA, it fails now as follows on my local machine (make beets-container clean-data test-e2e):

bats -T tests/e2e
1..13
not ok 1 auto-detect song genre on import in 6877ms
# (from function `assertGenre' in file tests/e2e/tests.bats, line 6,
#  in test file tests/e2e/tests.bats, line 23)
#   `assertGenre "$QUERY" 'lastfm | Jazz | Jazz, Blues'' failed
# ytimport: Ignoring 0 known URLs
# ytimport: Downloading 1 song(s) to /data/ytimport
# [youtube] Extracting URL: https://www.youtube.com/watch?v=jft3BVoxqjo
# [youtube] jft3BVoxqjo: Downloading webpage
# [youtube] jft3BVoxqjo: Downloading tv client config
# [youtube] jft3BVoxqjo: Downloading player f3d47b5a
# [youtube] jft3BVoxqjo: Downloading tv player API JSON
# [youtube] jft3BVoxqjo: Downloading ios player API JSON
# [youtube] jft3BVoxqjo: Downloading m3u8 information
# [info] jft3BVoxqjo: Downloading 1 format(s): 251
# [info] Downloading video thumbnail 44 ...
# [info] Writing video thumbnail 44 to: /data/ytimport/singles/Tuba_Skinny_-_Jubilee_Stomp_-_Royal_Street_I [jft3BVoxqjo].webp
# [download] Destination: /data/ytimport/singles/Tuba_Skinny_-_Jubilee_Stomp_-_Royal_Street_I [jft3BVoxqjo].webm
[download] 100% of    2.99MiB in 00:00:00 at 8.42MiB/s:00
# [ExtractAudio] Destination: /data/ytimport/singles/Tuba_Skinny_-_Jubilee_Stomp_-_Royal_Street_I [jft3BVoxqjo].opus
# Deleting original file /data/ytimport/singles/Tuba_Skinny_-_Jubilee_Stomp_-_Royal_Street_I [jft3BVoxqjo].webm (pass -k to keep)
# [MetadataFromField] Parsed meta_comment from '%(webpage_url)s %(title)s': 'https://www.youtube.com/watch?v=jft3BVoxqjo Tuba Skinny - Jubilee Stomp - Royal Street I'
# [MetadataFromField] Parsed artist from '%(uploader)s': 'TheWsm'
# [MetadataFromField] Parsed artist from '%(title)s': 'Tuba Skinny'
# [MetadataFromField] Parsed title from '%(title)s': 'Jubilee Stomp - Royal Street I'
# [MetadataFromField] Parsed album_artist from '%(artist)s': 'Tuba Skinny'
# [MetadataFromField] Parsed meta_yt_id from '%(id)s': 'jft3BVoxqjo'
# [MetadataFromField] Parsed meta_yt_source from '%(webpage_url_domain)s': 'youtube.com'
# [MetadataFromField] Parsed meta_yt_source from '%(webpage_url_domain)s': 'youtube'
# [MetadataFromField] Parsed meta_yt_likes from '%(like_count)s': '184604'
# [MetadataFromField] Parsed meta_yt_dislikes from '%(dislike_count)s': 'NA'
# [MetadataFromField] Parsed meta_yt_views from '%(view_count)s': '8774783'
# [MetadataFromField] Parsed meta_yt_rating from '%(average_rating)s': 'NA'
# [MetadataFromField] Parsed meta_publish_date from '%(release_date>%Y-%m-%d,upload_date>%Y-%m-%d)s': '2018-04-25'
# [Metadata] Adding metadata to "/data/ytimport/singles/Tuba_Skinny_-_Jubilee_Stomp_-_Royal_Street_I [jft3BVoxqjo].opus"
# [ThumbnailsConvertor] Converting thumbnail "/data/ytimport/singles/Tuba_Skinny_-_Jubilee_Stomp_-_Royal_Street_I [jft3BVoxqjo].webp" to png
# [EmbedThumbnail] mutagen: Adding thumbnail to "/data/ytimport/singles/Tuba_Skinny_-_Jubilee_Stomp_-_Royal_Street_I [jft3BVoxqjo].opus"
# [XAttrMetadata] Writing metadata to file's xattrs
# [Rename] Renaming '/data/ytimport/singles/Tuba_Skinny_-_Jubilee_Stomp_-_Royal_Street_I [jft3BVoxqjo].opus' to '/data/ytimport/singles/Tuba Skinny - Jubilee Stomp - Royal Street I [jft3BVoxqjo].opus'
# [SplitChaptersToTracks] Splitting /data/ytimport/singles/Tuba Skinny - Jubilee Stomp - Royal Street I [jft3BVoxqjo].opus
# ytimport: Importing downloaded songs into beets library
# No files imported from /data/ytimport/albums
#
# /data/ytimport/singles/Tuba Skinny - Jubilee Stomp - Royal Street I [jft3BVoxqjo].opus
# Importing as-is.
# ERROR: Expected query 'Jubilee Stomp' with format '$genre_source | $genre | $genres' to evaluate to 'lastfm | Jazz | Jazz, Blues' but was 'lastfm | Jazz | Jazz, Blues, jazz'

Apparently the duplication happens because title_case is not applied consistently (to the parent genres of the primary genre that the autogenre plugin appends to the genre list).

Copy link
Author

@JOJ0 JOJ0 Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title_case option is not applied to the genre list anymore which indicates that the change I suggested is still also needed. More to the point, while I had to disable the e2e test a while ago because YouTube blocked (and still blocks) traffic from GHA, it fails now as follows on my local machine (make beets-container clean-data test-e2e):

Yes I ran them locally yesterday already. They fail like so:

# ERROR: Expected query 'Jubilee Stomp' with format '$genre_source | $genre | $genres' to evaluate to 'lastfm | Jazz | Jazz, Blues' but was 'lastfm | Jazz | Jazz, Blues, jazz'
not ok 2 auto-detect album genre on import in 31433ms

Also I saw errors like this:

# WARNING: Writing cache to '/.cache/yt-dlp/youtube-nsig/f3d47b5a.json' failed: Traceback (most recent call last):
#   File "/usr/local/lib/python3.12/site-packages/yt_dlp/cache.py", line 41, in store
#     os.makedirs(os.path.dirname(fn), exist_ok=True)
#   File "<frozen os>", line 215, in makedirs
#   File "<frozen os>", line 215, in makedirs
#   File "<frozen os>", line 225, in makedirs
# PermissionError: [Errno 13] Permission denied: '/.cache'

Today, as an experiment I tried this:

    def _list2str(self, genrelist):
        return self._lastgenre._to_delimited_genre_string(genrelist)

which changes the situation to:

# /data/ytimport/singles/Tuba Skinny - Jubilee Stomp - Royal Street I [jft3BVoxqjo].opus
# Importing as-is.
# ERROR: Expected query 'Jubilee Stomp' with format '$genre_source | $genre | $genres' to evaluate to 'lastfm | Jazz | Jazz, Blues' but was 'lastfm | Jazz | Jazz, Blues, Jazz'


# ERROR: Expected query 'Miles Davis Autumn Leaves Walkin' with format '$genre_source | $genre | $genres' to evaluate to 'lastfm | Jazz | Jazz, Blues' but was 'lastfm | Jazz | Jazz, Blues, Jazz'

Reimplementing what _to_delimited_string does here is a good idea yes, I agree, but it seems like this alone wouldn't solve the problem. The deduplication still does not work. Maybe you are appending something later?

I think it's better you take it from here since I don't really understand what the actual outcome should be. I'm sure it's much easier for you. I'm happy to assist with answering what has changed. I think you should be able to push to my branch right?

In case you want to apply another round of deduplication somewhere in your code, lastgenre now uses this function:

from beets.util import unique_list

Copy link
Author

@JOJ0 JOJ0 Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a last step before I suggest you take over, I reimplemented count-applying and title_case in _list2str - see my last commit. While doing that I realised that there is another problem. title_case is not applied since the configuration could not be fetched in _format_tags already, which these debug prints proof:

# /data/ytimport/singles/Tuba Skinny - Jubilee Stomp - Royal Street I [jft3BVoxqjo].opus
# Importing as-is.
# THIS IS WHAT title_case OPTION IS: None
# THIS IS WHAT title_case OPTION IS: None
# THIS IS WHAT title_case OPTION IS: None
# THIS IS WHAT title_case OPTION IS: None
# ERROR: Expected query 'Jubilee Stomp' with format '$genre_source | $genre | $genres' to evaluate to 'lastfm | Jazz | Jazz, Blues' but was 'lastfm | Jazz | Jazz, Blues, jazz'

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks for your contribution! I'll look into it when I have time.

@mgoltzsche
Copy link
Owner

Oh, oups, I realize I was running the e2e tests with an old ./data directory. Now that I deleted it (make clean-data), I cannot download the test data anymore from YouTube because apparently it blocks unauthenticated media access now 😭.

@mgoltzsche
Copy link
Owner

mgoltzsche commented Jan 29, 2025

Upgrading yt-dlp and ytmusicapi did the trick - I updated my previous review comment correspondingly.

- Reduce list to configured count
- Use the local _format_genre method to apply title_case
- Join to delim-separated string
- And keep "the None check"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants