- Fix admins' username and status not being parsed correctly in watchlists and users tags
- Fix issue #6
- Users with non-alphanumeric characters in their name are now escaped in URLs
- From suggestion in issue #5
- Fix admins' username and status not being parsed correctly
- Fix issue #6
- Fix ` being removed from usernames
- Fix incorrect user icon URLs when converting BBCode to HTML
- Use pytest ^7.2.0
- Fix CVE-2022-42969 issue
- Submission footers
- Submission footers are now separated from the submission description and stored in the
Submission.footer
field - The BBCode of the footer can be accessed with the
Submission.footer_bbcode
property
- Submission footers are now separated from the submission description and stored in the
- Generate user icon URLs
- New
generate_user_icon_url()
method added toUserPartial
andUser
to create the URL for the current user icon
- New
- BBCode to HTML conversion
- Work-in-progress version of a BBCode converter based on the bbcode library
- Converter function is located in the
parse
submodule:faapi.parse.bbcode_to_html()
- The majority of HTML fields (submission descriptions, journal contents, comments, etc.) can be converted back and forth between HTML and BBCode without loosing information
- If a submission contains incorrect or very unusual BBCode tags or text, the BBCode to HTML conversion may create artifacts and tags that did not exist in the original content
- Added
Journal.header_bbcode
andJournal.footer_bbcode
properties to convertJournal.header
andJournal.footer
to BBCode - Return
None
instead of 0 (or""
for favorites) when reaching the last page withFAAPI.gallery()
,FAAPI.scraps()
,FAAPI.journals()
,FAAPI.favorites()
,FAAPI.watchlist_by()
, andFAAPI.watchlist_to()
- Added
__hash__
method toUser
,UserPartial
,Submission
,SubmissionPartial
,Journal
,JournalPartial
, andComment
; the hash value is calculated using the same fields used for equality comparisons - Improved cleanup of HTML fields by using htmlmin
- Fur Affinity URLs are now properly converted to relative
[url=<path>]
tags in BBCode - Unknown tags are converted to
[tag=<name>.<classes>]
in BBCode - Added
CookieDict(TypedDict)
notation for cookies dictionary (alternative toCookieJar
) to provide intellisense and type checking information
- Fix comments being considered equal even if they had different parents but the same ID
- Fix break lines tags (
<br/>
) not always being converted to newlines when converting to BBCode - Fix errors when converting nav links (e.g.
[2,1,3]
) to BBCode - Fix incorrect detection of last page in
FAAPI.watchlist_by()
andFAAPI.watchlist_by()
- Fix errors when converting special characters (e.g.
&
) - Fix trailing spaces around newlines remaining after converting to BBCode
- Fix horizontal lines not being correctly converted from BBCode if the dashes (
-----
or longer) were not surrounded by newlines
- Added htmlmin ^0.1.12
- Added bbcode ^1.1.0
- Improved HTML extraction for specific tags to avoid encoding issues
- HTML fields are cleaned up (i.e., removed newlines, carriage returns, and extra spaces)
- None of the parsed pages use tags with pre white space rendering, so no information is lost
- Improvements to BBCode conversion
- Do not quote URLs when converting to BBCode
- Support nested quote blocks
- Support non-specific tags (e.g.
div.submission-footer
) and convert them to[tag.<tag name>.<tag class>][/tag.<tag.name>]
- Fix incorrect encoding of special characters (
<
,>
, etc.) in HTML fields- Was caused by the previous method of extracting the inner HTML of a tag
- Fix URLs automatically shortened by Fur Affinity being converted to BBCode with the wrong text content
- Fix HTML paragraph tags (
<p>
) sometimes appearing in BBCode-converted content - Fix BBCode conversion of
:usernameicon:
links (i.e., user icon links without the username)
- Submission user folders
- Submission folders are now parsed and stored in a dedicated
user_folders
field in theSubmission
object - Each folder is stored in a
namedtuple
with fields forname
,url
, andgroup
(if any)
- Submission folders are now parsed and stored in a dedicated
- BBCode conversion
- New properties have been added to the
User
,Submission
,Journal
,JournalPartial
, andComment
objects to provide BBCode versions of HTML fields - The generated BBCode tags follow the Fur Affinity standard found on their support page
- New properties have been added to the
- Use lxml ^4.9.1
- Fix CVE-2022-2309 issue
- Fix error when parsing journals folders and journal pages caused by date format set to full on Fur Affinity's site settings
- Requests timeout
- New
FAAPI.timeout: int | None
variable to set request timeout in seconds - Timeout is used for both page requests (e.g. submissions) and file requests
- New
- Fix possible parsing error arising from multiple attributes in one tag
- Frontpage
- New
FAAPI.frontpage()
method to get submissions from Fur Affinity's front page
- New
- Sorting of
Journal
,Submission
, andUser
objects- All data objects now support greater than, greater or equal, lower than, and lower or equal operations for easy sorting
- Fix equality comparison between
Journal
andJournalPartial
- Fix parsing of usernames from user pages returning the title instead
- Caused by a change in Fur Affinity's DOM
- Journal headers and footers
- The
Journal
class now contains header and footer fields which are parsed from journal pages (FAAPI.journal
)
- The
- Submission favorite status and link
- The
Submission
class now contains a booleanfavorite
field that is set toTrue
if the submission is a favorite, and afavorite_toggle_link
containing the link to toggle the favorite status (/fav/
or/unfav/
)
- The
- User watch and block statuses and links
- The
User
class now contains booleanwatched
andblocked
fields that are set toTrue
if the user is watched or blocked respectively, andwatched_toggle_link
andblocked_toggle_link
fields containing the links to toggle the watched (/watch/
or/unwatch/
) and blocked (/block/
or/unblock/
) statuses respectively.
- The
- Remove
parse.check_page
function which had no usage in the library anymore - Remove
parse.parse_search_submissions
function andFAAPI.search
method- They will be reintroduced once Fur Affinity allows scraping search pages again
- Fix an incorrect regular expression that parsed mentions in journals, submissions, and profiles which could cause
non-Fur Affinity links to be matched as valid
- Security issue #3
- Fix
FAAPI.journals
not detecting the next page correctly- Caused by a change in Fur Affinity's journals page
-
Comments! 💬
- A new
Comment
object is now used to store comments for submissions and journals - The comments are organised in a tree structure, and each one contains references to both its parent
object (
Submission
orJournal
) and, if the comment is a reply, to its parent comment too - The auxiliary functions
faapi.comment.flatten_comments
andfaapi.comment.sort_comments
allow to flatten the comment tree or reorganise it
- A new
-
Separate
JournalPartial
andJournal
objects- The new
JournalPartial
class takes the place of the previousJournal
class, and it is now used only to parse journal from a user's journals folder - The new
Journal
class contains the same fields asJournalPartial
with the addition of comments, and it is only used to parse journal pages
- The new
-
Comparisons
- All objects can now be used with the comparison (==) operator with other objects of the same type or the type of
their key property (
id: int
for submissions and journals, andname_url: str
for users)
- All objects can now be used with the comparison (==) operator with other objects of the same type or the type of
their key property (
- The
cookies
argument ofFAAPI
is now mandatory, and anUnauthorized
exception is raised ifFAAPI
is initialised with an empty cookies list - The list of
Submission
/Journal
objects returned byFAAPI.gallery
,FAAPI.scraps
, andFAAPI.journals
now uses a sharedUserPartial
object in theauthor
variable (i.e. changing a property of the author in one object of the list will change it for the others as well)
- Fix path checking against robots.txt not working correctly with paths missing a leading forward slash
- New
Submission.stats
field for submission statistics stored in a named tuple (views
,comments
(count) ,favorites
)- Pull request #2, thanks to @warpKaiba!
- New
Journal.stats
field for journal statistics stored in a named tuple (comments
(count))
- Rename
UserStats.favs
toUserStats.favorites
- Fix links in PyPi metadata pointing to previous hosting at GitLab
- Better and more resilient robots.txt parsing
- Fix spaces around slash (/) not being preserved for submission categories
- Raise
DisabledAccount
for users pending deletion - Error messages from server are not lowercase
- Fix rare occurrence of error message not being parsed if inside a
section.notice-message
- New
NotFound
exception inheriting fromParsingError
- Removed
FAAPI.submission_exists
,FAAPI.journal_exists
, andFAAPI.user_exists
methods - Improved reliability of error pages' parser
- Custom exceptions inherit from
Exception
instead ofBaseException
- No changes to code; migrated repository to GitHub and updated README and PyPi metadata
- Allow empty info/contacts when parsing user profiles
- Fix last page check when parsing galleries
- Use BaseException as base class of custom exceptions
- Use requests ^2.27.1
- Allow submission thumbnail tag to be null
- Use
UserStats
class to hold user statistics instead of namedtuple - Add watched by and watching stats to
UserStats
- Safer parsing
- Add docstrings
- Handle robots.txt parsing with
urllib.RobotFileParser
User-Agent
header is exposed asFAAPI.user_agent
property
FAAPI.last_get
uses UNIX timeFAAPI.check_path
doesn't raise an exception by defaultFAAPI.login_status
does not raise an exception on unauthorized- Remove crawl delay error
- Improve download of files
FAAPI.get_parsed
checks login status and checks the page for errors directly (both can be manually skipped)- Add
Unauthorized
exception
FAAPI.submission
andFAAPI.submission_file
support setting the chunk size for the binary file download
- The file downloader uses chunk size instead of speed
- When raising
ServerError
andNoticeMessage
, the actual messages appearing on the page are use as exception arguments
- Add support for
http.cookiejar.CookieJar
(and inheriting classes, likerequests.cookies.RequestsCookieJar
) for cookies. - Add
FAAPI.me()
method to get the logged-in user - Add
FAAPI.login_status
property to get the current login status
- Use lxml ^4.7.1
- Fix CVE-2021-43818 issue
- Fix rare error when parsing the info section of a userpage
- Fix a key error in
Submission
when assigning the parsed results
- Upgrade to Python 3.9+
- Update type annotations
Submission
parses next and previous submission IDsFAAPI.watchlist_by()
andFAAPI.watchlist_to()
methods support multiple watchlist pages
- Renamed
FAAPI.get_parse
toget_parsed
- Removed get prefix from
FAAPI
methods (e.g.get_submission
tosubmission
) and return a list ofUserPartials
objects instead ofUsers
- Added
__all__
declarations to allow importing exceptions and secondary functions fromconnection
andparse
datetime
fields are not serialised on__iter__
(e.g. when casting aSubmission
object todict
)