Skip to content
This repository has been archived by the owner on Dec 22, 2022. It is now read-only.
/ libgenparser Public archive

An easy and advanced libgen.rs scraper. Can search books, filter results, download books etc. Also supports asyncio.

Notifications You must be signed in to change notification settings

BeastImran/libgenparser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LIBGEN-PARSER

Lets you scrape libgen website easily. Uses libgen.rs as source.

Quick Links

  1. Installation

  2. Requirements

    . Dependencies

    . Missing Module issues

  3. Usage

    . Initial steps

    . Change cache length

    . Feature usage

    . Async code

  4. Supported Features

  5. Examples

    . Initial steps

    . Search examples

    . Download examples

  6. Offline build

  7. Final words

INSTALLATION

Use the below command to install and start using.

$ pip install libgenparser

REQUIREMENTS

Project was built on Python3.9. Pip must be up-to-date.

To update pip, do as follows:

$ python -m pip install -U pip

Dependencies

This package has following dependencies. All of them, probably will be installed automatically.

Missing modules issues?

If not installed automatically, use the below respective commands to install missing modules.

  • All of them at once:

    $ pip install requests beautifulsoup4 lxml async-cache
  • Install individually

    • Install requests

      $ pip install requests
    • Install beautifulsoup4

      $ pip install beautifulsoup4
    • Install lxml parser for html parsing by beautifulsoup4

      $ pip install lxml
    • Install async-cache only if you plan to use async version.

      $ pip install async-cache

USAGE

Start using by importing LibgenParser as follows.

from libgenparser.parser import LibgenParser

libgen = LibgenParser()
libgen.search_title("Clean python")

LibgenParser class contains all required methods. All methods except LibgenParser.resolve_download_link() and LibgenParser.download(), returns parsed list of dictionaries on success else they return None. All methods are cached using functools.lru_cache() for faster results on repeated query. Async methods are available in __future__ package and are cached using async-cache module's cache.AsyncLRU which works similar to in-built functools.lru_cache().

By default, cache_length (amount of objects to hold in memory) is 1000. This might be an issue of out of memory if you are running on very low memory machine. Change it as required by assigning a value (as length) to custom_cache_length in LibgenParser class as shown bellow. Setting it to 0 will store no cache. Number less than 0 will be ignored and cache_length will be set to default value.

Change cache length

from libgenparser.parser import LibgenParser

libgen = LibgenParser(custom_cache_length=100)
libgen.search_title("Clean python")

Feature Usage

  • Pass string of title to libgen.search_title() to search in title field.

  • Pass string of author's name tolibgen.search_author() to search in author field.

  • Provide string value of to libgen.search_year() to search in year field.

  • Pass string of MD5 identifier of ebook to libgen.search_md5() to search in MD5 field.

  • Pass string of publisher's name to libgen.search_publisher() to search in publisher field.

  • Pass string of ISBN identifier of ebook to libgen.search_isbn() to search in ISBN field.

  • Pass string of file extension of ebook to libgen.searcg_extension() to search in file extension field.

  • Pass string of language to libgen.search_language() to search in language field.

  • Pass string of tag to libgen.search_tag() to search in tag field.

  • Pass string of MD5 identifier of ebook to libgen.resolve_download_link() to get a direct download link. Direct download link is scraped from library.lol/main/{MD5} formatted url.

  • Pass string of MD5 identifier of ebook and path to download file to libgen.download() to download the file to provided path.

Async (asyncio)

This module supports asynchronous (asyncio) code under libgenparser.__future__ package.

from libgenparser.__future__.parser import LibgenParser # LibgenParser from __future__ package contains async versions of all methods.

Supported features

  • Download file using md5 identifier.

    Download file using md5 identifier of that specific book by using parser.download() method. It takes two arguments, md5 string and path of file to store. See this example.

  • Generate download link.

    Get direct download link of a specific book by passing the MD5 identifier of that book to parser.resolve_download_link() which returns a direct url of book to download. MD5 identifiers can be obtained on use of search method.

  • Search for following fields

    Can query libgen.rs to search in following fields by using respective method shown below.

    • Title (default)
    • Tag
    • Author name
    • Year
    • MD5
    • ISBN
    • Publisher
    • Language
    • File extension

Examples

Initial steps to get started

For synchronous code:

from libgenparser.parser import LibgenParser
libgen = LibgenParser() # LibgenParser(custom_cache_length=<NUMBER>), NUMBER=0 no cache

For asynchronous code:

from libgenparser.__future__.parser import LibgenParser
libgen = LibgenParser() # LibgenParser(custom_cache_length=<NUMBER>), NUMBER=0 no cache

Any search result will be a list of dictionaries with following fields holding respective data. Some of these fields can be empty on site, which will be specified by None which means the data was not found for that specific book. Guaranteed fields are noted in comments.

column_names = [
    'Thumb', # holds book's thumbnail url.
    'Download_link', # holds book's download site's link. Not to be confused with direct download. Guaranteed Field.
    'MD5', # holds unique MD5 identifier of book. Guaranteed Field.
    'Title', # the title of book. Guaranteed Field.
    'Author', # the author of book
    'Year', # the year of publish
    'Language', # language of book
    'Pages', # number of pages in book
    'ID', # id of book on libgen.rs site. Guaranteed Field.
    'Size', # size of book (kb, mb, KB, MB etc)
    'Extension', # the format of book (.pdf, .djvu, .epub, .chm etc)
]

Search examples

To search a title:

search_result = libgen.search_title("Python")[:5]
output {'Thumb': 'http://libgen.rs/covers/0/aee7239ffcf7871e1d6687ced1215e22-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=AEE7239FFCF7871E1D6687CED1215E22', 'MD5': 'AEE7239FFCF7871E1D6687CED1215E22', 'Title': 'Exploring Python', 'Author': 'Markus Nix', 'Year': '2005', 'Language': 'German', 'Pages': '174', 'ID': '801', 'Size': '2 Mb (1691080)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/1000/8b7f9439ff75aeac89b8748bdbc1e1d3-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=8B7F9439FF75AEAC89B8748BDBC1E1D3', 'MD5': '8B7F9439FF75AEAC89B8748BDBC1E1D3', 'Title': 'Beginning Python', 'Author': 'Peter C. Norton', 'Year': '2005', 'Language': 'English', 'Pages': '679', 'ID': '1464', 'Size': '4 Mb (3670658)', 'Extension': 'pdf'}

{'Thumb': 'http://libgen.rs/covers/1000/8288fdbd7e6d373accbc9d274ff42b29-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=8288FDBD7E6D373ACCBC9D274FF42B29', 'MD5': '8288FDBD7E6D373ACCBC9D274FF42B29', 'Title': 'Python essential reference', 'Author': 'David Beazley', 'Year': '2001', 'Language': 'English', 'Pages': '586', 'ID': '1473', 'Size': '4 Mb (4308155)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/1000/1834e0e4df316dc07956cff0c63bda92-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=1834E0E4DF316DC07956CFF0C63BDA92', 'MD5': '1834E0E4DF316DC07956CFF0C63BDA92', 'Title': 'Foundations of Python network programming', 'Author': 'John Goerzen', 'Year': '2004', 'Language': 'English', 'Pages': '538', 'ID': '1482', 'Size': '3 Mb (3024636)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/1000/31debc732ae60d265ba551a112fbe6bd-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=31DEBC732AE60D265BA551A112FBE6BD', 'MD5': '31DEBC732AE60D265BA551A112FBE6BD', 'Title': 'Making use of Python', 'Author': 'Rashi Gupta', 'Year': '2002', 'Language': 'English', 'Pages': '416', 'ID': '1486', 'Size': '3 Mb (3187656)', 'Extension': 'pdf'}

To search an author:

search_result = libgen.search_author("Markus Nix")[:5]
output {'Thumb': 'http://libgen.rs/covers/0/aee7239ffcf7871e1d6687ced1215e22-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=AEE7239FFCF7871E1D6687CED1215E22', 'MD5': 'AEE7239FFCF7871E1D6687CED1215E22', 'Title': 'Exploring Python', 'Author': 'Markus Nix', 'Year': '2005', 'Language': 'German', 'Pages': '174', 'ID': '801', 'Size': '2 Mb (1691080)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/659000/8f35df56f73b2c9baddf88b57387450c-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=8F35DF56F73B2C9BADDF88B57387450C', 'MD5': '8F35DF56F73B2C9BADDF88B57387450C', 'Title': 'Exploring JavaScript. Von Insidern lernen GERMAN ', 'Author': 'Markus Nix', 'Year': None, 'Language': 'German ', 'Pages': '166', 'ID': '659804', 'Size': '2 Mb (2390174)', 'Extension': 'pdf'}

To search using MD5 identifier:

search_result = libgen.search_md5("AEE7239FFCF7871E1D6687CED1215E22")
output {'Thumb': 'http://libgen.rs/covers/0/aee7239ffcf7871e1d6687ced1215e22-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=AEE7239FFCF7871E1D6687CED1215E22', 'MD5': 'AEE7239FFCF7871E1D6687CED1215E22', 'Title': 'Exploring Python', 'Author': 'Markus Nix', 'Year': '2005', 'Language': 'German', 'Pages': '174', 'ID': '801', 'Size': '2 Mb (1691080)', 'Extension': 'djvu'}

To search using extension:

search_result = libgen.search_extension("chm")[-5:]
output {'Thumb': 'http://libgen.rs/covers/0/4787b628578fa3dc2d29603e369348a8-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=4787B628578FA3DC2D29603E369348A8', 'MD5': '4787B628578FA3DC2D29603E369348A8', 'Title': 'The God Delusion', 'Author': 'Richard Dawkins', 'Year': '2006', 'Language': 'English', 'Pages': None, 'ID': '46', 'Size': '343 Kb (351078)', 'Extension': 'chm'}

{'Thumb': 'http://libgen.rs/covers/0/f5e45a4a88744c863385fdf19c9f40ce-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=F5E45A4A88744C863385FDF19C9F40CE', 'MD5': 'F5E45A4A88744C863385FDF19C9F40CE', 'Title': 'How the Mind Works', 'Author': 'Steven Pinker', 'Year': '1999', 'Language': 'English', 'Pages': None, 'ID': '101', 'Size': '3 Mb (3119362)', 'Extension': 'chm'}

{'Thumb': 'http://libgen.rs/covers/0/5fb1eff06867dd2b197943b46b023f0d-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=5FB1EFF06867DD2B197943B46B023F0D', 'MD5': '5FB1EFF06867DD2B197943B46B023F0D', 'Title': 'The blank slate: the modern denial of human nature', 'Author': 'Steven Pinker', 'Year': '2002', 'Language': 'English', 'Pages': None, 'ID': '102', 'Size': '2 Mb (2291483)', 'Extension': 'chm'}

{'Thumb': 'http://libgen.rs/covers/000/bb5a6040d0465fffc8891ccb73027f16-g.jpg', 'Download_link': 'http://libgen.rs/get?&md5=BB5A6040D0465FFFC8891CCB73027F16', 'MD5': 'BB5A6040D0465FFFC8891CCB73027F16', 'Title': '101 ключевая идея: Эволюция: Происхождение жизни. Равновесие Харди-Вайнберга', 'Author': 'Мортон Дженкинс', 'Year': '2001', 'Language': 'Russian', 'Pages': None, 'ID': '194', 'Size': '212 Kb (217222)', 'Extension': 'chm'}

{'Thumb': 'http://libgen.rs/covers/0/67d566396616fd0982d659b6cfc1b664-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=67D566396616FD0982D659B6CFC1B664', 'MD5': '67D566396616FD0982D659B6CFC1B664', 'Title': 'Наглядная биохимия', 'Author': 'Я.Кольман, К.-Г.Рем. Перевод с немецкого Л.В.Козлова, Е.С.Левиной и П.Д.Решетова Под редакцией П.Д.Решетова и Т.И.Соркиной.', 'Year': '2000', 'Language': 'Russian', 'Pages': '469', 'ID': '222', 'Size': '16 Mb (17291435)', 'Extension': 'chm'}

To search using ISBN of book:

search_result = libgen.search_isbn("9783935042697")
output {'Thumb': 'http://libgen.rs/covers/2945000/4efa9a991b645f75be61871447c559ca-g.jpg', 'Download_link': 'http://libgen.rs/get?&md5=4EFA9A991B645F75BE61871447C559CA', 'MD5': '4EFA9A991B645F75BE61871447C559CA', 'Title': 'Python Basics: A Practical Introduction to Python 3', 'Author': 'David Amos', 'Year': '2021', 'Language': 'English', 'Pages': None, 'ID': '2945746', 'Size': '6 Mb (6774691)', 'Extension': 'pdf'}

To search using langauge of book:

search_result = libgen.search_language("english")
output {'Thumb': 'http://libgen.rs/covers/0/7b2a4d53fde834e801c26a2bab7e0240.jpg', 'Download_link': 'http://libgen.rs/get?&md5=7B2A4D53FDE834E801C26A2BAB7E0240', 'MD5': '7B2A4D53FDE834E801C26A2BAB7E0240', 'Title': 'Handbook of Clinical Drug Data', 'Author': 'Philip Anderson', 'Year': '2001', 'Language': 'English', 'Pages': '1163', 'ID': '1', 'Size': '3 Mb (3627486)', 'Extension': 'pdf'}

{'Thumb': 'http://libgen.rs/covers/0/048ea0496db0444f873139cd705a07af-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=048EA0496DB0444F873139CD705A07AF', 'MD5': '048EA0496DB0444F873139CD705A07AF', 'Title': 'Handbook of Herbs and Spices', 'Author': 'K V Peter', 'Year': '2001', 'Language': 'English', 'Pages': '332', 'ID': '2', 'Size': '1 Mb (1552793)', 'Extension': 'pdf'}

{'Thumb': 'http://libgen.rs/covers/0/411b9300a2f2094800e0e30d439c30fd-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=411B9300A2F2094800E0E30D439C30FD', 'MD5': '411B9300A2F2094800E0E30D439C30FD', 'Title': 'Handbook of Herbs and Spices: Volume 2', 'Author': 'Peter', 'Year': '2004', 'Language': 'English', 'Pages': None, 'ID': '3', 'Size': '2 Mb (2298559)', 'Extension': 'pdf'}

{'Thumb': 'http://libgen.rs/covers/0/372e34d136fbf39dce00460d9e8f1f52.jpg', 'Download_link': 'http://libgen.rs/get?&md5=372E34D136FBF39DCE00460D9E8F1F52', 'MD5': '372E34D136FBF39DCE00460D9E8F1F52', 'Title': 'Medical terminology, an illustrated guide', 'Author': 'Barbara Janson Cohen', 'Year': '2004', 'Language': 'English', 'Pages': '744', 'ID': '4', 'Size': '12 Mb (12287946)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/0/c086e2244ad712fe683c37c0e677b79b-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=C086E2244AD712FE683C37C0E677B79B', 'MD5': 'C086E2244AD712FE683C37C0E677B79B', 'Title': "Patterson's Allergic Diseases", 'Author': 'Leslie C. Grammer', 'Year': '2002', 'Language': 'English', 'Pages': '509', 'ID': '6', 'Size': '13 Mb (13319483)', 'Extension': 'pdf'}

To search using publisher of book:

search_result = libgen.search_publisher("McGraw-Hill")
output {'Thumb': 'http://libgen.rs/covers/0/7b2a4d53fde834e801c26a2bab7e0240.jpg', 'Download_link': 'http://libgen.rs/get?&md5=7B2A4D53FDE834E801C26A2BAB7E0240', 'MD5': '7B2A4D53FDE834E801C26A2BAB7E0240', 'Title': 'Handbook of Clinical Drug Data', 'Author': 'Philip Anderson', 'Year': '2001', 'Language': 'English', 'Pages': '1163', 'ID': '1', 'Size': '3 Mb (3627486)', 'Extension': 'pdf'}

{'Thumb': 'http://libgen.rs/covers/0/5f94ea5f1ebcbf82d9e46ae72bcfaa08-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=5F94EA5F1EBCBF82D9E46AE72BCFAA08', 'MD5': '5F94EA5F1EBCBF82D9E46AE72BCFAA08', 'Title': "Schaum's Immunology", 'Author': 'George Pinchuk', 'Year': '2001', 'Language': 'English', 'Pages': '329', 'ID': '8', 'Size': '4 Mb (4077170)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/0/cff0dece0fbc9780f3c13daf1936dab7-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=CFF0DECE0FBC9780F3C13DAF1936DAB7', 'MD5': 'CFF0DECE0FBC9780F3C13DAF1936DAB7', 'Title': 'Microbiology demystified', 'Author': 'Tom Betsy', 'Year': '2005', 'Language': 'English', 'Pages': '309', 'ID': '27', 'Size': '3 Mb (2958038)', 'Extension': 'pdf'}

{'Thumb': 'http://libgen.rs/covers/0/c924f24882bb6b47d55131f6a373f5c7-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=C924F24882BB6B47D55131F6A373F5C7', 'MD5': 'C924F24882BB6B47D55131F6A373F5C7', 'Title': "Schaum's outline of theory and problems of biochemistry", 'Author': 'Philip Kuchel', 'Year': '1998', 'Language': 'English', 'Pages': '572', 'ID': '73', 'Size': '6 Mb (6639413)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/0/a50f2e8f2963888a976899e2c4675d70.jpg', 'Download_link': 'http://libgen.rs/get?&md5=A50F2E8F2963888A976899E2C4675D70', 'MD5': 'A50F2E8F2963888A976899E2C4675D70', 'Title': 'Physiology demystified', 'Author': 'Layman D.P.', 'Year': '2004', 'Language': 'English', 'Pages': '432', 'ID': '76', 'Size': '9 Mb (9364185)', 'Extension': 'pdf'}

To search using tags of book:

search_result = libgen.search_tag("programming")
output {'Thumb': 'http://libgen.rs/covers/0/6066c1b029ada93730b364c1592fb015-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=6066C1B029ADA93730B364C1592FB015', 'MD5': '6066C1B029ADA93730B364C1592FB015', 'Title': 'Combinatorics on Traces', 'Author': 'Volker Diekert (auth.)', 'Year': '1990', 'Language': 'English', 'Pages': None, 'ID': '751', 'Size': '1 Mb (1270783)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/0/77ea2db71148f51d42628e66cbf612c9-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=77EA2DB71148F51D42628E66CBF612C9', 'MD5': '77EA2DB71148F51D42628E66CBF612C9', 'Title': 'Efficient Graph Rewriting and Its Implementation', 'Author': 'Heiko Dörr (eds.)', 'Year': '1995', 'Language': 'English', 'Pages': None, 'ID': '752', 'Size': '2 Mb (2228318)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/0/8ae5c7bf5dee99bfa7d339b35456e892-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=8AE5C7BF5DEE99BFA7D339B35456E892', 'MD5': '8AE5C7BF5DEE99BFA7D339B35456E892', 'Title': 'Advanced Functional Programming: Second International School Olympia, WA, USA, August 26–30, 1996 Tutorial Text', 'Author': 'Sigbjorn Finne', 'Year': '1996', 'Language': 'English', 'Pages': '244', 'ID': '785', 'Size': '2 Mb (1979590)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/1000/7bec16add82336b72fbd6db17811940a-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=7BEC16ADD82336B72FBD6DB17811940A', 'MD5': '7BEC16ADD82336B72FBD6DB17811940A', 'Title': 'Filtering, Segmentation and Depth', 'Author': 'Mark Nitzberg', 'Year': '1993', 'Language': 'English', 'Pages': None, 'ID': '1342', 'Size': '1 Mb (1418351)', 'Extension': 'djvu'}

{'Thumb': 'http://libgen.rs/covers/1000/064e4f885885efcef70ee17fa51a3b11-d.jpg', 'Download_link': 'http://libgen.rs/get?&md5=064E4F885885EFCEF70EE17FA51A3B11', 'MD5': '064E4F885885EFCEF70EE17FA51A3B11', 'Title': "Implementation of Functional Languages: 8th International Workshop, IFL'96 Bad Godesberg, Germany, September 16–18, 1996 Selected Papers", 'Author': 'Lee Braine', 'Year': '1997', 'Language': 'English', 'Pages': None, 'ID': '1502', 'Size': '3 Mb (2673640)', 'Extension': 'djvu'}

Download stuff

To get direct download link of book, use its MD5 specifier as follows:

download_url = libgen.resolve_download_link(md5="6066C1B029ADA93730B364C1592FB015")
output https://cloudflare-ipfs.com/ipfs/bafykbzaceaovkuyhyugtrvmngqda4zfbxsi4hg7bz6va3odthmdoifaf23glk?filename=%28Lecture%20Notes%20in%20Computer%20Science%20454%29%20Volker%20Diekert%20%28auth.%29%20-%20Combinatorics%20on%20Traces-Springer-Verlag%20Berlin%20Heidelberg%20%281990%29.djvu

To download the book, use its MD5 identifier as follows:

# path can be an absolute or full path.
# Needs to be a valid path and should end with a valid file name with its respective extension.
# Name of file can be determined by you, extension must be taken from the search results.

libgen.download(md5="6066C1B029ADA93730B364C1592FB015", path="./books/6066C1B029ADA93730B364C1592FB015.djvu")
output

Offline build

To do offline build, git clone this repository and use the provided setup.py file to build the module. setuptools module is required and must be up-to-date.

Git clone:

You should now end up with a folder named libgenparser with all contents in this repository.

$ git clone https://github.com/BeastImran/libgenparser.git

Build:

Navigate into libgenparser directory and use the following snippet to build. After a dump of lots of text, you should end up with three directories, bdist directory should have the .whl file with specific version name to do an offline install.

$ python setup.py sdist bdist_wheel

Offline install:

Use the .whl wheel file under bdist directory as shown below to do an offline install. Virtual environment recommended.

$ pip install -i FILEWITH.whl

Final words

Hope this project made your life a little easier. You got any contribution, suggestion, involvement in project, anything else? ping me on telegram.

About

An easy and advanced libgen.rs scraper. Can search books, filter results, download books etc. Also supports asyncio.

Resources

Stars

Watchers

Forks

Languages