DataGorri is an application used to extract data from tables located on websites
Have a look at the quick start guide DataGorri_Manual_1.1.pdf or the Wiki page, covering the most topics.
For operating systems like Mac OS or Ubuntu, DataGorri can be run by installing Python 3 and running the source code by the console/terminal. How to install Python and necessary third-party libraries is described in the documentation for developers.
There is an installer provided for Windows (32 and 64bit). The installation is straightforward and does not need extensive explanation.
Clone the project or download as .zip file
cd {datagorri}
python3 DataGorri.py
Steps to scrape a table:
- Create a page model to define the content in which one is interested in.
- Collect links of websites that should be scraped.
For some examples, have a look at the samples folder
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
For the versions available, see the releases on this repository.
- Julian Hackinger Julian.Hackinger@tum.de
Further, the following persons (in alphabetical order) have contributed to the current or previous versions of this software and agreed to being named as contributors:
- Ivaylo Dimitrov
- Matthias Franze
- Julian Hackinger
- Stefan Hentschel
- Lukas Holzner
- Florian Kreitmair
- Daniel Krieger
- Michael Legenc
- Marc Müller
See also the list of contributors who participated in this project.
This project is licensed under the Citeware License - see the LICENSE.txt file for details