This repository has been archived by the owner on May 30, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #83 from sotetsuk/develop-v0.1.0
Develop v0.1.0
- Loading branch information
Showing
19 changed files
with
1,109 additions
and
691 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,2 @@ | ||
.idea/* | ||
*.iml | ||
go-scholar |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,103 +1,173 @@ | ||
[![Build Status](https://travis-ci.org/sotetsuk/go-scholar.svg?branch=master)](https://travis-ci.org/sotetsuk/go-scholar) | ||
[![Coverage Status](https://coveralls.io/repos/github/sotetsuk/go-scholar/badge.svg?branch=master)](https://coveralls.io/github/sotetsuk/go-scholar?branch=master) | ||
[![GoDoc](https://godoc.org/github.com/sotetsuk/goscholar?status.svg)](https://godoc.org/github.com/sotetsuk/goscholar) | ||
[![Build Status](https://travis-ci.org/sotetsuk/goscholar.svg?branch=master)](https://travis-ci.org/sotetsuk/goscholar) | ||
[![Coverage Status](https://coveralls.io/repos/github/sotetsuk/goscholar/badge.svg?branch=master)](https://coveralls.io/github/sotetsuk/goscholar?branch=master) | ||
[![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)]() | ||
[![GitHub version](https://badge.fury.io/gh/sotetsuk%2Fgo-scholar.svg)](https://badge.fury.io/gh/sotetsuk%2Fgo-scholar) | ||
[![GitHub version](https://badge.fury.io/gh/sotetsuk%2Fgoscholar.svg)](https://badge.fury.io/gh/sotetsuk%2Fgoscholar) | ||
|
||
# go-scholar | ||
**Go**ogle **Scholar** crawler and scraper written in **Go** | ||
# goscholar | ||
**Go**ogle **Scholar** scraper written in **Go** | ||
|
||
|
||
## Install | ||
|
||
Assume that `$GOPATH` is set and `$PATH` includes `$GOPATH/bin`. | ||
|
||
```sh | ||
$ go get github.com/sotetsuk/goscholar | ||
``` | ||
$ go get github.com/sotetsuk/go-scholar | ||
$ go-scholar -h | ||
|
||
for command line: | ||
|
||
```sh | ||
$ go get github.com/sotetsuk/goscholar/cmd/goscholar | ||
$ goscholar -h | ||
``` | ||
|
||
## Feature | ||
|
||
- API for Go | ||
- API for command line | ||
- search by keywords, title, and author | ||
- find by ```<cluster-id>``` | ||
- fetch citing articles of ```<cluster-id>``` | ||
- crawl recursively (**not implemented yet**) | ||
- search the articles citing ```<cluster-id>``` | ||
- JSON output | ||
|
||
## Example | ||
## Go API | ||
|
||
### Example | ||
|
||
```go | ||
// create Query and generate URL | ||
q := Query{Keywords:"deep learning", Author:"y bengio"} | ||
url = q.SearchUrl() | ||
|
||
// fetch document sending the request to the URL | ||
doc, err := goscholar.Fetch(url) | ||
if err != nil { | ||
log.Error(err) | ||
return | ||
} | ||
|
||
// parse articles | ||
ch := make(chan *goscholar.Article, 10) | ||
go goscholar.ParseDocument(ch, doc) | ||
|
||
for a := range ch { | ||
fmt.Println(a) | ||
} | ||
``` | ||
$ go-scholar search --title "Deep learning via Hessian-free optimization" --num 1 | python -mjson.tool | ||
|
||
## Command line API | ||
|
||
### Example | ||
|
||
```sh | ||
$ goscholar search --keywords "deep learning nature" --author "y bengio" --after 2015 --num 1 | python -mjson.tool | ||
[ | ||
{ | ||
"ClusterId": "5362332738201102290", | ||
"InfoId": "0qfs6zbVakoJ", | ||
"Link": { | ||
"Format": "PDF", | ||
"Name": "psu.edu", | ||
"Url": "http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.436.894&rep=rep1&type=pdf" | ||
}, | ||
"NumCite": "390", | ||
"NumVer": "7", | ||
"Title": { | ||
"Name": "Deep learning", | ||
"Url": "http://www.nature.com/nature/journal/v521/n7553/abs/nature14539.html" | ||
}, | ||
"Year": "2015" | ||
} | ||
] | ||
``` | ||
|
||
```sh | ||
$ goscholar find 15502119379559163003 | python -mjson.tool | ||
[ | ||
{ | ||
"ClusterId": "15502119379559163003", | ||
"InfoId": "e6RSJHGXItcJ", | ||
"NumberOfCitations": "260", | ||
"NumberOfVersions": "9", | ||
"PDFLink": "http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_Martens10.pdf", | ||
"PDFSource": "wustl.edu", | ||
"Title": "Deep learning via Hessian-free optimization", | ||
"URL": "http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_Martens10.pdf", | ||
"Link": { | ||
"Format": "PDF", | ||
"Name": "wustl.edu", | ||
"Url": "http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_Martens10.pdf" | ||
}, | ||
"NumCite": "260", | ||
"NumVer": "", | ||
"Title": { | ||
"Name": "Deep learning via Hessian-free optimization", | ||
"Url": "http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_Martens10.pdf" | ||
}, | ||
"Year": "2010" | ||
} | ||
] | ||
] | ||
``` | ||
|
||
|
||
``` | ||
$ go-scholar find 8108748482885444188 | python -mjson.tool | ||
```sh | ||
$ goscholar cite 15502119379559163003 --num 1 | python -mjson.tool | ||
[ | ||
{ | ||
"ClusterId": "8108748482885444188", | ||
"InfoId": "XOJff8gPiHAJ", | ||
"NumberOfCitations": "376", | ||
"NumberOfVersions": "", | ||
"PDFLink": "", | ||
"PDFSource": "", | ||
"Title": "Learning in science: A comparison of deep and surface approaches", | ||
"URL": "http://onlinelibrary.wiley.com/doi/10.1002/(SICI)1098-2736(200002)37:2%3C109::AID-TEA3%3E3.0.CO;2-7/abstract", | ||
"Year": "2000" | ||
"ClusterId": "3674494786452480182", | ||
"InfoId": "tmCGO4pt_jIJ", | ||
"Link": { | ||
"Format": "PDF", | ||
"Name": "toronto.edu", | ||
"Url": "http://www.cs.toronto.edu/~asamir/papers/SPM_DNN_12.pdf" | ||
}, | ||
"NumCite": "1452", | ||
"NumVer": "27", | ||
"Title": { | ||
"Name": "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups", | ||
"Url": "http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6296526" | ||
}, | ||
"Year": "2012" | ||
} | ||
] | ||
``` | ||
|
||
## Usage | ||
(This article cites 15502119379559163003=Deep learning via Hessian-free optimization) | ||
|
||
### Usage | ||
|
||
``` | ||
$ go-scholar -h | ||
go-scholar: Google Scholar crawler and scraper written in Go | ||
Usage: | ||
go-scholar search [--author=<author>] [--title=<title>] [--query=<query>] | ||
go-scholar search [--keywords=<keywords>] [--author=<author>] [--title=<title>] | ||
[--after=<year>] [--before=<year>] [--num=<num>] [--start=<start>] | ||
[--json|--bibtex] | ||
go-scholar find <cluster-id> [--json|--bibtex] | ||
go-scholar cite <cluster-id> [--after=<year>] [--before=<year>] [--num=<num>] [--start=<start>] [--json|--bibtex] | ||
go-scholar find <cluster-id> | ||
go-scholar cite <cluster-id> [--after=<year>] [--before=<year>] [--num=<num>] [--start=<start>] | ||
go-scholar -h | --help | ||
go-scholar --version | ||
Query-options: | ||
<cluster-id> | ||
--keywords=<keywords> | ||
--author=<author> | ||
--title=<title> | ||
--query=<query> | ||
Search-options: | ||
--after=<year> | ||
--before=<year> | ||
--num=<num> | ||
--start=<start> | ||
Output-options: | ||
--json | ||
--bibtex | ||
Others: | ||
-h --help | ||
--version | ||
``` | ||
|
||
## Dependencies | ||
|
||
- [github.com/docopt/docopt-go](https://github.com/docopt/docopt-go) | ||
- [github.com/PuerkitoBio/goquery](https://github.com/PuerkitoBio/goquery) | ||
- [github.com/Sirupsen/logrus](https://github.com/PuerkitoBio/goquery) | ||
|
||
## Related Work | ||
goscholar is inspired by [scholar.py](https://github.com/ckreibich/scholar.py) | ||
|
||
## Contribute | ||
Contritubing is more than welcome! See [Issues](https://github.com/sotetsuk/go-scholar/issues) for what is required. | ||
Contritubing is more than welcome! See [Issues](https://github.com/sotetsuk/goscholar/issues) for what is required. | ||
|
||
## License | ||
MIT License |
Oops, something went wrong.