Code and materials for the EDM2016 tutorial "Massively scalable EDM with Spark"
Copyright 2016 Tristan Nixon tristan.m.nixon@gmail.com
All material in this repository is licensed under the Apache License, Version 2.0 (the "License"); you may not use any of the contents except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
- Install Java (on Ubuntu,
sudo apt-get install oracle-java8-installer
) - Install Scala (on Ubuntu,
sudo dpkg -i [scala-2.12.0-M5.deb](http://www.scala-lang.org/download/2.12.0-M4.html)
) - Install Spark. Download Spark, and extract it to
/usr/local
. - Download the (data for this tutorial)[http://edm16-spark-tutorial.s3-website-us-east-1.amazonaws.com/]
- Run Spark as:
/usr/local/spark-1.6.2/bin/spark-shell
Now, work through the exercises!
References:
- (OpenNLP)[https://en.wikipedia.org/wiki/OpenNLP]