-
Notifications
You must be signed in to change notification settings - Fork 18
/
important-links.txt
40 lines (27 loc) · 1.72 KB
/
important-links.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Important links
1. Get numpy in a docker container
— https://stackoverflow.com/questions/57763773/install-numpy-requirement-in-a-dockerfile-results-in-error
2. Docker compose: ports and expose
- https://stackoverflow.com/questions/40801772/what-is-the-difference-between-docker-compose-ports-vs-expose
3. Spark Configuration: memory/instance/cores
- https://stackoverflow.com/questions/26645293/spark-configuration-memory-instance-cores
4. Setting up a Scalable Spark Infrastructure with Docker
- https://www.pavanpkulkarni.com/blog/13-spark-on-docker/
5. Number of Cores vs Number of Executors
- https://stackoverflow.com/questions/24622108/apache-spark-the-number-of-cores-vs-the-number-of-executors
6. How to tune your Apache Spark Jobs
- https://blog.cloudera.com/how-to-tune-your-apache-spark-jobs-part-2/
7. Spark Executor Tuning
- https://www.youtube.com/watch?v=V9E-bWarMNw
8. Python vs Scala
- https://www.kdnuggets.com/2018/05/apache-spark-python-scala.html
9. Local mode vs Cluster mode
- https://stackoverflow.com/questions/48261344/a-master-url-must-be-set-in-your-configuration-gives-lot-of-confusion
10. Getting Started with Redis, Apache Spark and Python
- https://redislabs.com/blog/getting-started-redis-apache-spark-python/
11. Give Spark a 45x speed boost with Redis
- https://www.infoworld.com/article/3045083/give-spark-a-45x-speed-boost-with-redis.html
12. Simple redis based cache for storing results of python function calls, json encoded strings or html (forked from vivekn/redis-simple-cache for python 3 support)
- https://github.com/ohanetz/redis-simple-cache-3k
13. Uber leverages Apache Livy to handle Spark job problems like data source diversity and dependency issues.
- https://eng.uber.com/uscs-apache-spark/