# docker-spark **Repository Path**: astra_zhao/docker-spark ## Basic Information - **Project Name**: docker-spark - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-08-28 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # spark A `debian:stretch` based [Spark](http://spark.apache.org) container. Use it in a standalone cluster with the accompanying `docker-compose.yml`, or as a base for more complex recipes. ## docker example To run `SparkPi`, run the image with Docker: docker run --rm -it -p 4040:4040 gettyimages/spark bin/run-example SparkPi 10 To start `spark-shell` with your AWS credentials: docker run --rm -it -e "AWS_ACCESS_KEY_ID=YOURKEY" -e "AWS_SECRET_ACCESS_KEY=YOURSECRET" -p 4040:4040 gettyimages/spark bin/spark-shell To do a thing with Pyspark echo -e "import pyspark\n\nprint(pyspark.SparkContext().parallelize(range(0, 10)).count())" > count.py docker run --rm -it -p 4040:4040 -v $(pwd)/count.py:/count.py gettyimages/spark bin/spark-submit /count.py ## docker-compose example To create a simplistic standalone cluster with [docker-compose](http://docs.docker.com/compose): docker-compose up The SparkUI will be running at `http://${YOUR_DOCKER_HOST}:8080` with one worker listed. To run `pyspark`, exec into a container: docker exec -it docker-spark_master_1 /bin/bash bin/pyspark To run `SparkPi`, exec into a container: docker exec -it docker-spark_master_1 /bin/bash bin/run-example SparkPi 10 ## license MIT