I'm following the DataStax guide from here
Configure these:
- mkdirs for spark data in the install dir:alteredcarbon:spark3 neil$ more mkDataDir.sh#!/bin/bashmkdir cassandra-data; cd cassandra-datamkdir data saved_caches commit logmkdir sparkmkdir spark/rdd spark/tmp
- Edit resources/spark/conf/sparkenv.sh data dirs:export SPARK_TMP_DIR="/Volumes/BACKUP/DEV/TEMP/spark3/cassandra-data/spark/tmp"# Directory where RDDs will be cachedexport SPARK_RDD_DIR="/Volumes/BACKUP/DEV/TEMP/spark3/cassandra-data/spark/rdd"# The directory for storing master.log and worker.log filesexport SPARK_LOG_DIR="/Volumes/BACKUP/DEV/TEMP/spark3/cassandra-data/spark"
export SPARK_WORKER_DIR="/Volumes/BACKUP/DEV/TEMP/spark3/cassandra-data/spark/work" - Spark uses a local Cassandra node, so lets configure that
- Allocate the NIC: sudo ifconfig lo0 alias 127.0.0.4 up
- Configure JMX resources/cassandra/conf/cassandra env.shJMX_PORT="7188"
- Cassandra Endpoints: resources/cassansdra/conf/cassandra.yamllisten_address: 127.0.0.4# that rely on node auto-discovery.rpc_address: 127.0.0.4
- Configure Logging:vi resources/cassandra/conf/log4j-server.properties
- Configure /etc/hosts:127.0.0.4 localhost4 alteredcarbon4 alteredcarbon4.local
- Configure Cassandra data dirs (remember, spark runs on Cassandra nodes)/Volumes/BACKUP/DEV/TEMP/spark3/cassandra-data$ vi ../resources/cassandra/conf/cassandra.yaml# the configured compaction strategy.data_file_directories:- /Volumes/BACKUP/DEV/TEMP/spark3/cassandra-data/data# commit logcommitlog_directory: /Volumes/BACKUP/DEV/TEMP/spark3/cassandra-data/commitlog# saved cachessaved_caches_directory: /Volumes/BACKUP/DEV/TEMP/spark3/cassandra-data/saved_caches
- Fire up Spark with a Cassandra node $ bin/dse cassandra -f -k
- If all is good then your should see: INFO 15:00:04,865 SparkWorker: Starting remoting
INFO 15:00:05,073 SparkWorker: Remoting started; listening on addresses :[akka.tcp://sparkWorker@127.0.0.4:54942]INFO 15:00:05,077 SparkWorker: Remoting now listens on addresses: [akka.tcp://sparkWorker@127.0.0.4:54942]INFO 15:00:05,379 SparkWorker: Starting Spark worker 127.0.0.4:54942 with 6 cores, 9.5 GB RAMINFO 15:00:05,380 SparkWorker: Spark home: /Volumes/BACKUP/DEV/TEMP/spark3/resources/sparkINFO 15:00:05,643 SparkWorker: Started Worker web UI at http://192.168.228.1:7081INFO 15:00:05,645 SparkWorker: Connecting to master spark://127.0.0.4:7077...INFO 15:00:05,950 SparkMaster: Registering worker 127.0.0.4:54942 with 6 cores, 9.5 GB RAMINFO 15:00:05,953 SparkMaster: Adding worker 127.0.0.4INFO 15:00:06,046 SparkMaster: New Cassandra host /127.0.0.2:9042 addedINFO 15:00:06,047 SparkMaster: New Cassandra host /127.0.0.1:9042 addedINFO 15:00:06,047 SparkMaster: Connected to Cassandra cluster: Test ClusterINFO 15:00:06,047 SparkMaster: New Cassandra host /127.0.0.4:9042 addedINFO 15:00:06,105 SparkWorker: Successfully registered with master spark://127.0.0.4:7077OpsCenter with 1 Analytics node: BOOM!

No comments:
Post a Comment