Cloud Computing for Data Analysis ------------------------------------------------- EXTRA CREDIT EXERCISE - Exercise 11: Support Vector Machine - Instructions _1. Create a Maven project in Eclipse for Spark _1.1 Open Elipse IDE for Scala _1.2 File -> New -> Project _1.3 Select Maven -> Maven Project -> Click Next _1.4 Enable the following checkbox - Create a simple project _1.5 Enter Group Id: and Artifact Id: for example: svm6190 _1.6 Click finish _2. Right click on the project folder -> Configure -> Add Scala Nature _3. Right click on src/main/java -> refactor -> rename -> scala _4. Add the following dependencies in pom.xml org.apache.spark spark-core_2.11 2.2.0 org.apache.spark spark-mllib_2.11 2.2.0 _5. Right click on the project folder -> Build Path -> Configure Build Path -> Scala Compiler -> Enable the checkbox (USe Project Settings). Select Scala Installation -> Latest 2.11 bundle (dynamic). Then click Apply -> ok -> ok _6. Copy the file svmdriver.scala and SVMMultiClass into the src/main/scala folder in the project. Code available in https://webpages.uncc.edu/aatzache/ITCS6190/Project/SVM.zip _7. Right click on the project folder -> Run As -> maven clean _8. Right click on the project folder -> Run As -> maven install _9. The jar file would be generated in the target folder in the Project. _10. Get the .jar file. _11. Create a cluster with Hadoop and Spark in AWS and start the cluster. Once the cluster is running, log-in to the master node using Putty(Windows) or SSH(MAC or Linux) _12. Create a data bucket in AWS S3. Upload the Car Data .jar files to S3 _13. From the master node download .jar using the command: aws s3 cp s3://BUCKET_NAME/JAR_NAME.jar . _14. Run the .jar file using your terminal or Putty using following command: spark-submit --class svmdriver --master yarn --deploy-mode client s3://BUCKET_NAME/data.txt _15. Copy the Ouput - Confusion Matrix, Accuracy and Precision, Recall, F-measure metrics from the Terminal to a text file. Name the text file as Output.txt Save the terminal command window text. SUBMIT the Output.txt, and the terminal command window text file on Canvas. * Delete/Terminate the AWS cluster and delete all files from S3 when finished, otherwise Amazon will charge your Credit Card