# Create a SparkSession (Note, the config section is only for Windows!)
=
# Load up our data and convert it to the format MLLib expects.
=
=
# Convert this RDD to a DataFrame
=
=
# Note, there are lots of cases where you can avoid going from an RDD to a DataFrame.
# Perhaps you're importing data from a real database. Or you are using structured streaming
# to get your data.
# Let's split our data into training data and testing data
=
=
=
# Now create our linear regression model
=
# Train the model using our training data
=
# Now see if we can predict values in our test data.
# Generate predictions using our linear regression model for all features in our
# test dataframe:
=
# Extract the predictions and the "known" correct labels.
=
=
# Zip them together
=
# Print out the predicted and actual values for each point
# Stop the session