Description: Each spark application initiates a variety of distributed parallel operations on the cluster by a drive program. The drive program contains the main function of the application, and defines the distributed data sets on the cluster, and the related operations are applied to these distributed data sets.
In the example above, the actual drive program is the spark shell itself, as long as you enter the program you want to run.
The drive program access the spark. object by a SparkContext object to represent a connection to the computing cluster. Shell has created a SparkContext object itself when it is started, a variable called SC.
Once you have a SparkContext, you can use it to create a RDD.
Author: the home of a single teacher
Link: https://www.jianshu.com/p/c6aefad2ba0c
Source: Jane book
The copyright is owned by the author. Commercial reprint please contact the author to obtain authorization, non commercial reprint please indicate the source.
To Search:
File list (Check if you may need any files):
Filename | Size | Date |
---|
Spark快速大数据分析.pdf | 16901972 | 2017-03-29 |