- abs(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the absolute value. 
- abs() - Method in class org.apache.spark.sql.types.Decimal
-  
- AbsoluteError - Class in org.apache.spark.mllib.tree.loss
- 
:: DeveloperApi ::
 Class for absolute error loss calculation (for regression). 
- AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
-  
- accessTime() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
-  
- accId() - Method in class org.apache.spark.CleanAccum
-  
- Accumulable<R,T> - Class in org.apache.spark
- 
A data type that can be accumulated, ie has an commutative and associative "add" operation,
 but where the result type, R, may be different from the element type being added,T.
 
- Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
-  
- Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
-  
- accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulable shared variable of the given type, to which tasks
 can "add" values with  add. 
- accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulable shared variable of the given type, to which tasks
 can "add" values with  add. 
- accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
- 
Create an  Accumulable shared variable, to which tasks can add values
 with  +=. 
- accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
- 
Create an  Accumulable shared variable, with a name for display in the
 Spark UI. 
- accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
- 
Create an accumulator from a "mutable collection" type. 
- AccumulableInfo - Class in org.apache.spark.scheduler
- 
:: DeveloperApi ::
 Information about an  Accumulable modified during a task or stage. 
- AccumulableInfo - Class in org.apache.spark.status.api.v1
-  
- AccumulableParam<R,T> - Interface in org.apache.spark
- 
Helper object defining how to accumulate values of a particular type. 
- accumulables() - Method in class org.apache.spark.scheduler.StageInfo
- 
Terminal values of accumulables updated during this stage. 
- accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
- 
Intermediate updates to accumulables during this task. 
- Accumulator<T> - Class in org.apache.spark
- 
A simpler value of  Accumulable where the result type being accumulated is the same
 as the types of elements being merged, i.e. 
- Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
-  
- Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
-  
- accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulator integer variable, which tasks can "add" values
 to using the  add method. 
- accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulator integer variable, which tasks can "add" values
 to using the  add method. 
- accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulator double variable, which tasks can "add" values
 to using the  add method. 
- accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulator double variable, which tasks can "add" values
 to using the  add method. 
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulator variable of a given type, which tasks can "add"
 values to using the  add method. 
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulator variable of a given type, which tasks can "add"
 values to using the  add method. 
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
- 
Create an  Accumulator variable of a given type, which tasks can "add"
 values to using the  += method. 
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
- 
Create an  Accumulator variable of a given type, with a name for display
 in the Spark UI. 
- AccumulatorParam<T> - Interface in org.apache.spark
- 
A simpler version of  AccumulableParam where the only data type you can add
 in is the same type as the accumulated value. 
- AccumulatorParam.DoubleAccumulatorParam$ - Class in org.apache.spark
-  
- AccumulatorParam.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-  
- AccumulatorParam.FloatAccumulatorParam$ - Class in org.apache.spark
-  
- AccumulatorParam.FloatAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-  
- AccumulatorParam.IntAccumulatorParam$ - Class in org.apache.spark
-  
- AccumulatorParam.IntAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-  
- AccumulatorParam.LongAccumulatorParam$ - Class in org.apache.spark
-  
- AccumulatorParam.LongAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-  
- accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.StageData
-  
- accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.TaskData
-  
- accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns accuracy 
- acos(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the cosine inverse of the given value; the returned angle is in the range
 0.0 through pi. 
- acos(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the cosine inverse of the given column; the returned angle is in the range
 0.0 through pi. 
- active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-  
- activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- activeTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- ActorHelper - Interface in org.apache.spark.streaming.receiver
- 
:: DeveloperApi ::
 A receiver trait to be mixed in with your Actor to gain access to
 the API for pushing received data into Spark Streaming for being processed. 
- actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream with any arbitrary user implemented actor receiver. 
- actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream with any arbitrary user implemented actor receiver. 
- actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream with any arbitrary user implemented actor receiver. 
- actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create an input stream with any arbitrary user implemented actor receiver. 
- ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver
- 
:: DeveloperApi ::
 A helper with set of defaults for supervisor strategy 
- ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
-  
- actorSystem() - Method in class org.apache.spark.SparkEnv
-  
- add(T) - Method in class org.apache.spark.Accumulable
- 
Add more data to this accumulator / accumulable 
- add(org.apache.spark.ml.feature.Instance) - Method in class org.apache.spark.ml.classification.LogisticAggregator
- 
Add a new training instance to this LogisticAggregator, and update the loss and gradient
 of the objective function. 
- add(AFTPoint) - Method in class org.apache.spark.ml.regression.AFTAggregator
-  
- add(org.apache.spark.ml.feature.Instance) - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
- 
Add a new training instance to this LeastSquaresAggregator, and update the loss and gradient
 of the objective function. 
- add(double[], MultivariateGaussian[], ExpectationSum, Vector<Object>) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
-  
- add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
- 
Adds a new document. 
- add(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
- 
Adds two block matrices together. 
- add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
- 
Add a new sample to this summarizer, and update the statistical summary. 
- add(StructField) - Method in class org.apache.spark.sql.types.StructType
- 
- add(String, DataType) - Method in class org.apache.spark.sql.types.StructType
- 
Creates a new  StructType by adding a new nullable field with no metadata. 
- add(String, DataType, boolean) - Method in class org.apache.spark.sql.types.StructType
- 
Creates a new  StructType by adding a new field with no metadata. 
- add(String, DataType, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType
- 
Creates a new  StructType by adding a new field and specifying metadata. 
- add(String, String) - Method in class org.apache.spark.sql.types.StructType
- 
Creates a new  StructType by adding a new nullable field with no metadata where the
 dataType is specified as a String. 
- add(String, String, boolean) - Method in class org.apache.spark.sql.types.StructType
- 
Creates a new  StructType by adding a new field with no metadata where the
 dataType is specified as a String. 
- add(String, String, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType
- 
Creates a new  StructType by adding a new field and specifying metadata where the
 dataType is specified as a String. 
- add(Vector) - Method in class org.apache.spark.util.Vector
-  
- add_months(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Returns the date that is numMonths after startDate. 
- addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
- 
Add additional data to the accumulator value. 
- addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
-  
- addAppArgs(String...) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Adds command line arguments for the application. 
- addedFiles() - Method in class org.apache.spark.SparkContext
-  
- addedJars() - Method in class org.apache.spark.SparkContext
-  
- addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Add a file to be downloaded with this Spark job on every node. 
- addFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Adds a file to be submitted with the application. 
- addFile(String) - Method in class org.apache.spark.SparkContext
- 
Add a file to be downloaded with this Spark job on every node. 
- addFile(String, boolean) - Method in class org.apache.spark.SparkContext
- 
Add a file to be downloaded with this Spark job on every node. 
- addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
- 
Adds a param with multiple values (overwrites if the input param exists). 
- addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
- 
Adds a double param with multiple values. 
- addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
- 
Adds a int param with multiple values. 
- addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
- 
Adds a float param with multiple values. 
- addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
- 
Adds a long param with multiple values. 
- addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
- 
Adds a boolean param with true and false. 
- addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
- 
Merge two accumulated values together. 
- addInPlace(double, double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-  
- addInPlace(float, float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-  
- addInPlace(int, int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-  
- addInPlace(long, long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-  
- addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-  
- addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
-  
- addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
-  
- addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
-  
- addInPlace(Vector) - Method in class org.apache.spark.util.Vector
-  
- addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
-  
- addIntercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
- 
Whether to add intercept (default: false). 
- addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future. 
- addJar(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Adds a jar file to be submitted with the application. 
- addJar(String) - Method in class org.apache.spark.SparkContext
- 
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future. 
- addJar(String) - Method in class org.apache.spark.sql.hive.HiveContext
-  
- addJar(String) - Method in class org.apache.spark.sql.SQLContext
- 
Add a jar to SQLContext 
- addListener(SparkAppHandle.Listener) - Method in interface org.apache.spark.launcher.SparkAppHandle
- 
Adds a listener to be notified of changes to the handle's information. 
- addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
- 
Add Hadoop configuration specific to a single partition and attempt. 
- addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext
- 
Adds a callback function to be executed on task completion. 
- addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- addPyFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Adds a python file / zip / egg to be submitted with the application. 
- address() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-  
- addSparkArg(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Adds a no-value argument to the Spark invocation. 
- addSparkArg(String, String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Adds an argument with a value to the Spark invocation. 
- addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext
- 
:: DeveloperApi ::
 Register a listener to receive up-calls from events that happen during execution. 
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
- 
- addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
- 
Adds a (Java friendly) listener to be executed on task completion. 
- addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContext
- 
Adds a listener in the form of a Scala closure to be executed on task completion. 
- addTaskFailureListener(TaskFailureListener) - Method in class org.apache.spark.TaskContext
- 
Adds a listener to be executed on task failure. 
- addTaskFailureListener(Function2<TaskContext, Throwable, BoxedUnit>) - Method in class org.apache.spark.TaskContext
- 
Adds a listener to be executed on task failure. 
- AFTAggregator - Class in org.apache.spark.ml.regression
-  
- AFTAggregator(DenseVector<Object>, boolean) - Constructor for class org.apache.spark.ml.regression.AFTAggregator
-  
- AFTCostFun - Class in org.apache.spark.ml.regression
-  
- AFTCostFun(RDD<AFTPoint>, boolean) - Constructor for class org.apache.spark.ml.regression.AFTCostFun
-  
- AFTSurvivalRegression - Class in org.apache.spark.ml.regression
- 
:: Experimental ::
 Fit a parametric survival regression model named accelerated failure time (AFT) model
 (https://en.wikipedia.org/wiki/Accelerated_failure_time_model)
 based on the Weibull distribution of the survival time.
 
- AFTSurvivalRegression(String) - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- AFTSurvivalRegression() - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- AFTSurvivalRegressionModel - Class in org.apache.spark.ml.regression
- 
- agg(Column, Column...) - Method in class org.apache.spark.sql.DataFrame
- 
Aggregates on the entire  DataFrame without groups. 
- agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.DataFrame
- 
(Scala-specific) Aggregates on the entire  DataFrame without groups. 
- agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
- 
(Scala-specific) Aggregates on the entire  DataFrame without groups. 
- agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
- 
(Java-specific) Aggregates on the entire  DataFrame without groups. 
- agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
- 
Aggregates on the entire  DataFrame without groups. 
- agg(Column, Column...) - Method in class org.apache.spark.sql.GroupedData
- 
Compute aggregates by specifying a series of aggregate columns. 
- agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.GroupedData
- 
(Scala-specific) Compute aggregates by specifying a map from column name to
 aggregate methods. 
- agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
- 
(Scala-specific) Compute aggregates by specifying a map from column name to
 aggregate methods. 
- agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
- 
(Java-specific) Compute aggregates by specifying a map from column name to
 aggregate methods. 
- agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.GroupedData
- 
Compute aggregates by specifying a series of aggregate columns. 
- agg(TypedColumn<V, U1>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Computes the given aggregation, returning a  Dataset of tuples for each unique key
 and the result of computing this aggregation over all elements in the group. 
- agg(TypedColumn<V, U1>, TypedColumn<V, U2>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Computes the given aggregations, returning a  Dataset of tuples for each unique key
 and the result of computing these aggregations over all elements in the group. 
- agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Computes the given aggregations, returning a  Dataset of tuples for each unique key
 and the result of computing these aggregations over all elements in the group. 
- agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>, TypedColumn<V, U4>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Computes the given aggregations, returning a  Dataset of tuples for each unique key
 and the result of computing these aggregations over all elements in the group. 
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Aggregate the elements of each partition, and then the results for all the partitions, using
 given combine functions and a neutral "zero value". 
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Aggregate the elements of each partition, and then the results for all the partitions, using
 given combine functions and a neutral "zero value". 
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Aggregate the values of each key, using given combine functions and a neutral "zero value". 
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Aggregate the values of each key, using given combine functions and a neutral "zero value". 
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Aggregate the values of each key, using given combine functions and a neutral "zero value". 
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Aggregate the values of each key, using given combine functions and a neutral "zero value". 
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Aggregate the values of each key, using given combine functions and a neutral "zero value". 
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Aggregate the values of each key, using given combine functions and a neutral "zero value". 
- AggregatedDialect - Class in org.apache.spark.sql.jdbc
- 
AggregatedDialect can unify multiple dialects into one virtual Dialect. 
- AggregatedDialect(List<JdbcDialect>) - Constructor for class org.apache.spark.sql.jdbc.AggregatedDialect
-  
- aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
- 
Aggregates values from the neighboring edges and vertices of each vertex. 
- aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
- 
Aggregates vertices in messagesthat have the same ids usingreduceFunc, returning a
 VertexRDD co-indexed withthis.
 
- AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
-  
- AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- Aggregator<K,V,C> - Class in org.apache.spark
- 
:: DeveloperApi ::
 A set of functions used to aggregate data. 
- Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
-  
- aggregator() - Method in class org.apache.spark.ShuffleDependency
-  
- Aggregator<I,B,O> - Class in org.apache.spark.sql.expressions
- 
A base class for user-defined aggregations, which can be used in DataFrameandDatasetoperations to take all of the elements of a group and reduce them to a single value.
 
- Aggregator() - Constructor for class org.apache.spark.sql.expressions.Aggregator
-  
- aggUntyped(Seq<TypedColumn<?, ?>>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Internal helper function for building typed aggregations that return tuples. 
- Algo - Class in org.apache.spark.mllib.tree.configuration
- 
:: Experimental ::
 Enum to select the algorithm for the decision tree 
- Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
-  
- algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-  
- algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-  
- algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-  
- algorithm() - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-  
- algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
- 
The algorithm to use for updating. 
- algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-  
- alias(String) - Method in class org.apache.spark.sql.Column
- 
Gives the column an alias. 
- alias(String) - Method in class org.apache.spark.sql.DataFrame
- 
- alias(Symbol) - Method in class org.apache.spark.sql.DataFrame
- 
(Scala-specific) Returns a new  DataFrame with an alias set. 
- All - Static variable in class org.apache.spark.graphx.TripletFields
- 
Expose all the fields (source, edge, and destination). 
- alpha() - Method in class org.apache.spark.mllib.random.WeibullGenerator
-  
- AlphaComponent - Annotation Type in org.apache.spark.annotation
- 
A new component of Spark which may have unstable API's. 
- ALS - Class in org.apache.spark.ml.recommendation
- 
:: Experimental ::
 Alternating Least Squares (ALS) matrix factorization. 
- ALS(String) - Constructor for class org.apache.spark.ml.recommendation.ALS
-  
- ALS() - Constructor for class org.apache.spark.ml.recommendation.ALS
-  
- ALS - Class in org.apache.spark.mllib.recommendation
-  
- ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
-  
- ALS.Rating<ID> - Class in org.apache.spark.ml.recommendation
- 
:: DeveloperApi ::
 Rating class for better code readability. 
- ALS.Rating(ID, ID, float) - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating
-  
- ALS.Rating$ - Class in org.apache.spark.ml.recommendation
-  
- ALS.Rating$() - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating$
-  
- ALSModel - Class in org.apache.spark.ml.recommendation
- 
:: Experimental ::
 Model fitted by ALS. 
- AnalysisException - Exception in org.apache.spark.sql
- 
:: DeveloperApi ::
 Thrown when a query fails to analyze, usually because the query itself is invalid. 
- AnalysisException(String, Option<Object>, Option<Object>) - Constructor for exception org.apache.spark.sql.AnalysisException
-  
- analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext
- 
Analyzes the given table in the current database to generate statistics, which will be
 used in query optimizations. 
- analyzer() - Method in class org.apache.spark.sql.hive.HiveContext
-  
- analyzer() - Method in class org.apache.spark.sql.SQLContext
-  
- and(Column) - Method in class org.apache.spark.sql.Column
- 
Boolean AND. 
- And - Class in org.apache.spark.sql.sources
- 
A filter that evaluates to trueiff bothleftorrightevaluate totrue.
 
- And(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.And
-  
- antecedent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
-  
- ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
-  
- anyNull() - Method in interface org.apache.spark.sql.Row
- 
Returns true if there are any NULL values in this row. 
- appAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-  
- appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
Returns a new vector with 1.0(bias) appended to the input vector.
 
- appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-  
- applicationAttemptId() - Method in class org.apache.spark.SparkContext
-  
- ApplicationAttemptInfo - Class in org.apache.spark.status.api.v1
-  
- applicationId() - Method in class org.apache.spark.SparkContext
- 
A unique identifier for the Spark application. 
- ApplicationInfo - Class in org.apache.spark.status.api.v1
-  
- ApplicationStatus - Enum in org.apache.spark.status.api.v1
-  
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
- 
Construct a graph from a collection of vertices and
 edges with attributes. 
- apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
- 
Create a graph from edges, setting referenced vertices to `defaultVertexAttr`. 
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
- 
Create a graph from vertices and edges, setting missing vertices to `defaultVertexAttr`. 
- apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
- 
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices. 
- apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel
- 
Execute a Pregel-like iterative vertex-parallel abstraction. 
- apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
- 
Constructs a standalone  VertexRDD (one that is not set up for efficient joins with an
  EdgeRDD) from an RDD of vertex-attribute pairs. 
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
- 
Constructs a VertexRDDfrom an RDD of vertex-attribute pairs.
 
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
- 
Constructs a VertexRDDfrom an RDD of vertex-attribute pairs.
 
- apply(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Gets an attribute by its name. 
- apply(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Gets an attribute by its index. 
- apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
- 
Gets the value of the input param or its default value if it does not exist. 
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-  
- apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
-  
- apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Gets the (i, j)-th element. 
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-  
- apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Gets the value of the ith element. 
- apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node
- 
Construct a node with nodeIndex, predict, impurity and isLeaf parameters. 
- apply(String) - Static method in class org.apache.spark.rdd.PartitionGroup
-  
- apply(long, String, Option<String>, String, boolean) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-  
- apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-  
- apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-  
- apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
-  
- apply(Object) - Method in class org.apache.spark.sql.Column
- 
Extracts a value or values from a complex type. 
- apply(String) - Method in class org.apache.spark.sql.DataFrame
- 
Selects column based on the column name and return it as a  Column. 
- apply(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
- 
Creates a Columnfor this UDAF using givenColumns as input arguments.
 
- apply(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
- 
Creates a Columnfor this UDAF using givenColumns as input arguments.
 
- apply(DataFrame, Seq<Expression>, GroupedData.GroupType) - Static method in class org.apache.spark.sql.GroupedData
-  
- apply(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i. 
- apply(DataType) - Static method in class org.apache.spark.sql.types.ArrayType
- 
Construct a  ArrayType object with the given element type. 
- apply(double) - Static method in class org.apache.spark.sql.types.Decimal
-  
- apply(long) - Static method in class org.apache.spark.sql.types.Decimal
-  
- apply(int) - Static method in class org.apache.spark.sql.types.Decimal
-  
- apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
-  
- apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
-  
- apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
-  
- apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
-  
- apply(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal
-  
- apply(String) - Static method in class org.apache.spark.sql.types.Decimal
-  
- apply() - Static method in class org.apache.spark.sql.types.DecimalType
-  
- apply(Option<PrecisionInfo>) - Static method in class org.apache.spark.sql.types.DecimalType
-  
- apply(DataType, DataType) - Static method in class org.apache.spark.sql.types.MapType
- 
Construct a  MapType object with the given key type and value type. 
- apply(String) - Method in class org.apache.spark.sql.types.StructType
- 
- apply(Set<String>) - Method in class org.apache.spark.sql.types.StructType
- 
Returns a  StructType containing  StructFields of the given names, preserving the
 original order of fields. 
- apply(int) - Method in class org.apache.spark.sql.types.StructType
-  
- apply(Seq<Column>) - Method in class org.apache.spark.sql.UserDefinedFunction
-  
- apply(String) - Static method in class org.apache.spark.storage.BlockId
- 
Converts a BlockId "name" String back into a BlockId. 
- apply(String, String, int) - Static method in class org.apache.spark.storage.BlockManagerId
- 
- apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
-  
- apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
- 
:: DeveloperApi ::
 Create a new StorageLevel object without setting useOffHeap. 
- apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
- 
:: DeveloperApi ::
 Create a new StorageLevel object. 
- apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
- 
:: DeveloperApi ::
 Create a new StorageLevel object from its integer representation. 
- apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
- 
:: DeveloperApi ::
 Read StorageLevel object from ObjectInput stream. 
- apply(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
-  
- apply(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-  
- apply(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-  
- apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
-  
- apply(long) - Static method in class org.apache.spark.streaming.Minutes
-  
- apply(long) - Static method in class org.apache.spark.streaming.Seconds
-  
- apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
- 
Build a StatCounter from a list of values. 
- apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
- 
Build a StatCounter from a list of values passed as variable-length arguments. 
- apply(int) - Method in class org.apache.spark.util.Vector
-  
- applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-  
- applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-  
- applySchema(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-  
- applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-  
- applySchemaToPythonRDD(RDD<Object[]>, String) - Method in class org.apache.spark.sql.SQLContext
-  
- applySchemaToPythonRDD(RDD<Object[]>, StructType) - Method in class org.apache.spark.sql.SQLContext
-  
- appName() - Method in class org.apache.spark.api.java.JavaSparkContext
-  
- appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-  
- appName() - Method in class org.apache.spark.SparkContext
-  
- approxCountDistinct(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the approximate number of distinct items in a group. 
- approxCountDistinct(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the approximate number of distinct items in a group. 
- approxCountDistinct(Column, double) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the approximate number of distinct items in a group. 
- approxCountDistinct(String, double) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the approximate number of distinct items in a group. 
- ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-  
- areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
- 
Computes the area under the precision-recall curve. 
- areaUnderROC() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
- 
Computes the area under the receiver operating characteristic (ROC) curve. 
- areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
- 
Computes the area under the receiver operating characteristic (ROC) curve. 
- argmax() - Method in class org.apache.spark.mllib.linalg.DenseVector
-  
- argmax() - Method in class org.apache.spark.mllib.linalg.SparseVector
-  
- argmax() - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Find the index of a maximal element. 
- arr() - Method in class org.apache.spark.rdd.PartitionGroup
-  
- array(DataType) - Method in class org.apache.spark.sql.ColumnName
- 
Creates a new StructFieldof type array.
 
- array(Column...) - Static method in class org.apache.spark.sql.functions
- 
Creates a new array column. 
- array(String, String...) - Static method in class org.apache.spark.sql.functions
- 
Creates a new array column. 
- array(Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Creates a new array column. 
- array(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
- 
Creates a new array column. 
- array_contains(Column, Object) - Static method in class org.apache.spark.sql.functions
- 
Returns true if the array contain the value 
- arrayLengthGt(double) - Static method in class org.apache.spark.ml.param.ParamValidators
- 
Check that the array length is greater than lowerBound. 
- ArrayType - Class in org.apache.spark.sql.types
-  
- ArrayType(DataType, boolean) - Constructor for class org.apache.spark.sql.types.ArrayType
-  
- ArrayType() - Constructor for class org.apache.spark.sql.types.ArrayType
- 
No-arg constructor for kryo. 
- as(Encoder<U>) - Method in class org.apache.spark.sql.Column
- 
Provides a type hint about the expected return value of this column. 
- as(String) - Method in class org.apache.spark.sql.Column
- 
Gives the column an alias. 
- as(Seq<String>) - Method in class org.apache.spark.sql.Column
- 
(Scala-specific) Assigns the given aliases to the results of a table generating function. 
- as(String[]) - Method in class org.apache.spark.sql.Column
- 
Assigns the given aliases to the results of a table generating function. 
- as(Symbol) - Method in class org.apache.spark.sql.Column
- 
Gives the column an alias. 
- as(String, Metadata) - Method in class org.apache.spark.sql.Column
- 
Gives the column an alias with metadata. 
- as(Encoder<U>) - Method in class org.apache.spark.sql.DataFrame
- 
:: Experimental ::
 Converts this  DataFrame to a strongly-typed  Dataset containing objects of the
 specified type,  U. 
- as(String) - Method in class org.apache.spark.sql.DataFrame
- 
- as(Symbol) - Method in class org.apache.spark.sql.DataFrame
- 
(Scala-specific) Returns a new  DataFrame with an alias set. 
- as(Encoder<U>) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset where each record has been mapped on to the specified type. 
- as(String) - Method in class org.apache.spark.sql.Dataset
- 
Applies a logical alias to this  Dataset that can be used to disambiguate columns that have
 the same name after two Datasets have been joined. 
- asc() - Method in class org.apache.spark.sql.Column
- 
Returns an ordering used in sorting. 
- asc(String) - Static method in class org.apache.spark.sql.functions
- 
Returns a sort expression based on ascending order of the column. 
- ascii(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the numeric value of the first character of the string column, and returns the
 result as a int column. 
- asin(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the sine inverse of the given value; the returned angle is in the range
 -pi/2 through pi/2. 
- asin(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the sine inverse of the given column; the returned angle is in the range
 -pi/2 through pi/2. 
- asIntegral() - Method in class org.apache.spark.sql.types.DecimalType
-  
- asIntegral() - Method in class org.apache.spark.sql.types.DoubleType
-  
- asIntegral() - Method in class org.apache.spark.sql.types.FloatType
-  
- asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
- 
Read the elements of this stream through an iterator. 
- asJavaPairRDD() - Method in class org.apache.spark.api.r.PairwiseRRDD
-  
- asJavaRDD() - Method in class org.apache.spark.api.r.RRDD
-  
- asJavaRDD() - Method in class org.apache.spark.api.r.StringRRDD
-  
- asKeyValueIterator() - Method in class org.apache.spark.serializer.DeserializationStream
- 
Read the elements of this stream through an iterator over key-value pairs. 
- AskPermissionToCommitOutput - Class in org.apache.spark.scheduler
-  
- AskPermissionToCommitOutput(int, int, int) - Constructor for class org.apache.spark.scheduler.AskPermissionToCommitOutput
-  
- askTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
-  
- asRDDId() - Method in class org.apache.spark.storage.BlockId
-  
- assertValid() - Method in class org.apache.spark.broadcast.Broadcast
- 
Check if this broadcast is valid. 
- assignments() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-  
- AssociationRules - Class in org.apache.spark.mllib.fpm
- 
:: Experimental :: 
- AssociationRules() - Constructor for class org.apache.spark.mllib.fpm.AssociationRules
- 
Constructs a default instance with default parameters {minConfidence = 0.8}. 
- AssociationRules.Rule<Item> - Class in org.apache.spark.mllib.fpm
- 
:: Experimental :: 
- AsyncRDDActions<T> - Class in org.apache.spark.rdd
- 
A set of asynchronous RDD actions available through an implicit conversion. 
- AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
-  
- atan(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the tangent inverse of the given value. 
- atan(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the tangent inverse of the given column. 
- atan2(Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
 polar coordinates (r, theta). 
- atan2(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
 polar coordinates (r, theta). 
- atan2(String, Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
 polar coordinates (r, theta). 
- atan2(String, String) - Static method in class org.apache.spark.sql.functions
- 
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
 polar coordinates (r, theta). 
- atan2(Column, double) - Static method in class org.apache.spark.sql.functions
- 
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
 polar coordinates (r, theta). 
- atan2(String, double) - Static method in class org.apache.spark.sql.functions
- 
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
 polar coordinates (r, theta). 
- atan2(double, Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
 polar coordinates (r, theta). 
- atan2(double, String) - Static method in class org.apache.spark.sql.functions
- 
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
 polar coordinates (r, theta). 
- attempt() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- attempt() - Method in class org.apache.spark.status.api.v1.TaskData
-  
- attemptId() - Method in class org.apache.spark.scheduler.StageInfo
-  
- attemptId() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-  
- attemptId() - Method in class org.apache.spark.status.api.v1.StageData
-  
- attemptId() - Method in class org.apache.spark.TaskContext
-  
- attemptNumber() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-  
- attemptNumber() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- attemptNumber() - Method in class org.apache.spark.TaskCommitDenied
-  
- attemptNumber() - Method in class org.apache.spark.TaskContext
- 
How many times this task has been attempted. 
- attempts() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-  
- attr() - Method in class org.apache.spark.graphx.Edge
-  
- attr() - Method in class org.apache.spark.graphx.EdgeContext
- 
The attribute associated with the edge. 
- attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- Attribute - Class in org.apache.spark.ml.attribute
- 
:: DeveloperApi ::
 Abstract class for ML attributes. 
- Attribute() - Constructor for class org.apache.spark.ml.attribute.Attribute
-  
- attribute() - Method in class org.apache.spark.sql.sources.EqualNullSafe
-  
- attribute() - Method in class org.apache.spark.sql.sources.EqualTo
-  
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
-  
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
-  
- attribute() - Method in class org.apache.spark.sql.sources.In
-  
- attribute() - Method in class org.apache.spark.sql.sources.IsNotNull
-  
- attribute() - Method in class org.apache.spark.sql.sources.IsNull
-  
- attribute() - Method in class org.apache.spark.sql.sources.LessThan
-  
- attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
-  
- attribute() - Method in class org.apache.spark.sql.sources.StringContains
-  
- attribute() - Method in class org.apache.spark.sql.sources.StringEndsWith
-  
- attribute() - Method in class org.apache.spark.sql.sources.StringStartsWith
-  
- AttributeGroup - Class in org.apache.spark.ml.attribute
- 
:: DeveloperApi ::
 Attributes that describe a vector ML column. 
- AttributeGroup(String) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
- 
Creates an attribute group without attribute info. 
- AttributeGroup(String, int) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
- 
Creates an attribute group knowing only the number of attributes. 
- AttributeGroup(String, Attribute[]) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
- 
Creates an attribute group with attributes. 
- attributes() - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Optional array of attributes. 
- AttributeType - Class in org.apache.spark.ml.attribute
- 
:: DeveloperApi ::
 An enum-like type for attribute types: AttributeType$.Numeric,AttributeType$.Nominal,
 andAttributeType$.Binary.
 
- AttributeType(String) - Constructor for class org.apache.spark.ml.attribute.AttributeType
-  
- attrType() - Method in class org.apache.spark.ml.attribute.Attribute
- 
Attribute type. 
- attrType() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
-  
- attrType() - Method in class org.apache.spark.ml.attribute.NominalAttribute
-  
- attrType() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-  
- attrType() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-  
- available() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-  
- avg(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the average of the values in a group. 
- avg(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the average of the values in a group. 
- avg(String...) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the mean value for each numeric columns for each group. 
- avg(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the mean value for each numeric columns for each group. 
- avgMetrics() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-  
- awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Wait for the execution to stop. 
- awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Deprecated.
As of 1.3.0, replaced by awaitTerminationOrTimeout(Long).
 
 
- awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
- 
Wait for the execution to stop. 
- awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext
- 
Deprecated.
As of 1.3.0, replaced by awaitTerminationOrTimeout(Long).
 
 
- awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Wait for the execution to stop. 
- awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.StreamingContext
- 
Wait for the execution to stop. 
- cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Persist this RDD with the default storage level (`MEMORY_ONLY`). 
- cache() - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Persist this RDD with the default storage level (`MEMORY_ONLY`). 
- cache() - Method in class org.apache.spark.api.java.JavaRDD
- 
Persist this RDD with the default storage level (`MEMORY_ONLY`). 
- cache() - Method in class org.apache.spark.graphx.Graph
- 
Caches the vertices and edges associated with this graph at the previously-specified target
 storage levels, which default to MEMORY_ONLY.
 
- cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
- 
Persists the edge partitions using `targetStorageLevel`, which defaults to MEMORY_ONLY. 
- cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
- 
Persists the vertex partitions at `targetStorageLevel`, which defaults to MEMORY_ONLY. 
- cache() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
- 
Caches the underlying RDD. 
- cache() - Method in class org.apache.spark.rdd.RDD
- 
Persist this RDD with the default storage level (`MEMORY_ONLY`). 
- cache() - Method in class org.apache.spark.sql.DataFrame
- 
Persist this  DataFrame with the default storage level ( MEMORY_AND_DISK). 
- cache() - Method in class org.apache.spark.sql.Dataset
- 
Persist this  Dataset with the default storage level ( MEMORY_AND_DISK). 
- cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
- 
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER) 
- cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER) 
- cache() - Method in class org.apache.spark.streaming.dstream.DStream
- 
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER) 
- cachedLeafStatuses() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
-  
- cacheManager() - Method in class org.apache.spark.SparkEnv
-  
- cacheManager() - Method in class org.apache.spark.sql.SQLContext
-  
- cacheTable(String) - Method in class org.apache.spark.sql.SQLContext
- 
Caches the specified table in-memory. 
- calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.classification.LogisticCostFun
-  
- calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.regression.AFTCostFun
-  
- calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.regression.LeastSquaresCostFun
-  
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
- 
:: DeveloperApi ::
 information calculation for multiclass classification 
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
- 
:: DeveloperApi ::
 variance calculation 
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
- 
:: DeveloperApi ::
 information calculation for multiclass classification 
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
- 
:: DeveloperApi ::
 variance calculation 
- calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
- 
:: DeveloperApi ::
 information calculation for multiclass classification 
- calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
- 
:: DeveloperApi ::
 information calculation for regression 
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
- 
:: DeveloperApi ::
 information calculation for multiclass classification 
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
- 
:: DeveloperApi ::
 variance calculation 
- CalendarIntervalType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 The data type representing calendar time intervals. 
- CalendarIntervalType - Static variable in class org.apache.spark.sql.types.DataTypes
- 
Gets the CalendarIntervalType object. 
- call(K, Iterator<V1>, Iterator<V2>) - Method in interface org.apache.spark.api.java.function.CoGroupFunction
-  
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
-  
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
-  
- call(T) - Method in interface org.apache.spark.api.java.function.FilterFunction
-  
- call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
-  
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
-  
- call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.FlatMapGroupsFunction
-  
- call(T) - Method in interface org.apache.spark.api.java.function.ForeachFunction
-  
- call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.ForeachPartitionFunction
-  
- call(T1) - Method in interface org.apache.spark.api.java.function.Function
-  
- call() - Method in interface org.apache.spark.api.java.function.Function0
-  
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
-  
- call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
-  
- call(T1, T2, T3, T4) - Method in interface org.apache.spark.api.java.function.Function4
-  
- call(T) - Method in interface org.apache.spark.api.java.function.MapFunction
-  
- call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.MapGroupsFunction
-  
- call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.MapPartitionsFunction
-  
- call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
-  
- call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
-  
- call(T, T) - Method in interface org.apache.spark.api.java.function.ReduceFunction
-  
- call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
-  
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.VoidFunction2
-  
- call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
-  
- call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
-  
- call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
-  
- call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
-  
- call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
-  
- call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
-  
- call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
-  
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
-  
- callSite() - Method in class org.apache.spark.storage.RDDInfo
-  
- callUDF(String, Column...) - Static method in class org.apache.spark.sql.functions
- 
Call an user-defined function. 
- callUDF(Function0<?>, DataType) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf()
              This will be removed in Spark 2.0. 
 
- callUDF(Function1<?, ?>, DataType, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf()
              This will be removed in Spark 2.0. 
 
- callUDF(Function2<?, ?, ?>, DataType, Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf()
              This will be removed in Spark 2.0. 
 
- callUDF(Function3<?, ?, ?, ?>, DataType, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf()
              This will be removed in Spark 2.0. 
 
- callUDF(Function4<?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf()
              This will be removed in Spark 2.0. 
 
- callUDF(Function5<?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf()
              This will be removed in Spark 2.0. 
 
- callUDF(Function6<?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf()
              This will be removed in Spark 2.0. 
 
- callUDF(Function7<?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf()
              This will be removed in Spark 2.0. 
 
- callUDF(Function8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf()
              This will be removed in Spark 2.0. 
 
- callUDF(Function9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf().
              This will be removed in Spark 2.0. 
 
- callUDF(Function10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it's redundant with udf().
              This will be removed in Spark 2.0. 
 
- callUDF(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Call an user-defined function. 
- callUdf(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.5.0, since it was not coherent to have two functions callUdf and callUDF.
             This will be removed in Spark 2.0. 
 
- cancel() - Method in class org.apache.spark.ComplexFutureAction
-  
- cancel() - Method in interface org.apache.spark.FutureAction
- 
Cancels the execution of this action. 
- cancel() - Method in class org.apache.spark.SimpleFutureAction
-  
- cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Cancel all jobs that have been scheduled or are running. 
- cancelAllJobs() - Method in class org.apache.spark.SparkContext
- 
Cancel all jobs that have been scheduled or are running. 
- cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Cancel active jobs for the specified group. 
- cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
- 
Cancel active jobs for the specified group. 
- canEqual(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-  
- canEqual(Object) - Method in class org.apache.spark.util.MutablePair
-  
- canHandle(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-  
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-  
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-  
- canHandle(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
- 
Check if this dialect instance can handle a certain jdbc url. 
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-  
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-  
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-  
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-  
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-  
- cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
 elements (a, b) where a is in thisand b is inother.
 
- cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
 elements (a, b) where a is in thisand b is inother.
 
- caseSensitive() - Method in class org.apache.spark.ml.feature.StopWordsRemover
- 
whether to do a case sensitive comparison over the stop words
 Default: false 
- cast(DataType) - Method in class org.apache.spark.sql.Column
- 
Casts the column to a different data type. 
- cast(String) - Method in class org.apache.spark.sql.Column
- 
Casts the column to a different data type, using the canonical string representation
 of the type. 
- catalog() - Method in class org.apache.spark.sql.hive.HiveContext
-  
- catalog() - Method in class org.apache.spark.sql.SQLContext
-  
- CatalystScan - Interface in org.apache.spark.sql.sources
- 
::Experimental::
 An interface for experimenting with a more direct connection to the query planner. 
- Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-  
- categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- CategoricalSplit - Class in org.apache.spark.ml.tree
- 
:: DeveloperApi ::
 Split which tests a categorical feature. 
- categories() - Method in class org.apache.spark.mllib.tree.model.Split
-  
- categoryMaps() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-  
- cbrt(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the cube-root of the given value. 
- cbrt(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the cube-root of the given column. 
- ceil(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the ceiling of the given value. 
- ceil(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the ceiling of the given column. 
- ceil() - Method in class org.apache.spark.sql.types.Decimal
-  
- changePrecision(int, int) - Method in class org.apache.spark.sql.types.Decimal
- 
Update precision and scale while keeping our value the same, and return true if successful. 
- checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Mark this RDD for checkpointing. 
- checkpoint() - Method in class org.apache.spark.graphx.Graph
- 
Mark this Graph for checkpointing. 
- checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
-  
- checkpoint() - Method in class org.apache.spark.rdd.RDD
- 
Mark this RDD for checkpointing. 
- checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Enable periodic checkpointing of RDDs of this DStream. 
- checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Sets the context to periodically checkpoint the DStream operations for master
 fault-tolerance. 
- checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Enable periodic checkpointing of RDDs of this DStream 
- checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
- 
Set the context to periodically checkpoint the DStream operations for driver
 fault-tolerance. 
- checkpointData() - Method in class org.apache.spark.rdd.RDD
-  
- checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
-  
- checkpointDir() - Method in class org.apache.spark.SparkContext
-  
- checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
-  
- checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-  
- checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
-  
- checkpointFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-  
- checkpointFile(String, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-  
- checkpointInterval() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-  
- checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- child() - Method in class org.apache.spark.sql.sources.Not
-  
- CHILD_CONNECTION_TIMEOUT - Static variable in class org.apache.spark.launcher.SparkLauncher
- 
Maximum time (in ms) to wait for a child process to connect back to the launcher server
 when using @link{#start()}. 
- CHILD_PROCESS_LOGGER_NAME - Static variable in class org.apache.spark.launcher.SparkLauncher
- 
Logger name to use when launching a child process. 
- ChiSqSelector - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 Chi-Squared feature selection, which selects categorical features to use for predicting a
 categorical label. 
- ChiSqSelector(String) - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
-  
- ChiSqSelector() - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
-  
- ChiSqSelector - Class in org.apache.spark.mllib.feature
-  
- ChiSqSelector(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
-  
- ChiSqSelectorModel - Class in org.apache.spark.ml.feature
-  
- ChiSqSelectorModel - Class in org.apache.spark.mllib.feature
- 
Chi Squared selector model. 
- ChiSqSelectorModel(int[]) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel
-  
- chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Conduct Pearson's chi-squared goodness of fit test of the observed data against the
 expected distribution. 
- chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform
 distribution, with each category having an expected frequency of 1 / observed.size.
 
- chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Conduct Pearson's independence test on the input contingency matrix, which cannot contain
 negative entries or columns or rows that sum up to 0. 
- chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Conduct Pearson's independence test for every feature against the label across the input RDD. 
- chiSqTest(JavaRDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Java-friendly version of chiSqTest()
 
- ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
- 
Object containing the test results for the chi-squared hypothesis test. 
- Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-  
- ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
- 
:: DeveloperApi :: 
- ClassificationModel() - Constructor for class org.apache.spark.ml.classification.ClassificationModel
-  
- ClassificationModel - Interface in org.apache.spark.mllib.classification
- 
Represents a classification model that predicts to which of a set of categories an example
 belongs. 
- Classifier<FeaturesType,E extends Classifier<FeaturesType,E,M>,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
- 
:: DeveloperApi :: 
- Classifier() - Constructor for class org.apache.spark.ml.classification.Classifier
-  
- className() - Method in class org.apache.spark.ExceptionFailure
-  
- classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
-  
- classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-  
- classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
-  
- classTag() - Method in class org.apache.spark.api.java.JavaRDD
-  
- classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
-  
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-  
- classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-  
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
-  
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-  
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-  
- clean(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLog
- 
Clean all the records that are older than the threshold time. 
- CleanAccum - Class in org.apache.spark
-  
- CleanAccum(long) - Constructor for class org.apache.spark.CleanAccum
-  
- CleanBroadcast - Class in org.apache.spark
-  
- CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
-  
- CleanCheckpoint - Class in org.apache.spark
-  
- CleanCheckpoint(int) - Constructor for class org.apache.spark.CleanCheckpoint
-  
- CleanRDD - Class in org.apache.spark
-  
- CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
-  
- CleanShuffle - Class in org.apache.spark
-  
- CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
-  
- CleanupTask - Interface in org.apache.spark
- 
Classes that represent cleaning tasks. 
- CleanupTaskWeakReference - Class in org.apache.spark
- 
A WeakReference associated with a CleanupTask. 
- CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
-  
- clear(Param<?>) - Method in interface org.apache.spark.ml.param.Params
-  
- clear() - Method in class org.apache.spark.sql.util.ExecutionListenerManager
- 
- clearActive() - Static method in class org.apache.spark.sql.SQLContext
- 
Clears the active SQLContext for current thread. 
- clearCache() - Method in class org.apache.spark.sql.SQLContext
- 
Removes all cached tables from the in-memory cache. 
- clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Pass-through to SparkContext.setCallSite. 
- clearCallSite() - Method in class org.apache.spark.SparkContext
- 
Clear the thread-local property for overriding the call sites
 of actions and RDDs. 
- clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-  
- clearDependencies() - Method in class org.apache.spark.rdd.RDD
- 
Clears the dependencies of this RDD. 
- clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-  
- clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-  
- clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Clear the job's list of files added by addFileso that they do not get downloaded to
 any new nodes.
 
- clearFiles() - Method in class org.apache.spark.SparkContext
- 
Clear the job's list of files added by addFileso that they do not get downloaded to
 any new nodes.
 
- clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Clear the job's list of JARs added by addJarso that they do not get downloaded to
 any new nodes.
 
- clearJars() - Method in class org.apache.spark.SparkContext
- 
Clear the job's list of JARs added by addJarso that they do not get downloaded to
 any new nodes.
 
- clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Clear the current thread's job group ID and its description. 
- clearJobGroup() - Method in class org.apache.spark.SparkContext
- 
Clear the current thread's job group ID and its description. 
- clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
- 
Clears the threshold so that predictwill output raw prediction scores.
 
- clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
- 
Clears the threshold so that predictwill output raw prediction scores.
 
- clone() - Method in class org.apache.spark.SparkConf
- 
Copy this object 
- clone() - Method in class org.apache.spark.sql.types.Decimal
-  
- clone() - Method in class org.apache.spark.storage.StorageLevel
-  
- clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-  
- clone() - Method in class org.apache.spark.util.random.BernoulliSampler
-  
- clone() - Method in class org.apache.spark.util.random.PoissonSampler
-  
- clone() - Method in interface org.apache.spark.util.random.RandomSampler
- 
return a copy of the RandomSampler object 
- cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler
- 
Return a sampler that is the complement of the range specified of the current sampler. 
- close() - Method in class org.apache.spark.api.java.JavaSparkContext
-  
- close() - Method in class org.apache.spark.input.PortableDataStream
- 
Closing the PortableDataStream is not needed anymore. 
- close() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
-  
- close() - Method in class org.apache.spark.serializer.DeserializationStream
-  
- close() - Method in class org.apache.spark.serializer.SerializationStream
-  
- close() - Method in class org.apache.spark.sql.sources.OutputWriter
- 
- close() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-  
- close() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
-  
- close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-  
- close() - Method in class org.apache.spark.streaming.util.WriteAheadLog
- 
Close this log and release any resources. 
- closeLogWriter(int) - Method in class org.apache.spark.scheduler.JobLogger
- 
Close log file, and clean the stage relationship in stageIdToJobId 
- closureSerializer() - Method in class org.apache.spark.SparkEnv
-  
- cls() - Method in class org.apache.spark.util.MethodIdentifier
-  
- clsTag() - Method in interface org.apache.spark.sql.Encoder
- 
A ClassTag that can be used to construct and Array to contain a collection of `T`. 
- cluster() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-  
- clusterCenters() - Method in class org.apache.spark.ml.clustering.KMeansModel
-  
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
- 
Leaf cluster centers. 
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
-  
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-  
- clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-  
- cn() - Method in class org.apache.spark.mllib.feature.VocabWord
-  
- coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return a new RDD that is reduced into numPartitionspartitions.
 
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return a new RDD that is reduced into numPartitionspartitions.
 
- coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a new RDD that is reduced into numPartitionspartitions.
 
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a new RDD that is reduced into numPartitionspartitions.
 
- coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return a new RDD that is reduced into numPartitionspartitions.
 
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return a new RDD that is reduced into numPartitionspartitions.
 
- coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD that is reduced into numPartitionspartitions.
 
- coalesce(int) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame that has exactly  numPartitions partitions. 
- coalesce(int) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset that has exactly  numPartitions partitions. 
- coalesce(Column...) - Static method in class org.apache.spark.sql.functions
- 
Returns the first column that is not null, or null if all inputs are null. 
- coalesce(Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Returns the first column that is not null, or null if all inputs are null. 
- code() - Method in class org.apache.spark.mllib.feature.VocabWord
-  
- codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
-  
- coefficients() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- coefficients() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- coefficients() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-  
- coefficientStandardErrors() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
- 
Standard error of estimated coefficients and intercept. 
- cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
For each key k in thisorother, return a resulting RDD that contains a tuple with the
 list of values for that key inthisas well asother.
 
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
For each key k in thisorother1orother2, return a resulting RDD that contains a
 tuple with the list of values for that key inthis,other1andother2.
 
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
For each key k in thisorother1orother2orother3,
 return a resulting RDD that contains a tuple with the list of values
 for that key inthis,other1,other2andother3.
 
- cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
For each key k in thisorother, return a resulting RDD that contains a tuple with the
 list of values for that key inthisas well asother.
 
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
For each key k in thisorother1orother2, return a resulting RDD that contains a
 tuple with the list of values for that key inthis,other1andother2.
 
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
For each key k in thisorother1orother2orother3,
 return a resulting RDD that contains a tuple with the list of values
 for that key inthis,other1,other2andother3.
 
- cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
For each key k in thisorother, return a resulting RDD that contains a tuple with the
 list of values for that key inthisas well asother.
 
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
For each key k in thisorother1orother2, return a resulting RDD that contains a
 tuple with the list of values for that key inthis,other1andother2.
 
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
For each key k in thisorother1orother2orother3,
 return a resulting RDD that contains a tuple with the list of values
 for that key inthis,other1,other2andother3.
 
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
For each key k in thisorother1orother2orother3,
 return a resulting RDD that contains a tuple with the list of values
 for that key inthis,other1,other2andother3.
 
- cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-  
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-  
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-  
- cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
For each key k in thisorother, return a resulting RDD that contains a tuple with the
 list of values for that key inthisas well asother.
 
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
For each key k in thisorother1orother2, return a resulting RDD that contains a
 tuple with the list of values for that key inthis,other1andother2.
 
- cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
For each key k in thisorother, return a resulting RDD that contains a tuple with the
 list of values for that key inthisas well asother.
 
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
For each key k in thisorother1orother2, return a resulting RDD that contains a
 tuple with the list of values for that key inthis,other1andother2.
 
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
For each key k in thisorother1orother2orother3,
 return a resulting RDD that contains a tuple with the list of values
 for that key inthis,other1,other2andother3.
 
- cogroup(GroupedDataset<K, U>, Function3<K, Iterator<V>, Iterator<U>, TraversableOnce<R>>, Encoder<R>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Applies the given function to each cogrouped data. 
- cogroup(GroupedDataset<K, U>, CoGroupFunction<K, V, U, R>, Encoder<R>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Applies the given function to each cogrouped data. 
- cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'cogroup' between RDDs of thisDStream andotherDStream.
 
- cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'cogroup' between RDDs of thisDStream andotherDStream.
 
- cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'cogroup' between RDDs of thisDStream andotherDStream.
 
- cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'cogroup' between RDDs of thisDStream andotherDStream.
 
- cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'cogroup' between RDDs of thisDStream andotherDStream.
 
- cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'cogroup' between RDDs of thisDStream andotherDStream.
 
- CoGroupedRDD<K> - Class in org.apache.spark.rdd
- 
:: DeveloperApi ::
 A RDD that cogroups its parents. 
- CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner, ClassTag<K>) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
-  
- CoGroupFunction<K,V1,V2,R> - Interface in org.apache.spark.api.java.function
- 
A function that returns zero or more output records from each grouping key and its values from 2
 Datasets. 
- col(String) - Method in class org.apache.spark.sql.DataFrame
- 
Selects column based on the column name and return it as a  Column. 
- col(String) - Static method in class org.apache.spark.sql.functions
- 
Returns a  Column based on the given column name. 
- collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return an array that contains all of the elements in this RDD. 
- collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- collect() - Method in class org.apache.spark.rdd.RDD
- 
Return an array that contains all of the elements in this RDD. 
- collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD that contains all matching values by applying f.
 
- collect() - Method in class org.apache.spark.sql.DataFrame
- 
Returns an array that contains all of  Rows in this  DataFrame. 
- collect() - Method in class org.apache.spark.sql.Dataset
- 
Returns an array that contains all the elements in this  Dataset. 
- collect_list(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns a list of objects with duplicates. 
- collect_list(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns a list of objects with duplicates. 
- collect_set(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns a set of objects with duplicate elements eliminated. 
- collect_set(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns a set of objects with duplicate elements eliminated. 
- collectAsList() - Method in class org.apache.spark.sql.DataFrame
- 
Returns a Java list that contains all of  Rows in this  DataFrame. 
- collectAsList() - Method in class org.apache.spark.sql.Dataset
- 
Returns an array that contains all the elements in this  Dataset. 
- collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return the key-value pairs in this RDD to the master as a Map. 
- collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return the key-value pairs in this RDD to the master as a Map. 
- collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
The asynchronous version of collect, which returns a future for
 retrieving an array containing all of the elements in this RDD.
 
- collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
- 
Returns a future for retrieving all elements of this RDD. 
- collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
- 
Returns an RDD that contains for each vertex v its local edges,
 i.e., the edges that are incident on v, in the user-specified direction. 
- collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
- 
Collect the neighbor vertex ids for each vertex. 
- collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
- 
Collect the neighbor vertex attributes for each vertex. 
- collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return an array that contains all of the elements in a specific partition of this RDD. 
- collectToPython() - Method in class org.apache.spark.sql.DataFrame
-  
- colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-  
- colsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-  
- colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Computes column-wise summary statistics for the input RDD[Vector]. 
- Column - Class in org.apache.spark.sql
- 
:: Experimental ::
 A column that will be computed based on the data in a  DataFrame. 
- Column(Expression) - Constructor for class org.apache.spark.sql.Column
-  
- Column(String) - Constructor for class org.apache.spark.sql.Column
-  
- column(String) - Static method in class org.apache.spark.sql.functions
- 
Returns a  Column based on the given column name. 
- ColumnName - Class in org.apache.spark.sql
- 
:: Experimental ::
 A convenient class used for constructing schema. 
- ColumnName(String) - Constructor for class org.apache.spark.sql.ColumnName
-  
- ColumnPruner - Class in org.apache.spark.ml.feature
- 
Utility transformer for removing temporary columns from a DataFrame. 
- ColumnPruner(Set<String>) - Constructor for class org.apache.spark.ml.feature.ColumnPruner
-  
- columns() - Method in class org.apache.spark.sql.DataFrame
- 
Returns all column names as an array. 
- columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-  
- columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
Compute all cosine similarities between columns of this matrix using the brute-force
 approach of computing normalized dot products. 
- columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
Compute similarities between columns of this matrix using a sampling approach. 
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Generic function to combine the elements for each key using a custom set of aggregation
 functions. 
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Generic function to combine the elements for each key using a custom set of aggregation
 functions. 
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Simplified version of combineByKey that hash-partitions the output RDD and uses map-side
 aggregation. 
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing
 partitioner/parallelism level and using map-side aggregation. 
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Generic function to combine the elements for each key using a custom set of aggregation
 functions. 
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD. 
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-  
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Combine elements of each key in DStream's RDDs using custom function. 
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Combine elements of each key in DStream's RDDs using custom function. 
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Combine elements of each key in DStream's RDDs using custom functions. 
- combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
:: Experimental ::
 Generic function to combine the elements for each key using a custom set of aggregation
 functions. 
- combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
:: Experimental ::
 Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD. 
- combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
:: Experimental ::
 Simplified version of combineByKeyWithClassTag that hash-partitions the resulting RDD using the
 existing partitioner/parallelism level. 
- combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
-  
- combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
-  
- combinerClassName() - Method in class org.apache.spark.ShuffleDependency
-  
- combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
-  
- combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
-  
- compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- compare(Decimal) - Method in class org.apache.spark.sql.types.Decimal
-  
- compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
-  
- compareTo(SparkShutdownHook) - Method in class org.apache.spark.util.SparkShutdownHook
-  
- completed() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-  
- completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- completedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- completionTime() - Method in class org.apache.spark.scheduler.StageInfo
- 
Time when all tasks in the stage completed or when the stage was cancelled. 
- completionTime() - Method in class org.apache.spark.status.api.v1.JobData
-  
- ComplexFutureAction<T> - Class in org.apache.spark
- 
A  FutureAction for actions that could trigger multiple Spark jobs. 
- ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
-  
- compressed() - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Returns a vector in either dense or sparse format, whichever uses less storage. 
- compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
-  
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-  
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-  
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-  
- compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
-  
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-  
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-  
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-  
- CompressionCodec - Interface in org.apache.spark.io
- 
:: DeveloperApi ::
 CompressionCodec allows the customization of choosing different compression implementations
 to be used in block storage. 
- compute(Partition, TaskContext) - Method in class org.apache.spark.api.r.BaseRRDD
-  
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
-  
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD
- 
Provides the RDD[(VertexId, VD)]equivalent output.
 
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
- 
Compute the gradient and loss given the features of a single data point. 
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
- 
Compute the gradient and loss given the features of a single data point,
 add the gradient to a provided vector to avoid creating new objects, and return loss. 
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-  
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-  
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
-  
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-  
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-  
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-  
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-  
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
-  
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
-  
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
- 
Compute an updated value for weights given the gradient, stepSize, iteration number and
 regularization parameter. 
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
-  
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
-  
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
-  
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
-  
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
-  
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
- 
:: DeveloperApi ::
 Implemented by subclasses to compute a given partition. 
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
-  
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
-  
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
- 
Generate an RDD for the given duration 
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Method that generates a RDD for the given Duration 
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-  
- compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Method that generates a RDD for the given time 
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-  
- computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
Computes column-wise summary statistics. 
- computeCost(DataFrame) - Method in class org.apache.spark.ml.clustering.KMeansModel
- 
Return the K-means cost (sum of squared distances of points to their nearest center) for this
 model on the given data. 
- computeCost(Vector) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
- 
Computes the squared distance between the input point and the cluster center it belongs to. 
- computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
- 
Computes the sum of squared distances between the input points and their corresponding cluster
 centers. 
- computeCost(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
- 
Java-friendly version of computeCost().
 
- computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
- 
Return the K-means cost (sum of squared distances of points to their nearest center) for this
 model on the given data. 
- computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
Computes the covariance matrix, treating each row as an observation. 
- computeError(org.apache.spark.mllib.tree.model.TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss
- 
Method to calculate error of the base learner for the gradient boosting calculation. 
- computeError(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss
- 
Method to calculate loss when the predictions are already known. 
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-  
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
Computes the Gramian matrix A^T A.
 
- computeInitialPredictionAndError(RDD<LabeledPoint>, double, DecisionTreeModel, Loss) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
- 
:: DeveloperApi ::
 Compute the initial predictions and errors for a dataset for the first
 iteration of gradient boosting. 
- computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
- 
Computes the preferred locations based on input(s) and returned a location to block map. 
- computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
Computes the top k principal components. 
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-  
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
Computes singular value decomposition of this matrix. 
- concat(Column...) - Static method in class org.apache.spark.sql.functions
- 
Concatenates multiple input string columns together into a single string column. 
- concat(Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Concatenates multiple input string columns together into a single string column. 
- concat_ws(String, Column...) - Static method in class org.apache.spark.sql.functions
- 
Concatenates multiple input string columns together into a single string column,
 using the given separator. 
- concat_ws(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Concatenates multiple input string columns together into a single string column,
 using the given separator. 
- conf() - Method in class org.apache.spark.SparkEnv
-  
- conf() - Method in class org.apache.spark.sql.hive.HiveContext
-  
- conf() - Method in class org.apache.spark.sql.SQLContext
-  
- conf() - Method in class org.apache.spark.streaming.StreamingContext
-  
- confidence() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
- 
Returns the confidence of the rule. 
- confidence() - Method in class org.apache.spark.partial.BoundedDouble
-  
- configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
-  
- CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
- 
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456). 
- CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.NewHadoopRDD
- 
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456). 
- configure() - Method in class org.apache.spark.sql.hive.HiveContext
- 
Overridden by child classes that need to set configuration before the client init. 
- confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns confusion matrix:
 predicted classes are in columns,
 they are ordered by class label ascending,
 as in "labels" 
- connectedComponents() - Method in class org.apache.spark.graphx.GraphOps
- 
Compute the connected component membership of each vertex and return a graph with the vertex
 value containing the lowest vertex id in the connected component containing that vertex. 
- ConnectedComponents - Class in org.apache.spark.graphx.lib
- 
Connected components algorithm. 
- ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
-  
- consequent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
-  
- ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
- 
An input stream that always returns the same RDD on each timestep. 
- ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
-  
- contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap
- 
Checks whether a parameter is explicitly specified. 
- contains(String) - Method in class org.apache.spark.SparkConf
- 
Does the configuration contain a given parameter? 
- contains(Object) - Method in class org.apache.spark.sql.Column
- 
Contains the other element. 
- contains(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Tests whether this Metadata contains a binding for a key. 
- containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
- 
Return whether the given block is stored in this block manager in O(1) time. 
- containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
-  
- containsNull() - Method in class org.apache.spark.sql.types.ArrayType
-  
- context() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
- context() - Method in class org.apache.spark.InterruptibleIterator
-  
- context(SQLContext) - Method in class org.apache.spark.ml.util.MLReader
-  
- context(SQLContext) - Method in class org.apache.spark.ml.util.MLWriter
-  
- context() - Method in class org.apache.spark.rdd.RDD
- 
- context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
- context() - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return the StreamingContext associated with this DStream 
- Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-  
- ContinuousSplit - Class in org.apache.spark.ml.tree
- 
:: DeveloperApi ::
 Split which tests a continuous feature. 
- conv(Column, int, int) - Static method in class org.apache.spark.sql.functions
- 
Convert a number in a string column from one base to another. 
- CONVERT_CTAS() - Static method in class org.apache.spark.sql.hive.HiveContext
-  
- CONVERT_METASTORE_PARQUET() - Static method in class org.apache.spark.sql.hive.HiveContext
-  
- CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING() - Static method in class org.apache.spark.sql.hive.HiveContext
-  
- convertCTAS() - Method in class org.apache.spark.sql.hive.HiveContext
- 
When true, a table created by a Hive CTAS statement (no USING clause) will be
 converted to a data source table, using the data source set by spark.sql.sources.default. 
- convertMetastoreParquet() - Method in class org.apache.spark.sql.hive.HiveContext
- 
When true, enables an experimental feature where metastore tables that use the parquet SerDe
 are automatically converted to use the Spark SQL parquet table scan, instead of the Hive
 SerDe. 
- convertMetastoreParquetWithSchemaMerging() - Method in class org.apache.spark.sql.hive.HiveContext
- 
When true, also tries to merge possibly different but compatible Parquet schemas in different
 Parquet data files. 
- convertToCanonicalEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.GraphOps
- 
Convert bi-directional edges into uni-directional ones. 
- CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
-  
- CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-  
- CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegression
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayes
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRest
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeansModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LDA
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LocalLDAModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.Estimator
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.Evaluator
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Binarizer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Bucketizer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.ColumnPruner
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.HashingTF
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDF
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDFModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.IndexToString
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Interaction
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCA
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCAModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormula
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormulaModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.SQLTransformer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Tokenizer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAssembler
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorSlicer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2VecModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.Model
-  
- copy() - Method in class org.apache.spark.ml.param.ParamMap
- 
Creates a copy of this param map. 
- copy(ParamMap) - Method in interface org.apache.spark.ml.param.Params
- 
Creates a copy of this instance with the same UID and some extra params. 
- copy(ParamMap) - Method in class org.apache.spark.ml.Pipeline
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.PipelineModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.PipelineStage
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.Predictor
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALSModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegression
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.Transformer
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
-  
- copy(ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
-  
- copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-  
- copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
-  
- copy() - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Get a deep copy of the matrix. 
- copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-  
- copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
-  
- copy() - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Makes a deep copy of this vector. 
- copy() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-  
- copy() - Method in class org.apache.spark.mllib.random.GammaGenerator
-  
- copy() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-  
- copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-  
- copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
- 
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the
 class when applicable for non-locking concurrent usage. 
- copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-  
- copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
-  
- copy() - Method in class org.apache.spark.mllib.random.WeibullGenerator
-  
- copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
- 
Returns a shallow copy of this instance. 
- copy() - Method in interface org.apache.spark.sql.Row
- 
Make a copy of the current  Row object. 
- copy() - Method in class org.apache.spark.util.StatCounter
- 
Clone this StatCounter 
- copyValues(T, ParamMap) - Method in interface org.apache.spark.ml.param.Params
- 
Copies param values from this instance to another instance for params shared by them. 
- coresGranted() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-  
- coresPerExecutor() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-  
- corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Compute the Pearson correlation matrix for the input RDD of Vectors. 
- corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Compute the correlation matrix for the input RDD of Vectors using the specified method. 
- corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Compute the Pearson correlation for the input RDDs. 
- corr(JavaRDD<Double>, JavaRDD<Double>) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Java-friendly version of corr()
 
- corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Compute the correlation for the input RDDs using the specified method. 
- corr(JavaRDD<Double>, JavaRDD<Double>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
- 
Java-friendly version of corr()
 
- corr(String, String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
Calculates the correlation of two columns of a DataFrame. 
- corr(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
Calculates the Pearson Correlation Coefficient of two columns of a DataFrame. 
- corr(Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the Pearson Correlation Coefficient for two columns. 
- corr(String, String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the Pearson Correlation Coefficient for two columns. 
- cos(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the cosine of the given value. 
- cos(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the cosine of the given column. 
- cosh(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the hyperbolic cosine of the given value. 
- cosh(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the hyperbolic cosine of the given column. 
- count() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return the number of elements in the RDD. 
- count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
- 
The number of edges in the RDD. 
- count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
- 
The number of vertices in the RDD. 
- count() - Method in class org.apache.spark.ml.regression.AFTAggregator
-  
- count() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
-  
- count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
- 
Sample size. 
- count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
- 
Sample size. 
- count() - Method in class org.apache.spark.rdd.RDD
- 
Return the number of elements in the RDD. 
- count() - Method in class org.apache.spark.sql.DataFrame
- 
- count() - Method in class org.apache.spark.sql.Dataset
- 
Returns the number of elements in the  Dataset. 
- count(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the number of items in a group. 
- count(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the number of items in a group. 
- count() - Method in class org.apache.spark.sql.GroupedData
- 
Count the number of rows for each group. 
- count() - Method in class org.apache.spark.sql.GroupedDataset
- 
Returns a  Dataset that contains a tuple with each key and the number of items present
 for that key. 
- count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD has a single element generated by counting each RDD
 of this DStream. 
- count() - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD has a single element generated by counting each RDD
 of this DStream. 
- count() - Method in class org.apache.spark.streaming.kafka.OffsetRange
- 
Number of messages this OffsetRange refers to 
- count() - Method in class org.apache.spark.util.StatCounter
-  
- countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Approximate version of count() that returns a potentially incomplete result
 within a timeout, even if not all tasks have finished. 
- countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Approximate version of count() that returns a potentially incomplete result
 within a timeout, even if not all tasks have finished. 
- countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
- 
Approximate version of count() that returns a potentially incomplete result
 within a timeout, even if not all tasks have finished. 
- countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return approximate number of distinct elements in the RDD. 
- countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
- 
Return approximate number of distinct elements in the RDD. 
- countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
- 
Return approximate number of distinct elements in the RDD. 
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return approximate number of distinct values for each key in this RDD. 
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return approximate number of distinct values for each key in this RDD. 
- countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return approximate number of distinct values for each key in this RDD. 
- countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return approximate number of distinct values for each key in this RDD. 
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return approximate number of distinct values for each key in this RDD. 
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return approximate number of distinct values for each key in this RDD. 
- countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return approximate number of distinct values for each key in this RDD. 
- countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
The asynchronous version of count, which returns a
 future for counting the number of elements in this RDD.
 
- countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
- 
Returns a future for counting the number of elements in the RDD. 
- countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Count the number of elements for each key, and return the result to the master as a Map. 
- countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Count the number of elements for each key, collecting the results to a local Map. 
- countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Approximate version of countByKey that can return a partial result if it does
 not finish within a timeout. 
- countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Approximate version of countByKey that can return a partial result if it does
 not finish within a timeout. 
- countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Approximate version of countByKey that can return a partial result if it does
 not finish within a timeout. 
- countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return the count of each unique value in this RDD as a map of (value, count) pairs. 
- countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
- 
Return the count of each unique value in this RDD as a local map of (value, count) pairs. 
- countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD contains the counts of each distinct value in
 each RDD of this DStream. 
- countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD contains the counts of each distinct value in
 each RDD of this DStream. 
- countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD contains the counts of each distinct value in
 each RDD of this DStream. 
- countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD contains the count of distinct elements in
 RDDs in a sliding window over this DStream. 
- countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD contains the count of distinct elements in
 RDDs in a sliding window over this DStream. 
- countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD contains the count of distinct elements in
 RDDs in a sliding window over this DStream. 
- countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
(Experimental) Approximate version of countByValue(). 
- countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
(Experimental) Approximate version of countByValue(). 
- countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
- 
Approximate version of countByValue(). 
- countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD has a single element generated by counting the number
 of elements in a window over this DStream. 
- countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD has a single element generated by counting the number
 of elements in a sliding window over this DStream. 
- countDistinct(Column, Column...) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the number of distinct items in a group. 
- countDistinct(String, String...) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the number of distinct items in a group. 
- countDistinct(Column, Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the number of distinct items in a group. 
- countDistinct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the number of distinct items in a group. 
- countTowardsTaskFailures() - Method in class org.apache.spark.ExecutorLostFailure
-  
- countTowardsTaskFailures() - Method in class org.apache.spark.TaskCommitDenied
- 
If a task failed because its attempt to commit was denied, do not count this failure
 towards failing the stage. 
- countTowardsTaskFailures() - Method in interface org.apache.spark.TaskFailedReason
- 
Whether this task failure should be counted towards the maximum number of times the task is
 allowed to fail before the stage is aborted. 
- CountVectorizer - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 Extracts a vocabulary from document collections and generates a  CountVectorizerModel. 
- CountVectorizer(String) - Constructor for class org.apache.spark.ml.feature.CountVectorizer
-  
- CountVectorizer() - Constructor for class org.apache.spark.ml.feature.CountVectorizer
-  
- CountVectorizerModel - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 Converts a text document to a sparse vector of token counts. 
- CountVectorizerModel(String, String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
-  
- CountVectorizerModel(String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
-  
- cov(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
Calculate the sample covariance of two numerical columns of a DataFrame. 
- crc32(Column) - Static method in class org.apache.spark.sql.functions
- 
Calculates the cyclic redundancy check value  (CRC32) of a binary column and
 returns the value as a bigint. 
- CreatableRelationProvider - Interface in org.apache.spark.sql.sources
-  
- create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
- 
Deprecated. 
- create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
- 
Create a new StorageLevel object. 
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD
- 
Create an RDD that executes an SQL query on a JDBC connection and reads results. 
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD
- 
Create an RDD that executes an SQL query on a JDBC connection and reads results. 
- create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
- 
Create a PartitionPruningRDD. 
- create(Object...) - Static method in class org.apache.spark.sql.RowFactory
- 
Create a  Row from the given arguments. 
- create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
-  
- create(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
-  
- create(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-  
- create(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-  
- createArrayType(DataType) - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates an ArrayType by specifying the data type of elements (elementType).
 
- createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates an ArrayType by specifying the data type of elements (elementType) and
 whether the array contains null values (containsNull).
 
- createCombiner() - Method in class org.apache.spark.Aggregator
-  
- createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataFrame(List<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataFrame(List<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataset(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataset(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
-  
- createDataset(List<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
-  
- createDecimalType(int, int) - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates a DecimalType by specifying the precision and scale. 
- createDecimalType() - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates a DecimalType with default precision and scale, which are 10 and 0. 
- createDirectStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create an input stream that directly pulls messages from Kafka Brokers
 without using any receiver. 
- createDirectStream(StreamingContext, Map<String, String>, Set<String>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create an input stream that directly pulls messages from Kafka Brokers
 without using any receiver. 
- createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, Map<TopicAndPartition, Long>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create an input stream that directly pulls messages from Kafka Brokers
 without using any receiver. 
- createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, Set<String>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create an input stream that directly pulls messages from Kafka Brokers
 without using any receiver. 
- createExternalTable(String, String) - Method in class org.apache.spark.sql.SQLContext
-  
- createExternalTable(String, String, String) - Method in class org.apache.spark.sql.SQLContext
-  
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-  
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-  
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-  
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-  
- createJDBCTable(String, String, boolean) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.340, replaced by write().jdbc(). This will be removed in Spark 2.0.
 
 
- createLogDir() - Method in class org.apache.spark.scheduler.JobLogger
- 
Create a folder for log files, the folder's name is the creation time of jobLogger 
- createLogWriter(int) - Method in class org.apache.spark.scheduler.JobLogger
- 
Create a log file for one job 
- createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates a MapType by specifying the data type of keys (keyType) and values
 (keyType).
 
- createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates a MapType by specifying the data type of keys (keyType), the data type of
 values (keyType), and whether values contain any null value
 (valueContainsNull).
 
- createModel(Vector, double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-  
- createModel(Vector, double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-  
- createModel(Vector, double) - Method in class org.apache.spark.mllib.classification.SVMWithSGD
-  
- createModel(Vector, double) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
- 
Create a model given the weights and intercept 
- createModel(Vector, double) - Method in class org.apache.spark.mllib.regression.LassoWithSGD
-  
- createModel(Vector, double) - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-  
- createModel(Vector, double) - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-  
- createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent. 
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent. 
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent. 
- createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent. 
- createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent. 
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent. 
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent. 
- createRDD(SparkContext, Map<String, String>, OffsetRange[], ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create a RDD from Kafka using offset ranges for each topic and partition. 
- createRDD(SparkContext, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create a RDD from Kafka using offset ranges for each topic and partition. 
- createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, OffsetRange[]) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create a RDD from Kafka using offset ranges for each topic and partition. 
- createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create a RDD from Kafka using offset ranges for each topic and partition. 
- createRDDFromArray(JavaSparkContext, byte[][]) - Static method in class org.apache.spark.api.r.RRDD
- 
Create an RRDD given a sequence of byte arrays. 
- createRDDWithLocalProperties(Time, boolean, Function0<U>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Wrap a body of code such that the call site and operation scope
 information are passed to the RDDs created in this body properly. 
- createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.ml.source.libsvm.DefaultSource
-  
- createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in interface org.apache.spark.sql.sources.CreatableRelationProvider
- 
Creates a relation with the given parameters based on the contents of the given
 DataFrame. 
- createRelation(SQLContext, String[], Option<StructType>, Option<StructType>, Map<String, String>) - Method in interface org.apache.spark.sql.sources.HadoopFsRelationProvider
- 
Returns a new base relation with the given parameters, a user defined schema, and a list of
 partition columns. 
- createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider
- 
Returns a new base relation with the given parameters. 
- createRelation(SQLContext, Map<String, String>, StructType) - Method in interface org.apache.spark.sql.sources.SchemaRelationProvider
- 
Returns a new base relation with the given parameters and user defined schema. 
- createRWorker(int) - Static method in class org.apache.spark.api.r.RRDD
- 
ProcessBuilder used to launch worker R processes. 
- createSparkContext(String, String, String, String[], Map<Object, Object>, Map<Object, Object>) - Static method in class org.apache.spark.api.r.RRDD
-  
- createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Create a input stream from a Flume source. 
- createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Create a input stream from a Flume source. 
- createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates a input stream from a Flume source. 
- createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates a input stream from a Flume source. 
- createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
- 
Creates a input stream from a Flume source. 
- createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create an input stream that pulls messages from Kafka Brokers. 
- createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create an input stream that pulls messages from Kafka Brokers. 
- createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create an input stream that pulls messages from Kafka Brokers. 
- createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create an input stream that pulls messages from Kafka Brokers. 
- createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
- 
Create an input stream that pulls messages from Kafka Brokers. 
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, String, String, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
- 
Create an input stream that pulls messages from a Kinesis stream. 
- createStream(JavaStreamingContext, String, String, String, String, int, Duration, StorageLevel, String, String) - Method in class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
-  
- createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
- 
Create an input stream that receives messages pushed by a MQTT publisher. 
- createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
- 
Create an input stream that receives messages pushed by a MQTT publisher. 
- createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
- 
Create an input stream that receives messages pushed by a MQTT publisher. 
- createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
- 
Create a input stream that returns tweets received from Twitter. 
- createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
- 
Create a input stream that returns tweets received from Twitter using Twitter4J's default
 OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
 twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
 twitter4j.oauth.accessTokenSecret. 
- createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
- 
Create a input stream that returns tweets received from Twitter using Twitter4J's default
 OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
 twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
 twitter4j.oauth.accessTokenSecret. 
- createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
- 
Create a input stream that returns tweets received from Twitter using Twitter4J's default
 OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
 twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
 twitter4j.oauth.accessTokenSecret. 
- createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
- 
Create a input stream that returns tweets received from Twitter. 
- createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
- 
Create a input stream that returns tweets received from Twitter. 
- createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
- 
Create a input stream that returns tweets received from Twitter. 
- createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
- 
Create an input stream that receives messages pushed by a zeromq publisher. 
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
- 
Create an input stream that receives messages pushed by a zeromq publisher. 
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
- 
Create an input stream that receives messages pushed by a zeromq publisher. 
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
- 
Create an input stream that receives messages pushed by a zeromq publisher. 
- createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates a StructField by specifying the name (name), data type (dataType) and
 whether values of this field can be null values (nullable).
 
- createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates a StructField with empty metadata. 
- createStructType(List<StructField>) - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates a StructType with the given list of StructFields (fields).
 
- createStructType(StructField[]) - Static method in class org.apache.spark.sql.types.DataTypes
- 
Creates a StructType with the given StructField array (fields).
 
- createTransformFunc() - Method in class org.apache.spark.ml.feature.DCT
-  
- createTransformFunc() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-  
- createTransformFunc() - Method in class org.apache.spark.ml.feature.NGram
-  
- createTransformFunc() - Method in class org.apache.spark.ml.feature.Normalizer
-  
- createTransformFunc() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-  
- createTransformFunc() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- createTransformFunc() - Method in class org.apache.spark.ml.feature.Tokenizer
-  
- createTransformFunc() - Method in class org.apache.spark.ml.UnaryTransformer
- 
Creates the transform function using the given param map. 
- creationSite() - Method in class org.apache.spark.rdd.RDD
- 
User code that created this RDD (e.g. 
- creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
-  
- crosstab(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
Computes a pair-wise frequency table of the given columns. 
- CrossValidator - Class in org.apache.spark.ml.tuning
- 
:: Experimental ::
 K-fold cross validation. 
- CrossValidator(String) - Constructor for class org.apache.spark.ml.tuning.CrossValidator
-  
- CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
-  
- CrossValidatorModel - Class in org.apache.spark.ml.tuning
- 
:: Experimental ::
 Model from k-fold cross validation. 
- cube(Column...) - Method in class org.apache.spark.sql.DataFrame
- 
Create a multi-dimensional cube for the current  DataFrame using the specified columns,
 so we can run aggregation on them. 
- cube(String, String...) - Method in class org.apache.spark.sql.DataFrame
- 
Create a multi-dimensional cube for the current  DataFrame using the specified columns,
 so we can run aggregation on them. 
- cube(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
- 
Create a multi-dimensional cube for the current  DataFrame using the specified columns,
 so we can run aggregation on them. 
- cube(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
Create a multi-dimensional cube for the current  DataFrame using the specified columns,
 so we can run aggregation on them. 
- cume_dist() - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the cumulative distribution of values within a window partition,
 i.e. 
- cumeDist() - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.6.0, replaced by cume_dist. This will be removed in Spark 2.0.
 
 
- current_date() - Static method in class org.apache.spark.sql.functions
- 
Returns the current date as a date column. 
- current_timestamp() - Static method in class org.apache.spark.sql.functions
- 
Returns the current timestamp as a timestamp column. 
- currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
-  
- currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
-  
- currPrefLocs(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- databaseTypeDefinition() - Method in class org.apache.spark.sql.jdbc.JdbcType
-  
- dataDistribution() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-  
- DataFrame - Class in org.apache.spark.sql
- 
:: Experimental ::
 A distributed collection of data organized into named columns. 
- DataFrame(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.DataFrame
- 
A constructor that automatically analyzes the logical plan. 
- DataFrameHolder - Class in org.apache.spark.sql
- 
A container for a  DataFrame, used for implicit conversions. 
- DataFrameNaFunctions - Class in org.apache.spark.sql
- 
:: Experimental ::
 Functionality for working with missing data in  DataFrames. 
- DataFrameReader - Class in org.apache.spark.sql
- 
:: Experimental ::
 Interface used to load a  DataFrame from external storage systems (e.g. 
- DataFrameStatFunctions - Class in org.apache.spark.sql
- 
:: Experimental ::
 Statistic functions for  DataFrames. 
- DataFrameWriter - Class in org.apache.spark.sql
- 
:: Experimental ::
 Interface used to write a  DataFrame to external storage systems (e.g. 
- dataSchema() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
- 
Specifies schema of actual data files. 
- Dataset<T> - Class in org.apache.spark.sql
- 
:: Experimental ::
 A  Dataset is a strongly typed collection of objects that can be transformed in parallel
 using functional or relational operations. 
- DatasetHolder<T> - Class in org.apache.spark.sql
- 
A container for a  Dataset, used for implicit conversions. 
- DataSourceRegister - Interface in org.apache.spark.sql.sources
- 
::DeveloperApi::
 Data sources should implement this trait so that they can register an alias to their data source. 
- dataStream() - Method in class org.apache.spark.api.r.BaseRRDD
-  
- dataType() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
- 
- DataType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 The base type of all Spark SQL data types. 
- DataType() - Constructor for class org.apache.spark.sql.types.DataType
-  
- dataType() - Method in class org.apache.spark.sql.types.StructField
-  
- dataType() - Method in class org.apache.spark.sql.UserDefinedFunction
-  
- DataTypes - Class in org.apache.spark.sql.types
- 
To get/create specific data type, users should use singleton objects and factory methods
 provided by this class. 
- DataTypes() - Constructor for class org.apache.spark.sql.types.DataTypes
-  
- DataValidators - Class in org.apache.spark.mllib.util
- 
:: DeveloperApi ::
 A collection of methods used to validate data before applying ML algorithms. 
- DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
-  
- date() - Method in class org.apache.spark.sql.ColumnName
- 
Creates a new StructFieldof type date.
 
- DATE() - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for nullable date type. 
- date_add(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Returns the date that is daysdays afterstart
 
- date_format(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Converts a date/timestamp/string to a value of string in the format specified by the date
 format given by the second argument. 
- date_sub(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Returns the date that is daysdays beforestart
 
- datediff(Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the number of days from starttoend.
 
- DateType - Static variable in class org.apache.spark.sql.types.DataTypes
- 
Gets the DateType object. 
- DateType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 A date type, supporting "0001-01-01" through "9999-12-31". 
- dayofmonth(Column) - Static method in class org.apache.spark.sql.functions
- 
Extracts the day of the month as an integer from a given date/timestamp/string. 
- dayofyear(Column) - Static method in class org.apache.spark.sql.functions
- 
Extracts the day of the year as an integer from a given date/timestamp/string. 
- DB2Dialect - Class in org.apache.spark.sql.jdbc
-  
- DB2Dialect() - Constructor for class org.apache.spark.sql.jdbc.DB2Dialect
-  
- DCT - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 A feature transformer that takes the 1D discrete cosine transform of a real vector. 
- DCT(String) - Constructor for class org.apache.spark.ml.feature.DCT
-  
- DCT() - Constructor for class org.apache.spark.ml.feature.DCT
-  
- ddlParser() - Method in class org.apache.spark.sql.SQLContext
-  
- decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-  
- decimal() - Method in class org.apache.spark.sql.ColumnName
- 
Creates a new StructFieldof type decimal.
 
- decimal(int, int) - Method in class org.apache.spark.sql.ColumnName
- 
Creates a new StructFieldof type decimal.
 
- DECIMAL() - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for nullable decimal type. 
- Decimal - Class in org.apache.spark.sql.types
- 
A mutable implementation of BigDecimal that can hold a Long if values are small enough. 
- Decimal() - Constructor for class org.apache.spark.sql.types.Decimal
-  
- DecimalType - Class in org.apache.spark.sql.types
-  
- DecimalType(int, int) - Constructor for class org.apache.spark.sql.types.DecimalType
-  
- DecimalType(int) - Constructor for class org.apache.spark.sql.types.DecimalType
-  
- DecimalType() - Constructor for class org.apache.spark.sql.types.DecimalType
-  
- DecimalType(Option<PrecisionInfo>) - Constructor for class org.apache.spark.sql.types.DecimalType
-  
- DecisionTree - Class in org.apache.spark.mllib.tree
- 
A class which implements a decision tree learning algorithm for classification and regression. 
- DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
-  
- DecisionTreeClassificationModel - Class in org.apache.spark.ml.classification
- 
:: Experimental ::
 Decision treemodel for classification.
 
- DecisionTreeClassifier - Class in org.apache.spark.ml.classification
- 
:: Experimental ::
 Decision treelearning algorithm
 for classification.
 
- DecisionTreeClassifier(String) - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- DecisionTreeClassifier() - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- DecisionTreeModel - Class in org.apache.spark.mllib.tree.model
- 
Decision tree model for classification or regression. 
- DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
-  
- DecisionTreeRegressionModel - Class in org.apache.spark.ml.regression
- 
:: Experimental ::
 Decision treemodel for regression.
 
- DecisionTreeRegressor - Class in org.apache.spark.ml.regression
- 
:: Experimental ::
 Decision treelearning algorithm
 for regression.
 
- DecisionTreeRegressor(String) - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- DecisionTreeRegressor() - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- decode(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Computes the first argument into a string from a binary using the provided character set
 (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). 
- decodeLabel(Vector) - Static method in class org.apache.spark.ml.classification.LabelConverter
- 
Converts a vector to a label. 
- defaultAttr() - Static method in class org.apache.spark.ml.attribute.BinaryAttribute
- 
The default binary attribute. 
- defaultAttr() - Static method in class org.apache.spark.ml.attribute.NominalAttribute
- 
The default nominal attribute. 
- defaultAttr() - Static method in class org.apache.spark.ml.attribute.NumericAttribute
- 
The default numeric attribute. 
- defaultClassLoader() - Method in class org.apache.spark.serializer.Serializer
- 
Default ClassLoader to use in deserialization. 
- defaultCopy(ParamMap) - Method in interface org.apache.spark.ml.param.Params
- 
Default implementation of copy with extra params. 
- defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Default min number of partitions for Hadoop RDDs when not given by user 
- defaultMinPartitions() - Method in class org.apache.spark.SparkContext
- 
Default min number of partitions for Hadoop RDDs when not given by user
 Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2. 
- defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
- defaultMinSplits() - Method in class org.apache.spark.SparkContext
- 
Default min number of partitions for Hadoop RDDs when not given by user 
- defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Default level of parallelism to use when not given by user (e.g. 
- defaultParallelism() - Method in class org.apache.spark.SparkContext
- 
Default level of parallelism to use when not given by user (e.g. 
- defaultParamMap() - Method in interface org.apache.spark.ml.param.Params
- 
Internal param map for default values. 
- defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- defaultParams(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner
- 
Choose a partitioner to use for a cogroup-like operation between a number of RDDs. 
- defaultSize() - Method in class org.apache.spark.sql.types.ArrayType
- 
The default size of a value of the ArrayType is 100 * the default size of the element type. 
- defaultSize() - Method in class org.apache.spark.sql.types.BinaryType
- 
The default size of a value of the BinaryType is 4096 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.BooleanType
- 
The default size of a value of the BooleanType is 1 byte. 
- defaultSize() - Method in class org.apache.spark.sql.types.ByteType
- 
The default size of a value of the ByteType is 1 byte. 
- defaultSize() - Method in class org.apache.spark.sql.types.CalendarIntervalType
-  
- defaultSize() - Method in class org.apache.spark.sql.types.DataType
- 
The default size of a value of this data type, used internally for size estimation. 
- defaultSize() - Method in class org.apache.spark.sql.types.DateType
- 
The default size of a value of the DateType is 4 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.DecimalType
- 
The default size of a value of the DecimalType is 4096 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.DoubleType
- 
The default size of a value of the DoubleType is 8 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.FloatType
- 
The default size of a value of the FloatType is 4 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.IntegerType
- 
The default size of a value of the IntegerType is 4 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.LongType
- 
The default size of a value of the LongType is 8 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.MapType
- 
The default size of a value of the MapType is
 100 * (the default size of the key type + the default size of the value type). 
- defaultSize() - Method in class org.apache.spark.sql.types.NullType
-  
- defaultSize() - Method in class org.apache.spark.sql.types.ShortType
- 
The default size of a value of the ShortType is 2 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.StringType
- 
The default size of a value of the StringType is 4096 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.StructType
- 
The default size of a value of the StructType is the total default sizes of all field types. 
- defaultSize() - Method in class org.apache.spark.sql.types.TimestampType
- 
The default size of a value of the TimestampType is 8 bytes. 
- defaultSize() - Method in class org.apache.spark.sql.types.UserDefinedType
- 
The default size of a value of the UserDefinedType is 4096 bytes. 
- DefaultSource - Class in org.apache.spark.ml.source.libsvm
- 
libsvmpackage implements Spark SQL data source API for loading LIBSVM data asDataFrame.
 
- DefaultSource() - Constructor for class org.apache.spark.ml.source.libsvm.DefaultSource
-  
- defaultStategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
- 
- defaultStrategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
- 
- defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
-  
- degree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
- 
The polynomial degree to expand, which should be >= 1. 
- degrees() - Method in class org.apache.spark.graphx.GraphOps
- 
The degree of each vertex in the graph. 
- degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-  
- degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
-  
- degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult
- 
Returns the degree(s) of freedom of the hypothesis test. 
- delegate() - Method in class org.apache.spark.InterruptibleIterator
-  
- dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
- 
Creates a column-major dense matrix. 
- dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Creates a dense vector from its values. 
- dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Creates a dense vector from its values. 
- dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Creates a dense vector from a double array. 
- dense_rank() - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the rank of rows within a window partition, without any gaps. 
- DenseMatrix - Class in org.apache.spark.mllib.linalg
- 
Column-major dense matrix. 
- DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
-  
- DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
- 
Column-major dense matrix. 
- denseRank() - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.6.0, replaced by dense_rank. This will be removed in Spark 2.0.
 
 
- DenseVector - Class in org.apache.spark.mllib.linalg
- 
A dense vector represented by a value array. 
- DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
-  
- dependencies() - Method in class org.apache.spark.rdd.RDD
- 
Get the list of dependencies of this RDD, taking into account whether the
 RDD is checkpointed or not. 
- dependencies() - Method in class org.apache.spark.streaming.dstream.DStream
- 
List of parent DStreams on which this DStream depends on 
- dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
-  
- Dependency<T> - Class in org.apache.spark
- 
:: DeveloperApi ::
 Base class for dependencies. 
- Dependency() - Constructor for class org.apache.spark.Dependency
-  
- depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
- 
Get depth of tree. 
- DerbyDialect - Class in org.apache.spark.sql.jdbc
-  
- DerbyDialect() - Constructor for class org.apache.spark.sql.jdbc.DerbyDialect
-  
- desc() - Method in class org.apache.spark.sql.Column
- 
Returns an ordering used in sorting. 
- desc(String) - Static method in class org.apache.spark.sql.functions
- 
Returns a sort expression based on the descending order of the column. 
- desc() - Method in class org.apache.spark.util.MethodIdentifier
-  
- describe(String...) - Method in class org.apache.spark.sql.DataFrame
- 
Computes statistics for numeric columns, including count, mean, stddev, min, and max. 
- describe(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
Computes statistics for numeric columns, including count, mean, stddev, min, and max. 
- describeTopics(int) - Method in class org.apache.spark.ml.clustering.LDAModel
- 
Return the topics described by their top-weighted terms. 
- describeTopics() - Method in class org.apache.spark.ml.clustering.LDAModel
-  
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-  
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LDAModel
- 
Return the topics described by weighted terms. 
- describeTopics() - Method in class org.apache.spark.mllib.clustering.LDAModel
- 
Return the topics described by weighted terms. 
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-  
- description() - Method in class org.apache.spark.ExceptionFailure
-  
- description() - Method in class org.apache.spark.status.api.v1.JobData
-  
- description() - Method in class org.apache.spark.storage.StorageLevel
-  
- description() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-  
- DeserializationStream - Class in org.apache.spark.serializer
- 
:: DeveloperApi ::
 A stream for reading serialized objects. 
- DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
-  
- deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
-  
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
-  
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
-  
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-  
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-  
- deserialize(Object) - Method in class org.apache.spark.sql.types.UserDefinedType
- 
Convert a SQL datum to the user type 
- deserialized() - Method in class org.apache.spark.storage.MemoryEntry
-  
- deserialized() - Method in class org.apache.spark.storage.StorageLevel
-  
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.DummySerializerInstance
-  
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-  
- destroy() - Method in class org.apache.spark.broadcast.Broadcast
- 
Destroy all data and metadata related to this broadcast variable. 
- details() - Method in class org.apache.spark.scheduler.StageInfo
-  
- details() - Method in class org.apache.spark.status.api.v1.StageData
-  
- determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
- 
Determines the bounds for range partitioning from candidates with weights indicating how many
 items each represents. 
- deterministic() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
- 
Returns true iff this function is deterministic, i.e. 
- DeveloperApi - Annotation Type in org.apache.spark.annotation
- 
A lower-level, unstable API intended for developers. 
- devianceResiduals() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
- 
The weighted residuals, the usual residuals rescaled by
 the square root of the instance weights. 
- diag(Vector) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
- 
Generate a diagonal matrix in DenseMatrixformat from the supplied values.
 
- diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices
- 
Generate a diagonal matrix in Matrixformat from the supplied values.
 
- dialectClassName() - Method in class org.apache.spark.sql.SQLContext
-  
- diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD
- 
For each vertex present in both thisandother,diffreturns only those vertices with
 differing values; for values that are different, keeps the values fromother.
 
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
- 
For each vertex present in both thisandother,diffreturns only those vertices with
 differing values; for values that are different, keeps the values fromother.
 
- disableOutputSpecValidation() - Static method in class org.apache.spark.rdd.PairRDDFunctions
-  
- disconnect() - Method in interface org.apache.spark.launcher.SparkAppHandle
- 
Disconnects the handle from the application, without stopping it. 
- DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-  
- DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-  
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-  
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
-  
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-  
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-  
- diskSize() - Method in class org.apache.spark.storage.BlockStatus
-  
- diskSize() - Method in class org.apache.spark.storage.BlockUpdatedInfo
-  
- diskSize() - Method in class org.apache.spark.storage.RDDInfo
-  
- diskUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- diskUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-  
- diskUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
-  
- diskUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-  
- diskUsed() - Method in class org.apache.spark.storage.StorageStatus
- 
Return the disk space used by this block manager. 
- diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
- 
Return the disk space used by the given RDD in this block manager in O(1) time. 
- dist(Vector) - Method in class org.apache.spark.util.Vector
-  
- distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return a new RDD containing the distinct elements in this RDD. 
- distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return a new RDD containing the distinct elements in this RDD. 
- distinct() - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a new RDD containing the distinct elements in this RDD. 
- distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a new RDD containing the distinct elements in this RDD. 
- distinct() - Method in class org.apache.spark.api.java.JavaRDD
- 
Return a new RDD containing the distinct elements in this RDD. 
- distinct(int) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return a new RDD containing the distinct elements in this RDD. 
- distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD containing the distinct elements in this RDD. 
- distinct() - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD containing the distinct elements in this RDD. 
- distinct() - Method in class org.apache.spark.sql.DataFrame
- 
- distinct() - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset that contains only the unique elements of this  Dataset. 
- distinct(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
- 
Creates a Columnfor this UDAF using the distinct values of the givenColumns as input arguments.
 
- distinct(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
- 
Creates a Columnfor this UDAF using the distinct values of the givenColumns as input arguments.
 
- DistributedLDAModel - Class in org.apache.spark.ml.clustering
- 
:: Experimental :: 
- DistributedLDAModel - Class in org.apache.spark.mllib.clustering
-  
- DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed
- 
Represents a distributively stored matrix backed by one or more RDDs. 
- div(Duration) - Method in class org.apache.spark.streaming.Duration
-  
- divide(Object) - Method in class org.apache.spark.sql.Column
- 
Division this expression by another expression. 
- divide(double) - Method in class org.apache.spark.util.Vector
-  
- doc() - Method in class org.apache.spark.ml.param.Param
-  
- docConcentration() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-  
- docConcentration() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-  
- docConcentration() - Method in class org.apache.spark.mllib.clustering.LDAModel
- 
Concentration parameter (commonly named "alpha") for the prior placed on documents'
 distributions over topics ("theta"). 
- docConcentration() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-  
- doDestroy(boolean) - Method in class org.apache.spark.broadcast.Broadcast
- 
Actually destroy all data and metadata related to this broadcast variable. 
- dot(Vector) - Method in class org.apache.spark.util.Vector
-  
- DOUBLE() - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for nullable double type. 
- doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulator double variable, which tasks can "add" values
 to using the  add method. 
- doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Create an  Accumulator double variable, which tasks can "add" values
 to using the  add method. 
- DoubleArrayParam - Class in org.apache.spark.ml.param
- 
:: DeveloperApi ::
 Specialized version of Param[Array[Double} for Java.
 
- DoubleArrayParam(Params, String, String, Function1<double[], Object>) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
-  
- DoubleArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
-  
- DoubleDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
-  
- DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function
- 
A function that returns zero or more records of type Double from each input record. 
- DoubleFunction<T> - Interface in org.apache.spark.api.java.function
- 
A function that returns Doubles, and can be used to construct DoubleRDDs. 
- DoubleParam - Class in org.apache.spark.ml.param
- 
:: DeveloperApi ::
 Specialized version of Param[Double] for Java.
 
- DoubleParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
-  
- DoubleParam(String, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
-  
- DoubleParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
-  
- DoubleParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
-  
- DoubleRDDFunctions - Class in org.apache.spark.rdd
- 
Extra functions available on RDDs of Doubles through an implicit conversion. 
- DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
-  
- doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.rdd.RDD
-  
- doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
-  
- doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
-  
- doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
-  
- DoubleType - Static variable in class org.apache.spark.sql.types.DataTypes
- 
Gets the DoubleType object. 
- DoubleType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 The data type representing Doublevalues.
 
- doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
-  
- doUnpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast
- 
Actually unpersist the broadcasted value on the executors. 
- DRIVER_EXTRA_CLASSPATH - Static variable in class org.apache.spark.launcher.SparkLauncher
- 
Configuration key for the driver class path. 
- DRIVER_EXTRA_JAVA_OPTIONS - Static variable in class org.apache.spark.launcher.SparkLauncher
- 
Configuration key for the driver VM options. 
- DRIVER_EXTRA_LIBRARY_PATH - Static variable in class org.apache.spark.launcher.SparkLauncher
- 
Configuration key for the driver native library path. 
- DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
- 
Executor id for the driver. 
- DRIVER_MEMORY - Static variable in class org.apache.spark.launcher.SparkLauncher
- 
Configuration key for the driver memory. 
- driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
-  
- driverLogs() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-  
- drop(String) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame with a column dropped. 
- drop(Column) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame with a column dropped. 
- drop() - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that drops rows containing any null or NaN values. 
- drop(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that drops rows containing null or NaN values. 
- drop(String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that drops rows containing any null or NaN values
 in the specified columns. 
- drop(Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
(Scala-specific) Returns a new  DataFrame that drops rows containing any null or NaN values
 in the specified columns. 
- drop(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that drops rows containing null or NaN values
 in the specified columns. 
- drop(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
(Scala-specific) Returns a new  DataFrame that drops rows containing null or NaN values
 in the specified columns. 
- drop(int) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that drops rows containing
 less than  minNonNulls non-null and non-NaN values. 
- drop(int, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that drops rows containing
 less than  minNonNulls non-null and non-NaN values in the specified columns. 
- drop(int, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
(Scala-specific) Returns a new  DataFrame that drops rows containing less than
  minNonNulls non-null and non-NaN values in the specified columns. 
- dropDuplicates() - Method in class org.apache.spark.sql.DataFrame
- 
- dropDuplicates(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
(Scala-specific) Returns a new  DataFrame with duplicate rows removed, considering only
 the subset of columns. 
- dropDuplicates(String[]) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame with duplicate rows removed, considering only
 the subset of columns. 
- dropLast() - Method in class org.apache.spark.ml.feature.OneHotEncoder
- 
Whether to drop the last category in the encoded vector (default: true) 
- dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
-  
- Dst - Static variable in class org.apache.spark.graphx.TripletFields
- 
Expose the destination and edge fields but not the source field. 
- dstAttr() - Method in class org.apache.spark.graphx.EdgeContext
- 
The vertex attribute of the edge's destination vertex. 
- dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
- 
The destination vertex attribute 
- dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- dstId() - Method in class org.apache.spark.graphx.Edge
-  
- dstId() - Method in class org.apache.spark.graphx.EdgeContext
- 
The vertex id of the edge's destination vertex. 
- dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-  
- dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-  
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-  
- DStream<T> - Class in org.apache.spark.streaming.dstream
- 
A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous
 sequence of RDDs (of the same type) representing a continuous stream of data (see
 org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs). 
- DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
-  
- dtypes() - Method in class org.apache.spark.sql.DataFrame
- 
Returns all column names and their data types as an array. 
- DummySerializerInstance - Class in org.apache.spark.serializer
- 
Unfortunately, we need a serializer instance in order to construct a DiskBlockObjectWriter. 
- duration() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- Duration - Class in org.apache.spark.streaming
-  
- Duration(long) - Constructor for class org.apache.spark.streaming.Duration
-  
- duration() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
- 
Return the duration of this output operation. 
- Durations - Class in org.apache.spark.streaming
-  
- Durations() - Constructor for class org.apache.spark.streaming.Durations
-  
- f() - Method in class org.apache.spark.sql.UserDefinedFunction
-  
- f1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns document-based f1-measure averaged by the number of documents 
- f1Measure(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns f1-measure for a given label (category) 
- factorial(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the factorial of the given value. 
- failed() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- failedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-  
- failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- failureReason() - Method in class org.apache.spark.scheduler.StageInfo
- 
If the stage failed, the reason why. 
- failureReason() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-  
- FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
-  
- falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns false positive rate for a given label (category) 
- feature() - Method in class org.apache.spark.mllib.tree.model.Split
-  
- featureImportances() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
- 
Estimate of the importance of each feature. 
- featureImportances() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
- 
Estimate of the importance of each feature. 
- featureIndex() - Method in class org.apache.spark.ml.tree.CategoricalSplit
-  
- featureIndex() - Method in class org.apache.spark.ml.tree.ContinuousSplit
-  
- featureIndex() - Method in interface org.apache.spark.ml.tree.Split
- 
Index of feature which this split tests 
- features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-  
- featuresCol() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-  
- featuresCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
- 
Field in "predictions" which gives the features of each instance as a vector. 
- featuresCol() - Method in class org.apache.spark.ml.regression.LinearRegressionTrainingSummary
-  
- featuresDataType() - Method in class org.apache.spark.ml.PredictionModel
- 
Returns the SQL DataType corresponding to the FeaturesType type parameter. 
- FeatureType - Class in org.apache.spark.mllib.tree.configuration
- 
Enum to describe whether a feature is "continuous" or "categorical" 
- FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
-  
- featureType() - Method in class org.apache.spark.mllib.tree.model.Split
-  
- FetchFailed - Class in org.apache.spark
- 
:: DeveloperApi ::
 Task failed to fetch shuffle data from a remote node. 
- FetchFailed(BlockManagerId, int, int, int, String) - Constructor for class org.apache.spark.FetchFailed
-  
- fetchPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
-  
- fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-  
- fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-  
- field() - Method in class org.apache.spark.storage.BroadcastBlockId
-  
- fieldIndex(String) - Method in interface org.apache.spark.sql.Row
- 
Returns the index of a given field name. 
- fieldIndex(String) - Method in class org.apache.spark.sql.types.StructType
- 
Returns the index of a given field. 
- fieldNames() - Method in class org.apache.spark.sql.types.StructType
- 
Returns all field names in an array. 
- fields() - Method in class org.apache.spark.sql.types.StructType
-  
- FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
-  
- files() - Method in class org.apache.spark.SparkContext
-  
- fileStream(String, Class<K>, Class<V>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them using the given key-value types and input format. 
- fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them using the given key-value types and input format. 
- fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them using the given key-value types and input format. 
- fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create a input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them using the given key-value types and input format. 
- fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create a input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them using the given key-value types and input format. 
- fileStream(String, Function1<Path, Object>, boolean, Configuration, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create a input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them using the given key-value types and input format. 
- fill(double) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that replaces null or NaN values in numeric columns with  value. 
- fill(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that replaces null values in string columns with  value. 
- fill(double, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that replaces null or NaN values in specified numeric columns. 
- fill(double, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
(Scala-specific) Returns a new  DataFrame that replaces null or NaN values in specified
 numeric columns. 
- fill(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that replaces null values in specified string columns. 
- fill(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
(Scala-specific) Returns a new  DataFrame that replaces null values in
 specified string columns. 
- fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Returns a new  DataFrame that replaces null values. 
- fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
(Scala-specific) Returns a new  DataFrame that replaces null values. 
- filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return a new RDD containing only the elements that satisfy a predicate. 
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a new RDD containing only the elements that satisfy a predicate. 
- filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return a new RDD containing only the elements that satisfy a predicate. 
- filter(Function1<Graph<VD, ED>, Graph<VD2, ED2>>, Function1<EdgeTriplet<VD2, ED2>, Object>, Function2<Object, VD2, Object>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.GraphOps
- 
Filter the graph by computing some values to filter on, and applying the predicates. 
- filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- filter(Function1<Tuple2<Object, VD>, Object>) - Method in class org.apache.spark.graphx.VertexRDD
- 
Restricts the vertex set to the set of vertices satisfying the given predicate. 
- filter(Params) - Method in class org.apache.spark.ml.param.ParamMap
- 
Filters this param map for the given parent. 
- filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD containing only the elements that satisfy a predicate. 
- filter(Column) - Method in class org.apache.spark.sql.DataFrame
- 
Filters rows using the given condition. 
- filter(String) - Method in class org.apache.spark.sql.DataFrame
- 
Filters rows using the given SQL expression. 
- filter(Function1<T, Object>) - Method in class org.apache.spark.sql.Dataset
- 
(Scala-specific)
 Returns a new  Dataset that only contains elements where  func returns  true. 
- filter(FilterFunction<T>) - Method in class org.apache.spark.sql.Dataset
- 
(Java-specific)
 Returns a new  Dataset that only contains elements where  func returns  true. 
- Filter - Class in org.apache.spark.sql.sources
- 
A filter predicate for data sources. 
- Filter() - Constructor for class org.apache.spark.sql.sources.Filter
-  
- filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
- 
Return a new DStream containing only the elements that satisfy a predicate. 
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream containing only the elements that satisfy a predicate. 
- filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream containing only the elements that satisfy a predicate. 
- filterByRange(K, K) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
- 
Returns an RDD containing only the elements in the the inclusive range lowertoupper.
 
- FilterFunction<T> - Interface in org.apache.spark.api.java.function
- 
Base interface for a function used in Dataset's filter function. 
- filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD
- 
Filters this RDD with p, where p takes an additional parameter of type A. 
- findSplitsBins(RDD<LabeledPoint>, org.apache.spark.mllib.tree.impl.DecisionTreeMetadata) - Static method in class org.apache.spark.mllib.tree.DecisionTree
- 
Returns splits and bins for decision tree calculation. 
- findSynonyms(String, int) - Method in class org.apache.spark.ml.feature.Word2VecModel
- 
Find "num" number of words closest in similarity to the given word. 
- findSynonyms(Vector, int) - Method in class org.apache.spark.ml.feature.Word2VecModel
- 
Find "num" number of words closest to similarity to the given vector representation
 of the word. 
- findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-  
- findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-  
- finish(B) - Method in class org.apache.spark.sql.expressions.Aggregator
- 
Transform the output of the reduction. 
- finished() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- finishTime() - Method in class org.apache.spark.scheduler.TaskInfo
- 
The time when the task has completed successfully (including the time to remotely fetch
 results, if necessary). 
- first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-  
- first() - Method in class org.apache.spark.api.java.JavaPairRDD
-  
- first() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return the first element in this RDD. 
- first() - Method in class org.apache.spark.rdd.RDD
- 
Return the first element in this RDD. 
- first() - Method in class org.apache.spark.sql.DataFrame
- 
Returns the first row. 
- first() - Method in class org.apache.spark.sql.Dataset
- 
Returns the first element in this  Dataset. 
- first(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the first value in a group. 
- first(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the first value of a column in a group. 
- firstParent(ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Returns the first parent RDD 
- fit(DataFrame) - Method in class org.apache.spark.ml.classification.OneVsRest
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.clustering.LDA
-  
- fit(DataFrame, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator
- 
Fits a single model to the input data with optional parameters. 
- fit(DataFrame, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator
- 
Fits a single model to the input data with optional parameters. 
- fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Estimator
- 
Fits a single model to the input data with provided parameter map. 
- fit(DataFrame) - Method in class org.apache.spark.ml.Estimator
- 
Fits a model to the input data. 
- fit(DataFrame, ParamMap[]) - Method in class org.apache.spark.ml.Estimator
- 
Fits multiple models to the input data with multiple sets of parameters. 
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.CountVectorizer
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.IDF
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.PCA
- 
Computes a  PCAModel that contains the principal components of the input vectors. 
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.RFormula
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.StandardScaler
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.StringIndexer
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.VectorIndexer
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.Pipeline
- 
Fits the pipeline to the input dataset with additional parameters. 
- fit(DataFrame) - Method in class org.apache.spark.ml.Predictor
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.tuning.CrossValidator
-  
- fit(DataFrame) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-  
- fit(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-  
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
- 
Computes the inverse document frequency. 
- fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
- 
Computes the inverse document frequency. 
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA
- 
Computes a  PCAModel that contains the principal components of the input vectors. 
- fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA
- 
Java-friendly version of fit()
 
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler
- 
Computes the mean and variance and stores as a model to be used for later scaling. 
- fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-  
- fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-  
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by first applying a function to all elements of this
  RDD, and then flattening the results. 
- flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD by first applying a function to all elements of this
  RDD, and then flattening the results. 
- flatMap(Function1<Row, TraversableOnce<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new RDD by first applying a function to all rows of this  DataFrame,
 and then flattening the results. 
- flatMap(Function1<T, TraversableOnce<U>>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
- 
(Scala-specific)
 Returns a new  Dataset by first applying a function to all elements of this  Dataset,
 and then flattening the results. 
- flatMap(FlatMapFunction<T, U>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
- 
(Java-specific)
 Returns a new  Dataset by first applying a function to all elements of this  Dataset,
 and then flattening the results. 
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream by applying a function to all elements of this DStream,
 and then flattening the results 
- flatMap(Function1<T, Traversable<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream by applying a function to all elements of this DStream,
 and then flattening the results 
- FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function
- 
A function that returns zero or more output records from each input record. 
- FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function
- 
A function that takes two inputs and returns zero or more output records. 
- flatMapGroups(Function2<K, Iterator<V>, TraversableOnce<U>>, Encoder<U>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Applies the given function to each group of data. 
- flatMapGroups(FlatMapGroupsFunction<K, V, U>, Encoder<U>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Applies the given function to each group of data. 
- FlatMapGroupsFunction<K,V,R> - Interface in org.apache.spark.api.java.function
- 
A function that returns zero or more output records from each grouping key and its values. 
- flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by first applying a function to all elements of this
  RDD, and then flattening the results. 
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by first applying a function to all elements of this
  RDD, and then flattening the results. 
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream by applying a function to all elements of this DStream,
 and then flattening the results 
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Pass each value in the key-value pair RDD through a flatMap function without changing the
 keys; this also retains the original RDD's partitioning. 
- flatMapValues(Function1<V, TraversableOnce<U>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Pass each value in the key-value pair RDD through a flatMap function without changing the
 keys; this also retains the original RDD's partitioning. 
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
 'this' DStream without changing the key. 
- flatMapValues(Function1<V, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
 'this' DStream without changing the key. 
- flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
FlatMaps f over this RDD, where f takes an additional parameter of type A. 
- FLOAT() - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for nullable float type. 
- FloatDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
-  
- FloatParam - Class in org.apache.spark.ml.param
- 
:: DeveloperApi ::
 Specialized version of Param[Float] for Java.
 
- FloatParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
-  
- FloatParam(String, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
-  
- FloatParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
-  
- FloatParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
-  
- floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
-  
- FloatType - Static variable in class org.apache.spark.sql.types.DataTypes
- 
Gets the FloatType object. 
- FloatType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 The data type representing Floatvalues.
 
- floatWritableConverter() - Static method in class org.apache.spark.SparkContext
-  
- floor(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the floor of the given value. 
- floor(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the floor of the given column. 
- floor() - Method in class org.apache.spark.sql.types.Decimal
-  
- floor(Duration) - Method in class org.apache.spark.streaming.Time
-  
- floor(Duration, Time) - Method in class org.apache.spark.streaming.Time
-  
- FlumeUtils - Class in org.apache.spark.streaming.flume
-  
- FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
-  
- flush() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
-  
- flush() - Method in class org.apache.spark.serializer.SerializationStream
-  
- flush() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
-  
- fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns f-measure for a given label (category) 
- fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns f1-measure for a given label (category) 
- fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns f-measure
 (equals to precision and recall because precision equals recall) 
- fMeasureByThreshold() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
- 
Returns a dataframe with two fields (threshold, F-Measure) curve with beta = 1.0. 
- fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
- 
Returns the (threshold, F-Measure) curve. 
- fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
- 
Returns the (threshold, F-Measure) curve with beta = 1.0. 
- fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Aggregate the elements of each partition, and then the results for all the partitions, using a
 given associative and commutative function and a neutral "zero value". 
- fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
- 
Aggregate the elements of each partition, and then the results for all the partitions, using a
 given associative and commutative function and a neutral "zero value". 
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.). 
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.). 
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Merge the values for each key using an associative function and a neutral "zero value"
 which may be added to the result an arbitrary number of times, and must not change the result
 (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.). 
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.). 
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.). 
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.). 
- foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Applies a function f to all elements of this RDD. 
- foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
- 
Applies a function f to all elements of this RDD. 
- foreach(Function1<Row, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame
- 
Applies a function fto all rows.
 
- foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.sql.Dataset
- 
(Scala-specific)
 Runs  func on each element of this  Dataset. 
- foreach(ForeachFunction<T>) - Method in class org.apache.spark.sql.Dataset
- 
(Java-specific)
 Runs  func on each element of this  Dataset. 
- foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Deprecated.
As of release 0.9.0, replaced by foreachRDD 
 
- foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Deprecated.
As of release 0.9.0, replaced by foreachRDD 
 
- foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Deprecated.
As of 0.9.0, replaced by foreachRDD.
 
 
- foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Deprecated.
As of 0.9.0, replaced by foreachRDD.
 
 
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseVector
-  
- foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Applies a function fto all the active elements of dense and sparse matrix.
 
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseVector
-  
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Applies a function fto all the active elements of dense and sparse vector.
 
- foreachAsync(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
The asynchronous version of the foreachaction, which
 applies a function f to all the elements of this RDD.
 
- foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
- 
Applies a function f to all elements of this RDD. 
- ForeachFunction<T> - Interface in org.apache.spark.api.java.function
- 
Base interface for a function used in Dataset's foreach function. 
- foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Applies a function f to each partition of this RDD. 
- foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
- 
Applies a function f to each partition of this RDD. 
- foreachPartition(Function1<Iterator<Row>, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame
- 
Applies a function f to each partition of this  DataFrame. 
- foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.sql.Dataset
- 
(Scala-specific)
 Runs  func on each partition of this  Dataset. 
- foreachPartition(ForeachPartitionFunction<T>) - Method in class org.apache.spark.sql.Dataset
- 
(Java-specific)
 Runs  func on each partition of this  Dataset. 
- foreachPartitionAsync(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
The asynchronous version of the foreachPartitionaction, which
 applies a function f to each partition of this RDD.
 
- foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
- 
Applies a function f to each partition of this RDD. 
- ForeachPartitionFunction<T> - Interface in org.apache.spark.api.java.function
- 
Base interface for a function used in Dataset's foreachPartition function. 
- foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Deprecated.
As of release 1.6.0, replaced by foreachRDD(JVoidFunction) 
 
- foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Deprecated.
As of release 1.6.0, replaced by foreachRDD(JVoidFunction2) 
 
- foreachRDD(VoidFunction<R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Apply a function to each RDD in this DStream. 
- foreachRDD(VoidFunction2<R, Time>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Apply a function to each RDD in this DStream. 
- foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Apply a function to each RDD in this DStream. 
- foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Apply a function to each RDD in this DStream. 
- foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
- 
Applies f to each element of this RDD, where f takes an additional parameter of type A. 
- format(String) - Method in class org.apache.spark.sql.DataFrameReader
- 
Specifies the input data source format. 
- format(String) - Method in class org.apache.spark.sql.DataFrameWriter
- 
Specifies the underlying output data source. 
- format_number(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places,
 and returns the result as a string column. 
- format_string(String, Column...) - Static method in class org.apache.spark.sql.functions
- 
Formats the arguments in printf-style and returns the result as a string column. 
- format_string(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Formats the arguments in printf-style and returns the result as a string column. 
- formatVersion() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.classification.SVMModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.clustering.KMeansModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.feature.Word2VecModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.regression.LassoModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-  
- formatVersion() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-  
- formatVersion() - Method in interface org.apache.spark.mllib.util.Saveable
- 
Current version of model save/load format. 
- formula() - Method in class org.apache.spark.ml.feature.RFormula
- 
R formula parameter. 
- FPGrowth - Class in org.apache.spark.mllib.fpm
- 
A parallel FP-growth algorithm to mine frequent itemsets. 
- FPGrowth() - Constructor for class org.apache.spark.mllib.fpm.FPGrowth
- 
Constructs a default instance with default parameters {minSupport: 0.3, numPartitions: same
 as the input data}.
 
- FPGrowth.FreqItemset<Item> - Class in org.apache.spark.mllib.fpm
- 
Frequent itemset. 
- FPGrowth.FreqItemset(Object, long) - Constructor for class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-  
- FPGrowthModel<Item> - Class in org.apache.spark.mllib.fpm
- 
Model trained by  FPGrowth, which holds frequent itemsets. 
- FPGrowthModel(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel
-  
- fractional() - Method in class org.apache.spark.sql.types.DecimalType
-  
- fractional() - Method in class org.apache.spark.sql.types.DoubleType
-  
- fractional() - Method in class org.apache.spark.sql.types.FloatType
-  
- freq() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-  
- freq() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
-  
- freqItems(String[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
Finding frequent items for columns, possibly with false positives. 
- freqItems(String[]) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
Finding frequent items for columns, possibly with false positives. 
- freqItems(Seq<String>, double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
(Scala-specific) Finding frequent items for columns, possibly with false positives. 
- freqItems(Seq<String>) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
(Scala-specific) Finding frequent items for columns, possibly with false positives. 
- freqItemsets() - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
-  
- freqSequences() - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel
-  
- from_unixtime(Column) - Static method in class org.apache.spark.sql.functions
- 
Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string
 representing the timestamp of that moment in the current system time zone in the given
 format. 
- from_unixtime(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string
 representing the timestamp of that moment in the current system time zone in the given
 format. 
- from_utc_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Assumes given timestamp is UTC and converts to given timezone. 
- fromAttributes(Seq<Attribute>) - Static method in class org.apache.spark.sql.types.StructType
-  
- fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
-  
- fromCaseClassString(String) - Static method in class org.apache.spark.sql.types.DataType
- 
Deprecated.
As of 1.2.0, replaced by DataType.fromJson()
 
 
- fromCOO(int, int, Iterable<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
- 
Generate a SparseMatrixfrom Coordinate List (COO) format.
 
- fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
- 
- fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
- 
Create a graph from EdgePartitions, setting referenced vertices to `defaultVertexAttr`. 
- fromEdges(RDD<Edge<ED>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD
- 
Creates an EdgeRDD from a set of edges. 
- fromEdges(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
- 
Construct a graph from a collection of edges. 
- fromEdges(EdgeRDD<?>, int, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
- 
Constructs a VertexRDDcontaining all vertices referred to inedges.
 
- fromEdgeTuples(RDD<Tuple2<Object, Object>>, VD, Option<PartitionStrategy>, StorageLevel, StorageLevel, ClassTag<VD>) - Static method in class org.apache.spark.graphx.Graph
- 
Construct a graph from a collection of edges encoded as vertex id pairs. 
- fromExistingRDDs(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
- 
Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the
 vertices. 
- fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
- 
- fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
- 
- fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-  
- fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD
- 
Convert a JavaRDD of key-value pairs to JavaPairRDD. 
- fromJson(String) - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Parses the JSON representation of a vector into a  Vector. 
- fromJson(String) - Static method in class org.apache.spark.sql.types.DataType
-  
- fromJson(String) - Static method in class org.apache.spark.sql.types.Metadata
- 
Creates a Metadata instance from JSON. 
- fromName(String) - Static method in class org.apache.spark.ml.attribute.AttributeType
- 
- fromOffset() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-  
- fromOld(DecisionTreeModel, DecisionTreeClassifier, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
- 
(private[ml]) Convert a model from the old API 
- fromOld(GradientBoostedTreesModel, GBTClassifier, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.classification.GBTClassificationModel
- 
(private[ml]) Convert a model from the old API 
- fromOld(RandomForestModel, RandomForestClassifier, Map<Object, Object>, int, int) - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel
- 
(private[ml]) Convert a model from the old API 
- fromOld(DecisionTreeModel, DecisionTreeRegressor, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
- 
(private[ml]) Convert a model from the old API 
- fromOld(GradientBoostedTreesModel, GBTRegressor, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.regression.GBTRegressionModel
- 
(private[ml]) Convert a model from the old API 
- fromOld(RandomForestModel, RandomForestRegressor, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel
- 
(private[ml]) Convert a model from the old API 
- fromOld(Node, Map<Object, Object>) - Static method in class org.apache.spark.ml.tree.Node
- 
Create a new Node from the old Node format, recursively creating child nodes as needed. 
- fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-  
- fromPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions
- 
Implicit conversion from a pair RDD to MLPairRDDFunctions. 
- fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-  
- fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-  
- fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-  
- fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RDDFunctions
- 
Implicit conversion from an RDD to RDDFunctions. 
- fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
-  
- fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
- 
- fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
- 
- fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-  
- fromStage(Stage, int, Option<Object>, Seq<Seq<TaskLocation>>) - Static method in class org.apache.spark.scheduler.StageInfo
- 
Construct a StageInfo from a Stage. 
- fromString(String) - Static method in enum org.apache.spark.JobExecutionStatus
-  
- fromString(String) - Static method in class org.apache.spark.mllib.tree.loss.Losses
-  
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.ApplicationStatus
-  
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.StageStatus
-  
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.TaskSorting
-  
- fromString(String) - Static method in class org.apache.spark.storage.StorageLevel
- 
:: DeveloperApi ::
 Return the StorageLevel object with the specified name. 
- fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Creates an attribute group from a StructFieldinstance.
 
- fullOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Perform a full outer join of thisandother.
 
- fullOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Perform a full outer join of thisandother.
 
- fullOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Perform a full outer join of thisandother.
 
- fullOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Perform a full outer join of thisandother.
 
- fullOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Perform a full outer join of thisandother.
 
- fullOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Perform a full outer join of thisandother.
 
- fullOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'full outer join' between RDDs of thisDStream andotherDStream.
 
- fullOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'full outer join' between RDDs of thisDStream andotherDStream.
 
- fullOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'full outer join' between RDDs of thisDStream andotherDStream.
 
- fullOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'full outer join' between RDDs of thisDStream andotherDStream.
 
- fullOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'full outer join' between RDDs of thisDStream andotherDStream.
 
- fullOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'full outer join' between RDDs of thisDStream andotherDStream.
 
- fullStackTrace() - Method in class org.apache.spark.ExceptionFailure
-  
- Function<T1,R> - Interface in org.apache.spark.api.java.function
- 
Base interface for functions whose return types do not create special RDDs. 
- function(Function4<Time, KeyType, Option<ValueType>, State<StateType>, Option<MappedType>>) - Static method in class org.apache.spark.streaming.StateSpec
- 
- function(Function3<KeyType, Option<ValueType>, State<StateType>, MappedType>) - Static method in class org.apache.spark.streaming.StateSpec
- 
- function(Function4<Time, KeyType, Optional<ValueType>, State<StateType>, Optional<MappedType>>) - Static method in class org.apache.spark.streaming.StateSpec
- 
- function(Function3<KeyType, Optional<ValueType>, State<StateType>, MappedType>) - Static method in class org.apache.spark.streaming.StateSpec
- 
- Function0<R> - Interface in org.apache.spark.api.java.function
- 
A zero-argument function that returns an R. 
- Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function
- 
A two-argument function that takes arguments of type T1 and T2 and returns an R. 
- Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function
- 
A three-argument function that takes arguments of type T1, T2 and T3 and returns an R. 
- Function4<T1,T2,T3,T4,R> - Interface in org.apache.spark.api.java.function
- 
A four-argument function that takes arguments of type T1, T2, T3 and T4 and returns an R. 
- functionRegistry() - Method in class org.apache.spark.sql.hive.HiveContext
-  
- functionRegistry() - Method in class org.apache.spark.sql.SQLContext
-  
- functions - Class in org.apache.spark.sql
-  
- functions() - Constructor for class org.apache.spark.sql.functions
-  
- FutureAction<T> - Interface in org.apache.spark
- 
A future for the result of an action to support cancellation. 
- futureExecutionContext() - Static method in class org.apache.spark.rdd.AsyncRDDActions
-  
- gain() - Method in class org.apache.spark.ml.tree.InternalNode
-  
- gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-  
- gamma1() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-  
- gamma2() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-  
- gamma6() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-  
- gamma7() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-  
- GammaGenerator - Class in org.apache.spark.mllib.random
- 
:: DeveloperApi ::
 Generates i.i.d. 
- GammaGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.GammaGenerator
-  
- gammaJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- gammaJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- gammaJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- gammaRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
Generates an RDD comprised of i.i.d.samples from the gamma distribution with the input
  shape and scale.
 
- gammaShape() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-  
- gammaShape() - Method in class org.apache.spark.mllib.clustering.LDAModel
- 
Shape parameter for random initialization of variational parameter gamma. 
- gammaShape() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-  
- gammaVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
Generates an RDD[Vector] with vectors containing i.i.d.samples drawn from the
 gamma distribution with the input shape and scale.
 
- gaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer
- 
Indicates whether regex splits on gaps (true) or matches tokens (false). 
- GaussianMixture - Class in org.apache.spark.mllib.clustering
- 
This class performs expectation maximization for multivariate Gaussian
 Mixture Models (GMMs). 
- GaussianMixture() - Constructor for class org.apache.spark.mllib.clustering.GaussianMixture
- 
Constructs a default instance. 
- GaussianMixtureModel - Class in org.apache.spark.mllib.clustering
- 
Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points
 are drawn from each Gaussian i=1..k with probability w(i); mu(i) and sigma(i) are
 the respective mean and covariance for each Gaussian distribution i=1..k. 
- GaussianMixtureModel(double[], MultivariateGaussian[]) - Constructor for class org.apache.spark.mllib.clustering.GaussianMixtureModel
-  
- gaussians() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-  
- GBTClassificationModel - Class in org.apache.spark.ml.classification
- 
:: Experimental ::
 Gradient-Boosted Trees (GBTs)model for classification.
 
- GBTClassificationModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.classification.GBTClassificationModel
- 
Construct a GBTClassificationModel 
- GBTClassifier - Class in org.apache.spark.ml.classification
- 
:: Experimental ::
 Gradient-Boosted Trees (GBTs)learning algorithm for classification.
 
- GBTClassifier(String) - Constructor for class org.apache.spark.ml.classification.GBTClassifier
-  
- GBTClassifier() - Constructor for class org.apache.spark.ml.classification.GBTClassifier
-  
- GBTRegressionModel - Class in org.apache.spark.ml.regression
- 
:: Experimental :: 
- GBTRegressionModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.regression.GBTRegressionModel
- 
Construct a GBTRegressionModel 
- GBTRegressor - Class in org.apache.spark.ml.regression
- 
:: Experimental ::
 Gradient-Boosted Trees (GBTs)learning algorithm for regression.
 
- GBTRegressor(String) - Constructor for class org.apache.spark.ml.regression.GBTRegressor
-  
- GBTRegressor() - Constructor for class org.apache.spark.ml.regression.GBTRegressor
-  
- GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression
- 
:: DeveloperApi ::
 GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM). 
- GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-  
- GeneralizedLinearModel - Class in org.apache.spark.mllib.regression
- 
:: DeveloperApi ::
 GeneralizedLinearModel (GLM) represents a model trained using
 GeneralizedLinearAlgorithm. 
- GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
-  
- generateAssociationRules(double) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
- 
Generates association rules for the Items infreqItemsets.
 
- generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
-  
- generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
- 
Generate an RDD containing test data for KMeans. 
- generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
- 
For compatibility, the generated data without specifying the mean and variance
 will have zero mean and variance of (1.0/3.0) since the original output range is
 [-1, 1] with uniform distribution, and the variance of uniform distribution
 is (b - a)^2^ / 12 which will be (1.0/3.0) 
- generateLinearInput(double, double[], double[], double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-  
- generateLinearInput(double, double[], double[], double[], int, int, double, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-  
- generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
- 
Return a Java List of synthetic data randomly generated according to a multi
 collinear model. 
- generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
- 
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso,
 and uregularized variants. 
- generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
- 
Generate an RDD containing test data for LogisticRegression. 
- generateRandomEdges(int, int, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-  
- geq(Object) - Method in class org.apache.spark.sql.Column
- 
Greater than or equal to an expression. 
- get() - Method in interface org.apache.spark.FutureAction
- 
Blocks and returns the result of this job. 
- get(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
- 
Optionally returns the value associated with a param. 
- get(Param<T>) - Method in interface org.apache.spark.ml.param.Params
-  
- get(String) - Method in class org.apache.spark.SparkConf
- 
Get a parameter; throws a NoSuchElementException if it's not set 
- get(String, String) - Method in class org.apache.spark.SparkConf
- 
Get a parameter, falling back to a default if not set 
- get() - Static method in class org.apache.spark.SparkEnv
- 
Returns the SparkEnv. 
- get(String) - Static method in class org.apache.spark.SparkFiles
- 
Get the absolute path of a file added through SparkContext.addFile().
 
- get(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i. 
- get() - Method in class org.apache.spark.streaming.State
- 
Get the state if it exists, otherwise it will throw java.util.NoSuchElementException.
 
- get() - Static method in class org.apache.spark.TaskContext
- 
Return the currently active TaskContext. 
- get_json_object(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Extracts json object from a json string based on json path specified, and returns json string
 of the extracted json object. 
- getActive() - Static method in class org.apache.spark.streaming.StreamingContext
- 
:: Experimental :: 
- getActiveJobIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
- 
Returns an array containing the ids of all active jobs. 
- getActiveJobIds() - Method in class org.apache.spark.SparkStatusTracker
- 
Returns an array containing the ids of all active jobs. 
- getActiveOrCreate(Function0<StreamingContext>) - Static method in class org.apache.spark.streaming.StreamingContext
- 
:: Experimental :: 
- getActiveOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
- 
:: Experimental :: 
- getActiveStageIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
- 
Returns an array containing the ids of all active stages. 
- getActiveStageIds() - Method in class org.apache.spark.SparkStatusTracker
- 
Returns an array containing the ids of all active stages. 
- getAkkaConf() - Method in class org.apache.spark.SparkConf
- 
Get all akka conf variables set on this SparkConf 
- getAlgo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getAll() - Method in class org.apache.spark.SparkConf
- 
Get all parameters as a list of pairs 
- getAllConfs() - Method in class org.apache.spark.sql.SQLContext
- 
Return all the configuration properties that have been set (i.e. 
- getAllPools() - Method in class org.apache.spark.SparkContext
- 
:: DeveloperApi ::
 Return pools for fair scheduler 
- getAlpha() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Alias for getDocConcentration
 
- getAnyValAs(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value of a given fieldName. 
- getAppId() - Method in interface org.apache.spark.launcher.SparkAppHandle
- 
Returns the application ID, or nullif not yet known.
 
- getAppId() - Method in class org.apache.spark.SparkConf
- 
Returns the Spark application id, valid in the Driver after TaskScheduler registration and
 from the start in the Executor. 
- getAs(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i. 
- getAs(String) - Method in interface org.apache.spark.sql.Row
- 
Returns the value of a given fieldName. 
- getAsymmetricAlpha() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Alias for getAsymmetricDocConcentration
 
- getAsymmetricDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Concentration parameter (commonly named "alpha") for the prior placed on documents'
 distributions over topics ("theta"). 
- getAttr(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Gets an attribute by its name. 
- getAttr(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Gets an attribute by its index. 
- getAvroSchema() - Method in class org.apache.spark.SparkConf
- 
Gets all the avro schemas in the configuration used in the generic Avro record serializer 
- getBeta() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Alias for getTopicConcentration
 
- getBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
- 
Return the given block stored in this block manager in O(1) time. 
- getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf
- 
Get a parameter as a boolean, falling back to a default if not set 
- getBoolean(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i as a primitive boolean. 
- getBoolean(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a Boolean. 
- getBooleanArray(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a Boolean array. 
- getByte(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i as a primitive byte. 
- getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
-  
- getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
- 
The three methods below are helpers for accessing the local map, a property of the SparkEnv of
 the local process. 
- getCaseSensitive() - Method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-  
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-  
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
- 
Get the custom datatype mapping for the given jdbc meta information. 
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-  
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-  
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-  
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-  
- getCategoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
-  
- getCheckpointDir() - Method in class org.apache.spark.SparkContext
-  
- getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Gets the name of the file to which this RDD was checkpointed 
- getCheckpointFile() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- getCheckpointFile() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- getCheckpointFile() - Method in class org.apache.spark.rdd.RDD
- 
Gets the name of the directory to which this RDD was checkpointed. 
- getCheckpointFiles() - Method in class org.apache.spark.graphx.Graph
- 
Gets the name of the files to which this Graph was checkpointed. 
- getCheckpointFiles() - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- getCheckpointInterval() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Period (in iterations) between checkpoints. 
- getCheckpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getConf() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Return a copy of this JavaSparkContext's configuration. 
- getConf() - Method in class org.apache.spark.rdd.HadoopRDD
-  
- getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
-  
- getConf() - Method in class org.apache.spark.SparkContext
- 
Return a copy of this SparkContext's configuration. 
- getConf(String) - Method in class org.apache.spark.sql.SQLContext
- 
Return the value of Spark SQL configuration property for the given key. 
- getConf(String, String) - Method in class org.apache.spark.sql.SQLContext
- 
Return the value of Spark SQL configuration property for the given key. 
- getConnection() - Method in interface org.apache.spark.rdd.JdbcRDD.ConnectionFactory
-  
- getConvergenceTol() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Return the largest change in log-likelihood at which convergence is
 considered to have occurred. 
- getDate(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i of date type as java.sql.Date. 
- getDecimal(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i of decimal type as java.math.BigDecimal. 
- getDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params
- 
Gets the default value of a parameter. 
- getDegree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-  
- getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-  
- getDependencies() - Method in class org.apache.spark.rdd.RDD
- 
Implemented by subclasses to return how this RDD depends on parent RDDs. 
- getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-  
- getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-  
- getDeprecatedConfig(String, SparkConf) - Static method in class org.apache.spark.SparkConf
- 
Looks for available deprecated keys for the given config option, and return the first
 value available. 
- getDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Concentration parameter (commonly named "alpha") for the prior placed on documents'
 distributions over topics ("theta"). 
- getDouble(String, double) - Method in class org.apache.spark.SparkConf
- 
Get a parameter as a double, falling back to a default if not set 
- getDouble(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i as a primitive double. 
- getDouble(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a Double. 
- getDoubleArray(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a Double array. 
- getEpsilon() - Method in class org.apache.spark.mllib.clustering.KMeans
- 
The distance threshold within which we've consider centers to have converged. 
- getExecutorEnv() - Method in class org.apache.spark.SparkConf
- 
Get all executor environment variables set on this SparkConf 
- getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext
- 
Return a map from the slave to the max memory available for caching and the remaining
 memory available for caching. 
- getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext
- 
:: DeveloperApi ::
 Return information about blocks stored in all of the slaves 
- getField(String) - Method in class org.apache.spark.sql.Column
- 
An expression that gets a field by name in a StructType.
 
- getFinalValue() - Method in class org.apache.spark.partial.PartialResult
- 
Blocking method to wait for and return the final value. 
- getFloat(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i as a primitive float. 
- getFormula() - Method in class org.apache.spark.ml.feature.RFormula
-  
- getGaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- getImpurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getIndices() - Method in class org.apache.spark.ml.feature.VectorSlicer
-  
- getInitializationMode() - Method in class org.apache.spark.mllib.clustering.KMeans
- 
The initialization algorithm. 
- getInitializationSteps() - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Number of steps for the k-means|| initialization mode 
- getInitialModel() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Return the user supplied initial GMM, if supplied 
- getInitialPositionInStream(int) - Method in class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
-  
- getInputFormat(JobConf) - Method in class org.apache.spark.rdd.HadoopRDD
-  
- getInt(String, int) - Method in class org.apache.spark.SparkConf
- 
Get a parameter as an integer, falling back to a default if not set 
- getInt(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i as a primitive int. 
- getInverse() - Method in class org.apache.spark.ml.feature.DCT
-  
- getItem(Object) - Method in class org.apache.spark.sql.Column
- 
An expression that gets an item at position ordinalout of an array,
 or gets a value by keykeyin aMapType.
 
- getJavaMap(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i of array type as a Map.
 
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-  
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-  
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-  
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
- 
Retrieve the jdbc / sql type for a given datatype. 
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-  
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-  
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-  
- getJobConf() - Method in class org.apache.spark.rdd.HadoopRDD
-  
- getJobIdsForGroup(String) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
- 
Return a list of all known jobs in a particular job group. 
- getJobIdsForGroup(String) - Method in class org.apache.spark.SparkStatusTracker
- 
Return a list of all known jobs in a particular job group. 
- getJobInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
- 
Returns job information, or nullif the job info could not be found or was garbage collected.
 
- getJobInfo(int) - Method in class org.apache.spark.SparkStatusTracker
- 
Returns job information, or Noneif the job info could not be found or was garbage collected.
 
- getK() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Gets the desired number of leaf clusters. 
- getK() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Return the number of Gaussians in the mixture model 
- getK() - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Number of clusters to create (k). 
- getK() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Number of topics to infer. 
- getKappa() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
- 
Learning rate: exponential decay rate 
- getLabels() - Method in class org.apache.spark.ml.feature.IndexToString
-  
- getLambda() - Method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- getLDAModel(double[]) - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer
-  
- getLearningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- getLeastGroupHash(String) - Method in class org.apache.spark.rdd.PartitionCoalescer
- 
Sorts and gets the least element of the list associated with key in groupHash
 The returned PartitionGroup is the least loaded of all groups that represent the machine "key" 
- getList(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i of array type as List.
 
- getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Get a local property set in this thread, or null if it is missing. 
- getLocalProperty(String) - Method in class org.apache.spark.SparkContext
- 
Get a local property set in this thread, or null if it is missing. 
- getLong(String, long) - Method in class org.apache.spark.SparkConf
- 
Get a parameter as a long, falling back to a default if not set 
- getLong(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i as a primitive long. 
- getLong(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a Long. 
- getLongArray(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a Long array. 
- getLoss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- getLossType() - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- getLossType() - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- getMap(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i of map type as a Scala Map. 
- getMap() - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Returns the immutable version of this map. 
- getMaxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getMaxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Gets the max number of k-means iterations to split clusters. 
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Return the maximum number of iterations to run 
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Maximum number of iterations to run. 
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Maximum number of iterations for learning. 
- getMaxLocalProjDBSize() - Method in class org.apache.spark.mllib.fpm.PrefixSpan
- 
Gets the maximum number of items allowed in a projected database before local processing. 
- getMaxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getMaxPatternLength() - Method in class org.apache.spark.mllib.fpm.PrefixSpan
- 
Gets the maximal pattern length (i.e. 
- getMessage() - Method in exception org.apache.spark.sql.AnalysisException
-  
- getMetadata(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a Metadata. 
- getMetadataArray(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a Metadata array. 
- getMetricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-  
- getMetricName() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-  
- getMetricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-  
- getMetricsSources(String) - Method in class org.apache.spark.TaskContext
- 
::DeveloperApi::
 Returns all metrics sources with the given name which are associated with the instance
 which runs the task. 
- getMinDivisibleClusterSize() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Gets the minimum number of points (if >= 1.0) or the minimum proportion of points
 (if <1.0) of a divisible cluster.
 
- getMiniBatchFraction() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
- 
Mini-batch fraction, which sets the fraction of document sampled and used in each iteration 
- getMinInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getMinInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getMinSupport() - Method in class org.apache.spark.mllib.fpm.PrefixSpan
- 
Get the minimal support (i.e. 
- getMinTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- getModel() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
-  
- getModel() - Method in class org.apache.spark.ml.clustering.LDAModel
- 
Returns underlying spark.mllib model, which may be local or distributed 
- getModel() - Method in class org.apache.spark.ml.clustering.LocalLDAModel
-  
- getModelType() - Method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- getN() - Method in class org.apache.spark.ml.feature.NGram
-  
- getNames() - Method in class org.apache.spark.ml.feature.VectorSlicer
-  
- getNode(int, Node) - Static method in class org.apache.spark.mllib.tree.model.Node
- 
Traces down from a root node to get the node with the given node index. 
- getNumClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getNumFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
-  
- getNumFeatures() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
- 
The dimension of training features. 
- getNumIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- getNumPartitions() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return the number of partitions in this RDD. 
- getNumPartitions() - Method in class org.apache.spark.rdd.RDD
- 
Returns the number of partitions of this RDD. 
- getNumValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute
- 
Get the number of values, either from numValuesor fromvalues.
 
- getOldDataset(DataFrame, String) - Static method in class org.apache.spark.ml.clustering.LDA
- 
Get dataset for spark.mllib LDA 
- getOptimizeDocConcentration() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
- 
Optimize docConcentration, indicates whether docConcentration (Dirichlet parameter for
 document-topic distribution) will be optimized during training. 
- getOptimizer() - Method in class org.apache.spark.mllib.clustering.LDA
- 
:: DeveloperApi :: 
- getOption(String) - Method in class org.apache.spark.SparkConf
- 
Get a parameter as an Option 
- getOption() - Method in class org.apache.spark.streaming.State
- 
Get the state as an Option.
 
- getOrCreate(SparkConf) - Static method in class org.apache.spark.SparkContext
- 
This function may be used to get or instantiate a SparkContext and register it as a
 singleton object. 
- getOrCreate() - Static method in class org.apache.spark.SparkContext
- 
This function may be used to get or instantiate a SparkContext and register it as a
 singleton object. 
- getOrCreate(SparkContext) - Static method in class org.apache.spark.sql.SQLContext
- 
Get the singleton SQLContext if it exists or create a new one using the given SparkContext. 
- getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Deprecated.
As of 1.4.0, replaced by getOrCreatewithout JavaStreamingContextFactory.
 
 
- getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Deprecated.
As of 1.4.0, replaced by getOrCreatewithout JavaStreamingContextFactory.
 
 
- getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Deprecated.
As of 1.4.0, replaced by getOrCreatewithout JavaStreamingContextFactory.
 
 
- getOrCreate(String, Function0<JavaStreamingContext>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext. 
- getOrCreate(String, Function0<JavaStreamingContext>, Configuration) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext. 
- getOrCreate(String, Function0<JavaStreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext. 
- getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
- 
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext. 
- getOrDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params
- 
Gets the value of a param in the embedded param map or its default value. 
- getOrElse(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap
- 
Returns the value associated with a param or a default value. 
- getP() - Method in class org.apache.spark.ml.feature.Normalizer
-  
- getParam(String) - Method in interface org.apache.spark.ml.param.Params
-  
- getParents(int) - Method in class org.apache.spark.NarrowDependency
- 
Get the parent partitions for a child partition. 
- getParents(int) - Method in class org.apache.spark.OneToOneDependency
-  
- getParents(int) - Method in class org.apache.spark.RangeDependency
-  
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-  
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-  
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-  
- getPartition(long, long, int) - Method in interface org.apache.spark.graphx.PartitionStrategy
- 
Returns the partition number for a given edge. 
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-  
- getPartition(Object) - Method in class org.apache.spark.HashPartitioner
-  
- getPartition(Object) - Method in class org.apache.spark.Partitioner
-  
- getPartition(Object) - Method in class org.apache.spark.RangePartitioner
-  
- getPartitionId() - Static method in class org.apache.spark.TaskContext
- 
Returns the partition id of currently active TaskContext. 
- getPartitions() - Method in class org.apache.spark.api.r.BaseRRDD
-  
- getPartitions() - Method in class org.apache.spark.graphx.EdgeRDD
-  
- getPartitions() - Method in class org.apache.spark.graphx.VertexRDD
-  
- getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
-  
- getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
-  
- getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
-  
- getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
-  
- getPartitions() - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- getPartitions() - Method in class org.apache.spark.rdd.PartitionPruningRDD
-  
- getPartitions() - Method in class org.apache.spark.rdd.RDD
- 
Implemented by subclasses to return the set of partitions in this RDD. 
- getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
-  
- getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
-  
- getPath() - Method in class org.apache.spark.input.PortableDataStream
-  
- getPattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- getPersistentRDDs() - Method in class org.apache.spark.SparkContext
- 
Returns an immutable map of RDDs that have marked themselves as persistent via cache() call. 
- getPoolForName(String) - Method in class org.apache.spark.SparkContext
- 
:: DeveloperApi ::
 Return the pool associated with the given name, if one exists 
- getPreferredLocations(Partition) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
-  
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
-  
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD
- 
Optionally overridden by subclasses to specify placement preferences. 
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ShuffledRDD
-  
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
-  
- getQuantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getRDDStorageInfo() - Method in class org.apache.spark.SparkContext
- 
:: DeveloperApi ::
 Return information about what RDDs are cached, if they are in mem or on disk, how much space
 they take, etc. 
- getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
- 
Gets the receiver object that will be sent to the worker nodes
 to receive data. 
- getRootDirectory() - Static method in class org.apache.spark.SparkFiles
- 
Get the root directory that contains files added through SparkContext.addFile().
 
- getRuns() - Method in class org.apache.spark.mllib.clustering.KMeans
- 
:: Experimental ::
 Number of runs of the algorithm to execute in parallel. 
- getScalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-  
- getSchedulingMode() - Method in class org.apache.spark.SparkContext
- 
Return current scheduling mode 
- getSchema(Class<?>) - Method in class org.apache.spark.sql.SQLContext
-  
- getSeed() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Gets the random seed. 
- getSeed() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Return the random seed 
- getSeed() - Method in class org.apache.spark.mllib.clustering.KMeans
- 
The random seed for cluster initialization. 
- getSeed() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Random seed 
- getSeq(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i of array type as a Scala Seq. 
- getSerializer(Serializer) - Static method in class org.apache.spark.serializer.Serializer
-  
- getSerializer(Option<Serializer>) - Static method in class org.apache.spark.serializer.Serializer
-  
- getShort(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i as a primitive short. 
- getSizeAsBytes(String) - Method in class org.apache.spark.SparkConf
- 
Get a size parameter as bytes; throws a NoSuchElementException if it's not set. 
- getSizeAsBytes(String, String) - Method in class org.apache.spark.SparkConf
- 
Get a size parameter as bytes, falling back to a default if not set. 
- getSizeAsBytes(String, long) - Method in class org.apache.spark.SparkConf
- 
Get a size parameter as bytes, falling back to a default if not set. 
- getSizeAsGb(String) - Method in class org.apache.spark.SparkConf
- 
Get a size parameter as Gibibytes; throws a NoSuchElementException if it's not set. 
- getSizeAsGb(String, String) - Method in class org.apache.spark.SparkConf
- 
Get a size parameter as Gibibytes, falling back to a default if not set. 
- getSizeAsKb(String) - Method in class org.apache.spark.SparkConf
- 
Get a size parameter as Kibibytes; throws a NoSuchElementException if it's not set. 
- getSizeAsKb(String, String) - Method in class org.apache.spark.SparkConf
- 
Get a size parameter as Kibibytes, falling back to a default if not set. 
- getSizeAsMb(String) - Method in class org.apache.spark.SparkConf
- 
Get a size parameter as Mebibytes; throws a NoSuchElementException if it's not set. 
- getSizeAsMb(String, String) - Method in class org.apache.spark.SparkConf
- 
Get a size parameter as Mebibytes, falling back to a default if not set. 
- getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Get Spark's home location from either a value set through the constructor,
 or the spark.home Java property, or the SPARK_HOME environment variable
 (in that order of preference). 
- getSplits() - Method in class org.apache.spark.ml.feature.Bucketizer
-  
- getSQLDialect() - Method in class org.apache.spark.sql.hive.HiveContext
-  
- getSQLDialect() - Method in class org.apache.spark.sql.SQLContext
-  
- getStageInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
- 
Returns stage information, or nullif the stage info could not be found or was
 garbage collected.
 
- getStageInfo(int) - Method in class org.apache.spark.SparkStatusTracker
- 
Returns stage information, or Noneif the stage info could not be found or was
 garbage collected.
 
- getStages() - Method in class org.apache.spark.ml.Pipeline
-  
- getState() - Method in interface org.apache.spark.launcher.SparkAppHandle
- 
Returns the current application state. 
- getState() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
:: DeveloperApi :: 
- getState() - Method in class org.apache.spark.streaming.StreamingContext
- 
:: DeveloperApi :: 
- getStatement() - Method in class org.apache.spark.ml.feature.SQLTransformer
-  
- getStopWords() - Method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Get the RDD's current storage level, or StorageLevel.NONE if none is set. 
- getStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- getStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- getStorageLevel() - Method in class org.apache.spark.rdd.RDD
- 
Get the RDD's current storage level, or StorageLevel.NONE if none is set. 
- getString(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i as a String object. 
- getString(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a String. 
- getStringArray(String) - Method in class org.apache.spark.sql.types.Metadata
- 
Gets a String array. 
- getStruct(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i of struct type as an  Row object. 
- getSubsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getTableExistsQuery(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
- 
Get the SQL query that should be used to find if the given table exists. 
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-  
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-  
- getTau0() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
- 
A (positive) learning parameter that downweights early iterations. 
- getThreadLocal() - Static method in class org.apache.spark.SparkEnv
- 
Returns the ThreadLocal SparkEnv. 
- getThreshold() - Method in class org.apache.spark.ml.classification.LogisticRegression
-  
- getThreshold() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- getThreshold() - Method in class org.apache.spark.ml.feature.Binarizer
-  
- getThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
- 
Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions. 
- getThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
- 
Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions. 
- getThresholds() - Method in class org.apache.spark.ml.classification.LogisticRegression
-  
- getThresholds() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- getTimeAsMs(String) - Method in class org.apache.spark.SparkConf
- 
Get a time parameter as milliseconds; throws a NoSuchElementException if it's not set. 
- getTimeAsMs(String, String) - Method in class org.apache.spark.SparkConf
- 
Get a time parameter as milliseconds, falling back to a default if not set. 
- getTimeAsSeconds(String) - Method in class org.apache.spark.SparkConf
- 
Get a time parameter as seconds; throws a NoSuchElementException if it's not set. 
- getTimeAsSeconds(String, String) - Method in class org.apache.spark.SparkConf
- 
Get a time parameter as seconds, falling back to a default if not set. 
- getTimestamp(int) - Method in interface org.apache.spark.sql.Row
- 
Returns the value at position i of date type as java.sql.Timestamp. 
- gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo
- 
The time when the task started remotely getting the result. 
- getToLowercase() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- getTopicConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
- 
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
 distributions over terms. 
- getTreeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- getUseNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- getValidationTol() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- getValue() - Method in class org.apache.spark.broadcast.Broadcast
- 
Actually get the broadcasted value. 
- getValue(int) - Method in class org.apache.spark.ml.attribute.NominalAttribute
- 
Gets a value given its index. 
- getValuesMap(Seq<String>) - Method in interface org.apache.spark.sql.Row
- 
Returns a Map(name -> value) for the requested fieldNames
 For primitive types if value is null it returns 'zero value' specific for primitive
 ie. 
- getVectors() - Method in class org.apache.spark.ml.feature.Word2VecModel
- 
Returns a dataframe with two fields, "word" and "vector", with "word" being a String and
 and the vector the DenseVector that it is mapped to. 
- getVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel
-  
- Gini - Class in org.apache.spark.mllib.tree.impurity
- 
:: Experimental ::
 Class for calculating the
 Gini impurityduring binary classification.
 
- Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
-  
- globalTopicTotals() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-  
- globalTopicTotals() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
- 
Aggregate distributions over topics from all term vertices. 
- glom() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return an RDD created by coalescing all elements within each partition into an array. 
- glom() - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD created by coalescing all elements within each partition into an array. 
- glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
 this DStream. 
- glom() - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
 this DStream. 
- gradient() - Method in class org.apache.spark.ml.classification.LogisticAggregator
-  
- gradient() - Method in class org.apache.spark.ml.regression.AFTAggregator
-  
- gradient() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
-  
- Gradient - Class in org.apache.spark.mllib.optimization
- 
:: DeveloperApi ::
 Class used to compute the gradient for a loss function, given a single data point. 
- Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
-  
- gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
- 
Method to calculate the gradients for the gradient boosting calculation for least
 absolute error calculation. 
- gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
- 
Method to calculate the loss gradients for the gradient boosting calculation for binary
 classification
 The gradient with respect to F(x) is: - 4 y / (1 + exp(2 y F(x))) 
- gradient(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss
- 
Method to calculate the gradients for the gradient boosting calculation. 
- gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
- 
Method to calculate the gradients for the gradient boosting calculation for least
 squares error calculation. 
- GradientBoostedTrees - Class in org.apache.spark.mllib.tree
- 
A class that implements
 Stochastic Gradient Boostingfor regression and binary classification.
 
- GradientBoostedTrees(BoostingStrategy) - Constructor for class org.apache.spark.mllib.tree.GradientBoostedTrees
-  
- GradientBoostedTreesModel - Class in org.apache.spark.mllib.tree.model
- 
Represents a gradient boosted trees model. 
- GradientBoostedTreesModel(Enumeration.Value, DecisionTreeModel[], double[]) - Constructor for class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-  
- GradientDescent - Class in org.apache.spark.mllib.optimization
- 
Class used to solve an optimization problem using Gradient Descent. 
- Graph<VD,ED> - Class in org.apache.spark.graphx
- 
The Graph abstractly represents a graph with arbitrary objects
 associated with vertices and edges. 
- Graph(ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.Graph
-  
- graph() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-  
- graph() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
- 
The following fields will only be initialized through the initialize() method 
- graph() - Method in class org.apache.spark.streaming.dstream.DStream
-  
- graph() - Method in class org.apache.spark.streaming.StreamingContext
-  
- GraphGenerators - Class in org.apache.spark.graphx.util
- 
A collection of graph generating functions. 
- GraphGenerators() - Constructor for class org.apache.spark.graphx.util.GraphGenerators
-  
- GraphImpl<VD,ED> - Class in org.apache.spark.graphx.impl
- 
An implementation of  Graph to support computation on graphs. 
- GraphImpl(VertexRDD<VD>, ReplicatedVertexView<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.impl.GraphImpl
-  
- GraphImpl(ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.impl.GraphImpl
- 
Default constructor is provided to support serialization 
- GraphKryoRegistrator - Class in org.apache.spark.graphx
- 
Registers GraphX classes with Kryo for improved performance. 
- GraphKryoRegistrator() - Constructor for class org.apache.spark.graphx.GraphKryoRegistrator
-  
- GraphLoader - Class in org.apache.spark.graphx
- 
Provides utilities for loading  Graphs from files. 
- GraphLoader() - Constructor for class org.apache.spark.graphx.GraphLoader
-  
- GraphOps<VD,ED> - Class in org.apache.spark.graphx
- 
Contains additional functionality for  Graph. 
- GraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.GraphOps
-  
- graphToGraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
- 
Implicitly extracts the  GraphOps member from a graph. 
- GraphXUtils - Class in org.apache.spark.graphx
-  
- GraphXUtils() - Constructor for class org.apache.spark.graphx.GraphXUtils
-  
- greater(Duration) - Method in class org.apache.spark.streaming.Duration
-  
- greater(Time) - Method in class org.apache.spark.streaming.Time
-  
- greaterEq(Duration) - Method in class org.apache.spark.streaming.Duration
-  
- greaterEq(Time) - Method in class org.apache.spark.streaming.Time
-  
- GreaterThan - Class in org.apache.spark.sql.sources
- 
A filter that evaluates to trueiff the attribute evaluates to a value
 greater thanvalue.
 
- GreaterThan(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThan
-  
- GreaterThanOrEqual - Class in org.apache.spark.sql.sources
- 
A filter that evaluates to trueiff the attribute evaluates to a value
 greater than or equal tovalue.
 
- GreaterThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThanOrEqual
-  
- greatest(Column...) - Static method in class org.apache.spark.sql.functions
- 
Returns the greatest value of the list of values, skipping null values. 
- greatest(String, String...) - Static method in class org.apache.spark.sql.functions
- 
Returns the greatest value of the list of column names, skipping null values. 
- greatest(Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Returns the greatest value of the list of values, skipping null values. 
- greatest(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
- 
Returns the greatest value of the list of column names, skipping null values. 
- gridGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
- 
Create rowsbycolsgrid graph with each vertex connected to its
 row+1 and col+1 neighbors.
 
- groupArr() - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- groupBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return an RDD of grouped elements. 
- groupBy(Function<T, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return an RDD of grouped elements. 
- groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD of grouped items. 
- groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD of grouped elements. 
- groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD of grouped items. 
- groupBy(Column...) - Method in class org.apache.spark.sql.DataFrame
- 
Groups the  DataFrame using the specified columns, so we can run aggregation on them. 
- groupBy(String, String...) - Method in class org.apache.spark.sql.DataFrame
- 
Groups the  DataFrame using the specified columns, so we can run aggregation on them. 
- groupBy(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
- 
Groups the  DataFrame using the specified columns, so we can run aggregation on them. 
- groupBy(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
Groups the  DataFrame using the specified columns, so we can run aggregation on them. 
- groupBy(Column...) - Method in class org.apache.spark.sql.Dataset
- 
- groupBy(Function1<T, K>, Encoder<K>) - Method in class org.apache.spark.sql.Dataset
- 
(Scala-specific)
 Returns a  GroupedDataset where the data is grouped by the given key  func. 
- groupBy(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
- 
- groupBy(MapFunction<T, K>, Encoder<K>) - Method in class org.apache.spark.sql.Dataset
- 
(Java-specific)
 Returns a  GroupedDataset where the data is grouped by the given key  func. 
- groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Group the values for each key in the RDD into a single sequence. 
- groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Group the values for each key in the RDD into a single sequence. 
- groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Group the values for each key in the RDD into a single sequence. 
- groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Group the values for each key in the RDD into a single sequence. 
- groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Group the values for each key in the RDD into a single sequence. 
- groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Group the values for each key in the RDD into a single sequence. 
- groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying groupByKeyto each RDD.
 
- groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying groupByKeyto each RDD.
 
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying groupByKeyon each RDD ofthisDStream.
 
- groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying groupByKeyto each RDD.
 
- groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying groupByKeyto each RDD.
 
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying groupByKeyon each RDD.
 
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying groupByKeyover a sliding window.
 
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying groupByKeyover a sliding window.
 
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying groupByKeyover a sliding window onthisDStream.
 
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying groupByKeyover a sliding window onthisDStream.
 
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying groupByKeyover a sliding window.
 
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying groupByKeyover a sliding window.
 
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying groupByKeyover a sliding window onthisDStream.
 
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Create a new DStream by applying groupByKeyover a sliding window onthisDStream.
 
- GroupedData - Class in org.apache.spark.sql
- 
:: Experimental ::
 A set of methods for aggregations on a  DataFrame, created by  DataFrame.groupBy. 
- GroupedData(DataFrame, Seq<Expression>, GroupedData.GroupType) - Constructor for class org.apache.spark.sql.GroupedData
-  
- GroupedDataset<K,V> - Class in org.apache.spark.sql
- 
:: Experimental ::
 A  Dataset has been logically grouped by a user specified grouping key. 
- groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.Graph
- 
Merges multiple edges between two vertices into a single edge. 
- groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- groupHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Alias for cogroup. 
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Alias for cogroup. 
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Alias for cogroup. 
- groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Alias for cogroup. 
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Alias for cogroup. 
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Alias for cogroup. 
- gt(double) - Static method in class org.apache.spark.ml.param.ParamValidators
- 
Check if value > lowerBound 
- gt(Object) - Method in class org.apache.spark.sql.Column
- 
Greater than. 
- gtEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators
- 
Check if value >= lowerBound 
- L1Updater - Class in org.apache.spark.mllib.optimization
- 
:: DeveloperApi ::
 Updater for L1 regularized problems. 
- L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
-  
- label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-  
- labelCol() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-  
- labelCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
- 
Field in "predictions" which gives the true label of each instance. 
- labelCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-  
- LabelConverter - Class in org.apache.spark.ml.classification
- 
Label to vector converter. 
- LabelConverter() - Constructor for class org.apache.spark.ml.classification.LabelConverter
-  
- LabeledPoint - Class in org.apache.spark.mllib.regression
- 
Class that represents the features and labels of a data point. 
- LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
-  
- LabelPropagation - Class in org.apache.spark.graphx.lib
- 
Label Propagation algorithm. 
- LabelPropagation() - Constructor for class org.apache.spark.graphx.lib.LabelPropagation
-  
- labels() - Method in class org.apache.spark.ml.feature.IndexToString
-  
- labels() - Method in class org.apache.spark.ml.feature.StringIndexerModel
-  
- labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-  
- labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns the sequence of labels in ascending order 
- labels() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns the sequence of labels in ascending order 
- lag(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the value that is offsetrows before the current row, andnullif there is less thanoffsetrows before the current row.
 
- lag(String, int) - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the value that is offsetrows before the current row, andnullif there is less thanoffsetrows before the current row.
 
- lag(String, int, Object) - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the value that is offsetrows before the current row, anddefaultValueif there is less thanoffsetrows before the current row.
 
- lag(Column, int, Object) - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the value that is offsetrows before the current row, anddefaultValueif there is less thanoffsetrows before the current row.
 
- LassoModel - Class in org.apache.spark.mllib.regression
- 
Regression model trained using Lasso. 
- LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
-  
- LassoWithSGD - Class in org.apache.spark.mllib.regression
- 
Train a regression model with L1-regularization using Stochastic Gradient Descent. 
- LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD
- 
Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100,
 regParam: 0.01, miniBatchFraction: 1.0}. 
- last(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the last value in a group. 
- last(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the last value of the column in a group. 
- last_day(Column) - Static method in class org.apache.spark.sql.functions
- 
Given a date column, returns the last day of the month which the given date belongs to. 
- lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-  
- lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-  
- lastErrorTime() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-  
- lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
-  
- latestModel() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Return the latest model. 
- latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
- 
Return the latest model. 
- launch() - Method in class org.apache.spark.launcher.SparkLauncher
- 
Launches a sub-process that will start the configured Spark application. 
- launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- launchTime() - Method in class org.apache.spark.status.api.v1.TaskData
-  
- layers() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
-  
- LBFGS - Class in org.apache.spark.mllib.optimization
- 
:: DeveloperApi ::
 Class used to solve an optimization problem using Limited-memory BFGS. 
- LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
-  
- LDA - Class in org.apache.spark.ml.clustering
- 
:: Experimental :: 
- LDA(String) - Constructor for class org.apache.spark.ml.clustering.LDA
-  
- LDA() - Constructor for class org.apache.spark.ml.clustering.LDA
-  
- LDA - Class in org.apache.spark.mllib.clustering
- 
Latent Dirichlet Allocation (LDA), a topic model designed for text documents. 
- LDA() - Constructor for class org.apache.spark.mllib.clustering.LDA
- 
Constructs a LDA instance with default parameters. 
- LDAModel - Class in org.apache.spark.ml.clustering
- 
:: Experimental ::
 Model fitted by  LDA. 
- LDAModel - Class in org.apache.spark.mllib.clustering
- 
Latent Dirichlet Allocation (LDA) model. 
- LDAOptimizer - Interface in org.apache.spark.mllib.clustering
- 
:: DeveloperApi :: 
- lead(String, int) - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the value that is offsetrows after the current row, andnullif there is less thanoffsetrows after the current row.
 
- lead(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the value that is offsetrows after the current row, andnullif there is less thanoffsetrows after the current row.
 
- lead(String, int, Object) - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the value that is offsetrows after the current row, anddefaultValueif there is less thanoffsetrows after the current row.
 
- lead(Column, int, Object) - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the value that is offsetrows after the current row, anddefaultValueif there is less thanoffsetrows after the current row.
 
- LeafNode - Class in org.apache.spark.ml.tree
- 
:: DeveloperApi ::
 Decision tree leaf node. 
- learningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- least(Column...) - Static method in class org.apache.spark.sql.functions
- 
Returns the least value of the list of values, skipping null values. 
- least(String, String...) - Static method in class org.apache.spark.sql.functions
- 
Returns the least value of the list of column names, skipping null values. 
- least(Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Returns the least value of the list of values, skipping null values. 
- least(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
- 
Returns the least value of the list of column names, skipping null values. 
- LeastSquaresAggregator - Class in org.apache.spark.ml.regression
- 
LeastSquaresAggregator computes the gradient and loss for a Least-squared loss function,
 as used in linear regression for samples in sparse or dense vector in a online fashion. 
- LeastSquaresAggregator(Vector, double, double, boolean, double[], double[]) - Constructor for class org.apache.spark.ml.regression.LeastSquaresAggregator
-  
- LeastSquaresCostFun - Class in org.apache.spark.ml.regression
- 
LeastSquaresCostFun implements Breeze's DiffFunction[T] for Least Squares cost. 
- LeastSquaresCostFun(RDD<org.apache.spark.ml.feature.Instance>, double, double, boolean, boolean, double[], double[], double) - Constructor for class org.apache.spark.ml.regression.LeastSquaresCostFun
-  
- LeastSquaresGradient - Class in org.apache.spark.mllib.optimization
- 
:: DeveloperApi ::
 Compute gradient and loss for a Least-squared loss function, as used in linear regression. 
- LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
-  
- left() - Method in class org.apache.spark.sql.sources.And
-  
- left() - Method in class org.apache.spark.sql.sources.Or
-  
- leftCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit
- 
Get sorted categories which split to the left 
- leftChild() - Method in class org.apache.spark.ml.tree.InternalNode
-  
- leftChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
- 
Return the index of the left child of this node. 
- leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-  
- leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
- 
Left joins this VertexRDD with an RDD containing vertex attribute pairs. 
- leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
-  
- leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Perform a left outer join of thisandother.
 
- leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Perform a left outer join of thisandother.
 
- leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Perform a left outer join of thisandother.
 
- leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Perform a left outer join of thisandother.
 
- leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Perform a left outer join of thisandother.
 
- leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Perform a left outer join of thisandother.
 
- leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'left outer join' between RDDs of thisDStream andotherDStream.
 
- leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'left outer join' between RDDs of thisDStream andotherDStream.
 
- leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'left outer join' between RDDs of thisDStream andotherDStream.
 
- leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'left outer join' between RDDs of thisDStream andotherDStream.
 
- leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'left outer join' between RDDs of thisDStream andotherDStream.
 
- leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'left outer join' between RDDs of thisDStream andotherDStream.
 
- leftPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-  
- leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
- 
Left joins this RDD with another VertexRDD with the same index. 
- LEGACY_DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
- 
Legacy version of DRIVER_IDENTIFIER, retained for backwards-compatibility. 
- length() - Method in class org.apache.spark.scheduler.SplitInfo
-  
- length(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the length of a given string or binary column. 
- length() - Method in interface org.apache.spark.sql.Row
- 
Number of elements in the Row. 
- length() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
-  
- length() - Method in class org.apache.spark.sql.types.StructType
-  
- length() - Method in class org.apache.spark.util.Vector
-  
- leq(Object) - Method in class org.apache.spark.sql.Column
- 
Less than or equal to. 
- less(Duration) - Method in class org.apache.spark.streaming.Duration
-  
- less(Time) - Method in class org.apache.spark.streaming.Time
-  
- lessEq(Duration) - Method in class org.apache.spark.streaming.Duration
-  
- lessEq(Time) - Method in class org.apache.spark.streaming.Time
-  
- LessThan - Class in org.apache.spark.sql.sources
- 
A filter that evaluates to trueiff the attribute evaluates to a value
 less thanvalue.
 
- LessThan(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThan
-  
- LessThanOrEqual - Class in org.apache.spark.sql.sources
- 
A filter that evaluates to trueiff the attribute evaluates to a value
 less than or equal tovalue.
 
- LessThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThanOrEqual
-  
- levenshtein(Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the Levenshtein distance of the two given string columns. 
- like(String) - Method in class org.apache.spark.sql.Column
- 
SQL like expression. 
- limit(int) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame by taking the first  n rows. 
- line() - Method in exception org.apache.spark.sql.AnalysisException
-  
- LinearDataGenerator - Class in org.apache.spark.mllib.util
- 
:: DeveloperApi ::
 Generate sample data used for Linear Data. 
- LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
-  
- LinearRegression - Class in org.apache.spark.ml.regression
- 
:: Experimental ::
 Linear regression. 
- LinearRegression(String) - Constructor for class org.apache.spark.ml.regression.LinearRegression
-  
- LinearRegression() - Constructor for class org.apache.spark.ml.regression.LinearRegression
-  
- LinearRegressionModel - Class in org.apache.spark.ml.regression
- 
- LinearRegressionModel - Class in org.apache.spark.mllib.regression
- 
Regression model trained using LinearRegression. 
- LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
-  
- LinearRegressionSummary - Class in org.apache.spark.ml.regression
-  
- LinearRegressionTrainingSummary - Class in org.apache.spark.ml.regression
-  
- LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
- 
Train a linear regression model with no regularization using Stochastic Gradient Descent. 
- LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
- 
Construct a LinearRegression object with default parameters: {stepSize: 1.0,
 numIterations: 100, miniBatchFraction: 1.0}. 
- listener() - Method in class org.apache.spark.sql.SQLContext
-  
- listenerBus() - Method in class org.apache.spark.SparkContext
-  
- listenerManager() - Method in class org.apache.spark.sql.SQLContext
-  
- listLeafFiles(FileSystem, FileStatus) - Static method in class org.apache.spark.sql.sources.HadoopFsRelation
-  
- listLeafFilesInParallel(String[], Configuration, SparkContext) - Static method in class org.apache.spark.sql.sources.HadoopFsRelation
-  
- lit(Object) - Static method in class org.apache.spark.sql.functions
- 
Creates a  Column of literal value. 
- load(String) - Static method in class org.apache.spark.ml.classification.LogisticRegression
-  
- load(String) - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- load(String) - Static method in class org.apache.spark.ml.classification.NaiveBayes
-  
- load(String) - Static method in class org.apache.spark.ml.classification.NaiveBayesModel
-  
- load(String) - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
-  
- load(String) - Static method in class org.apache.spark.ml.clustering.KMeans
-  
- load(String) - Static method in class org.apache.spark.ml.clustering.KMeansModel
-  
- load(String) - Static method in class org.apache.spark.ml.clustering.LDA
-  
- load(String) - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
-  
- load(String) - Static method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-  
- load(String) - Static method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-  
- load(String) - Static method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-  
- load(String) - Static method in class org.apache.spark.ml.feature.Binarizer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.Bucketizer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.ChiSqSelector
-  
- load(String) - Static method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-  
- load(String) - Static method in class org.apache.spark.ml.feature.CountVectorizer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
-  
- load(String) - Static method in class org.apache.spark.ml.feature.DCT
-  
- load(String) - Static method in class org.apache.spark.ml.feature.HashingTF
-  
- load(String) - Static method in class org.apache.spark.ml.feature.IDF
-  
- load(String) - Static method in class org.apache.spark.ml.feature.IDFModel
-  
- load(String) - Static method in class org.apache.spark.ml.feature.IndexToString
-  
- load(String) - Static method in class org.apache.spark.ml.feature.Interaction
-  
- load(String) - Static method in class org.apache.spark.ml.feature.MinMaxScaler
-  
- load(String) - Static method in class org.apache.spark.ml.feature.MinMaxScalerModel
-  
- load(String) - Static method in class org.apache.spark.ml.feature.NGram
-  
- load(String) - Static method in class org.apache.spark.ml.feature.Normalizer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.OneHotEncoder
-  
- load(String) - Static method in class org.apache.spark.ml.feature.PCA
-  
- load(String) - Static method in class org.apache.spark.ml.feature.PCAModel
-  
- load(String) - Static method in class org.apache.spark.ml.feature.PolynomialExpansion
-  
- load(String) - Static method in class org.apache.spark.ml.feature.QuantileDiscretizer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.SQLTransformer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.StandardScaler
-  
- load(String) - Static method in class org.apache.spark.ml.feature.StandardScalerModel
-  
- load(String) - Static method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- load(String) - Static method in class org.apache.spark.ml.feature.StringIndexer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.StringIndexerModel
-  
- load(String) - Static method in class org.apache.spark.ml.feature.Tokenizer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.VectorAssembler
-  
- load(String) - Static method in class org.apache.spark.ml.feature.VectorIndexer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.VectorIndexerModel
-  
- load(String) - Static method in class org.apache.spark.ml.feature.VectorSlicer
-  
- load(String) - Static method in class org.apache.spark.ml.feature.Word2Vec
-  
- load(String) - Static method in class org.apache.spark.ml.feature.Word2VecModel
-  
- load(String) - Static method in class org.apache.spark.ml.Pipeline
-  
- load(String) - Static method in class org.apache.spark.ml.PipelineModel
-  
- load(String) - Static method in class org.apache.spark.ml.recommendation.ALS
-  
- load(String) - Static method in class org.apache.spark.ml.recommendation.ALSModel
-  
- load(String) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- load(String) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- load(String) - Static method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- load(String) - Static method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-  
- load(String) - Static method in class org.apache.spark.ml.regression.LinearRegression
-  
- load(String) - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
-  
- load(String) - Static method in class org.apache.spark.ml.tuning.CrossValidator
-  
- load(String) - Static method in class org.apache.spark.ml.tuning.CrossValidatorModel
-  
- load(String) - Method in interface org.apache.spark.ml.util.MLReadable
- 
Reads an ML instance from the input path, a shortcut of read.load(path).
 
- load(String) - Method in class org.apache.spark.ml.util.MLReader
- 
Loads the ML component from the input path. 
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.SVMModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.KMeansModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.LocalLDAModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.feature.Word2VecModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
- 
Load a model from the given path. 
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LassoModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LinearRegressionModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-  
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.RandomForestModel
-  
- load(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Loader
- 
Load a model from the given path. 
- load(String...) - Method in class org.apache.spark.sql.DataFrameReader
- 
Loads input in as a  DataFrame, for data sources that support multiple paths. 
- load(String) - Method in class org.apache.spark.sql.DataFrameReader
- 
Loads input in as a  DataFrame, for data sources that require a path (e.g. 
- load() - Method in class org.apache.spark.sql.DataFrameReader
- 
Loads input in as a  DataFrame, for data sources that don't require a path (e.g. 
- load(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader
- 
Loads input in as a  DataFrame, for data sources that support multiple paths. 
- load(String) - Method in class org.apache.spark.sql.SQLContext
- 
Deprecated.
As of 1.4.0, replaced by read().load(path). This will be removed in Spark 2.0.
 
 
- load(String, String) - Method in class org.apache.spark.sql.SQLContext
- 
Deprecated.
As of 1.4.0, replaced by read().format(source).load(path).
             This will be removed in Spark 2.0.
 
 
- load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
- 
Deprecated.
As of 1.4.0, replaced by read().format(source).options(options).load().
             This will be removed in Spark 2.0.
 
 
- load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
- 
Deprecated.
As of 1.4.0, replaced by read().format(source).options(options).load().
 
 
- load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
- 
Deprecated.
As of 1.4.0, replaced by
            read().format(source).schema(schema).options(options).load().
 
 
- load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
- 
Deprecated.
As of 1.4.0, replaced by
            read().format(source).schema(schema).options(options).load().
 
 
- Loader<M extends Saveable> - Interface in org.apache.spark.mllib.util
- 
:: DeveloperApi :: 
- loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
- loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile.
 
- loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFilewith the default number of
 partitions.
 
- loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint]. 
- loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-  
- loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of
 partitions. 
- loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-  
- loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils
-  
- loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of
 features determined automatically and the default number of partitions. 
- loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
Loads vectors saved using RDD[Vector].saveAsTextFile.
 
- loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
Loads vectors saved using RDD[Vector].saveAsTextFilewith the default number of partitions.
 
- LOCAL_CLUSTER_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-  
- LOCAL_N_FAILURES_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-  
- LOCAL_N_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-  
- localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-  
- localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-  
- localCheckpoint() - Method in class org.apache.spark.rdd.RDD
- 
Mark this RDD for local checkpointing using Spark's existing caching layer. 
- LocalLDAModel - Class in org.apache.spark.ml.clustering
- 
:: Experimental :: 
- LocalLDAModel - Class in org.apache.spark.mllib.clustering
- 
Local LDA model. 
- localProperties() - Method in class org.apache.spark.SparkContext
-  
- localSeqToDataFrameHolder(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLImplicits
- 
Creates a DataFrame from a local Seq of Product. 
- localSeqToDatasetHolder(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLImplicits
- 
Creates a  Dataset from a local Seq. 
- localValue() - Method in class org.apache.spark.Accumulable
- 
Get the current value of this accumulator from within a task. 
- locate(String, Column) - Static method in class org.apache.spark.sql.functions
- 
Locate the position of the first occurrence of substr. 
- locate(String, Column, int) - Static method in class org.apache.spark.sql.functions
- 
Locate the position of the first occurrence of substr in a string column, after position pos. 
- location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-  
- log() - Method in interface org.apache.spark.Logging
-  
- log(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the natural logarithm of the given value. 
- log(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the natural logarithm of the given column. 
- log(double, Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the first argument-base logarithm of the second argument. 
- log(double, String) - Static method in class org.apache.spark.sql.functions
- 
Returns the first argument-base logarithm of the second argument. 
- log10(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the logarithm of the given value in base 10. 
- log10(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the logarithm of the given value in base 10. 
- log1p(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the natural logarithm of the given value plus one. 
- log1p(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the natural logarithm of the given column plus one. 
- log2(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the logarithm of the given column in base 2. 
- log2(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the logarithm of the given value in base 2. 
- log_() - Method in interface org.apache.spark.Logging
-  
- logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
-  
- logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-  
- logDeprecationWarning(String) - Static method in class org.apache.spark.SparkConf
- 
Logs a warning message if the given config key is deprecated. 
- logDirName() - Method in class org.apache.spark.scheduler.JobLogger
-  
- logError(Function0<String>) - Method in interface org.apache.spark.Logging
-  
- logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-  
- Logging - Interface in org.apache.spark
- 
Utility trait for classes that want to log data. 
- logicalPlan() - Method in class org.apache.spark.sql.DataFrame
-  
- logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
-  
- logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-  
- LogisticAggregator - Class in org.apache.spark.ml.classification
- 
LogisticAggregator computes the gradient and loss for binary logistic loss function, as used
 in binary classification for instances in sparse or dense vector in a online fashion. 
- LogisticAggregator(Vector, int, boolean, double[], double[]) - Constructor for class org.apache.spark.ml.classification.LogisticAggregator
-  
- LogisticCostFun - Class in org.apache.spark.ml.classification
- 
LogisticCostFun implements Breeze's DiffFunction[T] for a multinomial logistic loss function,
 as used in multi-class classification (it is also used in binary logistic regression). 
- LogisticCostFun(RDD<org.apache.spark.ml.feature.Instance>, int, boolean, boolean, double[], double[], double) - Constructor for class org.apache.spark.ml.classification.LogisticCostFun
-  
- LogisticGradient - Class in org.apache.spark.mllib.optimization
- 
:: DeveloperApi ::
 Compute gradient and loss for a multinomial logistic loss function, as used
 in multi-class classification (it is also used in binary logistic regression). 
- LogisticGradient(int) - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
-  
- LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
-  
- LogisticRegression - Class in org.apache.spark.ml.classification
- 
:: Experimental ::
 Logistic regression. 
- LogisticRegression(String) - Constructor for class org.apache.spark.ml.classification.LogisticRegression
-  
- LogisticRegression() - Constructor for class org.apache.spark.ml.classification.LogisticRegression
-  
- LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util
- 
:: DeveloperApi ::
 Generate test data for LogisticRegression. 
- LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-  
- LogisticRegressionModel - Class in org.apache.spark.ml.classification
- 
- LogisticRegressionModel - Class in org.apache.spark.mllib.classification
- 
Classification model trained using Multinomial/Binary Logistic Regression. 
- LogisticRegressionModel(Vector, double, int, int) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
-  
- LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
- 
- LogisticRegressionSummary - Interface in org.apache.spark.ml.classification
- 
Abstraction for Logistic Regression Results for a given model. 
- LogisticRegressionTrainingSummary - Interface in org.apache.spark.ml.classification
- 
Abstraction for multinomial Logistic Regression Training results. 
- LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification
- 
Train a classification model for Multinomial/Binary Logistic Regression using
 Limited-memory BFGS. 
- LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-  
- LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
- 
Train a classification model for Binary Logistic Regression
 using Stochastic Gradient Descent. 
- LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
- 
Construct a LogisticRegression object with default parameters: {stepSize: 1.0,
 numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}. 
- logLikelihood(DataFrame) - Method in class org.apache.spark.ml.clustering.LDAModel
- 
Calculates a lower bound on the log likelihood of the entire corpus. 
- logLikelihood() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
- 
Log likelihood of the observed tokens in the training set,
 given the current parameter estimates:
  log P(docs | topics, topic distributions for docs, alpha, eta) 
- logLikelihood() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-  
- logLikelihood(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
- 
Calculates a lower bound on the log likelihood of the entire corpus. 
- logLikelihood(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
- 
Java-friendly version of logLikelihood
 
- LogLoss - Class in org.apache.spark.mllib.tree.loss
- 
:: DeveloperApi ::
 Class for log loss calculation (for classification). 
- LogLoss() - Constructor for class org.apache.spark.mllib.tree.loss.LogLoss
-  
- logName() - Method in interface org.apache.spark.Logging
-  
- LogNormalGenerator - Class in org.apache.spark.mllib.random
- 
:: DeveloperApi ::
 Generates i.i.d. 
- LogNormalGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.LogNormalGenerator
-  
- logNormalGraph(SparkContext, int, int, double, double, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
- 
Generate a graph whose vertex out degree distribution is log normal. 
- logNormalJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- logNormalJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- logNormalJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- logNormalRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
Generates an RDD comprised of i.i.d.samples from the log normal distribution with the input
  mean and standard deviation
 
- logNormalVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
Generates an RDD[Vector] with vectors containing i.i.d.samples drawn from a
 log normal distribution.
 
- logpdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
- 
Returns the log-density of this multivariate Gaussian at given point, x 
- logPerplexity(DataFrame) - Method in class org.apache.spark.ml.clustering.LDAModel
- 
Calculate an upper bound bound on perplexity. 
- logPerplexity(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
- 
Calculate an upper bound bound on perplexity. 
- logPerplexity(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
- 
Java-friendly version of logPerplexity
 
- logPrior() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
- 
Log probability of the current parameter estimate:
 log P(topics, topic distributions for docs | Dirichlet hyperparameters) 
- logPrior() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
- 
Log probability of the current parameter estimate:
 log P(topics, topic distributions for docs | alpha, eta) 
- logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
-  
- logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-  
- logUrlMap() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-  
- logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
-  
- logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-  
- LONG() - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for nullable long type. 
- LongDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
-  
- LongParam - Class in org.apache.spark.ml.param
- 
:: DeveloperApi ::
 Specialized version of Param[Long] for Java.
 
- LongParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
-  
- LongParam(String, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
-  
- LongParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
-  
- LongParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
-  
- longRddToDataFrameHolder(RDD<Object>) - Method in class org.apache.spark.sql.SQLImplicits
- 
Creates a single column DataFrame from an RDD[Long]. 
- longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
-  
- LongType - Static variable in class org.apache.spark.sql.types.DataTypes
- 
Gets the LongType object. 
- LongType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 The data type representing Longvalues.
 
- longWritableConverter() - Static method in class org.apache.spark.SparkContext
-  
- lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return the list of values in the RDD for key key.
 
- lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return the list of values in the RDD for key key.
 
- lookupTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
-  
- loss() - Method in class org.apache.spark.ml.classification.LogisticAggregator
-  
- loss() - Method in class org.apache.spark.ml.regression.AFTAggregator
-  
- loss() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
-  
- loss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- Loss - Interface in org.apache.spark.mllib.tree.loss
- 
:: DeveloperApi ::
 Trait for adding "pluggable" loss functions for the gradient boosting algorithm. 
- Losses - Class in org.apache.spark.mllib.tree.loss
-  
- Losses() - Constructor for class org.apache.spark.mllib.tree.loss.Losses
-  
- lossType() - Method in class org.apache.spark.ml.classification.GBTClassifier
- 
Loss function which GBT tries to minimize. 
- lossType() - Method in class org.apache.spark.ml.regression.GBTRegressor
- 
Loss function which GBT tries to minimize. 
- low() - Method in class org.apache.spark.partial.BoundedDouble
-  
- lower(Column) - Static method in class org.apache.spark.sql.functions
- 
Converts a string column to lower case. 
- lpad(Column, int, String) - Static method in class org.apache.spark.sql.functions
- 
Left-pad the string column with 
- lt(double) - Static method in class org.apache.spark.ml.param.ParamValidators
- 
Check if value < upperBound 
- lt(Object) - Method in class org.apache.spark.sql.Column
- 
Less than. 
- ltEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators
- 
Check if value <= upperBound 
- ltrim(Column) - Static method in class org.apache.spark.sql.functions
- 
Trim the spaces from left end for the specified string value. 
- LZ4CompressionCodec - Class in org.apache.spark.io
- 
- LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
-  
- LZFCompressionCodec - Class in org.apache.spark.io
- 
- LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec
-  
- main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
-  
- main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-  
- main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-  
- main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
-  
- main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
-  
- makeDriverRef(String, SparkConf, org.apache.spark.rpc.RpcEnv) - Static method in class org.apache.spark.util.RpcUtils
- 
Retrieve a RpcEndpointRefwhich is located in the driver via its name.
 
- makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
- 
Distribute a local Scala collection to form an RDD. 
- makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
- 
Distribute a local Scala collection to form an RDD, with one or more
 location preferences (hostnames of Spark nodes) for each object. 
- map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to all elements of this RDD. 
- map(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Map the values of this matrix using a function. 
- map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult
- 
Transform this PartialResult into a PartialResult of type T. 
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD by applying a function to all elements of this RDD. 
- map(DataType, DataType) - Method in class org.apache.spark.sql.ColumnName
- 
Creates a new StructFieldof type map.
 
- map(MapType) - Method in class org.apache.spark.sql.ColumnName
-  
- map(Function1<Row, R>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new RDD by applying a function to all rows of this DataFrame. 
- map(Function1<T, U>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
- 
(Scala-specific)
 Returns a new  Dataset that contains the result of applying  func to each element. 
- map(MapFunction<T, U>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
- 
(Java-specific)
 Returns a new  Dataset that contains the result of applying  func to each element. 
- map() - Method in class org.apache.spark.sql.types.Metadata
-  
- map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream by applying a function to all elements of this DStream. 
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream by applying a function to all elements of this DStream. 
- mapEdgePartitions(Function2<Object, EdgePartition<ED, VD>, EdgePartition<ED2, VD2>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- mapEdges(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
- 
Transforms each edge attribute in the graph using the map function. 
- mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
- 
Transforms each edge attribute using the map function, passing it a whole partition at a
 time. 
- mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- MapFunction<T,U> - Interface in org.apache.spark.api.java.function
- 
Base interface for a map function used in Dataset's map function. 
- mapGroups(Function2<K, Iterator<V>, U>, Encoder<U>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Applies the given function to each group of data. 
- mapGroups(MapGroupsFunction<K, V, U>, Encoder<U>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Applies the given function to each group of data. 
- MapGroupsFunction<K,V,R> - Interface in org.apache.spark.api.java.function
- 
Base interface for a map function used in GroupedDataset's mapGroup function. 
- mapId() - Method in class org.apache.spark.FetchFailed
-  
- mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
-  
- mapId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-  
- mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-  
- mapOutputTracker() - Method in class org.apache.spark.SparkEnv
-  
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to each partition of this RDD. 
- mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to each partition of this RDD. 
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD by applying a function to each partition of this RDD. 
- mapPartitions(Function1<Iterator<Row>, Iterator<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new RDD by applying a function to each partition of this DataFrame. 
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
- 
(Scala-specific)
 Returns a new  Dataset that contains the result of applying  func to each partition. 
- mapPartitions(MapPartitionsFunction<T, U>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
- 
(Java-specific)
 Returns a new  Dataset that contains the result of applying  func to each partition. 
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
 of this DStream. 
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
 of this DStream. 
- MapPartitionsFunction<T,U> - Interface in org.apache.spark.api.java.function
- 
Base interface for function used in Dataset's mapPartitions. 
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to each partition of this RDD. 
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to each partition of this RDD. 
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to each partition of this RDD. 
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to each partition of this RDD. 
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
 of this DStream. 
- mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
:: DeveloperApi ::
 Return a new RDD by applying a function to each partition of this RDD. 
- mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
 of the original partition. 
- mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
 of the original partition. 
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD
- 
Maps over a partition, providing the InputSplit that was used as the base of the partition. 
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
- 
Maps over a partition, providing the InputSplit that was used as the base of the partition. 
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.HadoopRDD
- 
Maps over a partition, providing the InputSplit that was used as the base of the partition. 
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.NewHadoopRDD
- 
Maps over a partition, providing the InputSplit that was used as the base of the partition. 
- mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
 of the original partition. 
- mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-  
- mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-  
- mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
- 
Aggregates values from the neighboring edges and vertices of each vertex. 
- mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
-  
- mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to all elements of this RDD. 
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return a new RDD by applying a function to all elements of this RDD. 
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream by applying a function to all elements of this DStream. 
- mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
- 
Transforms each edge attribute using the map function, passing it the adjacent vertex
 attributes as well. 
- mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
- 
Transforms each edge attribute using the map function, passing it the adjacent vertex
 attributes as well. 
- mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
- 
Transforms each edge attribute a partition at a time using the map function, passing it the
 adjacent vertex attributes as well. 
- mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- MapType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 The data type for Maps. 
- MapType(DataType, DataType, boolean) - Constructor for class org.apache.spark.sql.types.MapType
-  
- MapType() - Constructor for class org.apache.spark.sql.types.MapType
- 
No-arg constructor for kryo. 
- mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Pass each value in the key-value pair RDD through a map function without changing the keys;
 this also retains the original RDD's partitioning. 
- mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.EdgeRDD
- 
Map the values in an edge partitioning preserving the structure but changing the values. 
- mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
- 
Maps each vertex attribute, preserving the index. 
- mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
- 
Maps each vertex attribute, additionally supplying the vertex ID. 
- mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Pass each value in the key-value pair RDD through a map function without changing the keys;
 this also retains the original RDD's partitioning. 
- mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying a map function to the value of each key-value pairs in
 'this' DStream without changing the key. 
- mapValues(Function1<V, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying a map function to the value of each key-value pairs in
 'this' DStream without changing the key. 
- mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph
- 
Transforms each vertex attribute in the graph using the map function. 
- mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Maps f over this RDD, where f takes an additional parameter of type A. 
- mapWithState(StateSpec<K, V, StateType, MappedType>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
:: Experimental ::
 Return a  JavaMapWithStateDStream by applying a function to every key-value element of
  this stream, while maintaining some state data for each unique key. 
- mapWithState(StateSpec<K, V, StateType, MappedType>, ClassTag<StateType>, ClassTag<MappedType>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
:: Experimental ::
 Return a  MapWithStateDStream by applying a function to every key-value element of
  this stream, while maintaining some state data for each unique key. 
- MapWithStateDStream<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming.dstream
- 
:: Experimental ::
 DStream representing the stream of data generated by  mapWithState operation on a
  pair DStream. 
- MapWithStateDStream(StreamingContext, ClassTag<MappedType>) - Constructor for class org.apache.spark.streaming.dstream.MapWithStateDStream
-  
- mark(int) - Method in class org.apache.spark.storage.BufferReleasingInputStream
-  
- markSupported() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-  
- mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
- 
Restricts the graph to only the vertices and edges that are also in other, but keeps the
 attributes from this graph.
 
- mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- master() - Method in class org.apache.spark.api.java.JavaSparkContext
-  
- master() - Method in class org.apache.spark.SparkContext
-  
- Matrices - Class in org.apache.spark.mllib.linalg
- 
- Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
-  
- Matrix - Interface in org.apache.spark.mllib.linalg
- 
Trait for a local matrix. 
- MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed
- 
Represents an entry in an distributed matrix. 
- MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
-  
- MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation
- 
Model representing the result of matrix factorization. 
- MatrixFactorizationModel(int, RDD<Tuple2<Object, double[]>>, RDD<Tuple2<Object, double[]>>) - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-  
- max() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Returns the maximum element from this RDD as defined by
 the default comparator natural order. 
- max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Returns the maximum element from this RDD as defined by the specified
 Comparator[T]. 
- max() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-  
- max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
- 
Maximum value of each dimension. 
- max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
- 
Maximum value of each column. 
- max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
- 
Returns the max of this RDD as defined by the implicit Ordering[T]. 
- max(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the maximum value of the expression in a group. 
- max(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the maximum value of the column in a group. 
- max(String...) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the max value for each numeric columns for each group. 
- max(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the max value for each numeric columns for each group. 
- max(Duration) - Method in class org.apache.spark.streaming.Duration
-  
- max(Time) - Method in class org.apache.spark.streaming.Time
-  
- max() - Method in class org.apache.spark.util.StatCounter
-  
- MAX_HASH_NNZ() - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Max number of nonzero entries used in computing hash code. 
- MAX_LONG_DIGITS() - Static method in class org.apache.spark.sql.types.Decimal
- 
Maximum number of decimal digits a Long can represent 
- MAX_PRECISION() - Static method in class org.apache.spark.sql.types.DecimalType
-  
- MAX_SCALE() - Static method in class org.apache.spark.sql.types.DecimalType
-  
- maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- maxBufferSizeMb() - Method in class org.apache.spark.serializer.KryoSerializer
-  
- maxCores() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-  
- maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- maxIters() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-  
- maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-  
- maxMem() - Method in class org.apache.spark.storage.StorageStatus
-  
- maxMemory() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- maxNodesInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
- 
Return the maximum number of nodes which can be in the given level of the tree. 
- maxVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-  
- md5(Column) - Static method in class org.apache.spark.sql.functions
- 
Calculates the MD5 digest of a binary column and returns the value
 as a 32 character hex string. 
- mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Compute the mean of this RDD's elements. 
- mean() - Method in class org.apache.spark.ml.feature.StandardScalerModel
-  
- mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-  
- mean() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-  
- mean() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-  
- mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-  
- mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
- 
Sample mean of each dimension. 
- mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
- 
Sample mean vector. 
- mean() - Method in class org.apache.spark.partial.BoundedDouble
-  
- mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
- 
Compute the mean of this RDD's elements. 
- mean(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the average of the values in a group. 
- mean(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the average of the values in a group. 
- mean(String...) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the average value for each numeric columns for each group. 
- mean(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the average value for each numeric columns for each group. 
- mean() - Method in class org.apache.spark.util.StatCounter
-  
- meanAbsoluteError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
- 
Returns the mean absolute error, which is a risk function corresponding to the
 expected value of the absolute error loss or l1-norm loss. 
- meanAbsoluteError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
- 
Returns the mean absolute error, which is a risk function corresponding to the
 expected value of the absolute error loss or l1-norm loss. 
- meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return the approximate mean of the elements in this RDD. 
- meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Approximate operation to return the mean within a timeout. 
- meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
- 
Approximate operation to return the mean within a timeout. 
- meanAveragePrecision() - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
- 
Returns the mean average precision (MAP) of all the queries. 
- means() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-  
- meanSquaredError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
- 
Returns the mean squared error, which is a risk function corresponding to the
 expected value of the squared error loss or quadratic loss. 
- meanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
- 
Returns the mean squared error, which is a risk function corresponding to the
 expected value of the squared error loss or quadratic loss. 
- MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
-  
- MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
-  
- MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
-  
- MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-  
- MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-  
- MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-  
- MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
-  
- MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-  
- MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-  
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-  
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
-  
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-  
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-  
- MemoryEntry - Class in org.apache.spark.storage
-  
- MemoryEntry(Object, long, boolean) - Constructor for class org.apache.spark.storage.MemoryEntry
-  
- memoryManager() - Method in class org.apache.spark.SparkEnv
-  
- memoryPerExecutorMB() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-  
- memoryRemaining() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-  
- memoryUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-  
- memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
-  
- memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-  
- memRemaining() - Method in class org.apache.spark.storage.StorageStatus
- 
Return the memory remaining in this block manager. 
- memSize() - Method in class org.apache.spark.storage.BlockStatus
-  
- memSize() - Method in class org.apache.spark.storage.BlockUpdatedInfo
-  
- memSize() - Method in class org.apache.spark.storage.RDDInfo
-  
- memUsed() - Method in class org.apache.spark.storage.StorageStatus
- 
Return the memory used by this block manager. 
- memUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
- 
Return the memory used by the given RDD in this block manager in O(1) time. 
- merge(R) - Method in class org.apache.spark.Accumulable
- 
Merge two accumulable objects together 
- merge(LogisticAggregator) - Method in class org.apache.spark.ml.classification.LogisticAggregator
- 
Merge another LogisticAggregator, and update the loss and gradient
 of the objective function. 
- merge(AFTAggregator) - Method in class org.apache.spark.ml.regression.AFTAggregator
-  
- merge(LeastSquaresAggregator) - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
- 
Merge another LeastSquaresAggregator, and update the loss and gradient
 of the objective function. 
- merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
- 
Merges another. 
- merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
- 
Merge another MultivariateOnlineSummarizer, and update the statistical summary. 
- merge(B, B) - Method in class org.apache.spark.sql.expressions.Aggregator
- 
Merge two intermediate values. 
- merge(MutableAggregationBuffer, Row) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
- 
Merges two aggregation buffers and stores the updated buffer values back to buffer1.
 
- merge(double) - Method in class org.apache.spark.util.StatCounter
- 
Add a value into this StatCounter, updating the internal statistics. 
- merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter
- 
Add multiple values into this StatCounter, updating the internal statistics. 
- merge(StatCounter) - Method in class org.apache.spark.util.StatCounter
- 
Merge another StatCounter into this one, adding up the internal statistics. 
- mergeCombiners() - Method in class org.apache.spark.Aggregator
-  
- mergeValue() - Method in class org.apache.spark.Aggregator
-  
- MESOS_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-  
- message() - Method in class org.apache.spark.FetchFailed
-  
- message() - Method in exception org.apache.spark.sql.AnalysisException
-  
- Metadata - Class in org.apache.spark.sql.types
- 
:: DeveloperApi :: 
- Metadata() - Constructor for class org.apache.spark.sql.types.Metadata
- 
No-arg constructor for kryo. 
- metadata() - Method in class org.apache.spark.sql.types.StructField
-  
- metadata() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
-  
- METADATA_KEY_DESCRIPTION() - Static method in class org.apache.spark.streaming.scheduler.StreamInputInfo
- 
The key for description in StreamInputInfo.metadata.
 
- MetadataBuilder - Class in org.apache.spark.sql.types
- 
:: DeveloperApi :: 
- MetadataBuilder() - Constructor for class org.apache.spark.sql.types.MetadataBuilder
-  
- metadataDescription() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
-  
- metadataHive() - Method in class org.apache.spark.sql.hive.HiveContext
- 
The copy of the Hive client that is used to retrieve metadata from the Hive MetaStore. 
- method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-  
- MethodIdentifier<T> - Class in org.apache.spark.util
- 
Helper class to identify a method. 
- MethodIdentifier(Class<T>, String, String) - Constructor for class org.apache.spark.util.MethodIdentifier
-  
- metricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
- 
param for metric name in evaluation
 Default: areaUnderROC 
- metricName() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
- 
param for metric name in evaluation (supports "f1"(default),"precision","recall","weightedPrecision","weightedRecall")
 
- metricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
- 
Param for metric name in evaluation. 
- metrics() - Method in class org.apache.spark.ExceptionFailure
-  
- metricsSystem() - Method in class org.apache.spark.SparkContext
-  
- metricsSystem() - Method in class org.apache.spark.SparkEnv
-  
- MFDataGenerator - Class in org.apache.spark.mllib.util
- 
:: DeveloperApi ::
 Generate RDD(s) containing data for Matrix Factorization. 
- MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
-  
- microF1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns micro-averaged label-based f1-measure
 (equals to micro-averaged document-based f1-measure) 
- microPrecision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns micro-averaged label-based precision
 (equals to micro-averaged document-based precision) 
- microRecall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns micro-averaged label-based recall
 (equals to micro-averaged document-based recall) 
- milliseconds() - Method in class org.apache.spark.streaming.Duration
-  
- milliseconds(long) - Static method in class org.apache.spark.streaming.Durations
-  
- Milliseconds - Class in org.apache.spark.streaming
- 
Helper object that creates instance of  Duration representing
 a given number of milliseconds. 
- Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
-  
- milliseconds() - Method in class org.apache.spark.streaming.Time
-  
- millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener
- 
Reformat a time interval in milliseconds to a prettier format for output 
- min() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Returns the minimum element from this RDD as defined by
 the default comparator natural order. 
- min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Returns the minimum element from this RDD as defined by the specified
 Comparator[T]. 
- min() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-  
- min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
- 
Minimum value of each dimension. 
- min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
- 
Minimum value of each column. 
- min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
- 
Returns the min of this RDD as defined by the implicit Ordering[T]. 
- min(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the minimum value of the expression in a group. 
- min(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the minimum value of the column in a group. 
- min(String...) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the min value for each numeric column for each group. 
- min(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the min value for each numeric column for each group. 
- min(Duration) - Method in class org.apache.spark.streaming.Duration
-  
- min(Time) - Method in class org.apache.spark.streaming.Time
-  
- min() - Method in class org.apache.spark.util.StatCounter
-  
- minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-  
- minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF
-  
- minInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-  
- MinMaxScaler - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 Rescale each feature individually to a common range [min, max] linearly using column summary
 statistics, which is also known as min-max normalization or Rescaling. 
- MinMaxScaler(String) - Constructor for class org.apache.spark.ml.feature.MinMaxScaler
-  
- MinMaxScaler() - Constructor for class org.apache.spark.ml.feature.MinMaxScaler
-  
- MinMaxScalerModel - Class in org.apache.spark.ml.feature
-  
- minSamplesRequired() - Static method in class org.apache.spark.ml.feature.QuantileDiscretizer
-  
- minTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer
- 
Minimum token length, >= 0. 
- minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD
- 
For each VertexId present in both thisandother, minus will act as a set difference
 operation returning only those unique VertexId's present inthis.
 
- minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
- 
For each VertexId present in both thisandother, minus will act as a set difference
 operation returning only those unique VertexId's present inthis.
 
- minus(Object) - Method in class org.apache.spark.sql.Column
- 
Subtraction. 
- minus(Duration) - Method in class org.apache.spark.streaming.Duration
-  
- minus(Time) - Method in class org.apache.spark.streaming.Time
-  
- minus(Duration) - Method in class org.apache.spark.streaming.Time
-  
- minute(Column) - Static method in class org.apache.spark.sql.functions
- 
Extracts the minutes as an integer from a given date/timestamp/string. 
- minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- minutes(long) - Static method in class org.apache.spark.streaming.Durations
-  
- Minutes - Class in org.apache.spark.streaming
- 
Helper object that creates instance of  Duration representing
 a given number of minutes. 
- Minutes() - Constructor for class org.apache.spark.streaming.Minutes
-  
- minVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-  
- mkString() - Method in interface org.apache.spark.sql.Row
- 
Displays all elements of this sequence in a string (without a separator). 
- mkString(String) - Method in interface org.apache.spark.sql.Row
- 
Displays all elements of this sequence in a string using a separator string. 
- mkString(String, String, String) - Method in interface org.apache.spark.sql.Row
- 
Displays all elements of this traversable or iterator in a string using
 start, end, and separator strings. 
- MLPairRDDFunctions<K,V> - Class in org.apache.spark.mllib.rdd
- 
Machine learning specific Pair RDD functions. 
- MLPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.mllib.rdd.MLPairRDDFunctions
-  
- MLReadable<T> - Interface in org.apache.spark.ml.util
- 
Trait for objects that provide  MLReader. 
- MLReader<T> - Class in org.apache.spark.ml.util
- 
Abstract class for utility classes that can load ML instances. 
- MLReader() - Constructor for class org.apache.spark.ml.util.MLReader
-  
- MLUtils - Class in org.apache.spark.mllib.util
- 
Helper methods to load, save and pre-process data used in ML Lib. 
- MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
-  
- MLWritable - Interface in org.apache.spark.ml.util
- 
Trait for classes that provide  MLWriter. 
- MLWriter - Class in org.apache.spark.ml.util
- 
Abstract class for utility classes that can save ML instances. 
- MLWriter() - Constructor for class org.apache.spark.ml.util.MLWriter
-  
- mod(Object) - Method in class org.apache.spark.sql.Column
- 
Modulo (a.k.a. 
- mode(SaveMode) - Method in class org.apache.spark.sql.DataFrameWriter
- 
Specifies the behavior when data or table already exists. 
- mode(String) - Method in class org.apache.spark.sql.DataFrameWriter
- 
Specifies the behavior when data or table already exists. 
- Model<M extends Model<M>> - Class in org.apache.spark.ml
- 
- Model() - Constructor for class org.apache.spark.ml.Model
-  
- model() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-  
- model() - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-  
- model() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-  
- model() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
- 
The model to be updated and used for prediction. 
- model() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-  
- models() - Method in class org.apache.spark.ml.classification.OneVsRestModel
-  
- modelType() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-  
- modificationTime() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
-  
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.Rating$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.SparkContext.FloatAccumulatorParam$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.SparkContext.IntAccumulatorParam$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.SparkContext.LongAccumulatorParam$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus$
- 
Static reference to the singleton instance of this Scala object. 
- MODULE$ - Static variable in class org.apache.spark.util.Vector.VectorAccumParam$
- 
Static reference to the singleton instance of this Scala object. 
- monotonically_increasing_id() - Static method in class org.apache.spark.sql.functions
- 
A column expression that generates monotonically increasing 64-bit integers. 
- monotonicallyIncreasingId() - Static method in class org.apache.spark.sql.functions
- 
A column expression that generates monotonically increasing 64-bit integers. 
- month(Column) - Static method in class org.apache.spark.sql.functions
- 
Extracts the month as an integer from a given date/timestamp/string. 
- months_between(Column, Column) - Static method in class org.apache.spark.sql.functions
-  
- MQTTUtils - Class in org.apache.spark.streaming.mqtt
-  
- MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
-  
- MsSqlServerDialect - Class in org.apache.spark.sql.jdbc
-  
- MsSqlServerDialect() - Constructor for class org.apache.spark.sql.jdbc.MsSqlServerDialect
-  
- mu() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-  
- MulticlassClassificationEvaluator - Class in org.apache.spark.ml.evaluation
- 
:: Experimental ::
 Evaluator for multiclass classification, which expects two input columns: score and label. 
- MulticlassClassificationEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-  
- MulticlassClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-  
- MulticlassMetrics - Class in org.apache.spark.mllib.evaluation
- 
::Experimental::
 Evaluator for multiclass classification. 
- MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
-  
- MultilabelMetrics - Class in org.apache.spark.mllib.evaluation
- 
Evaluator for multilabel classification. 
- MultilabelMetrics(RDD<Tuple2<double[], double[]>>) - Constructor for class org.apache.spark.mllib.evaluation.MultilabelMetrics
-  
- multiLabelValidator(int) - Static method in class org.apache.spark.mllib.util.DataValidators
- 
Function to check if labels used for k class multi-label classification are
 in the range of {0, 1, ..., k - 1}. 
- MultilayerPerceptronClassificationModel - Class in org.apache.spark.ml.classification
- 
:: Experimental ::
 Classification model based on the Multilayer Perceptron. 
- MultilayerPerceptronClassifier - Class in org.apache.spark.ml.classification
- 
:: Experimental ::
 Classifier trainer based on the Multilayer Perceptron. 
- MultilayerPerceptronClassifier(String) - Constructor for class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-  
- MultilayerPerceptronClassifier() - Constructor for class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-  
- Multinomial() - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- multiply(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
- 
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-  
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
Multiply this matrix by a local matrix on the right. 
- multiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Convenience method for `Matrix`-`DenseMatrix` multiplication. 
- multiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Convenience method for `Matrix`-`DenseVector` multiplication. 
- multiply(Vector) - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Convenience method for `Matrix`-`Vector` multiplication. 
- multiply(Object) - Method in class org.apache.spark.sql.Column
- 
Multiplication of this expression and another expression. 
- multiply(double) - Method in class org.apache.spark.util.Vector
-  
- MultivariateGaussian - Class in org.apache.spark.mllib.stat.distribution
- 
:: DeveloperApi ::
 This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution. 
- MultivariateGaussian(Vector, Matrix) - Constructor for class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-  
- MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat
- 
:: DeveloperApi ::
 MultivariateOnlineSummarizer implements  MultivariateStatisticalSummary to compute the mean,
 variance, minimum, maximum, counts, and nonzero counts for instances in sparse or dense vector
 format in a online fashion. 
- MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-  
- MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat
- 
Trait for multivariate statistical summary of a data matrix. 
- mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
-  
- MutableAggregationBuffer - Class in org.apache.spark.sql.expressions
- 
:: Experimental ::
 A Rowrepresenting an mutable aggregation buffer.
 
- MutableAggregationBuffer() - Constructor for class org.apache.spark.sql.expressions.MutableAggregationBuffer
-  
- MutablePair<T1,T2> - Class in org.apache.spark.util
- 
:: DeveloperApi ::
 A tuple of 2 elements. 
- MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
-  
- MutablePair() - Constructor for class org.apache.spark.util.MutablePair
- 
No-arg constructor for serialization 
- myName() - Method in class org.apache.spark.util.InnerClosureFinder
-  
- MySQLDialect - Class in org.apache.spark.sql.jdbc
-  
- MySQLDialect() - Constructor for class org.apache.spark.sql.jdbc.MySQLDialect
-  
- p() - Method in class org.apache.spark.ml.feature.Normalizer
- 
Normalization in L^p^ space. 
- pageRank(double, double) - Method in class org.apache.spark.graphx.GraphOps
- 
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
 PageRank and edge attributes containing the normalized edge weight. 
- PageRank - Class in org.apache.spark.graphx.lib
- 
PageRank algorithm implementation. 
- PageRank() - Constructor for class org.apache.spark.graphx.lib.PageRank
-  
- PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream
- 
Extra functions available on DStream of (key, value) pairs through an implicit conversion. 
- PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
-  
- PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function
- 
A function that returns zero or more key-value pair records from each input record. 
- PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function
- 
A function that returns key-value pairs (Tuple2<K, V>), and can be used to
 construct PairRDDs. 
- PairRDDFunctions<K,V> - Class in org.apache.spark.rdd
- 
Extra functions available on RDDs of (key, value) pairs through an implicit conversion. 
- PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
-  
- PairwiseRRDD<T> - Class in org.apache.spark.api.r
- 
Form an RDD[(Int, Array[Byte])] from key-value pairs returned from R. 
- PairwiseRRDD(RDD<T>, int, byte[], String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.PairwiseRRDD
-  
- parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Distribute a local Scala collection to form an RDD. 
- parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Distribute a local Scala collection to form an RDD. 
- parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
- 
Distribute a local Scala collection to form an RDD. 
- parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Distribute a local Scala collection to form an RDD. 
- parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Distribute a local Scala collection to form an RDD. 
- parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Distribute a local Scala collection to form an RDD. 
- parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Distribute a local Scala collection to form an RDD. 
- Param<T> - Class in org.apache.spark.ml.param
- 
:: DeveloperApi ::
 A param with self-contained documentation and optionally default value. 
- Param(String, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
-  
- Param(Identifiable, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
-  
- Param(String, String, String) - Constructor for class org.apache.spark.ml.param.Param
-  
- Param(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.Param
-  
- param() - Method in class org.apache.spark.ml.param.ParamPair
-  
- ParamGridBuilder - Class in org.apache.spark.ml.tuning
- 
:: Experimental ::
 Builder for a param grid used in grid search-based model selection. 
- ParamGridBuilder() - Constructor for class org.apache.spark.ml.tuning.ParamGridBuilder
-  
- ParamMap - Class in org.apache.spark.ml.param
- 
:: Experimental ::
 A param to value map. 
- ParamMap() - Constructor for class org.apache.spark.ml.param.ParamMap
- 
Creates an empty param map. 
- paramMap() - Method in interface org.apache.spark.ml.param.Params
- 
Internal param map for user-supplied values. 
- ParamPair<T> - Class in org.apache.spark.ml.param
- 
:: Experimental ::
 A param and its value. 
- ParamPair(Param<T>, T) - Constructor for class org.apache.spark.ml.param.ParamPair
-  
- Params - Interface in org.apache.spark.ml.param
-  
- params() - Method in interface org.apache.spark.ml.param.Params
-  
- ParamValidators - Class in org.apache.spark.ml.param
- 
:: DeveloperApi ::
 Factory methods for common validation functions for Param.isValid.
 
- ParamValidators() - Constructor for class org.apache.spark.ml.param.ParamValidators
-  
- parent() - Method in class org.apache.spark.ml.Model
- 
The parent estimator that produced this model. 
- parent() - Method in class org.apache.spark.ml.param.Param
-  
- parent(int, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Returns the jth parent RDD: e.g. 
- parentIds() - Method in class org.apache.spark.scheduler.StageInfo
-  
- parentIds() - Method in class org.apache.spark.storage.RDDInfo
-  
- parentIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
- 
Get the parent index of the given node, or 0 if it is the root. 
- parquet(String...) - Method in class org.apache.spark.sql.DataFrameReader
- 
Loads a Parquet file, returning the result as a  DataFrame. 
- parquet(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader
- 
Loads a Parquet file, returning the result as a  DataFrame. 
- parquet(String) - Method in class org.apache.spark.sql.DataFrameWriter
- 
Saves the content of the  DataFrame in Parquet format at the specified path. 
- parquetFile(String...) - Method in class org.apache.spark.sql.SQLContext
- 
Deprecated.
As of 1.4.0, replaced by read().parquet(). This will be removed in Spark 2.0.
 
 
- parquetFile(Seq<String>) - Method in class org.apache.spark.sql.SQLContext
-  
- parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Parses a string resulted from  Vector.toString into a  Vector. 
- parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint
-  
- parseDataType(String) - Method in class org.apache.spark.sql.SQLContext
-  
- parseIgnoreCase(Class<E>, String) - Static method in class org.apache.spark.util.EnumUtil
-  
- parseSql(String) - Method in class org.apache.spark.sql.hive.HiveContext
-  
- parseSql(String) - Method in class org.apache.spark.sql.SQLContext
-  
- PartialResult<R> - Class in org.apache.spark.partial
-  
- PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
-  
- Partition - Interface in org.apache.spark
- 
An identifier for a partition in an RDD. 
- partition() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-  
- partition() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-  
- partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a copy of the RDD partitioned using the specified partitioner. 
- partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.Graph
- 
Repartitions the edges in the graph according to partitionStrategy.
 
- partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.Graph
- 
Repartitions the edges in the graph according to partitionStrategy.
 
- partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return a copy of the RDD partitioned using the specified partitioner. 
- partitionBy(String...) - Method in class org.apache.spark.sql.DataFrameWriter
- 
Partitions the output by the given columns on the file system. 
- partitionBy(Seq<String>) - Method in class org.apache.spark.sql.DataFrameWriter
- 
Partitions the output by the given columns on the file system. 
- partitionBy(String, String...) - Static method in class org.apache.spark.sql.expressions.Window
- 
Creates a  WindowSpec with the partitioning defined. 
- partitionBy(Column...) - Static method in class org.apache.spark.sql.expressions.Window
- 
Creates a  WindowSpec with the partitioning defined. 
- partitionBy(String, Seq<String>) - Static method in class org.apache.spark.sql.expressions.Window
- 
Creates a  WindowSpec with the partitioning defined. 
- partitionBy(Seq<Column>) - Static method in class org.apache.spark.sql.expressions.Window
- 
Creates a  WindowSpec with the partitioning defined. 
- partitionBy(String, String...) - Method in class org.apache.spark.sql.expressions.WindowSpec
- 
- partitionBy(Column...) - Method in class org.apache.spark.sql.expressions.WindowSpec
- 
- partitionBy(String, Seq<String>) - Method in class org.apache.spark.sql.expressions.WindowSpec
- 
- partitionBy(Seq<Column>) - Method in class org.apache.spark.sql.expressions.WindowSpec
- 
- PartitionCoalescer - Class in org.apache.spark.rdd
- 
Coalesce the partitions of a parent RDD (prev) into fewer partitions, so that each partition of
 this RDD computes one or more of the parent ones.
 
- PartitionCoalescer(int, RDD<?>, double) - Constructor for class org.apache.spark.rdd.PartitionCoalescer
-  
- PartitionCoalescer.LocationIterator - Class in org.apache.spark.rdd
-  
- PartitionCoalescer.LocationIterator(RDD<?>) - Constructor for class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
-  
- partitionColumns() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
- 
Partition columns. 
- partitioner() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
The partitioner of this RDD. 
- partitioner() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
- 
If partitionsRDDalready has a partitioner, use it.
 
- partitioner() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- Partitioner - Class in org.apache.spark
- 
An object that defines how the elements in a key-value pair RDD are partitioned by key. 
- Partitioner() - Constructor for class org.apache.spark.Partitioner
-  
- partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
-  
- partitioner() - Method in class org.apache.spark.rdd.RDD
- 
Optionally overridden by subclasses to specify how they are partitioned. 
- partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
-  
- partitioner() - Method in class org.apache.spark.ShuffleDependency
-  
- partitioner(Partitioner) - Method in class org.apache.spark.streaming.StateSpec
- 
Set the partitioner by which the state RDDs generated by mapWithStatewill be
 be partitioned.
 
- PartitionGroup - Class in org.apache.spark.rdd
-  
- PartitionGroup(Option<String>) - Constructor for class org.apache.spark.rdd.PartitionGroup
-  
- partitionID() - Method in class org.apache.spark.TaskCommitDenied
-  
- partitionId() - Method in class org.apache.spark.TaskContext
- 
The ID of the RDD partition that is computed by this task. 
- PartitionPruningRDD<T> - Class in org.apache.spark.rdd
- 
:: DeveloperApi ::
 A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on
 all partitions. 
- PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
-  
- partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Set of partitions in this RDD. 
- partitions() - Method in class org.apache.spark.rdd.RDD
- 
Get the array of partitions of this RDD, taking into account whether the
 RDD is checkpointed or not. 
- partitions() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-  
- partitionsRDD() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- partitionsRDD() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- PartitionStrategy - Interface in org.apache.spark.graphx
- 
Represents the way edges are assigned to edge partitions based on their source and destination
 vertex IDs. 
- PartitionStrategy.CanonicalRandomVertexCut$ - Class in org.apache.spark.graphx
- 
Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical
 direction, resulting in a random vertex cut that colocates all edges between two vertices,
 regardless of direction. 
- PartitionStrategy.CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-  
- PartitionStrategy.EdgePartition1D$ - Class in org.apache.spark.graphx
- 
Assigns edges to partitions using only the source vertex ID, colocating edges with the same
 source. 
- PartitionStrategy.EdgePartition1D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-  
- PartitionStrategy.EdgePartition2D$ - Class in org.apache.spark.graphx
- 
Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix,
 guaranteeing a 2 * sqrt(numParts)bound on vertex replication.
 
- PartitionStrategy.EdgePartition2D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-  
- PartitionStrategy.RandomVertexCut$ - Class in org.apache.spark.graphx
- 
Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a
 random vertex cut that colocates all same-direction edges between two vertices. 
- PartitionStrategy.RandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-  
- path() - Method in class org.apache.spark.scheduler.InputFormatInfo
-  
- path() - Method in class org.apache.spark.scheduler.SplitInfo
-  
- path() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
-  
- paths() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
- 
Paths of this relation. 
- pattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer
- 
Regex pattern used to match delimiters if gapsis true or tokens ifgapsis false.
 
- pc() - Method in class org.apache.spark.ml.feature.PCAModel
-  
- pc() - Method in class org.apache.spark.mllib.feature.PCAModel
-  
- PCA - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 PCA trains a model to project vectors to a low-dimensional space using PCA. 
- PCA(String) - Constructor for class org.apache.spark.ml.feature.PCA
-  
- PCA() - Constructor for class org.apache.spark.ml.feature.PCA
-  
- PCA - Class in org.apache.spark.mllib.feature
- 
A feature transformer that projects vectors to a low-dimensional space using PCA. 
- PCA(int) - Constructor for class org.apache.spark.mllib.feature.PCA
-  
- PCAModel - Class in org.apache.spark.ml.feature
-  
- PCAModel - Class in org.apache.spark.mllib.feature
- 
Model fitted by  PCA that can project vectors to a low-dimensional space using PCA. 
- pdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
- 
Returns density of this multivariate Gaussian at given point, x 
- pendingStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- percent_rank() - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the relative rank (i.e. 
- percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- percentRank() - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.6.0, replaced by percent_rank. This will be removed in Spark 2.0.
 
 
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Set this RDD's storage level to persist its values across operations after the first time
 it is computed. 
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Set this RDD's storage level to persist its values across operations after the first time
 it is computed. 
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD
- 
Set this RDD's storage level to persist its values across operations after the first time
 it is computed. 
- persist(StorageLevel) - Method in class org.apache.spark.graphx.Graph
- 
Caches the vertices and edges associated with this graph at the specified storage level,
 ignoring any target storage levels previously set. 
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
- 
Persists the edge partitions at the specified storage level, ignoring any existing target
 storage level. 
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
- 
Persists the vertex partitions at the specified storage level, ignoring any existing target
 storage level. 
- persist(StorageLevel) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
- 
Persists the underlying RDD with the specified storage level. 
- persist(StorageLevel) - Method in class org.apache.spark.rdd.HadoopRDD
-  
- persist(StorageLevel) - Method in class org.apache.spark.rdd.NewHadoopRDD
-  
- persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD
- 
Set this RDD's storage level to persist its values across operations after the first time
 it is computed. 
- persist() - Method in class org.apache.spark.rdd.RDD
- 
Persist this RDD with the default storage level (`MEMORY_ONLY`). 
- persist() - Method in class org.apache.spark.sql.DataFrame
- 
Persist this  DataFrame with the default storage level ( MEMORY_AND_DISK). 
- persist(StorageLevel) - Method in class org.apache.spark.sql.DataFrame
- 
Persist this  DataFrame with the given storage level. 
- persist() - Method in class org.apache.spark.sql.Dataset
- 
Persist this  Dataset with the default storage level ( MEMORY_AND_DISK). 
- persist(StorageLevel) - Method in class org.apache.spark.sql.Dataset
- 
Persist this  Dataset with the given storage level. 
- persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream
- 
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER) 
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream
- 
Persist the RDDs of this DStream with the given storage level 
- persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER) 
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Persist the RDDs of this DStream with the given storage level 
- persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Persist the RDDs of this DStream with the given storage level 
- persist() - Method in class org.apache.spark.streaming.dstream.DStream
- 
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER) 
- persistentRdds() - Method in class org.apache.spark.SparkContext
-  
- personalizedPageRank(long, double, double) - Method in class org.apache.spark.graphx.GraphOps
- 
Run personalized PageRank for a given vertex, such that all random walks
 are started relative to the source node. 
- pi() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-  
- pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-  
- pickBin(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
- 
Takes a parent RDD partition and decides which of the partition groups to put it in
 Takes locality into account, but also uses power of 2 choices to load balance
 It strikes a balance between the two use the balanceSlack variable 
- pickRandomVertex() - Method in class org.apache.spark.graphx.GraphOps
- 
Picks a random vertex from the graph and returns its ID. 
- pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return an RDD created by piping elements to a forked external process. 
- pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return an RDD created by piping elements to a forked external process. 
- pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return an RDD created by piping elements to a forked external process. 
- pipe(String) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD created by piping elements to a forked external process. 
- pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD created by piping elements to a forked external process. 
- pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD created by piping elements to a forked external process. 
- Pipeline - Class in org.apache.spark.ml
- 
:: Experimental ::
 A simple pipeline, which acts as an estimator. 
- Pipeline(String) - Constructor for class org.apache.spark.ml.Pipeline
-  
- Pipeline() - Constructor for class org.apache.spark.ml.Pipeline
-  
- PipelineModel - Class in org.apache.spark.ml
- 
:: Experimental ::
 Represents a fitted pipeline. 
- PipelineStage - Class in org.apache.spark.ml
- 
- PipelineStage() - Constructor for class org.apache.spark.ml.PipelineStage
-  
- pivot(String) - Method in class org.apache.spark.sql.GroupedData
- 
Pivots a column of the current  DataFrame and perform the specified aggregation. 
- pivot(String, Seq<Object>) - Method in class org.apache.spark.sql.GroupedData
- 
Pivots a column of the current  DataFrame and perform the specified aggregation. 
- pivot(String, List<Object>) - Method in class org.apache.spark.sql.GroupedData
- 
Pivots a column of the current  DataFrame and perform the specified aggregation. 
- planner() - Method in class org.apache.spark.sql.hive.HiveContext
-  
- planner() - Method in class org.apache.spark.sql.SQLContext
-  
- plus(Object) - Method in class org.apache.spark.sql.Column
- 
Sum of this expression and another expression. 
- plus(Duration) - Method in class org.apache.spark.streaming.Duration
-  
- plus(Duration) - Method in class org.apache.spark.streaming.Time
-  
- plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector
- 
return (this + plus) dot other, but without creating any intermediate storage 
- PMMLExportable - Interface in org.apache.spark.mllib.pmml
- 
:: DeveloperApi ::
 Export model to the PMML format
 Predictive Model Markup Language (PMML) is an XML-based file format
 developed by the Data Mining Group (www.dmg.org). 
- pmod(Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the positive value of dividend mod divisor. 
- point() - Method in class org.apache.spark.mllib.feature.VocabWord
-  
- POINTS() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
-  
- PoissonGenerator - Class in org.apache.spark.mllib.random
- 
:: DeveloperApi ::
 Generates i.i.d. 
- PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
-  
- poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
Generates an RDD comprised of i.i.d.samples from the Poisson distribution with the input
 mean.
 
- PoissonSampler<T> - Class in org.apache.spark.util.random
- 
:: DeveloperApi ::
 A sampler for sampling with replacement, based on values drawn from Poisson distribution. 
- PoissonSampler(double, boolean, ClassTag<T>) - Constructor for class org.apache.spark.util.random.PoissonSampler
-  
- PoissonSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.PoissonSampler
-  
- poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
Generates an RDD[Vector] with vectors containing i.i.d.samples drawn from the
 Poisson distribution with the input mean.
 
- PolynomialExpansion - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 Perform feature expansion in a polynomial space. 
- PolynomialExpansion(String) - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
-  
- PolynomialExpansion() - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
-  
- poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- port() - Method in class org.apache.spark.storage.BlockManagerId
-  
- port() - Method in class org.apache.spark.streaming.kafka.Broker
- 
Broker's port 
- PortableDataStream - Class in org.apache.spark.input
- 
A class that allows DataStreams to be serialized and moved around by not creating them
 until they need to be read 
- PortableDataStream(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.PortableDataStream
-  
- PostgresDialect - Class in org.apache.spark.sql.jdbc
-  
- PostgresDialect() - Constructor for class org.apache.spark.sql.jdbc.PostgresDialect
-  
- pow(Column, Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the value of the first argument raised to the power of the second argument. 
- pow(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Returns the value of the first argument raised to the power of the second argument. 
- pow(String, Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the value of the first argument raised to the power of the second argument. 
- pow(String, String) - Static method in class org.apache.spark.sql.functions
- 
Returns the value of the first argument raised to the power of the second argument. 
- pow(Column, double) - Static method in class org.apache.spark.sql.functions
- 
Returns the value of the first argument raised to the power of the second argument. 
- pow(String, double) - Static method in class org.apache.spark.sql.functions
- 
Returns the value of the first argument raised to the power of the second argument. 
- pow(double, Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the value of the first argument raised to the power of the second argument. 
- pow(double, String) - Static method in class org.apache.spark.sql.functions
- 
Returns the value of the first argument raised to the power of the second argument. 
- PowerIterationClustering - Class in org.apache.spark.mllib.clustering
-  
- PowerIterationClustering() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering
-  
- PowerIterationClustering.Assignment - Class in org.apache.spark.mllib.clustering
- 
Cluster assignment. 
- PowerIterationClustering.Assignment(long, int) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-  
- PowerIterationClustering.Assignment$ - Class in org.apache.spark.mllib.clustering
-  
- PowerIterationClustering.Assignment$() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$
-  
- PowerIterationClusteringModel - Class in org.apache.spark.mllib.clustering
- 
- PowerIterationClusteringModel(int, RDD<PowerIterationClustering.Assignment>) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-  
- pr() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
- 
Returns the precision-recall curve, which is an Dataframe containing
 two fields recall, precision with (0.0, 1.0) prepended to it. 
- pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
- 
Returns the precision-recall curve, which is an RDD of (recall, precision),
 NOT (precision, recall), with (0.0, 1.0) prepended to it. 
- precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns precision for a given label (category) 
- precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns precision 
- precision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns document-based precision averaged by the number of documents 
- precision(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns precision for a given label (category) 
- precision() - Method in class org.apache.spark.sql.types.Decimal
-  
- precision() - Method in class org.apache.spark.sql.types.DecimalType
-  
- precision() - Method in class org.apache.spark.sql.types.PrecisionInfo
-  
- precisionAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
- 
Compute the average precision of all the queries, truncated at ranking position k. 
- precisionByThreshold() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
- 
Returns a dataframe with two fields (threshold, precision) curve. 
- precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
- 
Returns the (threshold, precision) curve. 
- precisionInfo() - Method in class org.apache.spark.sql.types.DecimalType
-  
- PrecisionInfo - Class in org.apache.spark.sql.types
- 
Precision parameters for a Decimal 
- PrecisionInfo(int, int) - Constructor for class org.apache.spark.sql.types.PrecisionInfo
-  
- predict(FeaturesType) - Method in class org.apache.spark.ml.classification.ClassificationModel
- 
Predict label for the given features. 
- predict(Vector) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-  
- predict(Vector) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-  
- predict(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
- 
Predict label for the given feature vector. 
- predict(Vector) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
- 
Predict label for the given features. 
- predict(FeaturesType) - Method in class org.apache.spark.ml.PredictionModel
- 
Predict label for the given features. 
- predict(Vector) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- predict(Vector) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-  
- predict(Vector) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-  
- predict(Vector) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-  
- predict(Vector) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-  
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
- 
Predict values for the given data set using the model trained. 
- predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
- 
Predict values for a single data point using the model trained. 
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
- 
Predict values for examples stored in a JavaRDD. 
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-  
- predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-  
- predict(Vector) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
- 
Predicts the index of the cluster that the input point belongs to. 
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
- 
Predicts the indices of the clusters that the input points belong to. 
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
- 
Java-friendly version of predict().
 
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
- 
Maps given points to their cluster indices. 
- predict(Vector) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
- 
Maps given point to its cluster index. 
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
- 
Java-friendly version of predict()
 
- predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel
- 
Returns the cluster index that a given point belongs to. 
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
- 
Maps given points to their cluster indices. 
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
- 
Maps given points to their cluster indices. 
- predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
- 
Predict the rating of one user for one product. 
- predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
- 
Predict the rating of many users for many products. 
- predict(JavaPairRDD<Integer, Integer>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
- 
Java-friendly version of MatrixFactorizationModel.predict.
 
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
- 
Predict values for the given data set using the model trained. 
- predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
- 
Predict values for a single data point using the model trained. 
- predict(RDD<Object>) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
- 
Predict labels for provided features. 
- predict(JavaDoubleRDD) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
- 
Predict labels for provided features. 
- predict(double) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
- 
Predict a single label. 
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
- 
Predict values for the given data set using the model trained. 
- predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel
- 
Predict values for a single data point using the model trained. 
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
- 
Predict values for examples stored in a JavaRDD. 
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
- 
Predict values for a single data point using the model trained. 
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
- 
Predict values for the given data set using the model trained. 
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
- 
Predict values for the given data set using the model trained. 
- predict() - Method in class org.apache.spark.mllib.tree.model.Node
-  
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node
- 
predict value if node is not leaf 
- Predict - Class in org.apache.spark.mllib.tree.model
- 
Predicted value for a node
 param:  predict predicted value
 param:  prob probability of the label (classification only) 
- Predict(double, double) - Constructor for class org.apache.spark.mllib.tree.model.Predict
-  
- predict() - Method in class org.apache.spark.mllib.tree.model.Predict
-  
- prediction() - Method in class org.apache.spark.ml.tree.InternalNode
-  
- prediction() - Method in class org.apache.spark.ml.tree.LeafNode
-  
- prediction() - Method in class org.apache.spark.ml.tree.Node
- 
Prediction a leaf node makes, or which an internal node would make if it were a leaf node 
- predictionCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-  
- PredictionModel<FeaturesType,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml
- 
:: DeveloperApi ::
 Abstraction for a model for prediction tasks (regression and classification). 
- PredictionModel() - Constructor for class org.apache.spark.ml.PredictionModel
-  
- predictions() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-  
- predictions() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
- 
Dataframe outputted by the model's `transform` method. 
- predictions() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
- 
Predictions associated with the boundaries at the same index, monotone because of isotonic
 regression. 
- predictions() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-  
- predictions() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-  
- predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Use the clustering model to make predictions on batches of data from a DStream. 
- predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Java-friendly version of predictOn.
 
- predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
- 
Use the model to make predictions on batches of data from a DStream 
- predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
- 
Java-friendly version of predictOn.
 
- predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Use the model to make predictions on the values of a DStream and carry over its keys. 
- predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Java-friendly version of predictOnValues.
 
- predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
- 
Use the model to make predictions on the values of a DStream and carry over its keys. 
- predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
- 
Java-friendly version of predictOnValues.
 
- Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml
- 
:: DeveloperApi ::
 Abstraction for prediction problems (regression and classification). 
- Predictor() - Constructor for class org.apache.spark.ml.Predictor
-  
- predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-  
- predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.classification.SVMModel
-  
- predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
- 
Predict the result given a data point and the weights learned. 
- predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.regression.LassoModel
-  
- predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
-  
- predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-  
- predictProbabilities(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
- 
Predict values for the given data set using the model trained. 
- predictProbabilities(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
- 
Predict posterior class probabilities for a single data point using the model trained. 
- predictProbability(FeaturesType) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
- 
Predict the probability of each class given the features. 
- predictQuantiles(Vector) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- predictRaw(FeaturesType) - Method in class org.apache.spark.ml.classification.ClassificationModel
- 
Raw prediction for each possible label. 
- predictRaw(Vector) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-  
- predictRaw(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- predictRaw(Vector) - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-  
- predictRaw(Vector) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-  
- predictSoft(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
- 
Given the input vectors, return the membership value of each vector
 to all mixture components. 
- predictSoft(Vector) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
- 
Given the input vector, return the membership values to all mixture components. 
- preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Override this to specify a preferred location (hostname). 
- preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD
- 
Get the preferred locations of a partition, taking into account whether the
 RDD is checkpointed. 
- PrefixSpan - Class in org.apache.spark.mllib.fpm
- 
:: Experimental :: 
- PrefixSpan() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan
- 
Constructs a default instance with default parameters
 {minSupport: 0.1, maxPatternLength:10, maxLocalProjDBSize:32000000L}.
 
- PrefixSpan.FreqSequence<Item> - Class in org.apache.spark.mllib.fpm
- 
Represents a frequence sequence. 
- PrefixSpan.FreqSequence(Object[], long) - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
-  
- PrefixSpanModel<Item> - Class in org.apache.spark.mllib.fpm
- 
Model fitted by  PrefixSpan
 param:  freqSequences frequent sequences 
- PrefixSpanModel(RDD<PrefixSpan.FreqSequence<Item>>) - Constructor for class org.apache.spark.mllib.fpm.PrefixSpanModel
-  
- prefLoc() - Method in class org.apache.spark.rdd.PartitionGroup
-  
- pregel(A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<A>) - Method in class org.apache.spark.graphx.GraphOps
- 
Execute a Pregel-like iterative vertex-parallel abstraction. 
- Pregel - Class in org.apache.spark.graphx
- 
Implements a Pregel-like bulk-synchronous message-passing API. 
- Pregel() - Constructor for class org.apache.spark.graphx.Pregel
-  
- prepareForExecution() - Method in class org.apache.spark.sql.SQLContext
-  
- prepareJobForWrite(Job) - Method in class org.apache.spark.sql.sources.HadoopFsRelation
- 
- prettyJson() - Method in class org.apache.spark.sql.types.DataType
- 
The pretty (i.e. 
- prettyPrint() - Method in class org.apache.spark.streaming.Duration
-  
- prev() - Method in class org.apache.spark.rdd.ShuffledRDD
-  
- prevHandler() - Method in class org.apache.spark.util.SignalLoggerHandler
-  
- primitiveTypes() - Static method in class org.apache.spark.sql.hive.HiveContext
-  
- print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Print the first ten elements of each RDD generated in this DStream. 
- print(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Print the first num elements of each RDD generated in this DStream. 
- print() - Method in class org.apache.spark.streaming.dstream.DStream
- 
Print the first ten elements of each RDD generated in this DStream. 
- print(int) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Print the first num elements of each RDD generated in this DStream. 
- printSchema() - Method in class org.apache.spark.sql.DataFrame
- 
Prints the schema to the console in a nice tree format. 
- printSchema() - Method in class org.apache.spark.sql.Dataset
- 
Prints the schema of the underlying  Dataset to the console in a nice tree format. 
- printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-  
- printTreeString() - Method in class org.apache.spark.sql.types.StructType
-  
- Private - Annotation Type in org.apache.spark.annotation
- 
A class that is considered private to the internals of Spark -- there is a high-likelihood
 they will be changed in future versions of Spark. 
- prob() - Method in class org.apache.spark.mllib.tree.model.Predict
-  
- ProbabilisticClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
- 
:: DeveloperApi :: 
- ProbabilisticClassificationModel() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-  
- ProbabilisticClassifier<FeaturesType,E extends ProbabilisticClassifier<FeaturesType,E,M>,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
- 
:: DeveloperApi :: 
- ProbabilisticClassifier() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassifier
-  
- probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- probability2prediction(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- probability2prediction(Vector) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
- 
Given a vector of class conditional probabilities, select the predicted label. 
- probabilityCol() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-  
- probabilityCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
- 
Field in "predictions" which gives the calibrated probability of each instance as a vector. 
- PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-  
- processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
- 
Time taken for the all jobs of this batch to finish processing from the time they started
 processing. 
- processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-  
- processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-  
- product() - Method in class org.apache.spark.mllib.recommendation.Rating
-  
- productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-  
- progressListener() - Method in class org.apache.spark.streaming.StreamingContext
-  
- properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-  
- properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-  
- PrunedFilteredScan - Interface in org.apache.spark.sql.sources
- 
::DeveloperApi::
 A BaseRelation that can eliminate unneeded columns and filter using selected
 predicates before producing an RDD containing all matching tuples as Row objects. 
- PrunedScan - Interface in org.apache.spark.sql.sources
- 
::DeveloperApi::
 A BaseRelation that can eliminate unneeded columns before producing an RDD
 containing all of its tuples as Row objects. 
- Pseudorandom - Interface in org.apache.spark.util.random
- 
:: DeveloperApi ::
 A class with pseudorandom behavior. 
- put(ParamPair<?>...) - Method in class org.apache.spark.ml.param.ParamMap
- 
Puts a list of param pairs (overwrites if the input params exists). 
- put(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap
- 
Puts a (param, value) pair (overwrites if the input param exists). 
- put(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.param.ParamMap
- 
Puts a list of param pairs (overwrites if the input params exists). 
- putBoolean(String, boolean) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Puts a Boolean. 
- putBooleanArray(String, boolean[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Puts a Boolean array. 
- putDouble(String, double) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Puts a Double. 
- putDoubleArray(String, double[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Puts a Double array. 
- putLong(String, long) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Puts a Long. 
- putLongArray(String, long[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Puts a Long array. 
- putMetadata(String, Metadata) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
- putMetadataArray(String, Metadata[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
- putNull(String) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Puts a null. 
- putString(String, String) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Puts a String. 
- putStringArray(String, String[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
- 
Puts a String array. 
- pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-  
- pValue() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
-  
- pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult
- 
The probability of obtaining a test statistic result at least as extreme as the one that was
 actually observed, assuming that the null hypothesis is true. 
- pValues() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
- 
Two-sided p-value of estimated coefficients and intercept. 
- pyUDT() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-  
- pyUDT() - Method in class org.apache.spark.sql.types.UserDefinedType
- 
Paired Python UDT class, if exists. 
- R() - Method in class org.apache.spark.mllib.linalg.QRDecomposition
-  
- r2() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
- 
Returns R^2^, the coefficient of determination. 
- r2() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
- 
Returns R^2^, the unadjusted coefficient of determination. 
- RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-  
- rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
- 
Generate a DenseMatrixconsisting ofi.i.d.uniform random numbers.
 
- rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
- 
Generate a DenseMatrixconsisting ofi.i.d.uniform random numbers.
 
- rand(long) - Static method in class org.apache.spark.sql.functions
- 
Generate a random column with i.i.d. 
- rand() - Static method in class org.apache.spark.sql.functions
- 
Generate a random column with i.i.d. 
- randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
- 
Generate a DenseMatrixconsisting ofi.i.d.gaussian random numbers.
 
- randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
- 
Generate a DenseMatrixconsisting ofi.i.d.gaussian random numbers.
 
- randn(long) - Static method in class org.apache.spark.sql.functions
- 
Generate a column with i.i.d. 
- randn() - Static method in class org.apache.spark.sql.functions
- 
Generate a column with i.i.d. 
- RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
-  
- random(int, Random) - Static method in class org.apache.spark.util.Vector
- 
Creates this  Vector of given length containing random numbers
 between 0.0 and 1.0. 
- RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random
- 
:: DeveloperApi ::
 Trait for random data generators that generate i.i.d. 
- RandomForest - Class in org.apache.spark.mllib.tree
- 
A class that implements a Random Forestlearning algorithm for classification and regression.
 
- RandomForest(Strategy, int, String, int) - Constructor for class org.apache.spark.mllib.tree.RandomForest
-  
- RandomForestClassificationModel - Class in org.apache.spark.ml.classification
- 
:: Experimental ::
 Random Forestmodel for classification.
 
- RandomForestClassifier - Class in org.apache.spark.ml.classification
- 
:: Experimental ::
 Random Forestlearning algorithm for
 classification.
 
- RandomForestClassifier(String) - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
-  
- RandomForestClassifier() - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
-  
- RandomForestModel - Class in org.apache.spark.mllib.tree.model
- 
Represents a random forest model. 
- RandomForestModel(Enumeration.Value, DecisionTreeModel[]) - Constructor for class org.apache.spark.mllib.tree.model.RandomForestModel
-  
- RandomForestRegressionModel - Class in org.apache.spark.ml.regression
- 
:: Experimental ::
 Random Forestmodel for regression.
 
- RandomForestRegressor - Class in org.apache.spark.ml.regression
- 
:: Experimental ::
 Random Forestlearning algorithm for regression.
 
- RandomForestRegressor(String) - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
-  
- RandomForestRegressor() - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
-  
- randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
:: DeveloperApi ::
 Generates an RDD comprised of i.i.d.samples produced by the input RandomDataGenerator.
 
- randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
- randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
:: DeveloperApi ::
 Generates an RDD comprised of i.i.d.samples produced by the input RandomDataGenerator.
 
- RandomRDDs - Class in org.apache.spark.mllib.random
- 
Generator methods for creating RDDs comprised of i.i.d.samples from some distribution.
 
- RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
-  
- RandomSampler<T,U> - Interface in org.apache.spark.util.random
- 
:: DeveloperApi ::
 A pseudorandom sampler. 
- randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD
- 
Randomly splits this RDD with the provided weights. 
- randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD
- 
Randomly splits this RDD with the provided weights. 
- randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD
- 
Randomly splits this RDD with the provided weights. 
- randomSplit(double[], long) - Method in class org.apache.spark.sql.DataFrame
- 
Randomly splits this  DataFrame with the provided weights. 
- randomSplit(double[]) - Method in class org.apache.spark.sql.DataFrame
- 
Randomly splits this  DataFrame with the provided weights. 
- randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
- 
:: DeveloperApi ::
 Generates an RDD[Vector] with vectors containing i.i.d.samples produced by the
 input RandomDataGenerator.
 
- range(long, long, long, int) - Method in class org.apache.spark.SparkContext
- 
Creates a new RDD[Long] containing elements from starttoend(exclusive), increased bystepevery element.
 
- range(long) - Method in class org.apache.spark.sql.SQLContext
-  
- range(long, long) - Method in class org.apache.spark.sql.SQLContext
-  
- range(long, long, long, int) - Method in class org.apache.spark.sql.SQLContext
-  
- rangeBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec
- 
Defines the frame boundaries, from start(inclusive) toend(inclusive).
 
- RangeDependency<T> - Class in org.apache.spark
- 
:: DeveloperApi ::
 Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs. 
- RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
-  
- RangePartitioner<K,V> - Class in org.apache.spark
- 
A  Partitioner that partitions sortable records by range into roughly
 equal ranges. 
- RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
-  
- rank() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-  
- rank() - Method in class org.apache.spark.ml.recommendation.ALSModel
-  
- rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-  
- rank() - Static method in class org.apache.spark.sql.functions
- 
Window function: returns the rank of rows within a window partition. 
- RankingMetrics<T> - Class in org.apache.spark.mllib.evaluation
- 
::Experimental::
 Evaluator for ranking algorithms. 
- RankingMetrics(RDD<Tuple2<Object, Object>>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.evaluation.RankingMetrics
-  
- rateController() - Method in class org.apache.spark.streaming.dstream.InputDStream
-  
- rateController() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
- 
Asynchronously maintains & sends new rate limits to the receiver through the receiver tracker. 
- rating() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
-  
- Rating - Class in org.apache.spark.mllib.recommendation
- 
A more compact class to represent a rating than Tuple3[Int, Int, Double]. 
- Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
-  
- rating() - Method in class org.apache.spark.mllib.recommendation.Rating
-  
- raw2prediction(Vector) - Method in class org.apache.spark.ml.classification.ClassificationModel
- 
Given a vector of raw predictions, select the predicted label. 
- raw2prediction(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- raw2prediction(Vector) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-  
- raw2probability(Vector) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
- 
Non-in-place version of raw2probabilityInPlace()
 
- raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-  
- raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-  
- raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
- 
Estimate the probability of each class given the raw prediction,
 doing the computation in-place. 
- raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-  
- rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream from network source hostname:port, where data is received
 as serialized blocks (serialized using the Spark's serializer) that can be directly
 pushed into the block manager without deserializing them. 
- rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream from network source hostname:port, where data is received
 as serialized blocks (serialized using the Spark's serializer) that can be directly
 pushed into the block manager without deserializing them. 
- rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create a input stream from network source hostname:port, where data is received
 as serialized blocks (serialized using the Spark's serializer) that can be directly
 pushed into the block manager without deserializing them. 
- rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-  
- rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
-  
- rdd() - Method in class org.apache.spark.api.java.JavaRDD
-  
- rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
-  
- rdd() - Method in class org.apache.spark.Dependency
-  
- rdd() - Method in class org.apache.spark.NarrowDependency
-  
- RDD<T> - Class in org.apache.spark.rdd
- 
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark. 
- RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
-  
- RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
- 
Construct an RDD with just a one-to-one dependency on one parent 
- rdd() - Method in class org.apache.spark.ShuffleDependency
-  
- rdd() - Method in class org.apache.spark.sql.DataFrame
- 
- rdd() - Method in class org.apache.spark.sql.Dataset
- 
- RDD() - Static method in class org.apache.spark.storage.BlockId
-  
- RDD_SCOPE_KEY() - Static method in class org.apache.spark.SparkContext
-  
- RDD_SCOPE_NO_OVERRIDE_KEY() - Static method in class org.apache.spark.SparkContext
-  
- RDDBlockId - Class in org.apache.spark.storage
-  
- RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
-  
- rddBlocks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- rddBlocks() - Method in class org.apache.spark.storage.StorageStatus
- 
Return the RDD blocks stored in this block manager. 
- rddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus
- 
Return the blocks that belong to the given RDD stored in this block manager. 
- RDDDataDistribution - Class in org.apache.spark.status.api.v1
-  
- RDDFunctions<T> - Class in org.apache.spark.mllib.rdd
- 
Machine learning specific RDD functions. 
- RDDFunctions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RDDFunctions
-  
- rddId() - Method in class org.apache.spark.CleanCheckpoint
-  
- rddId() - Method in class org.apache.spark.CleanRDD
-  
- rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-  
- rddId() - Method in class org.apache.spark.storage.RDDBlockId
-  
- RDDInfo - Class in org.apache.spark.storage
-  
- RDDInfo(int, String, int, StorageLevel, Seq<Object>, String, Option<org.apache.spark.rdd.RDDOperationScope>) - Constructor for class org.apache.spark.storage.RDDInfo
-  
- rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener
- 
Filter RDD info to include only those with cached partitions 
- rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
-  
- RDDPartitionInfo - Class in org.apache.spark.status.api.v1
-  
- rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
-  
- rdds() - Method in class org.apache.spark.rdd.UnionRDD
-  
- RDDStorageInfo - Class in org.apache.spark.status.api.v1
-  
- rddStorageLevel(int) - Method in class org.apache.spark.storage.StorageStatus
- 
Return the storage level, if any, used by the given RDD in this block manager. 
- rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.RDD
-  
- rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
-  
- rddToDataFrameHolder(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLImplicits
- 
Creates a DataFrame from an RDD of Product (e.g. 
- rddToDatasetHolder(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLImplicits
- 
- rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.rdd.RDD
-  
- rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
-  
- rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.rdd.RDD
-  
- rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
-  
- rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, <any>, <any>) - Static method in class org.apache.spark.rdd.RDD
-  
- rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
-  
- read() - Method in class org.apache.spark.api.r.BaseRRDD
-  
- read() - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- read() - Static method in class org.apache.spark.ml.classification.NaiveBayesModel
-  
- read() - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
-  
- read() - Static method in class org.apache.spark.ml.clustering.KMeansModel
-  
- read() - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
-  
- read() - Static method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-  
- read() - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
-  
- read() - Static method in class org.apache.spark.ml.feature.IDFModel
-  
- read() - Static method in class org.apache.spark.ml.feature.MinMaxScalerModel
-  
- read() - Static method in class org.apache.spark.ml.feature.PCAModel
-  
- read() - Static method in class org.apache.spark.ml.feature.StandardScalerModel
-  
- read() - Static method in class org.apache.spark.ml.feature.StringIndexerModel
-  
- read() - Static method in class org.apache.spark.ml.feature.VectorIndexerModel
-  
- read() - Static method in class org.apache.spark.ml.feature.Word2VecModel
-  
- read() - Static method in class org.apache.spark.ml.Pipeline
-  
- read() - Static method in class org.apache.spark.ml.PipelineModel
-  
- read() - Static method in class org.apache.spark.ml.recommendation.ALSModel
-  
- read() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- read() - Static method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-  
- read() - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
-  
- read() - Static method in class org.apache.spark.ml.tuning.CrossValidator
-  
- read() - Static method in class org.apache.spark.ml.tuning.CrossValidatorModel
-  
- read() - Method in interface org.apache.spark.ml.util.MLReadable
- 
Returns an  MLReader instance for this class. 
- read(Kryo, Input, Class<Iterable<?>>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
-  
- read() - Method in class org.apache.spark.sql.SQLContext
-  
- read() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-  
- read(byte[]) - Method in class org.apache.spark.storage.BufferReleasingInputStream
-  
- read(byte[], int, int) - Method in class org.apache.spark.storage.BufferReleasingInputStream
-  
- read(WriteAheadLogRecordHandle) - Method in class org.apache.spark.streaming.util.WriteAheadLog
- 
Read a written record based on the given record handle. 
- readAll() - Method in class org.apache.spark.streaming.util.WriteAheadLog
- 
Read and return an iterator of all the records that have been written but not yet cleaned up. 
- readBytes() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-  
- readData(int) - Method in class org.apache.spark.api.r.BaseRRDD
-  
- readData(int) - Method in class org.apache.spark.api.r.PairwiseRRDD
-  
- readData(int) - Method in class org.apache.spark.api.r.RRDD
-  
- readData(int) - Method in class org.apache.spark.api.r.StringRRDD
-  
- readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
-  
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
-  
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
-  
- readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
-  
- readKey(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
- 
Reads the object representing the key of a key-value pair. 
- readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
- 
The most general-purpose method to read an object. 
- readRecords() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-  
- readValue(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
- 
Reads the object representing the value of a key-value pair. 
- ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-  
- ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
- 
Blocks until this action completes. 
- ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-  
- reason() - Method in class org.apache.spark.ExecutorLostFailure
-  
- reason() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-  
- reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-  
- recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns recall for a given label (category) 
- recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns recall
 (equals to precision for multiclass classifier
 because sum of all false positives is equal to sum
 of all false negatives) 
- recall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns document-based recall averaged by the number of documents 
- recall(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns recall for a given label (category) 
- recallByThreshold() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
- 
Returns a dataframe with two fields (threshold, recall) curve. 
- recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
- 
Returns the (threshold, recall) curve. 
- Receiver<T> - Class in org.apache.spark.streaming.receiver
- 
:: DeveloperApi ::
 Abstract class of a receiver that can be run on worker nodes to receive external data. 
- Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
-  
- ReceiverInfo - Class in org.apache.spark.streaming.scheduler
- 
:: DeveloperApi ::
 Class having information about a receiver 
- ReceiverInfo(int, String, boolean, String, String, String, String, long) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
-  
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-  
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-  
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-  
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
-  
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-  
- ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream
- 
Abstract class for defining any  InputDStream
 that has to start a receiver on worker nodes to receive external data. 
- ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
-  
- receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream with any arbitrary user implemented receiver. 
- receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create an input stream with any arbitrary user implemented receiver. 
- recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
- 
Recommends products to a user. 
- recommendProductsForUsers(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
- 
Recommends topK products for all users. 
- recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
- 
Recommends users to a product. 
- recommendUsersForProducts(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
- 
Recommends topK users for all products. 
- recordJobProperties(int, Properties) - Method in class org.apache.spark.scheduler.JobLogger
- 
Record job properties into job log file 
- RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.HadoopRDD
- 
Update the input bytes read metric each time this number of records has been read 
- RECORDS_BETWEEN_BYTES_WRITTEN_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.PairRDDFunctions
-  
- recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetricDistributions
-  
- recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetrics
-  
- recordsRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-  
- recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetricDistributions
-  
- recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetrics
-  
- recordsWritten() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
-  
- recordTaskMetrics(int, String, TaskInfo, TaskMetrics) - Method in class org.apache.spark.scheduler.JobLogger
- 
Record task metrics into job log files, including execution info and shuffle metrics 
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Reduces the elements of this RDD using the specified commutative and associative binary
 operator. 
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
- 
Reduces the elements of this RDD using the specified commutative and
 associative binary operator. 
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.sql.Dataset
- 
(Scala-specific)
 Reduces the elements of this  Dataset using the specified binary function. 
- reduce(ReduceFunction<T>) - Method in class org.apache.spark.sql.Dataset
- 
(Java-specific)
 Reduces the elements of this Dataset using the specified binary function. 
- reduce(B, I) - Method in class org.apache.spark.sql.expressions.Aggregator
- 
Combine two values to produce a new value. 
- reduce(Function2<V, V, V>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Reduces the elements of each group of data using the specified binary function. 
- reduce(ReduceFunction<V>) - Method in class org.apache.spark.sql.GroupedDataset
- 
Reduces the elements of each group of data using the specified binary function. 
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD has a single element generated by reducing each RDD
 of this DStream. 
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD has a single element generated by reducing each RDD
 of this DStream. 
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Merge the values for each key using an associative reduce function. 
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Merge the values for each key using an associative reduce function. 
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Merge the values for each key using an associative reduce function. 
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Merge the values for each key using an associative reduce function. 
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Merge the values for each key using an associative reduce function. 
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Merge the values for each key using an associative reduce function. 
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying reduceByKeyto each RDD.
 
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying reduceByKeyto each RDD.
 
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying reduceByKeyto each RDD.
 
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying reduceByKeyto each RDD.
 
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying reduceByKeyto each RDD.
 
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying reduceByKeyto each RDD.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Create a new DStream by applying reduceByKeyover a sliding window onthisDStream.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying reduceByKeyover a sliding window.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying reduceByKeyover a sliding window.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying reduceByKeyover a sliding window.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by reducing over a using incremental computation. 
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying incremental reduceByKeyover a sliding window.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying incremental reduceByKeyover a sliding window.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying reduceByKeyover a sliding window onthisDStream.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying reduceByKeyover a sliding window.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying reduceByKeyover a sliding window.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying reduceByKeyover a sliding window.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying incremental reduceByKeyover a sliding window.
 
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying incremental reduceByKeyover a sliding window.
 
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Merge the values for each key using an associative reduce function, but return the results
 immediately to the master as a Map. 
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Merge the values for each key using an associative reduce function, but return the results
 immediately to the master as a Map. 
- reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Alias for reduceByKeyLocally 
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Deprecated.
As this API is not Java compatible. 
 
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD has a single element generated by reducing all
 elements in a sliding window over this DStream. 
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD has a single element generated by reducing all
 elements in a sliding window over this DStream. 
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD has a single element generated by reducing all
 elements in a sliding window over this DStream. 
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD has a single element generated by reducing all
 elements in a sliding window over this DStream. 
- ReduceFunction<T> - Interface in org.apache.spark.api.java.function
- 
Base interface for function used in Dataset's reduce. 
- reduceId() - Method in class org.apache.spark.FetchFailed
-  
- reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
-  
- reduceId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-  
- reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-  
- refreshTable(String) - Method in class org.apache.spark.sql.hive.HiveContext
- 
Invalidate and refresh all the cached the metadata of the given table. 
- regexp_extract(Column, String, int) - Static method in class org.apache.spark.sql.functions
- 
Extract a specific(idx) group identified by a java regex, from the specified string column. 
- regexp_replace(Column, String, String) - Static method in class org.apache.spark.sql.functions
- 
Replace all substrings of the specified string value that match regexp with rep. 
- RegexTokenizer - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 A regex based tokenizer that extracts tokens either by using the provided regex pattern to split
 the text (default) or repeatedly matching the regex (if gapsis false).
 
- RegexTokenizer(String) - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
-  
- RegexTokenizer() - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
-  
- register(String, UserDefinedAggregateFunction) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined aggregate function (UDAF). 
- register(String, Function0<RT>, TypeTags.TypeTag<RT>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 0 arguments as user-defined function (UDF). 
- register(String, Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 1 arguments as user-defined function (UDF). 
- register(String, Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 2 arguments as user-defined function (UDF). 
- register(String, Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 3 arguments as user-defined function (UDF). 
- register(String, Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 4 arguments as user-defined function (UDF). 
- register(String, Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 5 arguments as user-defined function (UDF). 
- register(String, Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 6 arguments as user-defined function (UDF). 
- register(String, Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 7 arguments as user-defined function (UDF). 
- register(String, Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 8 arguments as user-defined function (UDF). 
- register(String, Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 9 arguments as user-defined function (UDF). 
- register(String, Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 10 arguments as user-defined function (UDF). 
- register(String, Function11<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 11 arguments as user-defined function (UDF). 
- register(String, Function12<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 12 arguments as user-defined function (UDF). 
- register(String, Function13<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 13 arguments as user-defined function (UDF). 
- register(String, Function14<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 14 arguments as user-defined function (UDF). 
- register(String, Function15<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 15 arguments as user-defined function (UDF). 
- register(String, Function16<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 16 arguments as user-defined function (UDF). 
- register(String, Function17<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 17 arguments as user-defined function (UDF). 
- register(String, Function18<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 18 arguments as user-defined function (UDF). 
- register(String, Function19<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 19 arguments as user-defined function (UDF). 
- register(String, Function20<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 20 arguments as user-defined function (UDF). 
- register(String, Function21<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 21 arguments as user-defined function (UDF). 
- register(String, Function22<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>, TypeTags.TypeTag<A22>) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a Scala closure of 22 arguments as user-defined function (UDF). 
- register(String, UDF1<?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 1 arguments. 
- register(String, UDF2<?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 2 arguments. 
- register(String, UDF3<?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 3 arguments. 
- register(String, UDF4<?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 4 arguments. 
- register(String, UDF5<?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 5 arguments. 
- register(String, UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 6 arguments. 
- register(String, UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 7 arguments. 
- register(String, UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 8 arguments. 
- register(String, UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 9 arguments. 
- register(String, UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 10 arguments. 
- register(String, UDF11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 11 arguments. 
- register(String, UDF12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 12 arguments. 
- register(String, UDF13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 13 arguments. 
- register(String, UDF14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 14 arguments. 
- register(String, UDF15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 15 arguments. 
- register(String, UDF16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 16 arguments. 
- register(String, UDF17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 17 arguments. 
- register(String, UDF18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 18 arguments. 
- register(String, UDF19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 19 arguments. 
- register(String, UDF20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 20 arguments. 
- register(String, UDF21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 21 arguments. 
- register(String, UDF22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
- 
Register a user-defined function with 22 arguments. 
- register(QueryExecutionListener) - Method in class org.apache.spark.sql.util.ExecutionListenerManager
- 
- registerAvroSchemas(Seq<Schema>) - Method in class org.apache.spark.SparkConf
- 
Use Kryo serialization and register the given set of Avro schemas so that the generic
 record serializer can decrease network IO 
- registerClasses(Kryo) - Method in class org.apache.spark.graphx.GraphKryoRegistrator
-  
- registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
-  
- registerDialect(JdbcDialect) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects
- 
Register a dialect for use on all new matching jdbc  DataFrame. 
- registerKryoClasses(SparkConf) - Static method in class org.apache.spark.graphx.GraphXUtils
- 
Registers classes that GraphX uses with Kryo. 
- registerKryoClasses(Class<?>[]) - Method in class org.apache.spark.SparkConf
- 
Use Kryo serialization and register the given set of classes with Kryo. 
- registerPython(String, UserDefinedPythonFunction) - Method in class org.apache.spark.sql.UDFRegistration
-  
- registerStream(DStream<BinarySample>) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-  
- registerStream(JavaDStream<BinarySample>) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-  
- registerTempTable(String) - Method in class org.apache.spark.sql.DataFrame
- 
Registers this  DataFrame as a temporary table using the given name. 
- Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-  
- RegressionEvaluator - Class in org.apache.spark.ml.evaluation
- 
:: Experimental ::
 Evaluator for regression, which expects two input columns: prediction and label. 
- RegressionEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
-  
- RegressionEvaluator() - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
-  
- RegressionMetrics - Class in org.apache.spark.mllib.evaluation
- 
Evaluator for regression. 
- RegressionMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
-  
- RegressionModel<FeaturesType,M extends RegressionModel<FeaturesType,M>> - Class in org.apache.spark.ml.regression
- 
:: DeveloperApi :: 
- RegressionModel() - Constructor for class org.apache.spark.ml.regression.RegressionModel
-  
- RegressionModel - Interface in org.apache.spark.mllib.regression
-  
- reindex() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- reindex() - Method in class org.apache.spark.graphx.VertexRDD
- 
Construct a new VertexRDD that is indexed by only the visible vertices. 
- RelationProvider - Interface in org.apache.spark.sql.sources
- 
::DeveloperApi::
 Implemented by objects that produce relations for a specific kind of data source. 
- relativeDirection(long) - Method in class org.apache.spark.graphx.Edge
- 
Return the relative direction of the edge to the corresponding
 vertex. 
- remainder(Decimal) - Method in class org.apache.spark.sql.types.Decimal
-  
- remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Sets each DStreams in this context to remember RDDs it generated in the last given duration. 
- remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext
- 
Set each DStreams in this context to remember RDDs it generated in the last given duration. 
- rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-  
- remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-  
- remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-  
- remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-  
- remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-  
- remove(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
- 
Removes a key from this map and returns its value associated previously as an option. 
- remove(String) - Method in class org.apache.spark.SparkConf
- 
Remove a parameter from the configuration 
- remove() - Method in class org.apache.spark.streaming.State
- 
Remove the state if it exists. 
- repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return a new RDD that has exactly numPartitions partitions. 
- repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a new RDD that has exactly numPartitions partitions. 
- repartition(int) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return a new RDD that has exactly numPartitions partitions. 
- repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
- 
Return a new RDD that has exactly numPartitions partitions. 
- repartition(int, Column...) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame partitioned by the given partitioning expressions into
  numPartitions. 
- repartition(Column...) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame partitioned by the given partitioning expressions preserving
 the existing number of partitions. 
- repartition(int) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame that has exactly  numPartitions partitions. 
- repartition(int, Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame partitioned by the given partitioning expressions into
  numPartitions. 
- repartition(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame partitioned by the given partitioning expressions preserving
 the existing number of partitions. 
- repartition(int) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset that has exactly  numPartitions partitions. 
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream
- 
Return a new DStream with an increased or decreased level of parallelism. 
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream with an increased or decreased level of parallelism. 
- repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream with an increased or decreased level of parallelism. 
- repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Repartition the RDD according to the given partitioner and, within each resulting partition,
 sort records by their keys. 
- repartitionAndSortWithinPartitions(Partitioner, Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Repartition the RDD according to the given partitioner and, within each resulting partition,
 sort records by their keys. 
- repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
- 
Repartition the RDD according to the given partitioner and, within each resulting partition,
 sort records by their keys. 
- repeat(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Repeats a string column n times, and returns it as a new string column. 
- replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Replaces values matching keys in replacementmap with the corresponding values.
 
- replace(String[], Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
Replaces values matching keys in replacementmap with the corresponding values.
 
- replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
(Scala-specific) Replaces values matching keys in replacementmap.
 
- replace(Seq<String>, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
- 
(Scala-specific) Replaces values matching keys in replacementmap.
 
- replicatedVertexView() - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- replication() - Method in class org.apache.spark.storage.StorageLevel
-  
- reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Report exceptions in receiving data. 
- requestExecutors(int) - Method in class org.apache.spark.SparkContext
- 
:: DeveloperApi ::
 Request an additional number of executors from the cluster manager. 
- reset() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-  
- resetIterator() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
-  
- residuals() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
- 
Residuals (label - predicted value) 
- resolve(String) - Method in class org.apache.spark.sql.DataFrame
-  
- resolvedTEncoder() - Method in class org.apache.spark.sql.Dataset
- 
The encoder for this  Dataset that has been resolved to its output schema. 
- restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Restart the receiver. 
- restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Restart the receiver. 
- restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Restart the receiver. 
- Resubmitted - Class in org.apache.spark
- 
:: DeveloperApi ::
 A ShuffleMapTaskthat completed successfully earlier, but we
 lost the executor before the stage completed.
 
- Resubmitted() - Constructor for class org.apache.spark.Resubmitted
-  
- result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-  
- result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
- 
Awaits and returns the result (of type T) of this action. 
- result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-  
- resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-  
- resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-  
- resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
-  
- resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-  
- resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-  
- retainedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- retryWaitMs(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
- 
Returns the configured number of milliseconds to wait on each retry 
- ReturnStatementFinder - Class in org.apache.spark.util
-  
- ReturnStatementFinder() - Constructor for class org.apache.spark.util.ReturnStatementFinder
-  
- reverse() - Method in class org.apache.spark.graphx.EdgeDirection
- 
Reverse the direction of an edge. 
- reverse() - Method in class org.apache.spark.graphx.EdgeRDD
- 
Reverse all the edges in this RDD. 
- reverse() - Method in class org.apache.spark.graphx.Graph
- 
Reverses all edges in the graph. 
- reverse() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- reverse() - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- reverse(Column) - Static method in class org.apache.spark.sql.functions
- 
Reverses the string column and returns it as a new string column. 
- reverseRoutingTables() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- reverseRoutingTables() - Method in class org.apache.spark.graphx.VertexRDD
- 
Returns a new  VertexRDD reflecting a reversal of all edge directions in the corresponding
  EdgeRDD. 
- ReviveOffers - Class in org.apache.spark.scheduler.local
-  
- ReviveOffers() - Constructor for class org.apache.spark.scheduler.local.ReviveOffers
-  
- RFormula - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 Implements the transforms required for fitting a dataset against an R model formula. 
- RFormula(String) - Constructor for class org.apache.spark.ml.feature.RFormula
-  
- RFormula() - Constructor for class org.apache.spark.ml.feature.RFormula
-  
- RFormulaModel - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 A fitted RFormula. 
- RidgeRegressionModel - Class in org.apache.spark.mllib.regression
- 
Regression model trained using RidgeRegression. 
- RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
-  
- RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression
- 
Train a regression model with L2-regularization using Stochastic Gradient Descent. 
- RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
- 
Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100,
 regParam: 0.01, miniBatchFraction: 1.0}. 
- right() - Method in class org.apache.spark.sql.sources.And
-  
- right() - Method in class org.apache.spark.sql.sources.Or
-  
- rightCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit
- 
Get sorted categories which split to the right 
- rightChild() - Method in class org.apache.spark.ml.tree.InternalNode
-  
- rightChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
- 
Return the index of the right child of this node. 
- rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-  
- rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
-  
- rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Perform a right outer join of thisandother.
 
- rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Perform a right outer join of thisandother.
 
- rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Perform a right outer join of thisandother.
 
- rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Perform a right outer join of thisandother.
 
- rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Perform a right outer join of thisandother.
 
- rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Perform a right outer join of thisandother.
 
- rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'right outer join' between RDDs of thisDStream andotherDStream.
 
- rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'right outer join' between RDDs of thisDStream andotherDStream.
 
- rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Return a new DStream by applying 'right outer join' between RDDs of thisDStream andotherDStream.
 
- rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'right outer join' between RDDs of thisDStream andotherDStream.
 
- rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'right outer join' between RDDs of thisDStream andotherDStream.
 
- rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Return a new DStream by applying 'right outer join' between RDDs of thisDStream andotherDStream.
 
- rightPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-  
- rint(Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the double value that is closest in value to the argument and
 is equal to a mathematical integer. 
- rint(String) - Static method in class org.apache.spark.sql.functions
- 
Returns the double value that is closest in value to the argument and
 is equal to a mathematical integer. 
- rlike(String) - Method in class org.apache.spark.sql.Column
- 
SQL RLIKE expression (LIKE with Regex). 
- RMATa() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-  
- RMATb() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-  
- RMATc() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-  
- RMATd() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-  
- rmatGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
- 
A random graph generator using the R-MAT model, proposed in
 "R-MAT: A Recursive Model for Graph Mining" by Chakrabarti et al. 
- rnd() - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- roc() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
- 
Returns the receiver operating characteristic (ROC) curve,
 which is an Dataframe having two fields (FPR, TPR)
 with (0.0, 0.0) prepended and (1.0, 1.0) appended to it. 
- roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
- 
Returns the receiver operating characteristic (ROC) curve,
 which is an RDD of (false positive rate, true positive rate)
 with (0.0, 0.0) prepended and (1.0, 1.0) appended to it. 
- rollup(Column...) - Method in class org.apache.spark.sql.DataFrame
- 
Create a multi-dimensional rollup for the current  DataFrame using the specified columns,
 so we can run aggregation on them. 
- rollup(String, String...) - Method in class org.apache.spark.sql.DataFrame
- 
Create a multi-dimensional rollup for the current  DataFrame using the specified columns,
 so we can run aggregation on them. 
- rollup(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
- 
Create a multi-dimensional rollup for the current  DataFrame using the specified columns,
 so we can run aggregation on them. 
- rollup(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
Create a multi-dimensional rollup for the current  DataFrame using the specified columns,
 so we can run aggregation on them. 
- root() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-  
- rootMeanSquaredError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
- 
Returns the root mean squared error, which is defined as the square root of
 the mean squared error. 
- rootMeanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
- 
Returns the root mean squared error, which is defined as the square root of
 the mean squared error. 
- rootNode() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-  
- rootNode() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-  
- round(Column) - Static method in class org.apache.spark.sql.functions
- 
Returns the value of the column erounded to 0 decimal places.
 
- round(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Round the value of etoscaledecimal places ifscale>= 0
 or at integral part whenscale< 0.
 
- ROUND_CEILING() - Static method in class org.apache.spark.sql.types.Decimal
-  
- ROUND_FLOOR() - Static method in class org.apache.spark.sql.types.Decimal
-  
- ROUND_HALF_UP() - Static method in class org.apache.spark.sql.types.Decimal
-  
- Row - Interface in org.apache.spark.sql
- 
Represents one row of output from a relational operator. 
- row_number() - Static method in class org.apache.spark.sql.functions
- 
Window function: returns a sequential number starting at 1 within a window partition. 
- RowFactory - Class in org.apache.spark.sql
- 
A factory class used to construct  Row objects. 
- RowFactory() - Constructor for class org.apache.spark.sql.RowFactory
-  
- rowIndices() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-  
- RowMatrix - Class in org.apache.spark.mllib.linalg.distributed
- 
Represents a row-oriented distributed Matrix with no meaningful row indices. 
- RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
-  
- RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
Alternative constructor leaving matrix dimensions to be determined automatically. 
- rowNumber() - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.6.0, replaced by row_number. This will be removed in Spark 2.0.
 
 
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-  
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-  
- rowsBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec
- 
Defines the frame boundaries, from start(inclusive) toend(inclusive).
 
- rowsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-  
- rpad(Column, int, String) - Static method in class org.apache.spark.sql.functions
- 
Right-padded with pad to a length of len. 
- rpcEnv() - Method in class org.apache.spark.SparkEnv
-  
- RpcUtils - Class in org.apache.spark.util
-  
- RpcUtils() - Constructor for class org.apache.spark.util.RpcUtils
-  
- RRDD<T> - Class in org.apache.spark.api.r
- 
An RDD that stores serialized R objects as Array[Byte]. 
- RRDD(RDD<T>, byte[], String, String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.RRDD
-  
- rtrim(Column) - Static method in class org.apache.spark.sql.functions
- 
Trim the spaces from right end for the specified string value. 
- run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
- 
Executes some action enclosed in the closure. 
- run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents
- 
Compute the connected component membership of each vertex and return a graph with the vertex
 value containing the lowest vertex id in the connected component containing that vertex. 
- run(Graph<VD, ED>, int, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.LabelPropagation
- 
Run static Label Propagation for detecting communities in networks. 
- run(Graph<VD, ED>, int, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
- 
Run PageRank for a fixed number of iterations returning a graph
 with vertex attributes containing the PageRank and edge
 attributes the normalized edge weight. 
- run(Graph<VD, ED>, Seq<Object>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ShortestPaths
- 
Computes shortest paths to the given set of landmark vertices. 
- run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.StronglyConnectedComponents
- 
Compute the strongly connected component (SCC) of each vertex and return a graph with the
 vertex value containing the lowest vertex id in the SCC containing that vertex. 
- run(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
- 
Implement SVD++ based on "Factorization Meets the Neighborhood:
 a Multifaceted Collaborative Filtering Model",
 available at http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf.
 
- run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
-  
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Runs the bisecting k-means algorithm. 
- run(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Java-friendly version of run().
 
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Perform expectation maximization 
- run(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Java-friendly version of run()
 
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Train a K-means model on the given set of points; datashould be cached for high
 performance, because this is an iterative algorithm.
 
- run(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Learn an LDA model using the given dataset. 
- run(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Java-friendly version of run()
 
- run(Graph<Object, Object>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
- 
Run the PIC algorithm on Graph. 
- run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
- 
Run the PIC algorithm. 
- run(JavaRDD<Tuple3<Long, Long, Double>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
- 
A Java-friendly version of PowerIterationClustering.run.
 
- run(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.AssociationRules
- 
Computes the association rules with confidence above minConfidence.
 
- run(JavaRDD<FPGrowth.FreqItemset<Item>>) - Method in class org.apache.spark.mllib.fpm.AssociationRules
- 
Java-friendly version of run.
 
- run(RDD<Object>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
- 
Computes an FP-Growth model that contains frequent itemsets. 
- run(JavaRDD<Basket>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
- 
Java-friendly version of run.
 
- run(RDD<Object[]>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
- 
Finds the complete set of frequent sequential patterns in the input sequences of itemsets. 
- run(JavaRDD<Sequence>) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
- 
A Java-friendly version of  run() that reads sequences from a  JavaRDD and returns
 frequent sequences in a  PrefixSpanModel. 
- run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- run(JavaRDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
- 
Run the algorithm with the configured parameters on an input
 RDD of LabeledPoint entries. 
- run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
- 
Run the algorithm with the configured parameters on an input RDD
 of LabeledPoint entries starting from the initial weights provided. 
- run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-  
- run(JavaRDD<Tuple3<Double, Double, Double>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-  
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree
- 
Method to train a decision tree model over an RDD 
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
- 
Method to train a gradient boosting model 
- run(JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
- 
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#run.
 
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.RandomForest
- 
Method to train a decision tree model over an RDD 
- run() - Method in class org.apache.spark.rdd.PartitionCoalescer
- 
Runs the packing algorithm and returns an array of PartitionGroups that if possible are
 load balanced and grouped by locality 
- run() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread
-  
- run() - Method in class org.apache.spark.util.SparkShutdownHook
-  
- runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, <any>, long) - Method in class org.apache.spark.SparkContext
- 
:: DeveloperApi ::
 Run a job that can return approximate results. 
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction
- 
Runs a Spark job. 
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a function on a given set of partitions in an RDD and pass the results to the given
 handler function. 
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a function on a given set of partitions in an RDD and return the results as an array. 
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a job on a given set of partitions of an RDD, but take a function of type
 Iterator[T] => Uinstead of(TaskContext, Iterator[T]) => U.
 
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a function on a given set of partitions in an RDD and pass the results to the given
 handler function. 
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a function on a given set of partitions in an RDD and return the results as an array. 
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a job on a given set of partitions of an RDD, but take a function of type
 Iterator[T] => Uinstead of(TaskContext, Iterator[T]) => U.
 
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a job on all partitions in an RDD and return the results in an array. 
- runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a job on all partitions in an RDD and return the results in an array. 
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a job on all partitions in an RDD and pass the results to a handler function. 
- runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
- 
Run a job on all partitions in an RDD and pass the results to a handler function. 
- runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS
- 
Run Limited-memory BFGS (L-BFGS) in parallel. 
- runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector, double) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
- 
Run stochastic gradient descent (SGD) in parallel using mini batches. 
- runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
- 
Alias of runMiniBatchSGDwith convergenceTol set to default value of 0.001.
 
- running() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- runningLocally() - Method in class org.apache.spark.TaskContext
-  
- runSqlHive(String) - Method in class org.apache.spark.sql.hive.HiveContext
-  
- runSVDPlusPlus(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
- 
This method is now replaced by the updated version of run()and returns exactly
 the same result.
 
- RuntimePercentage - Class in org.apache.spark.scheduler
-  
- RuntimePercentage(double, Option<Object>, double) - Constructor for class org.apache.spark.scheduler.RuntimePercentage
-  
- runUntilConvergence(Graph<VD, ED>, double, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
- 
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
 PageRank and edge attributes containing the normalized edge weight. 
- runUntilConvergenceWithOptions(Graph<VD, ED>, double, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
- 
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
 PageRank and edge attributes containing the normalized edge weight. 
- runWithOptions(Graph<VD, ED>, int, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
- 
Run PageRank for a fixed number of iterations returning a graph
 with vertex attributes containing the PageRank and edge
 attributes the normalized edge weight. 
- runWithValidation(RDD<LabeledPoint>, RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
- 
Method to validate a gradient boosting model 
- runWithValidation(JavaRDD<LabeledPoint>, JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
- 
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#runWithValidation.
 
- s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
-  
- sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return a sampled subset of this RDD. 
- sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return a sampled subset of this RDD. 
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a sampled subset of this RDD. 
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a sampled subset of this RDD. 
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return a sampled subset of this RDD. 
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return a sampled subset of this RDD. 
- sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD
- 
Return a sampled subset of this RDD. 
- sample(boolean, double, long) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame by sampling a fraction of rows. 
- sample(boolean, double) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame by sampling a fraction of rows, using a random seed. 
- sample(boolean, double, long) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset by sampling a fraction of records. 
- sample(boolean, double) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset by sampling a fraction of records, using a random seed. 
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliCellSampler
-  
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
-  
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
-  
- sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler
- 
take a random sample 
- sampleBy(String, Map<T, Object>, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
Returns a stratified sample without replacement based on the fraction given on each stratum. 
- sampleBy(String, Map<T, Double>, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions
- 
Returns a stratified sample without replacement based on the fraction given on each stratum. 
- sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a subset of this RDD sampled by key (via stratified sampling). 
- sampleByKey(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a subset of this RDD sampled by key (via stratified sampling). 
- sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return a subset of this RDD sampled by key (via stratified sampling). 
- sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
 math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key). 
- sampleByKeyExact(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
 math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key). 
- sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
 math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key). 
- sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
 estimating the standard deviation by dividing by N-1 instead of N). 
- sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
- 
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
 estimating the standard deviation by dividing by N-1 instead of N). 
- sampleStdev() - Method in class org.apache.spark.util.StatCounter
- 
Return the sample standard deviation of the values, which corrects for bias in estimating the
 variance by dividing by N-1 instead of N. 
- sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Compute the sample variance of this RDD's elements (which corrects for bias in
 estimating the standard variance by dividing by N-1 instead of N). 
- sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
- 
Compute the sample variance of this RDD's elements (which corrects for bias in
 estimating the variance by dividing by N-1 instead of N). 
- sampleVariance() - Method in class org.apache.spark.util.StatCounter
- 
Return the sample variance, which corrects for bias in estimating the variance by dividing
 by N-1 instead of N. 
- save(String) - Method in interface org.apache.spark.ml.util.MLWritable
- 
Saves this ML instance to the input path, a shortcut of write.save(path).
 
- save(String) - Method in class org.apache.spark.ml.util.MLWriter
- 
Saves the ML instances to the input path. 
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.SVMModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
- 
Java-friendly version of topicDistributions
 
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
- 
Save this model to the given path. 
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LassoModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-  
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-  
- save(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Saveable
- 
Save this model to the given path. 
- save(String) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by write().save(path). This will be removed in Spark 2.0.
 
 
- save(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by write().mode(mode).save(path).
             This will be removed in Spark 2.0.
 
 
- save(String, String) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by write().format(source).save(path).
             This will be removed in Spark 2.0.
 
 
- save(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by write().format(source).mode(mode).save(path).
             This will be removed in Spark 2.0.
 
 
- save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by
            write().format(source).mode(mode).options(options).save(path).
             This will be removed in Spark 2.0.
 
 
- save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by
            write().format(source).mode(mode).options(options).save(path).
             This will be removed in Spark 2.0.
 
 
- save(String) - Method in class org.apache.spark.sql.DataFrameWriter
- 
Saves the content of the  DataFrame at the specified path. 
- save() - Method in class org.apache.spark.sql.DataFrameWriter
- 
Saves the content of the  DataFrame as the specified table. 
- Saveable - Interface in org.apache.spark.mllib.util
- 
:: DeveloperApi :: 
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
 that storage system. 
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
 that storage system. 
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Output the RDD to any Hadoop-supported file system. 
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Output the RDD to any Hadoop-supported file system. 
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Output the RDD to any Hadoop-supported file system, compressing with the supplied codec. 
- saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormatclass
 supporting the key and value types K and V in this RDD.
 
- saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormatclass
 supporting the key and value types K and V in this RDD.
 
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormatclass
 supporting the key and value types K and V in this RDD.
 
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormatclass
 supporting the key and value types K and V in this RDD.
 
- saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
Save labeled data in LIBSVM format. 
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Output the RDD to any Hadoop-supported storage system, using
 a Configuration object for that storage system. 
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop
 Configuration object for that storage system. 
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Output the RDD to any Hadoop-supported file system. 
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Output the RDD to any Hadoop-supported file system. 
- saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
 
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
 
- saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
- 
Save each RDD in thisDStream as a Hadoop file.
 
- saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Save this RDD as a SequenceFile of serialized objects. 
- saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD
- 
Save this RDD as a SequenceFile of serialized objects. 
- saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Save each RDD in this DStream as a Sequence file of serialized objects. 
- saveAsParquetFile(String) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by write().parquet(). This will be removed in Spark 2.0.
 
 
- saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions
- 
Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key
 and value types. 
- saveAsTable(String) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by write().saveAsTable(tableName).
             This will be removed in Spark 2.0.
 
 
- saveAsTable(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by write().mode(mode).saveAsTable(tableName).
              This will be removed in Spark 2.0.
 
 
- saveAsTable(String, String) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by write().format(source).saveAsTable(tableName).
             This will be removed in Spark 2.0.
 
 
- saveAsTable(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by write().mode(mode).saveAsTable(tableName).
             This will be removed in Spark 2.0.
 
 
- saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by
            write().format(source).mode(mode).options(options).saveAsTable(tableName).
             This will be removed in Spark 2.0.
 
 
- saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.4.0, replaced by
            write().format(source).mode(mode).options(options).saveAsTable(tableName).
             This will be removed in Spark 2.0.
 
 
- saveAsTable(String) - Method in class org.apache.spark.sql.DataFrameWriter
- 
Saves the content of the  DataFrame as the specified table. 
- saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Save this RDD as a text file, using string representations of elements. 
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Save this RDD as a compressed text file, using string representations of elements. 
- saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD
- 
Save this RDD as a text file, using string representations of elements. 
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD
- 
Save this RDD as a compressed text file, using string representations of elements. 
- saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Save each RDD in this DStream as at text file, using string representation
 of elements. 
- saveImpl(String) - Method in class org.apache.spark.ml.util.MLWriter
- 
save()handles overwriting and then calls this method.
 
- saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
- 
- SaveMode - Enum in org.apache.spark.sql
- 
SaveMode is used to specify the expected behavior of saving a DataFrame to a data source. 
- sc() - Method in class org.apache.spark.api.java.JavaSparkContext
-  
- sc() - Method in class org.apache.spark.sql.SQLContext.implicits$.StringToColumn
-  
- sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Deprecated.
As of 0.9.0, replaced by sparkContext
 
 
- sc() - Method in class org.apache.spark.streaming.StreamingContext
-  
- scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-  
- scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-  
- scale() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- scale() - Method in class org.apache.spark.mllib.random.GammaGenerator
-  
- scale() - Method in class org.apache.spark.sql.types.Decimal
-  
- scale() - Method in class org.apache.spark.sql.types.DecimalType
-  
- scale() - Method in class org.apache.spark.sql.types.PrecisionInfo
-  
- scalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
- 
the vector to multiply with input vectors 
- scalingVec() - Method in class org.apache.spark.mllib.feature.ElementwiseProduct
-  
- scheduler() - Method in class org.apache.spark.streaming.StreamingContext
-  
- schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
- 
Time taken for the first job of this batch to start processing from the time this batch
 was submitted to the streaming scheduler. 
- SchedulingMode - Class in org.apache.spark.scheduler
- 
"FAIR" and "FIFO" determines which policy is used
    to order tasks amongst a Schedulable's sub-queues
  "NONE" is used when the a Schedulable has no sub-queues. 
- SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
-  
- schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- schedulingPool() - Method in class org.apache.spark.status.api.v1.StageData
-  
- schema() - Method in class org.apache.spark.sql.DataFrame
- 
- schema(StructType) - Method in class org.apache.spark.sql.DataFrameReader
- 
Specifies the input schema. 
- schema() - Method in class org.apache.spark.sql.Dataset
- 
Returns the schema of the encoded form of the objects in this  Dataset. 
- schema() - Method in interface org.apache.spark.sql.Encoder
- 
Returns the schema of encoding this type of object as a Row. 
- schema() - Method in interface org.apache.spark.sql.Row
- 
Schema for the row. 
- schema() - Method in class org.apache.spark.sql.sources.BaseRelation
-  
- schema() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
- 
Schema of this relation. 
- SchemaRelationProvider - Interface in org.apache.spark.sql.sources
- 
::DeveloperApi::
 Implemented by objects that produce relations for a specific kind of data source
 with a given schema. 
- scope() - Method in class org.apache.spark.rdd.RDD
- 
The scope associated with the operation that created this RDD. 
- scope() - Method in class org.apache.spark.storage.RDDInfo
-  
- scoreAndLabels() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-  
- ScriptTransformationWriterThread - Class in org.apache.spark.sql.hive.execution
-  
- ScriptTransformationWriterThread(Iterator<InternalRow>, Seq<DataType>, org.apache.spark.sql.catalyst.expressions.Projection, AbstractSerDe, ObjectInspector, HiveScriptIOSchema, OutputStream, Process, org.apache.spark.util.CircularBuffer, TaskContext, Configuration) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread
-  
- second(Column) - Static method in class org.apache.spark.sql.functions
- 
Extracts the seconds as an integer from a given date/timestamp/string. 
- seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- seconds(long) - Static method in class org.apache.spark.streaming.Durations
-  
- Seconds - Class in org.apache.spark.streaming
- 
Helper object that creates instance of  Duration representing
 a given number of seconds. 
- Seconds() - Constructor for class org.apache.spark.streaming.Seconds
-  
- securityManager() - Method in class org.apache.spark.SparkEnv
-  
- select(Column...) - Method in class org.apache.spark.sql.DataFrame
- 
Selects a set of column based expressions. 
- select(String, String...) - Method in class org.apache.spark.sql.DataFrame
- 
Selects a set of columns. 
- select(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
- 
Selects a set of column based expressions. 
- select(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
Selects a set of columns. 
- select(Column...) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  DataFrame by selecting a set of column based expressions. 
- select(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  DataFrame by selecting a set of column based expressions. 
- select(TypedColumn<T, U1>, Encoder<U1>) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset by computing the given  Column expression for each element. 
- select(TypedColumn<T, U1>, TypedColumn<T, U2>) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset by computing the given  Column expressions for each element. 
- select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset by computing the given  Column expressions for each element. 
- select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>, TypedColumn<T, U4>) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset by computing the given  Column expressions for each element. 
- select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>, TypedColumn<T, U4>, TypedColumn<T, U5>) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset by computing the given  Column expressions for each element. 
- selectedFeatures() - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-  
- selectedFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-  
- selectExpr(String...) - Method in class org.apache.spark.sql.DataFrame
- 
Selects a set of SQL expressions. 
- selectExpr(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
Selects a set of SQL expressions. 
- selectUntyped(Seq<TypedColumn<?, ?>>) - Method in class org.apache.spark.sql.Dataset
- 
Internal helper function for building typed selects that return tuples. 
- sendToDst(A) - Method in class org.apache.spark.graphx.EdgeContext
- 
Sends a message to the destination vertex. 
- sendToDst(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- sendToSrc(A) - Method in class org.apache.spark.graphx.EdgeContext
- 
Sends a message to the source vertex. 
- sendToSrc(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- sequence() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
-  
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Get an RDD for a Hadoop SequenceFile with given key and value types. 
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Get an RDD for a Hadoop SequenceFile. 
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
- 
Get an RDD for a Hadoop SequenceFile with given key and value types. 
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
- 
Get an RDD for a Hadoop SequenceFile with given key and value types. 
- sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext
- 
Version of sequenceFile() for types implicitly convertible to Writables through a
 WritableConverter. 
- SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd
- 
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile,
 through an implicit conversion. 
- SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Class<? extends Writable>, Class<? extends Writable>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
-  
- SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
-  
- SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
-  
- SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
-  
- SerializationStream - Class in org.apache.spark.serializer
- 
:: DeveloperApi ::
 A stream for writing serialized objects. 
- SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
-  
- serialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
-  
- serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
-  
- serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-  
- serialize(Object) - Method in class org.apache.spark.sql.types.UserDefinedType
- 
Convert the user type to a SQL datum 
- serializedData() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-  
- serializedPyClass() - Method in class org.apache.spark.sql.types.UserDefinedType
- 
Serialized Python UDT class, if exists. 
- Serializer - Class in org.apache.spark.serializer
- 
:: DeveloperApi ::
 A serializer. 
- Serializer() - Constructor for class org.apache.spark.serializer.Serializer
-  
- serializer() - Method in class org.apache.spark.ShuffleDependency
-  
- serializer() - Method in class org.apache.spark.SparkEnv
-  
- SerializerInstance - Class in org.apache.spark.serializer
- 
:: DeveloperApi ::
 An instance of a serializer, for use by one thread at a time. 
- SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
-  
- serializeStream(OutputStream) - Method in class org.apache.spark.serializer.DummySerializerInstance
-  
- serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-  
- set(Edge<ED>) - Method in class org.apache.spark.graphx.EdgeTriplet
- 
Set the edge properties of this triplet. 
- set(long, long, int, int, VD, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- set(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
-  
- set(String, Object) - Method in interface org.apache.spark.ml.param.Params
-  
- set(ParamPair<?>) - Method in interface org.apache.spark.ml.param.Params
-  
- set(String, String) - Method in class org.apache.spark.SparkConf
- 
Set a configuration variable. 
- set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
-  
- set(long) - Method in class org.apache.spark.sql.types.Decimal
- 
Set this Decimal to the given Long. 
- set(int) - Method in class org.apache.spark.sql.types.Decimal
- 
Set this Decimal to the given Int. 
- set(long, int, int) - Method in class org.apache.spark.sql.types.Decimal
- 
Set this Decimal to the given unscaled Long, with a given precision and scale. 
- set(BigDecimal, int, int) - Method in class org.apache.spark.sql.types.Decimal
- 
Set this Decimal to the given BigDecimal value, with a given precision and scale. 
- set(BigDecimal) - Method in class org.apache.spark.sql.types.Decimal
- 
Set this Decimal to the given BigDecimal value, inheriting its precision and scale. 
- set(Decimal) - Method in class org.apache.spark.sql.types.Decimal
- 
Set this Decimal to the given Decimal value. 
- setActive(SQLContext) - Static method in class org.apache.spark.sql.SQLContext
- 
Changes the SQLContext that will be returned in this thread and its children when
 SQLContext.getOrCreate() is called. 
- setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD
- 
Set aggregator for RDD's shuffle. 
- setAlgo(String) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
- 
Sets Algorithm using a String. 
- setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
- 
Set multiple parameters together 
- setAlpha(double) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setAlpha(Vector) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Alias for setDocConcentration()
 
- setAlpha(double) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Alias for setDocConcentration()
 
- setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setAppName(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Set the application name. 
- setAppName(String) - Method in class org.apache.spark.SparkConf
- 
Set a name for your application. 
- setAppResource(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Set the main application resource. 
- setBandwidth(double) - Method in class org.apache.spark.mllib.stat.KernelDensity
- 
Sets the bandwidth (standard deviation) of the Gaussian kernel (default: 1.0).
 
- setBeta(double) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Alias for setTopicConcentration()
 
- setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setBlockSize(int) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-  
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Pass-through to SparkContext.setCallSite. 
- setCallSite(String) - Method in class org.apache.spark.SparkContext
- 
Set the thread-local property for overriding the call sites
 of actions and RDDs. 
- setCaseSensitive(boolean) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- setCategoricalFeaturesInfo(Map<Integer, Integer>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
- 
Sets categoricalFeaturesInfo using a Java Map. 
- setCensorCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Set the directory under which RDDs are going to be checkpointed. 
- setCheckpointDir(String) - Method in class org.apache.spark.SparkContext
- 
Set the directory under which RDDs are going to be checkpointed. 
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Period (in iterations) between checkpoints (default = 10). 
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setClassifier(Classifier<?, ?, ?>) - Method in class org.apache.spark.ml.classification.OneVsRest
-  
- setConf(String, String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Set a single configuration value for the application. 
- setConf(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
-  
- setConf(Properties) - Method in class org.apache.spark.sql.SQLContext
- 
Set Spark SQL configuration properties. 
- setConf(String, String) - Method in class org.apache.spark.sql.SQLContext
- 
Set the given Spark SQL configuration property. 
- setConfig(String, String) - Static method in class org.apache.spark.launcher.SparkLauncher
- 
Set a configuration value for the launcher library. 
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Set the largest change in log-likelihood at which convergence is
 considered to have occurred. 
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
- 
Set the convergence tolerance. 
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
- 
Set the convergence tolerance of iterations for L-BFGS. 
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
- 
Set the convergence tolerance. 
- setDecayFactor(double) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Set the decay factor directly (for forgetful algorithms). 
- setDefault(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
- 
Sets a default value for a param. 
- setDefault(Seq<ParamPair<?>>) - Method in interface org.apache.spark.ml.param.Params
- 
Sets default values for a list of params. 
- setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer
- 
Sets a class loader for the serializer to use in deserialization. 
- setDegree(int) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-  
- setDeployMode(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Set the deploy mode for the application. 
- setDocConcentration(double[]) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setDocConcentration(double) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setDocConcentration(Vector) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Concentration parameter (commonly named "alpha") for the prior placed on documents'
 distributions over topics ("theta"). 
- setDocConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Replicates a DoubledocConcentration to create a symmetric prior.
 
- setDropLast(boolean) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-  
- setElasticNetParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
- 
Set the ElasticNet mixing parameter. 
- setElasticNetParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression
- 
Set the ElasticNet mixing parameter. 
- setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Set the distance threshold within which we've consider centers to have converged. 
- setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
-  
- setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-  
- setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.CrossValidator
-  
- setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-  
- setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.CrossValidator
-  
- setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-  
- setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf
- 
Set an environment variable to be used when launching executors for this application. 
- setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
- 
Set multiple environment variables to be used when launching executors. 
- setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf
- 
Set multiple environment variables to be used when launching executors. 
- setFeatureIndex(int) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- setFeatureIndex(int) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.LDA
- 
The features for LDA should be a Vectorrepresenting the word counts in a document.
 
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.LDAModel
- 
The features for LDA should be a Vectorrepresenting the word counts in a document.
 
- setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.RFormula
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.PredictionModel
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.Predictor
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-  
- setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setFinalRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setFitIntercept(boolean) - Method in class org.apache.spark.ml.classification.LogisticRegression
- 
Whether to fit an intercept term. 
- setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
- 
Set if we should fit the intercept
 Default is true. 
- setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.LinearRegression
- 
Set if we should fit the intercept
 Default is true. 
- setFormula(String) - Method in class org.apache.spark.ml.feature.RFormula
- 
Sets the formula to use for this transformer. 
- setGaps(boolean) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent
- 
Set the gradient function (of the loss function of one single data example)
 to be used for SGD. 
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS
- 
Set the gradient function (of the loss function of one single data example)
 to be used for L-BFGS. 
- setHalfLife(double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Set the half life and time unit ("batches" or "points") for forgetful algorithms. 
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.StringIndexer
-  
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-  
- setIfMissing(String, String) - Method in class org.apache.spark.SparkConf
- 
Set a parameter if it isn't already configured 
- setImplicitPrefs(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setImpurity(String) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- setImpurity(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
- 
The impurity setting is ignored for GBT models. 
- setImpurity(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setImpurity(String) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- setImpurity(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
- 
The impurity setting is ignored for GBT models. 
- setImpurity(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setImpurity(Impurity) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setIndices(int[]) - Method in class org.apache.spark.ml.feature.VectorSlicer
-  
- setInitialCenters(Vector[], double[]) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Specify initial centers directly. 
- setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Set the initialization algorithm. 
- setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
- 
Set the initialization mode. 
- setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Set the number of steps for the k-means|| initialization mode. 
- setInitialModel(GaussianMixtureModel) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Set the initial GMM starting point, bypassing the random initialization. 
- setInitialModel(KMeansModel) - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Set the initial starting point, bypassing the random initialization or k-means||
 The condition model.k == this.k must be met, failure results
 in an IllegalArgumentException. 
- setInitialWeights(Vector) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
- 
Set the initial weights. 
- setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
- 
Set the initial weights. 
- setInitMode(String) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- setInitSteps(int) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizer
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.IDF
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.IndexToString
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.PCA
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.PCAModel
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorSlicer
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
-  
- setInputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
-  
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.Interaction
-  
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.VectorAssembler
-  
- setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
- 
Set if the algorithm should add an intercept. 
- setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setInverse(boolean) - Method in class org.apache.spark.ml.feature.DCT
-  
- setIsotonic(boolean) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- setIsotonic(boolean) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-  
- setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-  
- setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setJars(Seq<String>) - Method in class org.apache.spark.SparkConf
- 
Set JAR files to distribute to the cluster. 
- setJars(String[]) - Method in class org.apache.spark.SparkConf
- 
Set JAR files to distribute to the cluster. 
- setJavaHome(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Set a custom JAVA_HOME for launching the Spark application. 
- setJobDescription(String) - Method in class org.apache.spark.SparkContext
- 
Set a human readable description of the current job. 
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
 different value or cleared. 
- setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
 different value or cleared. 
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext
- 
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
 different value or cleared. 
- setK(int) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- setK(int) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setK(int) - Method in class org.apache.spark.ml.feature.PCA
-  
- setK(int) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Sets the desired number of leaf clusters (default: 4). 
- setK(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Set the number of Gaussians in the mixture model. 
- setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Set the number of clusters to create (k). 
- setK(int) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Number of topics to infer. 
- setK(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-  
- setK(int) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Set the number of clusters. 
- setKappa(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
- 
Learning rate: exponential decay rate---should be between
 (0.5, 1.0] to guarantee asymptotic convergence. 
- setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD
- 
Set key ordering for RDD's shuffle. 
- setLabelCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
-  
- setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-  
- setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-  
- setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-  
- setLabelCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-  
- setLabelCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-  
- setLabelCol(String) - Method in class org.apache.spark.ml.feature.RFormula
-  
- setLabelCol(String) - Method in class org.apache.spark.ml.Predictor
-  
- setLabelCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- setLabelCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- setLabels(String[]) - Method in class org.apache.spark.ml.feature.IndexToString
-  
- setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setLayers(int[]) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-  
- setLearningDecay(double) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setLearningOffset(double) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec
-  
- setLearningRate(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Set a local property that affects jobs submitted from this thread, such as the
 Spark fair scheduler pool. 
- setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext
- 
Set a local property that affects jobs submitted from this thread, such as the
 Spark fair scheduler pool. 
- setLogLevel(String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Control our logLevel. 
- setLogLevel(String) - Method in class org.apache.spark.SparkContext
- 
Control our logLevel. 
- setLoss(Loss) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- setLossType(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setLossType(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setMainClass(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Sets the application class name for Java/Scala applications. 
- setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD
- 
Set mapSideCombine flag for RDD's shuffle. 
- setMaster(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Set the Spark master for the application. 
- setMaster(String) - Method in class org.apache.spark.SparkConf
- 
The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to
 run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster. 
- setMax(double) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-  
- setMax(double) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-  
- setMaxBins(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- setMaxBins(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setMaxBins(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setMaxBins(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- setMaxBins(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setMaxBins(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setMaxBins(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setMaxCategories(int) - Method in class org.apache.spark.ml.feature.VectorIndexer
-  
- setMaxDepth(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- setMaxDepth(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setMaxDepth(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setMaxDepth(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- setMaxDepth(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setMaxDepth(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setMaxDepth(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.LogisticRegression
- 
Set the maximum number of iterations. 
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
- 
Set the maximum number of iterations. 
- setMaxIter(int) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- setMaxIter(int) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setMaxIter(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- setMaxIter(int) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
- 
Set the maximum number of iterations. 
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.LinearRegression
- 
Set the maximum number of iterations. 
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Sets the max number of k-means iterations to split clusters (default: 20). 
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Set the maximum number of iterations to run. 
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Set maximum number of iterations to run. 
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Maximum number of iterations for learning. 
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-  
- setMaxLocalProjDBSize(long) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
- 
Sets the maximum number of items (including delimiters used in the internal storage format)
 allowed in a projected database before local processing (default: 32000000L).
 
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setMaxMemoryInMB(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
- 
- setMaxPatternLength(int) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
- 
Sets maximal pattern length (default: 10).
 
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-  
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-  
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-  
- setMin(double) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-  
- setMin(double) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-  
- setMinConfidence(double) - Method in class org.apache.spark.mllib.fpm.AssociationRules
- 
Sets the minimal confidence (default: 0.8).
 
- setMinCount(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- setMinCount(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-  
- setMinDF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
-  
- setMinDivisibleClusterSize(double) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Sets the minimum number of points (if >= 1.0) or the minimum proportion of points
 (if <1.0) of a divisible cluster (default: 1).
 
- setMinDocFreq(int) - Method in class org.apache.spark.ml.feature.IDF
-  
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
- 
Set the fraction of each batch to use for updates. 
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
- 
Mini-batch fraction in (0, 1], which sets the fraction of document sampled and used in
 each iteration. 
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
- 
:: Experimental ::
 Set fraction of data to be used for each SGD iteration. 
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
- 
Set the fraction of each batch to use for updates. 
- setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setMinInfoGain(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setMinInstancesPerNode(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.FPGrowth
- 
Sets the minimal support level (default: 0.3).
 
- setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
- 
Sets the minimal support level (default: 0.1).
 
- setMinTF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
-  
- setMinTF(double) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-  
- setMinTokenLength(int) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- setModelType(String) - Method in class org.apache.spark.ml.classification.NaiveBayes
- 
Set the model type using a string (case-sensitive). 
- setModelType(String) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- setN(int) - Method in class org.apache.spark.ml.feature.NGram
-  
- setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Assign a name to this RDD 
- setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Assign a name to this RDD 
- setName(String) - Method in class org.apache.spark.api.java.JavaRDD
- 
Assign a name to this RDD 
- setName(String) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- setName(String) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- setName(String) - Method in class org.apache.spark.rdd.RDD
- 
Assign a name to this RDD 
- setNames(String[]) - Method in class org.apache.spark.ml.feature.VectorSlicer
-  
- setNonnegative(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setNumBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
- 
Sets both numUserBlocks and numItemBlocks to the specific value. 
- setNumBuckets(int) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-  
- setNumClasses(int) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
- 
Set the number of possible outcomes for k classes classification problem in
 Multinomial Logistic Regression. 
- setNumClasses(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
- 
Set the number of corrections used in the LBFGS update. 
- setNumFeatures(int) - Method in class org.apache.spark.ml.feature.HashingTF
-  
- setNumFolds(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
-  
- setNumItemBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setNumIterations(int) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
- 
Set the number of iterations of gradient descent to run per update. 
- setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-  
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent
- 
Set the number of iterations for SGD. 
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
- 
Set the maximal number of iterations for L-BFGS. 
- setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
- 
Set the number of iterations of gradient descent to run per update. 
- setNumIterations(int) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- setNumPartitions(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-  
- setNumPartitions(int) - Method in class org.apache.spark.mllib.fpm.FPGrowth
- 
Sets the number of partitions used by parallel FP-growth (default: same as input data). 
- setNumTopFeatures(int) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-  
- setNumTrees(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setNumTrees(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setNumUserBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setOptimizeDocConcentration(boolean) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setOptimizeDocConcentration(boolean) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
- 
Sets whether to optimize docConcentration parameter during training. 
- setOptimizer(String) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setOptimizer(LDAOptimizer) - Method in class org.apache.spark.mllib.clustering.LDA
- 
:: DeveloperApi :: 
- setOptimizer(String) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Set the LDAOptimizer used to perform the actual calculation by algorithm name. 
- setOrNull(long, int, int) - Method in class org.apache.spark.sql.types.Decimal
- 
Set this Decimal to the given unscaled Long, with a given precision and scale,
 and return it, or return null if it cannot be set due to overflow. 
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizer
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDF
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.IndexToString
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Interaction
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.PCA
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.PCAModel
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorAssembler
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorSlicer
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
-  
- setOutputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
-  
- setP(double) - Method in class org.apache.spark.ml.feature.Normalizer
-  
- setParent(Estimator<M>) - Method in class org.apache.spark.ml.Model
- 
Sets the parent of this model (Java API). 
- setPattern(String) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- setPeacePeriod(int) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.PredictionModel
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.Predictor
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- setPredictionCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-  
- setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-  
- setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
-  
- setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setPropertiesFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Set a custom properties file with Spark configuration for the application. 
- setQuantileCalculationStrategy(Enumeration.Value) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setQuantileProbabilities(double[]) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- setQuantileProbabilities(double[]) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- setQuantilesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- setQuantilesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- setRandomCenters(int, double, long) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Initialize random centers, requiring only the number of dimensions. 
- setRank(int) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setRatingCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.ClassificationModel
-  
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.Classifier
-  
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-  
- setRegParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
- 
Set the regularization parameter. 
- setRegParam(double) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setRegParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression
- 
Set the regularization parameter. 
- setRegParam(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
- 
Set the regularization parameter. 
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
- 
Set the regularization parameter. 
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
- 
Set the regularization parameter. 
- setRest(long, int, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans
- 
:: Experimental ::
 Set the number of runs of the algorithm to execute in parallel. 
- setSample(RDD<Object>) - Method in class org.apache.spark.mllib.stat.KernelDensity
- 
Sets the sample to use for density estimation. 
- setSample(JavaRDD<Double>) - Method in class org.apache.spark.mllib.stat.KernelDensity
- 
Sets the sample to use for density estimation (for Java users). 
- setScalingVec(Vector) - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-  
- setScoreCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
- 
Deprecated.
use setRawPredictionCol()instead
 
 
- setSeed(long) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- setSeed(long) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setSeed(long) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
- 
Set the seed for weights initialization. 
- setSeed(long) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setSeed(long) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- setSeed(long) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setSeed(long) - Method in class org.apache.spark.ml.clustering.LDAModel
-  
- setSeed(long) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-  
- setSeed(long) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- setSeed(long) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setSeed(long) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- setSeed(long) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setSeed(long) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
- 
Sets the random seed (default: hash value of the class name). 
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Set the random seed 
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.KMeans
- 
Set the random seed for cluster initialization. 
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Random seed 
- setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec
-  
- setSeed(long) - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-  
- setSeed(long) - Method in class org.apache.spark.mllib.random.GammaGenerator
-  
- setSeed(long) - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-  
- setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
-  
- setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-  
- setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
-  
- setSeed(long) - Method in class org.apache.spark.mllib.random.WeibullGenerator
-  
- setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setSeed(long) - Method in class org.apache.spark.util.random.BernoulliCellSampler
-  
- setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
-  
- setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
-  
- setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom
- 
Set random seed. 
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD
- 
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer) 
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD
- 
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer) 
- setSmoothing(double) - Method in class org.apache.spark.ml.classification.NaiveBayes
- 
Set the smoothing parameter. 
- setSolver(String) - Method in class org.apache.spark.ml.regression.LinearRegression
- 
Set the solver algorithm used for optimization. 
- setSparkHome(String) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Set a custom Spark installation location for the application. 
- setSparkHome(String) - Method in class org.apache.spark.SparkConf
- 
Set the location where Spark is installed on worker nodes. 
- setSplits(double[]) - Method in class org.apache.spark.ml.feature.Bucketizer
-  
- setSrcOnly(long, int, VD) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- setStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline
-  
- setStandardization(boolean) - Method in class org.apache.spark.ml.classification.LogisticRegression
- 
Whether to standardize the training features before fitting the model. 
- setStandardization(boolean) - Method in class org.apache.spark.ml.regression.LinearRegression
- 
Whether to standardize the training features before fitting the model. 
- setStatement(String) - Method in class org.apache.spark.ml.feature.SQLTransformer
-  
- setStepSize(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setStepSize(double) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- setStepSize(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setStepSize(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
- 
Set the step size for gradient descent. 
- setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
- 
Set the initial step size of SGD for the first step. 
- setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
- 
Set the step size for gradient descent. 
- setStopWords(String[]) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- setSubsamplingRate(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setTaskContext(TaskContext) - Static method in class org.apache.spark.TaskContext
- 
Set the thread local TaskContext. 
- setTau0(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
- 
A (positive) learning parameter that downweights early iterations. 
- setTestMethod(String) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-  
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-  
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- setThreshold(double) - Method in class org.apache.spark.ml.feature.Binarizer
-  
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
- 
Sets the threshold that separates positive predictions from negative predictions
 in Binary Logistic Regression. 
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel
- 
Sets the threshold that separates positive predictions from negative predictions. 
- setThresholds(double[]) - Method in class org.apache.spark.ml.classification.LogisticRegression
-  
- setThresholds(double[]) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-  
- setThresholds(double[]) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-  
- setThresholds(double[]) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
-  
- setTol(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
- 
Set the convergence tolerance of iterations. 
- setTol(double) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
- 
Set the convergence tolerance of iterations. 
- setTol(double) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- setTol(double) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
- 
Set the convergence tolerance of iterations. 
- setTol(double) - Method in class org.apache.spark.ml.regression.LinearRegression
- 
Set the convergence tolerance of iterations. 
- setToLowercase(boolean) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-  
- setTopicConcentration(double) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setTopicConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
- 
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
 distributions over terms. 
- setTopicDistributionCol(String) - Method in class org.apache.spark.ml.clustering.LDA
-  
- setTrainRatio(double) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-  
- setTreeStrategy(Strategy) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent
- 
Set the updater function to actually perform a gradient step in a given direction. 
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS
- 
Set the updater function to actually perform a gradient step in a given direction. 
- setupGroups(int) - Method in class org.apache.spark.rdd.PartitionCoalescer
- 
Initializes targetLen partition groups and assigns a preferredLocation
 This uses coupon collector to estimate how many preferredLocations it must rotate through
 until it has seen most of the preferred locations (2 * n log(n)) 
- setUseNodeIdCache(boolean) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-  
- setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-  
- setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
- 
Set if the algorithm should validate data before training. 
- setValidationTol(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- setValue(R) - Method in class org.apache.spark.Accumulable
- 
Set the accumulator's value; only allowed on master 
- setVectorSize(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-  
- setVerbose(boolean) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Enables verbose reporting for SparkSubmit. 
- setVocabSize(int) - Method in class org.apache.spark.ml.feature.CountVectorizer
-  
- setWeightCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
- 
Whether to over-/under-sample training instances according to the given weights in weightCol. 
- setWeightCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- setWeightCol(String) - Method in class org.apache.spark.ml.regression.LinearRegression
- 
Whether to over-/under-sample training instances according to the given weights in weightCol. 
- setWindowSize(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- setWindowSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-  
- setWindowSize(int) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-  
- setWithMean(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
-  
- setWithMean(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-  
- setWithStd(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
-  
- setWithStd(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-  
- sha1(Column) - Static method in class org.apache.spark.sql.functions
- 
Calculates the SHA-1 digest of a binary column and returns the value
 as a 40 character hex string. 
- sha2(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Calculates the SHA-2 family of hash functions of a binary column and
 returns the value as a hex string. 
- shape() - Method in class org.apache.spark.mllib.random.GammaGenerator
-  
- shiftLeft(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Shift the the given value numBits left. 
- shiftRight(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Shift the the given value numBits right. 
- shiftRightUnsigned(Column, int) - Static method in class org.apache.spark.sql.functions
- 
Unsigned shift the the given value numBits right. 
- SHORT() - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for nullable short type. 
- ShortDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
-  
- ShortestPaths - Class in org.apache.spark.graphx.lib
- 
Computes shortest paths to the given set of landmark vertices, returning a graph where each
 vertex attribute is a map containing the shortest-path distance to each reachable landmark. 
- ShortestPaths() - Constructor for class org.apache.spark.graphx.lib.ShortestPaths
-  
- shortName() - Method in class org.apache.spark.ml.source.libsvm.DefaultSource
-  
- shortName() - Method in interface org.apache.spark.sql.sources.DataSourceRegister
- 
The string that represents the format that this data source provider uses. 
- ShortType - Static variable in class org.apache.spark.sql.types.DataTypes
- 
Gets the ShortType object. 
- ShortType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 The data type representing Shortvalues.
 
- shouldDistributeGaussians(int, int) - Static method in class org.apache.spark.mllib.clustering.GaussianMixture
- 
Heuristic to distribute the computation of the MultivariateGaussians, approximately when
 d > 25 except for when k is very small.
 
- shouldFilterOut(String) - Static method in class org.apache.spark.sql.sources.HadoopFsRelation
- 
Checks if we should filter out this path name. 
- shouldGoLeft(Vector) - Method in interface org.apache.spark.ml.tree.Split
- 
Return true (split to left) or false (split to right). 
- shouldGoLeft(int, Split[]) - Method in interface org.apache.spark.ml.tree.Split
- 
Return true (split to left) or false (split to right). 
- shouldOverwrite() - Method in class org.apache.spark.ml.util.MLWriter
-  
- shouldOwn(Param<?>) - Method in interface org.apache.spark.ml.param.Params
- 
Validates that the input param belongs to this instance. 
- show(int) - Method in class org.apache.spark.sql.DataFrame
- 
- show() - Method in class org.apache.spark.sql.DataFrame
- 
Displays the top 20 rows of  DataFrame in a tabular form. 
- show(boolean) - Method in class org.apache.spark.sql.DataFrame
- 
Displays the top 20 rows of  DataFrame in a tabular form. 
- show(int, boolean) - Method in class org.apache.spark.sql.DataFrame
- 
- show(int) - Method in class org.apache.spark.sql.Dataset
- 
Displays the content of this  Dataset in a tabular form. 
- show() - Method in class org.apache.spark.sql.Dataset
- 
Displays the top 20 rows of  Dataset in a tabular form. 
- show(boolean) - Method in class org.apache.spark.sql.Dataset
- 
Displays the top 20 rows of  Dataset in a tabular form. 
- show(int, boolean) - Method in class org.apache.spark.sql.Dataset
- 
Displays the  Dataset in a tabular form. 
- showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- showBytesDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- showBytesDistribution(String, org.apache.spark.util.Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- showDistribution(String, org.apache.spark.util.Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- showDistribution(String, Option<org.apache.spark.util.Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- showDistribution(String, Option<org.apache.spark.util.Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- showMillisDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-  
- showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-  
- SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
-  
- SHUFFLE_DATA() - Static method in class org.apache.spark.storage.BlockId
-  
- SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
-  
- ShuffleBlockId - Class in org.apache.spark.storage
-  
- ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
-  
- ShuffleDataBlockId - Class in org.apache.spark.storage
-  
- ShuffleDataBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleDataBlockId
-  
- ShuffleDependency<K,V,C> - Class in org.apache.spark
- 
:: DeveloperApi ::
 Represents a dependency on the output of a shuffle stage. 
- ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Option<Serializer>, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.ShuffleDependency
-  
- ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd
- 
:: DeveloperApi ::
 The resulting RDD from a shuffle (e.g. 
- ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.rdd.ShuffledRDD
-  
- shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
-  
- shuffleId() - Method in class org.apache.spark.CleanShuffle
-  
- shuffleId() - Method in class org.apache.spark.FetchFailed
-  
- shuffleId() - Method in class org.apache.spark.ShuffleDependency
-  
- shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
-  
- shuffleId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-  
- shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-  
- ShuffleIndexBlockId - Class in org.apache.spark.storage
-  
- ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
-  
- shuffleManager() - Method in class org.apache.spark.SparkEnv
-  
- shuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-  
- shuffleReadBytes() - Method in class org.apache.spark.status.api.v1.StageData
-  
- ShuffleReadMetricDistributions - Class in org.apache.spark.status.api.v1
-  
- ShuffleReadMetrics - Class in org.apache.spark.status.api.v1
-  
- shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-  
- shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-  
- shuffleReadRecords() - Method in class org.apache.spark.status.api.v1.StageData
-  
- shuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-  
- shuffleWriteBytes() - Method in class org.apache.spark.status.api.v1.StageData
-  
- ShuffleWriteMetricDistributions - Class in org.apache.spark.status.api.v1
-  
- ShuffleWriteMetrics - Class in org.apache.spark.status.api.v1
-  
- shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-  
- shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-  
- shuffleWriteRecords() - Method in class org.apache.spark.status.api.v1.StageData
-  
- sigma() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-  
- sigmas() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-  
- SignalLoggerHandler - Class in org.apache.spark.util
-  
- SignalLoggerHandler(String, Logger) - Constructor for class org.apache.spark.util.SignalLoggerHandler
-  
- signum(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the signum of the given value. 
- signum(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the signum of the given column. 
- SimpleFutureAction<T> - Class in org.apache.spark
- 
A  FutureAction holding the result of an action that triggers a single job. 
- simpleString() - Method in class org.apache.spark.sql.hive.HiveContext.QueryExecution
-  
- simpleString() - Method in class org.apache.spark.sql.types.ArrayType
-  
- simpleString() - Method in class org.apache.spark.sql.types.ByteType
-  
- simpleString() - Method in class org.apache.spark.sql.types.DataType
- 
Readable string representation for the type. 
- simpleString() - Method in class org.apache.spark.sql.types.DecimalType
-  
- simpleString() - Method in class org.apache.spark.sql.types.IntegerType
-  
- simpleString() - Method in class org.apache.spark.sql.types.LongType
-  
- simpleString() - Method in class org.apache.spark.sql.types.MapType
-  
- simpleString() - Method in class org.apache.spark.sql.types.ShortType
-  
- simpleString() - Method in class org.apache.spark.sql.types.StructType
-  
- SimpleUpdater - Class in org.apache.spark.mllib.optimization
- 
:: DeveloperApi ::
 A simple updater for gradient descent *without* any regularization. 
- SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
-  
- SIMR_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-  
- sin(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the sine of the given value. 
- sin(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the sine of the given column. 
- SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg
- 
Represents singular value decomposition (SVD) factors. 
- SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
-  
- sinh(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the hyperbolic sine of the given value. 
- sinh(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the hyperbolic sine of the given column. 
- size() - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Size of the attribute group. 
- size() - Method in class org.apache.spark.ml.param.ParamMap
- 
Number of param pairs in this map. 
- size() - Method in class org.apache.spark.mllib.linalg.DenseVector
-  
- size() - Method in class org.apache.spark.mllib.linalg.SparseVector
-  
- size() - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Size of the vector. 
- size() - Method in class org.apache.spark.rdd.PartitionGroup
-  
- size(Column) - Static method in class org.apache.spark.sql.functions
- 
Returns length of array or map. 
- size() - Method in interface org.apache.spark.sql.Row
- 
Number of elements in the Row. 
- size() - Method in class org.apache.spark.storage.MemoryEntry
-  
- SizeEstimator - Class in org.apache.spark.util
- 
:: DeveloperApi ::
 Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in
 memory-aware caches. 
- SizeEstimator() - Constructor for class org.apache.spark.util.SizeEstimator
-  
- sizeInBytes() - Method in class org.apache.spark.sql.sources.BaseRelation
- 
Returns an estimated size of this relation in bytes. 
- sizeInBytes() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
-  
- sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
- 
Sketches the input RDD via reservoir sampling on each partition. 
- skewness(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the skewness of the values in a group. 
- skewness(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the skewness of the values in a group. 
- skip(long) - Method in class org.apache.spark.storage.BufferReleasingInputStream
-  
- skippedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- slack() - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return all the RDDs between 'fromDuration' to 'toDuration' (both included) 
- slice(org.apache.spark.streaming.Interval) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return all the RDDs defined by the Interval object (both end times included) 
- slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return all the RDDs between 'fromTime' to 'toTime' (both included) 
- slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream
- 
Time interval after which the DStream generates a RDD 
- slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
-  
- sliding(int, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
- 
Returns a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding
 window over them. 
- sliding(int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
- 
sliding(Int, Int)*with step = 1.
 
- SnappyCompressionCodec - Class in org.apache.spark.io
- 
- SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
-  
- SnappyOutputStreamWrapper - Class in org.apache.spark.io
- 
Wrapper over SnappyOutputStreamwhich guards against write-after-close and double-close
 issues.
 
- SnappyOutputStreamWrapper(SnappyOutputStream) - Constructor for class org.apache.spark.io.SnappyOutputStreamWrapper
-  
- socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream from network source hostname:port. 
- socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create a input stream from TCP source hostname:port. 
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream from network source hostname:port. 
- socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream from network source hostname:port. 
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create a input stream from TCP source hostname:port. 
- Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-  
- sort(String, String...) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame sorted by the specified column, all in ascending order. 
- sort(Column...) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame sorted by the given expressions. 
- sort(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame sorted by the specified column, all in ascending order. 
- sort(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame sorted by the given expressions. 
- sort_array(Column) - Static method in class org.apache.spark.sql.functions
- 
Sorts the input array for the given column in ascending order,
 according to the natural ordering of the array elements. 
- sort_array(Column, boolean) - Static method in class org.apache.spark.sql.functions
- 
Sorts the input array for the given column in ascending / descending order,
 according to the natural ordering of the array elements. 
- sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return this RDD sorted by the given key function. 
- sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
- 
Return this RDD sorted by the given key function. 
- sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Sort the RDD by key, so that each partition contains a sorted range of the elements in
 ascending order. 
- sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Sort the RDD by key, so that each partition contains a sorted range of the elements. 
- sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Sort the RDD by key, so that each partition contains a sorted range of the elements. 
- sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Sort the RDD by key, so that each partition contains a sorted range of the elements. 
- sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Sort the RDD by key, so that each partition contains a sorted range of the elements. 
- sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Sort the RDD by key, so that each partition contains a sorted range of the elements. 
- sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
- 
Sort the RDD by key, so that each partition contains a sorted range of the elements. 
- sortWithinPartitions(String, String...) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame with each partition sorted by the given expressions. 
- sortWithinPartitions(Column...) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame with each partition sorted by the given expressions. 
- sortWithinPartitions(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame with each partition sorted by the given expressions. 
- sortWithinPartitions(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame with each partition sorted by the given expressions. 
- soundex(Column) - Static method in class org.apache.spark.sql.functions
- 
* Return the soundex code for the specified expression. 
- SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
-  
- SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
-  
- SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
-  
- SPARK_MASTER - Static variable in class org.apache.spark.launcher.SparkLauncher
- 
The Spark master. 
- spark_partition_id() - Static method in class org.apache.spark.sql.functions
- 
Partition ID of the Spark task. 
- SPARK_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-  
- SparkAppHandle - Interface in org.apache.spark.launcher
- 
A handle to a running Spark application. 
- SparkAppHandle.Listener - Interface in org.apache.spark.launcher
- 
Listener for updates to a handle's state. 
- SparkAppHandle.State - Enum in org.apache.spark.launcher
- 
Represents the application's state. 
- SparkConf - Class in org.apache.spark
- 
Configuration for a Spark application. 
- SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
-  
- SparkConf() - Constructor for class org.apache.spark.SparkConf
- 
Create a SparkConf that loads defaults from system properties and the classpath 
- sparkContext() - Method in class org.apache.spark.rdd.RDD
- 
The SparkContext that created this RDD. 
- SparkContext - Class in org.apache.spark
- 
Main entry point for Spark functionality. 
- SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
-  
- SparkContext() - Constructor for class org.apache.spark.SparkContext
- 
Create a SparkContext that loads settings from system properties (for instance, when
 launching with ./bin/spark-submit). 
- SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
- 
:: DeveloperApi ::
 Alternative constructor for setting preferred locations where Spark will create executors. 
- SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext
- 
Alternative constructor that allows setting common Spark properties directly 
- SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
- 
Alternative constructor that allows setting common Spark properties directly 
- sparkContext() - Method in class org.apache.spark.sql.SQLContext
-  
- sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
The underlying SparkContext 
- sparkContext() - Method in class org.apache.spark.streaming.StreamingContext
- 
Return the associated Spark context 
- SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
-  
- SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-  
- SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
-  
- SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
-  
- SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
-  
- SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
-  
- SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
-  
- SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
-  
- SparkEnv - Class in org.apache.spark
- 
:: DeveloperApi ::
 Holds all the runtime environment objects for a running Spark instance (either master or worker),
 including the serializer, Akka actor system, block manager, map output tracker, etc. 
- SparkEnv(String, org.apache.spark.rpc.RpcEnv, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleManager, org.apache.spark.broadcast.BroadcastManager, BlockTransferService, org.apache.spark.storage.BlockManager, SecurityManager, String, org.apache.spark.metrics.MetricsSystem, MemoryManager, org.apache.spark.scheduler.OutputCommitCoordinator, SparkConf) - Constructor for class org.apache.spark.SparkEnv
-  
- SparkException - Exception in org.apache.spark
-  
- SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
-  
- SparkException(String) - Constructor for exception org.apache.spark.SparkException
-  
- SparkFiles - Class in org.apache.spark
- 
Resolves paths to files added through SparkContext.addFile().
 
- SparkFiles() - Constructor for class org.apache.spark.SparkFiles
-  
- sparkFilesDir() - Method in class org.apache.spark.SparkEnv
-  
- SparkFirehoseListener - Class in org.apache.spark
- 
Class that allows users to receive all SparkListener events. 
- SparkFirehoseListener() - Constructor for class org.apache.spark.SparkFirehoseListener
-  
- SparkFlumeEvent - Class in org.apache.spark.streaming.flume
- 
A wrapper class for AvroFlumeEvent's with a custom serialization format. 
- SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
-  
- SparkJobInfo - Interface in org.apache.spark
- 
Exposes information about Spark Jobs. 
- SparkJobInfoImpl - Class in org.apache.spark
-  
- SparkJobInfoImpl(int, int[], JobExecutionStatus) - Constructor for class org.apache.spark.SparkJobInfoImpl
-  
- SparkLauncher - Class in org.apache.spark.launcher
- 
Launcher for Spark applications. 
- SparkLauncher() - Constructor for class org.apache.spark.launcher.SparkLauncher
-  
- SparkLauncher(Map<String, String>) - Constructor for class org.apache.spark.launcher.SparkLauncher
- 
Creates a launcher that will set the given environment variables in the child. 
- SparkListener - Interface in org.apache.spark.scheduler
- 
:: DeveloperApi ::
 Interface for listening to events from the Spark scheduler. 
- SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
-  
- SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
-  
- SparkListenerApplicationStart - Class in org.apache.spark.scheduler
-  
- SparkListenerApplicationStart(String, Option<String>, long, String, Option<String>, Option<Map<String, String>>) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
-  
- SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
-  
- SparkListenerBlockManagerAdded(long, BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-  
- SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
-  
- SparkListenerBlockManagerRemoved(long, BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-  
- SparkListenerBlockUpdated - Class in org.apache.spark.scheduler
-  
- SparkListenerBlockUpdated(BlockUpdatedInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockUpdated
-  
- SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
-  
- SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
-  
- SparkListenerEvent - Interface in org.apache.spark.scheduler
-  
- SparkListenerExecutorAdded - Class in org.apache.spark.scheduler
-  
- SparkListenerExecutorAdded(long, String, ExecutorInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorAdded
-  
- SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler
- 
Periodic updates from executors. 
- SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, TaskMetrics>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-  
- SparkListenerExecutorRemoved - Class in org.apache.spark.scheduler
-  
- SparkListenerExecutorRemoved(long, String, String) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-  
- SparkListenerJobEnd - Class in org.apache.spark.scheduler
-  
- SparkListenerJobEnd(int, long, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
-  
- SparkListenerJobStart - Class in org.apache.spark.scheduler
-  
- SparkListenerJobStart(int, long, Seq<StageInfo>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
-  
- SparkListenerStageCompleted - Class in org.apache.spark.scheduler
-  
- SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
-  
- SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
-  
- SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
-  
- SparkListenerTaskEnd - Class in org.apache.spark.scheduler
-  
- SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
-  
- SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
-  
- SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-  
- SparkListenerTaskStart - Class in org.apache.spark.scheduler
-  
- SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
-  
- SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
-  
- SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-  
- SparkMasterRegex - Class in org.apache.spark
- 
A collection of regexes for extracting information from the master string. 
- SparkMasterRegex() - Constructor for class org.apache.spark.SparkMasterRegex
-  
- sparkPartitionId() - Static method in class org.apache.spark.sql.functions
- 
Deprecated.
As of 1.6.0, replaced by spark_partition_id. This will be removed in Spark 2.0.
 
 
- sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
-  
- SparkShutdownHook - Class in org.apache.spark.util
-  
- SparkShutdownHook(int, Function0<BoxedUnit>) - Constructor for class org.apache.spark.util.SparkShutdownHook
-  
- SparkStageInfo - Interface in org.apache.spark
- 
Exposes information about Spark Stages. 
- SparkStageInfoImpl - Class in org.apache.spark
-  
- SparkStageInfoImpl(int, int, long, String, int, int, int, int) - Constructor for class org.apache.spark.SparkStageInfoImpl
-  
- SparkStatusTracker - Class in org.apache.spark
- 
Low-level status reporting APIs for monitoring job and stage progress. 
- sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
-  
- sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-  
- sparkUser() - Method in class org.apache.spark.SparkContext
-  
- sparkUser() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-  
- sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
- 
Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format. 
- sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Creates a sparse vector providing its index array and value array. 
- sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Creates a sparse vector using unordered (index, value) pairs. 
- sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way. 
- SparseMatrix - Class in org.apache.spark.mllib.linalg
- 
Column-major sparse matrix. 
- SparseMatrix(int, int, int[], int[], double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
-  
- SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
- 
Column-major sparse matrix. 
- SparseVector - Class in org.apache.spark.mllib.linalg
- 
A sparse vector represented by an index array and an value array. 
- SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
-  
- sparsity() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-  
- spdiag(Vector) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
- 
Generate a diagonal matrix in SparseMatrixformat from the supplied values.
 
- SpecialLengths - Class in org.apache.spark.api.r
-  
- SpecialLengths() - Constructor for class org.apache.spark.api.r.SpecialLengths
-  
- speculative() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- speculative() - Method in class org.apache.spark.status.api.v1.TaskData
-  
- speye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices
- 
Generate a sparse Identity Matrix in Matrixformat.
 
- speye(int) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
- 
Generate an Identity Matrix in SparseMatrixformat.
 
- SpillListener - Class in org.apache.spark
- 
A SparkListenerthat detects whether spills have occurred in Spark jobs.
 
- SpillListener() - Constructor for class org.apache.spark.SpillListener
-  
- split() - Method in class org.apache.spark.ml.tree.InternalNode
-  
- Split - Interface in org.apache.spark.ml.tree
- 
:: DeveloperApi ::
 Interface for a "Split," which specifies a test made at a decision tree node
 to choose the left or right path. 
- split() - Method in class org.apache.spark.mllib.tree.model.Node
-  
- Split - Class in org.apache.spark.mllib.tree.model
- 
:: DeveloperApi ::
 Split applied to a feature
 param:  feature feature index
 param:  threshold Threshold for continuous feature. 
- Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
-  
- split(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Splits str around pattern (pattern is a regular expression). 
- SPLIT_INFO_REFLECTIONS() - Static method in class org.apache.spark.rdd.HadoopRDD
-  
- splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
-  
- SplitInfo - Class in org.apache.spark.scheduler
-  
- SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
-  
- splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
-  
- splits() - Method in class org.apache.spark.ml.feature.Bucketizer
- 
Parameter for mapping continuous features into buckets. 
- sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
- 
Generate a SparseMatrixconsisting ofi.i.d.gaussian random numbers.
 
- sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
- 
Generate a SparseMatrixconsisting ofi.i.d.
 
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
- 
Generate a SparseMatrixconsisting ofi.i.d.gaussian random numbers.
 
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
- 
Generate a SparseMatrixconsisting ofi.i.d.
 
- sqdist(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.Vectors
- 
Returns the squared distance between two Vectors. 
- sql(String) - Method in class org.apache.spark.sql.SQLContext
-  
- sqlContext() - Method in class org.apache.spark.ml.clustering.LDAModel
-  
- sqlContext() - Method in class org.apache.spark.sql.DataFrame
-  
- sqlContext() - Method in class org.apache.spark.sql.Dataset
-  
- sqlContext() - Method in class org.apache.spark.sql.sources.BaseRelation
-  
- SQLContext - Class in org.apache.spark.sql
- 
The entry point for working with structured data (rows and columns) in Spark. 
- SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
-  
- SQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.SQLContext
-  
- SQLContext.implicits$ - Class in org.apache.spark.sql
- 
:: Experimental ::
 (Scala-specific) Implicit methods available in Scala for converting
 common Scala objects into  DataFrames. 
- SQLContext.implicits$() - Constructor for class org.apache.spark.sql.SQLContext.implicits$
-  
- SQLContext.implicits$.StringToColumn - Class in org.apache.spark.sql
- 
Converts $"col name" into an  Column. 
- SQLContext.implicits$.StringToColumn(StringContext) - Constructor for class org.apache.spark.sql.SQLContext.implicits$.StringToColumn
-  
- SQLContext.QueryExecution - Class in org.apache.spark.sql
-  
- SQLContext.QueryExecution(LogicalPlan) - Constructor for class org.apache.spark.sql.SQLContext.QueryExecution
-  
- SQLContext.SparkPlanner - Class in org.apache.spark.sql
-  
- SQLContext.SparkPlanner() - Constructor for class org.apache.spark.sql.SQLContext.SparkPlanner
-  
- SQLImplicits - Class in org.apache.spark.sql
- 
A collection of implicit methods for converting common Scala objects into  DataFrames. 
- SQLImplicits() - Constructor for class org.apache.spark.sql.SQLImplicits
-  
- sqlParser() - Method in class org.apache.spark.sql.SQLContext
-  
- SQLTransformer - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 Implements the transformations which are defined by SQL statement. 
- SQLTransformer(String) - Constructor for class org.apache.spark.ml.feature.SQLTransformer
-  
- SQLTransformer() - Constructor for class org.apache.spark.ml.feature.SQLTransformer
-  
- sqlType() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-  
- sqlType() - Method in class org.apache.spark.sql.types.UserDefinedType
- 
Underlying storage type for this UDT 
- SQLUserDefinedType - Annotation Type in org.apache.spark.sql.types
- 
::DeveloperApi::
 A user-defined type which can be automatically recognized by a SQLContext and registered. 
- sqrt(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the square root of the specified float value. 
- sqrt(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the square root of the specified float value. 
- squaredDist(Vector) - Method in class org.apache.spark.util.Vector
-  
- SquaredError - Class in org.apache.spark.mllib.tree.loss
- 
:: DeveloperApi ::
 Class for squared error loss calculation. 
- SquaredError() - Constructor for class org.apache.spark.mllib.tree.loss.SquaredError
-  
- SquaredL2Updater - Class in org.apache.spark.mllib.optimization
- 
:: DeveloperApi ::
 Updater for L2 regularized problems. 
- SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
-  
- Src - Static variable in class org.apache.spark.graphx.TripletFields
- 
Expose the source and edge fields but not the destination field. 
- srcAttr() - Method in class org.apache.spark.graphx.EdgeContext
- 
The vertex attribute of the edge's source vertex. 
- srcAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
- 
The source vertex attribute 
- srcAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- srcId() - Method in class org.apache.spark.graphx.Edge
-  
- srcId() - Method in class org.apache.spark.graphx.EdgeContext
- 
The vertex id of the edge's source vertex. 
- srcId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-  
- srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-  
- ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-  
- ssc() - Method in class org.apache.spark.streaming.dstream.DStream
-  
- stackTrace() - Method in class org.apache.spark.ExceptionFailure
-  
- stage() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-  
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-  
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-  
- StageData - Class in org.apache.spark.status.api.v1
-  
- stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
-  
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-  
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-  
- stageId() - Method in class org.apache.spark.scheduler.StageInfo
-  
- stageId() - Method in interface org.apache.spark.SparkStageInfo
-  
- stageId() - Method in class org.apache.spark.SparkStageInfoImpl
-  
- stageId() - Method in class org.apache.spark.status.api.v1.StageData
-  
- stageId() - Method in class org.apache.spark.TaskContext
- 
The ID of the stage that this task belong to. 
- stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-  
- stageIds() - Method in interface org.apache.spark.SparkJobInfo
-  
- stageIds() - Method in class org.apache.spark.SparkJobInfoImpl
-  
- stageIds() - Method in class org.apache.spark.status.api.v1.JobData
-  
- stageIdToActiveJobIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- stageIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- stageIdToInfo() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
-  
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-  
- StageInfo - Class in org.apache.spark.scheduler
- 
:: DeveloperApi ::
 Stores information about a stage to pass from the scheduler to SparkListeners. 
- StageInfo(int, int, String, int, Seq<RDDInfo>, Seq<Object>, String, Seq<Seq<TaskLocation>>) - Constructor for class org.apache.spark.scheduler.StageInfo
-  
- stageInfos() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-  
- stageLogInfo(int, String, boolean) - Method in class org.apache.spark.scheduler.JobLogger
- 
Write info into log file 
- stages() - Method in class org.apache.spark.ml.Pipeline
- 
param for pipeline stages 
- stages() - Method in class org.apache.spark.ml.PipelineModel
-  
- StageStatus - Enum in org.apache.spark.status.api.v1
-  
- StandardNormalGenerator - Class in org.apache.spark.mllib.random
- 
:: DeveloperApi ::
 Generates i.i.d. 
- StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
-  
- StandardScaler - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 Standardizes features by removing the mean and scaling to unit variance using column summary
 statistics on the samples in the training set. 
- StandardScaler(String) - Constructor for class org.apache.spark.ml.feature.StandardScaler
-  
- StandardScaler() - Constructor for class org.apache.spark.ml.feature.StandardScaler
-  
- StandardScaler - Class in org.apache.spark.mllib.feature
- 
Standardizes features by removing the mean and scaling to unit std using column summary
 statistics on the samples in the training set. 
- StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-  
- StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-  
- StandardScalerModel - Class in org.apache.spark.ml.feature
-  
- StandardScalerModel - Class in org.apache.spark.mllib.feature
- 
Represents a StandardScaler model that can transform vectors. 
- StandardScalerModel(Vector, Vector, boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-  
- StandardScalerModel(Vector, Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-  
- StandardScalerModel(Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-  
- starGraph(SparkContext, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
- 
Create a star graph with vertex 0 being the center. 
- start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Start the execution of the streams. 
- start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-  
- start() - Method in class org.apache.spark.streaming.dstream.InputDStream
- 
Method called to start receiving data. 
- start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-  
- start() - Method in class org.apache.spark.streaming.StreamingContext
- 
Start the execution of the streams. 
- startApplication(SparkAppHandle.Listener...) - Method in class org.apache.spark.launcher.SparkLauncher
- 
Starts a Spark application. 
- startIndexInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
- 
Return the index of the first node in the given level. 
- startPosition() - Method in exception org.apache.spark.sql.AnalysisException
-  
- startsWith(Column) - Method in class org.apache.spark.sql.Column
- 
String starts with. 
- startsWith(String) - Method in class org.apache.spark.sql.Column
- 
String starts with another string literal. 
- startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
-  
- startTime() - Method in class org.apache.spark.SparkContext
-  
- startTime() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-  
- startTime() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-  
- startTime() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-  
- stat() - Method in class org.apache.spark.sql.DataFrame
- 
- StatCounter - Class in org.apache.spark.util
- 
A class for tracking the statistics of a set of numbers (count, mean and variance) in a
 numerically robust way. 
- StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
-  
- StatCounter() - Constructor for class org.apache.spark.util.StatCounter
- 
Initialize the StatCounter with no values. 
- state() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-  
- State<S> - Class in org.apache.spark.streaming
- 
:: Experimental ::
 Abstract class for getting and updating the state in mapping function used in the  mapWithState
 operation of a  pair DStream (Scala)
 or a  JavaPairDStream (Java). 
- State() - Constructor for class org.apache.spark.streaming.State
-  
- stateChanged(SparkAppHandle) - Method in interface org.apache.spark.launcher.SparkAppHandle.Listener
- 
Callback for changes in the handle's state. 
- statement() - Method in class org.apache.spark.ml.feature.SQLTransformer
- 
SQL statement parameter. 
- stateSnapshots() - Method in class org.apache.spark.streaming.api.java.JavaMapWithStateDStream
-  
- stateSnapshots() - Method in class org.apache.spark.streaming.dstream.MapWithStateDStream
- 
Return a pair DStream where each RDD is the snapshot of the state of all the keys. 
- StateSpec<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming
- 
:: Experimental ::
 Abstract class representing all the specifications of the DStream transformation
  mapWithState operation of a
  pair DStream (Scala) or a
  JavaPairDStream (Java). 
- StateSpec() - Constructor for class org.apache.spark.streaming.StateSpec
-  
- staticPageRank(int, double) - Method in class org.apache.spark.graphx.GraphOps
- 
Run PageRank for a fixed number of iterations returning a graph with vertex attributes
 containing the PageRank and edge attributes the normalized edge weight. 
- staticPersonalizedPageRank(long, int, double) - Method in class org.apache.spark.graphx.GraphOps
- 
Run Personalized PageRank for a fixed number of iterations with
 with all iterations originating at the source node
 returning a graph with vertex attributes
 containing the PageRank and edge attributes the normalized edge weight. 
- statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-  
- statistic() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
-  
- statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult
- 
Test statistic. 
- Statistics - Class in org.apache.spark.mllib.stat
-  
- Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
-  
- Statistics - Class in org.apache.spark.streaming.receiver
- 
:: DeveloperApi ::
 Statistics for querying the supervisor about state of workers. 
- Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
-  
- stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return a  StatCounter object that captures the mean, variance and
 count of the RDD's elements in one operation. 
- stats() - Method in class org.apache.spark.mllib.tree.model.Node
-  
- stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
- 
Return a  StatCounter object that captures the mean, variance and
 count of the RDD's elements in one operation. 
- StatsReportListener - Class in org.apache.spark.scheduler
- 
:: DeveloperApi ::
 Simple SparkListener that logs a few summary statistics when each stage completes 
- StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
-  
- StatsReportListener - Class in org.apache.spark.streaming.scheduler
- 
:: DeveloperApi ::
 A simple StreamingListener that logs summary statistics across Spark Streaming batches
 param:  numBatchInfos Number of last batches to consider for generating statistics (default: 10) 
- StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
-  
- status() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- status() - Method in interface org.apache.spark.SparkJobInfo
-  
- status() - Method in class org.apache.spark.SparkJobInfoImpl
-  
- status() - Method in class org.apache.spark.status.api.v1.JobData
-  
- status() - Method in class org.apache.spark.status.api.v1.StageData
-  
- statusTracker() - Method in class org.apache.spark.api.java.JavaSparkContext
-  
- statusTracker() - Method in class org.apache.spark.SparkContext
-  
- StatusUpdate - Class in org.apache.spark.scheduler.local
-  
- StatusUpdate(long, Enumeration.Value, ByteBuffer) - Constructor for class org.apache.spark.scheduler.local.StatusUpdate
-  
- std() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-  
- std() - Method in class org.apache.spark.ml.feature.StandardScalerModel
-  
- std() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-  
- std() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-  
- stddev(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: alias for stddev_samp.
 
- stddev(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: alias for stddev_samp.
 
- stddev_pop(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the population standard deviation of
 the expression in a group. 
- stddev_pop(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the population standard deviation of
 the expression in a group. 
- stddev_samp(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the sample standard deviation of
 the expression in a group. 
- stddev_samp(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the sample standard deviation of
 the expression in a group. 
- stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Compute the standard deviation of this RDD's elements. 
- stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
- 
Compute the standard deviation of this RDD's elements. 
- stdev() - Method in class org.apache.spark.util.StatCounter
- 
Return the standard deviation of the values. 
- stop() - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Shut down the SparkContext. 
- stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
-  
- stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
-  
- stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
-  
- stop() - Method in interface org.apache.spark.launcher.SparkAppHandle
- 
Asks the application to stop. 
- stop() - Method in class org.apache.spark.SparkContext
-  
- stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Stop the execution of the streams. 
- stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Stop the execution of the streams. 
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Stop the execution of the streams. 
- stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-  
- stop() - Method in class org.apache.spark.streaming.dstream.InputDStream
- 
Method called to stop receiving data. 
- stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-  
- stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Stop the receiver completely. 
- stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Stop the receiver completely due to an exception 
- stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext
- 
Stop the execution of the streams immediately (does not wait for all received data
 to be processed). 
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext
- 
Stop the execution of the streams, with option of ensuring all received data
 has been processed. 
- StopCoordinator - Class in org.apache.spark.scheduler
-  
- StopCoordinator() - Constructor for class org.apache.spark.scheduler.StopCoordinator
-  
- StopExecutor - Class in org.apache.spark.scheduler.local
-  
- StopExecutor() - Constructor for class org.apache.spark.scheduler.local.StopExecutor
-  
- stopped() - Method in class org.apache.spark.SparkContext
-  
- stopWords() - Method in class org.apache.spark.ml.feature.StopWordsRemover
- 
the stop words set to be filtered out
 Default: StopWords.English
 
- StopWordsRemover - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 A feature transformer that filters out stop words from input. 
- StopWordsRemover(String) - Constructor for class org.apache.spark.ml.feature.StopWordsRemover
-  
- StopWordsRemover() - Constructor for class org.apache.spark.ml.feature.StopWordsRemover
-  
- storageLevel() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
-  
- storageLevel() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-  
- storageLevel() - Method in class org.apache.spark.storage.BlockStatus
-  
- storageLevel() - Method in class org.apache.spark.storage.BlockUpdatedInfo
-  
- storageLevel() - Method in class org.apache.spark.storage.RDDInfo
-  
- StorageLevel - Class in org.apache.spark.storage
- 
:: DeveloperApi ::
 Flags for controlling the storage of an RDD. 
- StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
-  
- storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
-  
- storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
-  
- storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel
- 
:: DeveloperApi ::
 Read StorageLevel object from ObjectInput stream. 
- StorageLevels - Class in org.apache.spark.api.java
- 
Expose some commonly useful storage level constants. 
- StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
-  
- StorageListener - Class in org.apache.spark.ui.storage
- 
:: DeveloperApi ::
 A SparkListener that prepares information to be displayed on the BlockManagerUI. 
- StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
-  
- StorageStatus - Class in org.apache.spark.storage
- 
:: DeveloperApi ::
 Storage information for each BlockManager. 
- StorageStatus(BlockManagerId, long) - Constructor for class org.apache.spark.storage.StorageStatus
-  
- StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus
- 
Create a storage status with an initial set of blocks, leaving the source unmodified. 
- storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
-  
- storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
-  
- storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
-  
- StorageStatusListener - Class in org.apache.spark.storage
- 
:: DeveloperApi ::
 A SparkListener that maintains executor storage status. 
- StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
-  
- store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
- 
Store an iterator of received data as a data block into Spark's memory. 
- store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
- 
Store the bytes of received data as a data block into Spark's memory. 
- store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
- 
Store a single item of received data to Spark's memory. 
- store(T) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Store a single item of received data to Spark's memory. 
- store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Store an ArrayBuffer of received data as a data block into Spark's memory. 
- store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Store an ArrayBuffer of received data as a data block into Spark's memory. 
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Store an iterator of received data as a data block into Spark's memory. 
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Store an iterator of received data as a data block into Spark's memory. 
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Store an iterator of received data as a data block into Spark's memory. 
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Store an iterator of received data as a data block into Spark's memory. 
- store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Store the bytes of received data as a data block into Spark's memory. 
- store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Store the bytes of received data as a data block into Spark's memory. 
- Strategy - Class in org.apache.spark.mllib.tree.configuration
- 
Stores all the configuration options for tree construction
 param:  algo  Learning goal. 
- Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int, double, int, double, boolean, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
-  
- Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
- 
- STREAM() - Static method in class org.apache.spark.storage.BlockId
-  
- StreamBlockId - Class in org.apache.spark.storage
-  
- StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
-  
- streamId() - Method in class org.apache.spark.storage.StreamBlockId
-  
- streamId() - Method in class org.apache.spark.streaming.receiver.Receiver
- 
Get the unique identifier the receiver input stream that this
 receiver is associated with. 
- streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-  
- streamIdToInputInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-  
- streamIdToNumRecords() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-  
- StreamingContext - Class in org.apache.spark.streaming
- 
Main entry point for Spark Streaming functionality. 
- StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
- 
Create a StreamingContext using an existing SparkContext. 
- StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
- 
Create a StreamingContext by providing the configuration necessary for a new SparkContext. 
- StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext
- 
Create a StreamingContext by providing the details necessary for creating a new SparkContext. 
- StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext
- 
Recreate a StreamingContext from a checkpoint file. 
- StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext
- 
Recreate a StreamingContext from a checkpoint file. 
- StreamingContext(String, SparkContext) - Constructor for class org.apache.spark.streaming.StreamingContext
- 
Recreate a StreamingContext from a checkpoint file using an existing SparkContext. 
- StreamingContextPythonHelper - Class in org.apache.spark.streaming
-  
- StreamingContextPythonHelper() - Constructor for class org.apache.spark.streaming.StreamingContextPythonHelper
-  
- StreamingContextState - Enum in org.apache.spark.streaming
- 
:: DeveloperApi ::
 Represents the state of a StreamingContext. 
- StreamingKMeans - Class in org.apache.spark.mllib.clustering
- 
StreamingKMeans provides methods for configuring a
 streaming k-means analysis, training the model on streaming,
 and using the model to make predictions on streaming data. 
- StreamingKMeans(int, double, String) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
-  
- StreamingKMeans() - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
-  
- StreamingKMeansModel - Class in org.apache.spark.mllib.clustering
- 
StreamingKMeansModel extends MLlib's KMeansModel for streaming
 algorithms, so it can keep track of a continuously updated weight
 associated with each cluster, and also update the model by
 doing a single iteration of the standard k-means algorithm. 
- StreamingKMeansModel(Vector[], double[]) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeansModel
-  
- StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression
- 
:: DeveloperApi ::
 StreamingLinearAlgorithm implements methods for continuously
 training a generalized linear model model on streaming data,
 and using it for prediction on (possibly different) streaming data. 
- StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-  
- StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
- 
Train or predict a linear regression model on streaming data. 
- StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
- 
Construct a StreamingLinearRegression object with default parameters:
 {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}. 
- StreamingListener - Interface in org.apache.spark.streaming.scheduler
- 
:: DeveloperApi ::
 A listener interface for receiving information about an ongoing streaming
 computation. 
- StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
-  
- StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
-  
- StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
-  
- StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
-  
- StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
-  
- StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
-  
- StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler
- 
:: DeveloperApi ::
 Base trait for events related to StreamingListener 
- StreamingListenerOutputOperationCompleted - Class in org.apache.spark.streaming.scheduler
-  
- StreamingListenerOutputOperationCompleted(OutputOperationInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
-  
- StreamingListenerOutputOperationStarted - Class in org.apache.spark.streaming.scheduler
-  
- StreamingListenerOutputOperationStarted(OutputOperationInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
-  
- StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
-  
- StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-  
- StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
-  
- StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-  
- StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
-  
- StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-  
- StreamingLogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
- 
Train or predict a logistic regression model on streaming data. 
- StreamingLogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
- 
Construct a StreamingLogisticRegression object with default parameters:
 {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}. 
- StreamingTest - Class in org.apache.spark.mllib.stat.test
-  
- StreamingTest() - Constructor for class org.apache.spark.mllib.stat.test.StreamingTest
-  
- StreamInputInfo - Class in org.apache.spark.streaming.scheduler
- 
:: DeveloperApi ::
 Track the information of input stream at specified batch time. 
- StreamInputInfo(int, long, Map<String, Object>) - Constructor for class org.apache.spark.streaming.scheduler.StreamInputInfo
-  
- string() - Method in class org.apache.spark.sql.ColumnName
- 
Creates a new StructFieldof type string.
 
- STRING() - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for nullable string type. 
- StringArrayParam - Class in org.apache.spark.ml.param
- 
:: DeveloperApi ::
 Specialized version of Param[Array[String} for Java.
 
- StringArrayParam(Params, String, String, Function1<String[], Object>) - Constructor for class org.apache.spark.ml.param.StringArrayParam
-  
- StringArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.StringArrayParam
-  
- StringContains - Class in org.apache.spark.sql.sources
- 
A filter that evaluates to trueiff the attribute evaluates to
 a string that contains the stringvalue.
 
- StringContains(String, String) - Constructor for class org.apache.spark.sql.sources.StringContains
-  
- StringEndsWith - Class in org.apache.spark.sql.sources
- 
A filter that evaluates to trueiff the attribute evaluates to
 a string that starts withvalue.
 
- StringEndsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringEndsWith
-  
- StringIndexer - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 A label indexer that maps a string column of labels to an ML column of label indices. 
- StringIndexer(String) - Constructor for class org.apache.spark.ml.feature.StringIndexer
-  
- StringIndexer() - Constructor for class org.apache.spark.ml.feature.StringIndexer
-  
- StringIndexerModel - Class in org.apache.spark.ml.feature
- 
- StringIndexerModel(String, String[]) - Constructor for class org.apache.spark.ml.feature.StringIndexerModel
-  
- StringIndexerModel(String[]) - Constructor for class org.apache.spark.ml.feature.StringIndexerModel
-  
- stringRddToDataFrameHolder(RDD<String>) - Method in class org.apache.spark.sql.SQLImplicits
- 
Creates a single column DataFrame from an RDD[String]. 
- stringResult() - Method in class org.apache.spark.sql.hive.HiveContext.QueryExecution
- 
Returns the result as a hive compatible sequence of strings. 
- StringRRDD<T> - Class in org.apache.spark.api.r
- 
An RDD that stores R objects as Array[String]. 
- StringRRDD(RDD<T>, byte[], String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.StringRRDD
-  
- StringStartsWith - Class in org.apache.spark.sql.sources
- 
A filter that evaluates to trueiff the attribute evaluates to
 a string that starts withvalue.
 
- StringStartsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringStartsWith
-  
- stringToText(String) - Static method in class org.apache.spark.SparkContext
-  
- StringType - Static variable in class org.apache.spark.sql.types.DataTypes
- 
Gets the StringType object. 
- StringType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 The data type representing Stringvalues.
 
- stringWritableConverter() - Static method in class org.apache.spark.SparkContext
-  
- stronglyConnectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps
- 
Compute the strongly connected component (SCC) of each vertex and return a graph with the
 vertex value containing the lowest vertex id in the SCC containing that vertex. 
- StronglyConnectedComponents - Class in org.apache.spark.graphx.lib
- 
Strongly connected components algorithm implementation. 
- StronglyConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.StronglyConnectedComponents
-  
- struct(Seq<StructField>) - Method in class org.apache.spark.sql.ColumnName
- 
Creates a new StructFieldof type struct.
 
- struct(StructType) - Method in class org.apache.spark.sql.ColumnName
- 
Creates a new StructFieldof type struct.
 
- struct(Column...) - Static method in class org.apache.spark.sql.functions
- 
Creates a new struct column. 
- struct(String, String...) - Static method in class org.apache.spark.sql.functions
- 
Creates a new struct column that composes multiple input columns. 
- struct(Seq<Column>) - Static method in class org.apache.spark.sql.functions
- 
Creates a new struct column. 
- struct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
- 
Creates a new struct column that composes multiple input columns. 
- StructField - Class in org.apache.spark.sql.types
- 
A field inside a StructType. 
- StructField(String, DataType, boolean, Metadata) - Constructor for class org.apache.spark.sql.types.StructField
-  
- StructField() - Constructor for class org.apache.spark.sql.types.StructField
- 
No-arg constructor for kryo. 
- StructType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 A  StructType object can be constructed by 
- StructType(StructField[]) - Constructor for class org.apache.spark.sql.types.StructType
-  
- StructType() - Constructor for class org.apache.spark.sql.types.StructType
- 
No-arg constructor for kryo. 
- subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.Graph
- 
Restricts the graph to only the vertices and edges satisfying the predicates. 
- subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-  
- submissionTime() - Method in class org.apache.spark.scheduler.StageInfo
- 
When this stage was submitted from the DAGScheduler to a TaskScheduler. 
- submissionTime() - Method in interface org.apache.spark.SparkStageInfo
-  
- submissionTime() - Method in class org.apache.spark.SparkStageInfoImpl
-  
- submissionTime() - Method in class org.apache.spark.status.api.v1.JobData
-  
- submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-  
- submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext
- 
Submit a job for execution and return a FutureJob holding the result. 
- subsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-  
- subsetAccuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
- 
Returns subset accuracy
 (for equal sets of labels) 
- substitutor() - Method in class org.apache.spark.sql.hive.HiveContext
-  
- substr(Column, Column) - Method in class org.apache.spark.sql.Column
- 
An expression that returns a substring. 
- substr(int, int) - Method in class org.apache.spark.sql.Column
- 
An expression that returns a substring. 
- substring(Column, int, int) - Static method in class org.apache.spark.sql.functions
- 
Substring starts at posand is of lengthlenwhen str is String type or
 returns the slice of byte array that starts atposin byte and is of lengthlenwhen str is Binary type
 
- substring_index(Column, String, int) - Static method in class org.apache.spark.sql.functions
- 
Returns the substring from string str before count occurrences of the delimiter delim. 
- subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
- 
Return an RDD with the elements from thisthat are not inother.
 
- subtract(Dataset<T>) - Method in class org.apache.spark.sql.Dataset
- 
Returns a new  Dataset where any elements present in  other have been removed. 
- subtract(Vector) - Method in class org.apache.spark.util.Vector
-  
- subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return an RDD with the pairs from thiswhose keys are not inother.
 
- subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return an RDD with the pairs from `this` whose keys are not in `other`. 
- subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
- 
Return an RDD with the pairs from `this` whose keys are not in `other`. 
- subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return an RDD with the pairs from thiswhose keys are not inother.
 
- subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return an RDD with the pairs from `this` whose keys are not in `other`. 
- subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
- 
Return an RDD with the pairs from `this` whose keys are not in `other`. 
- succeededTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-  
- Success - Class in org.apache.spark
- 
:: DeveloperApi ::
 Task succeeded. 
- Success() - Constructor for class org.apache.spark.Success
-  
- successful() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Add up the elements in this RDD. 
- sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
- 
Add up the elements in this RDD. 
- sum(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the sum of all values in the expression. 
- sum(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the sum of all values in the given column. 
- sum(String...) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the sum for each numeric columns for each group. 
- sum(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
- 
Compute the sum for each numeric columns for each group. 
- sum() - Method in class org.apache.spark.util.StatCounter
-  
- sum() - Method in class org.apache.spark.util.Vector
-  
- sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Approximate operation to return the sum within a timeout. 
- sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
- 
Approximate operation to return the sum within a timeout. 
- sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
- 
Approximate operation to return the sum within a timeout. 
- sumDistinct(Column) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the sum of distinct values in the expression. 
- sumDistinct(String) - Static method in class org.apache.spark.sql.functions
- 
Aggregate function: returns the sum of distinct values in the expression. 
- summary() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
- 
Gets summary of model on training set. 
- summary() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
- 
Gets summary (e.g. 
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
- 
Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2 
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
- 
Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2 
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.mllib.tree.RandomForest
- 
List of supported feature subset sampling strategies. 
- supportedImpurities() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
- 
Accessor for supported impurities: entropy, gini 
- supportedImpurities() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
- 
Accessor for supported impurity settings: entropy, gini 
- supportedImpurities() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
- 
Accessor for supported impurities: variance 
- supportedImpurities() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
- 
Accessor for supported impurity settings: variance 
- supportedLossTypes() - Static method in class org.apache.spark.ml.classification.GBTClassifier
- 
Accessor for supported loss settings: logistic 
- supportedLossTypes() - Static method in class org.apache.spark.ml.regression.GBTRegressor
- 
Accessor for supported loss settings: squared (L2), absolute (L1) 
- supportedModelTypes() - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- supportsRelocationOfSerializedObjects() - Method in class org.apache.spark.serializer.KryoSerializer
-  
- SVDPlusPlus - Class in org.apache.spark.graphx.lib
- 
Implementation of SVD++ algorithm. 
- SVDPlusPlus() - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus
-  
- SVDPlusPlus.Conf - Class in org.apache.spark.graphx.lib
- 
Configuration parameters for SVDPlusPlus. 
- SVDPlusPlus.Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-  
- SVMDataGenerator - Class in org.apache.spark.mllib.util
- 
:: DeveloperApi ::
 Generate sample data used for SVM. 
- SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
-  
- SVMModel - Class in org.apache.spark.mllib.classification
- 
Model for Support Vector Machines (SVMs). 
- SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
-  
- SVMWithSGD - Class in org.apache.spark.mllib.classification
- 
Train a Support Vector Machine (SVM) using Stochastic Gradient Descent. 
- SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD
- 
Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100,
 regParm: 0.01, miniBatchFraction: 1.0}. 
- symbolToColumn(Symbol) - Method in class org.apache.spark.sql.SQLImplicits
- 
An implicit conversion that turns a Scala  Symbol into a  Column. 
- SYSTEM_DEFAULT() - Static method in class org.apache.spark.sql.types.DecimalType
-  
- systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
-  
- t() - Method in class org.apache.spark.SerializableWritable
-  
- table(String) - Method in class org.apache.spark.sql.DataFrameReader
- 
- table(String) - Method in class org.apache.spark.sql.SQLContext
-  
- tableNames() - Method in class org.apache.spark.sql.SQLContext
-  
- tableNames(String) - Method in class org.apache.spark.sql.SQLContext
-  
- tables() - Method in class org.apache.spark.sql.SQLContext
-  
- tables(String) - Method in class org.apache.spark.sql.SQLContext
-  
- TableScan - Interface in org.apache.spark.sql.sources
- 
::DeveloperApi::
 A BaseRelation that can produce all of its tuples as an RDD of Row objects. 
- tachyonFolderName() - Method in class org.apache.spark.SparkContext
-  
- tag() - Method in class org.apache.spark.sql.types.BinaryType
-  
- tag() - Method in class org.apache.spark.sql.types.BooleanType
-  
- tag() - Method in class org.apache.spark.sql.types.ByteType
-  
- tag() - Method in class org.apache.spark.sql.types.DateType
-  
- tag() - Method in class org.apache.spark.sql.types.DecimalType
-  
- tag() - Method in class org.apache.spark.sql.types.DoubleType
-  
- tag() - Method in class org.apache.spark.sql.types.FloatType
-  
- tag() - Method in class org.apache.spark.sql.types.IntegerType
-  
- tag() - Method in class org.apache.spark.sql.types.LongType
-  
- tag() - Method in class org.apache.spark.sql.types.ShortType
-  
- tag() - Method in class org.apache.spark.sql.types.StringType
-  
- tag() - Method in class org.apache.spark.sql.types.TimestampType
-  
- take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Take the first num elements of the RDD. 
- take(int) - Method in class org.apache.spark.rdd.RDD
- 
Take the first num elements of the RDD. 
- take(int) - Method in class org.apache.spark.sql.DataFrame
- 
- take(int) - Method in class org.apache.spark.sql.Dataset
- 
Returns the first  num elements of this  Dataset as an array. 
- takeAsList(int) - Method in class org.apache.spark.sql.DataFrame
- 
Returns the first  n rows in the  DataFrame as a list. 
- takeAsList(int) - Method in class org.apache.spark.sql.Dataset
- 
Returns the first  num elements of this  Dataset as an array. 
- takeAsync(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
The asynchronous version of the takeaction, which returns a
 future for retrieving the firstnumelements of this RDD.
 
- takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions
- 
Returns a future for retrieving the first num elements of the RDD. 
- takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Returns the first k (smallest) elements from this RDD as defined by
 the specified Comparator[T] and maintains the order. 
- takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Returns the first k (smallest) elements from this RDD using the
 natural ordering for T while maintain the order. 
- takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
- 
Returns the first k (smallest) elements from this RDD as defined by the specified
 implicit Ordering[T] and maintains the ordering. 
- takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-  
- takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-  
- takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD
- 
Return a fixed-size sampled subset of this RDD in an array 
- tallSkinnyQR(boolean) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
- 
- tan(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the tangent of the given value. 
- tan(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the tangent of the given column. 
- tanh(Column) - Static method in class org.apache.spark.sql.functions
- 
Computes the hyperbolic tangent of the given value. 
- tanh(String) - Static method in class org.apache.spark.sql.functions
- 
Computes the hyperbolic tangent of the given column. 
- targetStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-  
- targetStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-  
- task() - Method in class org.apache.spark.CleanupTaskWeakReference
-  
- taskAttemptId() - Method in class org.apache.spark.TaskContext
- 
An ID that is unique to this task attempt (within the same SparkContext, no two task attempts
 will share the same attempt ID). 
- TaskCommitDenied - Class in org.apache.spark
- 
:: DeveloperApi ::
 Task requested the driver to commit, but was denied. 
- TaskCommitDenied(int, int, int) - Constructor for class org.apache.spark.TaskCommitDenied
-  
- TaskCompletionListener - Interface in org.apache.spark.util
- 
:: DeveloperApi :: 
- TaskContext - Class in org.apache.spark
- 
Contextual information about a task which can be read or mutated during
 execution. 
- TaskContext() - Constructor for class org.apache.spark.TaskContext
-  
- TaskData - Class in org.apache.spark.status.api.v1
-  
- TaskEndReason - Interface in org.apache.spark
- 
:: DeveloperApi ::
 Various possible reasons why a task ended. 
- TaskFailedReason - Interface in org.apache.spark
- 
:: DeveloperApi ::
 Various possible reasons why a task failed. 
- TaskFailureListener - Interface in org.apache.spark.util
- 
:: DeveloperApi :: 
- taskId() - Method in class org.apache.spark.scheduler.local.KillTask
-  
- taskId() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-  
- taskId() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- taskId() - Method in class org.apache.spark.status.api.v1.TaskData
-  
- taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
-  
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-  
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-  
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-  
- TaskInfo - Class in org.apache.spark.scheduler
- 
:: DeveloperApi ::
 Information about a running task attempt inside a TaskSet. 
- TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
-  
- TaskKilled - Class in org.apache.spark
- 
:: DeveloperApi ::
 Task was killed intentionally and needs to be rescheduled. 
- TaskKilled() - Constructor for class org.apache.spark.TaskKilled
-  
- TaskKilledException - Exception in org.apache.spark
- 
:: DeveloperApi ::
 Exception thrown when a task is explicitly killed (i.e., task failure is expected). 
- TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
-  
- taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
-  
- TaskLocality - Class in org.apache.spark.scheduler
-  
- TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
-  
- taskLocality() - Method in class org.apache.spark.status.api.v1.TaskData
-  
- taskLocalityPreferences() - Method in class org.apache.spark.scheduler.StageInfo
-  
- TaskMetricDistributions - Class in org.apache.spark.status.api.v1
-  
- taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-  
- taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-  
- taskMetrics() - Method in class org.apache.spark.status.api.v1.TaskData
-  
- TaskMetrics - Class in org.apache.spark.status.api.v1
-  
- taskMetrics() - Method in class org.apache.spark.TaskContext
- 
::DeveloperApi:: 
- TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
-  
- TaskResultBlockId - Class in org.apache.spark.storage
-  
- TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
-  
- TaskResultLost - Class in org.apache.spark
- 
:: DeveloperApi ::
 The task finished successfully, but the result was lost from the executor's block manager before
 it was fetched. 
- TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
-  
- tasks() - Method in class org.apache.spark.status.api.v1.StageData
-  
- TaskSorting - Enum in org.apache.spark.status.api.v1
-  
- taskTime() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-  
- taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-  
- TEST() - Static method in class org.apache.spark.storage.BlockId
-  
- TestResult<DF> - Interface in org.apache.spark.mllib.stat.test
- 
Trait for hypothesis test results. 
- text(String...) - Method in class org.apache.spark.sql.DataFrameReader
- 
Loads a text file and returns a  DataFrame with a single string column named "value". 
- text(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader
- 
Loads a text file and returns a  DataFrame with a single string column named "value". 
- text(String) - Method in class org.apache.spark.sql.DataFrameWriter
- 
Saves the content of the  DataFrame in a text file at the specified path. 
- textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Read a text file from HDFS, a local file system (available on all nodes), or any
 Hadoop-supported file system URI, and return it as an RDD of Strings. 
- textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
- 
Read a text file from HDFS, a local file system (available on all nodes), or any
 Hadoop-supported file system URI, and return it as an RDD of Strings. 
- textFile(String, int) - Method in class org.apache.spark.SparkContext
- 
Read a text file from HDFS, a local file system (available on all nodes), or any
 Hadoop-supported file system URI, and return it as an RDD of Strings. 
- textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create an input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them as text files (using key as LongWritable, value
 as Text and input format as TextInputFormat). 
- textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create a input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them as text files (using key as LongWritable, value
 as Text and input format as TextInputFormat). 
- theta() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-  
- theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-  
- threshold() - Method in class org.apache.spark.ml.feature.Binarizer
- 
Param for threshold used to binarize continuous features. 
- threshold() - Method in class org.apache.spark.ml.tree.ContinuousSplit
-  
- threshold() - Method in class org.apache.spark.mllib.tree.model.Split
-  
- thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
- 
Returns thresholds in descending order. 
- throwBalls() - Method in class org.apache.spark.rdd.PartitionCoalescer
-  
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
-  
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-  
- time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-  
- time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-  
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
-  
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-  
- time() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
-  
- time() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-  
- Time - Class in org.apache.spark.streaming
- 
This is a simple class that represents an absolute instant of time. 
- Time(long) - Constructor for class org.apache.spark.streaming.Time
-  
- timeout(Duration) - Method in class org.apache.spark.streaming.StateSpec
- 
Set the duration after which the state of an idle key will be removed. 
- times(int) - Method in class org.apache.spark.streaming.Duration
-  
- timestamp() - Method in class org.apache.spark.sql.ColumnName
- 
Creates a new StructFieldof type timestamp.
 
- TIMESTAMP() - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for nullable timestamp type. 
- TimestampType - Static variable in class org.apache.spark.sql.types.DataTypes
- 
Gets the TimestampType object. 
- TimestampType - Class in org.apache.spark.sql.types
- 
:: DeveloperApi ::
 The data type representing java.sql.Timestampvalues.
 
- TimeTrackingOutputStream - Class in org.apache.spark.storage
- 
Intercepts write calls and tracks total time spent writing in order to update shuffle write
 metrics. 
- TimeTrackingOutputStream(ShuffleWriteMetrics, OutputStream) - Constructor for class org.apache.spark.storage.TimeTrackingOutputStream
-  
- timeUnit() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-  
- TIMING_DATA() - Static method in class org.apache.spark.api.r.SpecialLengths
-  
- to(Time, Duration) - Method in class org.apache.spark.streaming.Time
-  
- to_date(Column) - Static method in class org.apache.spark.sql.functions
- 
Converts the column into DateType. 
- to_utc_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Assumes given timestamp is in given timezone and converts to UTC. 
- toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
- toArray() - Method in class org.apache.spark.input.PortableDataStream
- 
Read the file as a byte array 
- toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
-  
- toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Converts to a dense array in column major. 
- toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
-  
- toArray() - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Converts the instance to a double array. 
- toArray() - Method in class org.apache.spark.rdd.RDD
- 
Return an array that contains all of the elements in this RDD. 
- toAttributes() - Method in class org.apache.spark.sql.types.StructType
-  
- toBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
-  
- toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-  
- toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-  
- toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-  
- toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-  
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
- 
Collects data and assembles a local dense breeze matrix (for test only). 
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Converts to a breeze matrix. 
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Converts the instance to a breeze vector. 
- toByte() - Method in class org.apache.spark.sql.types.Decimal
-  
- toColumn(Encoder<B>, Encoder<O>) - Method in class org.apache.spark.sql.expressions.Aggregator
- 
Returns this Aggregatoras aTypedColumnthat can be used inDatasetorDataFrameoperations.
 
- toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
- 
Converts to CoordinateMatrix. 
- toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-  
- toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
A description of this RDD and its recursive dependencies for debugging. 
- toDebugString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
- 
Print the full model to a string. 
- toDebugString() - Method in class org.apache.spark.rdd.RDD
- 
A description of this RDD and its recursive dependencies for debugging. 
- toDebugString() - Method in class org.apache.spark.SparkConf
- 
Return a string listing all keys and values, one per line. 
- toDebugString() - Method in class org.apache.spark.sql.types.Decimal
-  
- toDegrees(Column) - Static method in class org.apache.spark.sql.functions
- 
Converts an angle measured in radians to an approximately equivalent angle measured in degrees. 
- toDegrees(String) - Static method in class org.apache.spark.sql.functions
- 
Converts an angle measured in radians to an approximately equivalent angle measured in degrees. 
- toDense() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
- 
Generate a DenseMatrixfrom the givenSparseMatrix.
 
- toDense() - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Converts this vector to a dense vector. 
- toDF(String...) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame with columns renamed. 
- toDF() - Method in class org.apache.spark.sql.DataFrame
- 
Returns the object itself. 
- toDF(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
- 
Returns a new  DataFrame with columns renamed. 
- toDF() - Method in class org.apache.spark.sql.DataFrameHolder
-  
- toDF(Seq<String>) - Method in class org.apache.spark.sql.DataFrameHolder
-  
- toDF() - Method in class org.apache.spark.sql.Dataset
- 
Converts this strongly typed collection of data to generic Dataframe. 
- toDouble() - Method in class org.apache.spark.sql.types.Decimal
-  
- toDS() - Method in class org.apache.spark.sql.Dataset
- 
- toDS() - Method in class org.apache.spark.sql.DatasetHolder
-  
- toEdgeTriplet() - Method in class org.apache.spark.graphx.EdgeContext
- 
Converts the edge and vertex properties into an  EdgeTriplet for convenience. 
- toErrorString() - Method in class org.apache.spark.ExceptionFailure
-  
- toErrorString() - Method in class org.apache.spark.ExecutorLostFailure
-  
- toErrorString() - Method in class org.apache.spark.FetchFailed
-  
- toErrorString() - Static method in class org.apache.spark.Resubmitted
-  
- toErrorString() - Method in class org.apache.spark.TaskCommitDenied
-  
- toErrorString() - Method in interface org.apache.spark.TaskFailedReason
- 
Error message displayed in the web UI. 
- toErrorString() - Static method in class org.apache.spark.TaskKilled
-  
- toErrorString() - Static method in class org.apache.spark.TaskResultLost
-  
- toErrorString() - Static method in class org.apache.spark.UnknownReason
-  
- toFloat() - Method in class org.apache.spark.sql.types.Decimal
-  
- toFormattedString() - Method in class org.apache.spark.streaming.Duration
-  
- toHiveString(Tuple2<Object, DataType>) - Static method in class org.apache.spark.sql.hive.HiveContext
-  
- toHiveStructString(Tuple2<Object, DataType>) - Static method in class org.apache.spark.sql.hive.HiveContext
- 
Hive outputs fields of structs slightly differently than top level attributes. 
- toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
- 
Converts to IndexedRowMatrix. 
- toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-  
- toInt() - Method in class org.apache.spark.sql.types.Decimal
-  
- toInt() - Method in class org.apache.spark.storage.StorageLevel
-  
- toJavaBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
-  
- toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
- 
Convert to a JavaDStream 
- toJavaRDD() - Method in class org.apache.spark.rdd.RDD
-  
- toJavaRDD() - Method in class org.apache.spark.sql.DataFrame
- 
- toJson() - Method in class org.apache.spark.mllib.linalg.DenseVector
-  
- toJson() - Method in class org.apache.spark.mllib.linalg.SparseVector
-  
- toJson() - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Converts the vector to a JSON string. 
- toJSON() - Method in class org.apache.spark.sql.DataFrame
- 
Returns the content of the  DataFrame as a RDD of JSON strings. 
- Tokenizer - Class in org.apache.spark.ml.feature
- 
:: Experimental ::
 A tokenizer that converts the input string to lowercase and then splits it by white spaces. 
- Tokenizer(String) - Constructor for class org.apache.spark.ml.feature.Tokenizer
-  
- Tokenizer() - Constructor for class org.apache.spark.ml.feature.Tokenizer
-  
- toLocal() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
- 
Convert this distributed model to a local representation. 
- toLocal() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
- 
Convert model to a local model. 
- toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Return an iterator that contains all of the elements in this RDD. 
- toLocalIterator() - Method in class org.apache.spark.rdd.RDD
- 
Return an iterator that contains all of the elements in this RDD. 
- toLocalMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
- 
Collect the distributed matrix on the driver as a `DenseMatrix`. 
- toLong() - Method in class org.apache.spark.sql.types.Decimal
-  
- toLowercase() - Method in class org.apache.spark.ml.feature.RegexTokenizer
- 
Indicates whether to convert all characters to lowercase before tokenizing. 
- toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute
- 
Converts to ML metadata with some existing metadata. 
- toMetadata() - Method in class org.apache.spark.ml.attribute.Attribute
- 
Converts to ML metadata 
- toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Converts to ML metadata with some existing metadata. 
- toMetadata() - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Converts to ML metadata 
- toOld() - Method in interface org.apache.spark.ml.tree.Split
- 
Convert to old Split format 
- top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Returns the top k (largest) elements from this RDD as defined by
 the specified Comparator[T] and maintains the order. 
- top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Returns the top k (largest) elements from this RDD using the
 natural ordering for T and maintains the order. 
- top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-  
- toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.dstream.DStream
-  
- toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
- 
Deprecated.
As of 1.3.0, replaced by implicit functions in the DStream companion object.
             This is kept here only for backward compatibility. 
 
- topByKey(int, Ordering<V>) - Method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions
- 
Returns the top k (largest) elements for each key from this RDD as defined by the specified
 implicit Ordering[T]. 
- topDocumentsPerTopic(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
- 
Return the top documents for each topic 
- topic() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-  
- topicAndPartition() - Method in class org.apache.spark.streaming.kafka.OffsetRange
- 
Kafka TopicAndPartition object, for convenience 
- topicAssignments() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
- 
Return the top topic for each (doc, term) pair. 
- topicConcentration() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-  
- topicConcentration() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-  
- topicConcentration() - Method in class org.apache.spark.mllib.clustering.LDAModel
- 
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
 distributions over terms. 
- topicConcentration() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-  
- topicDistributions() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
- 
For each document in the training set, return the distribution over topics for that document
 ("theta_doc"). 
- topicDistributions(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
- 
Predicts the topic mixture distribution for each document (often called "theta" in the
 literature). 
- topicDistributions(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
- 
Java-friendly version of topicDistributions
 
- topics() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-  
- topicsMatrix() - Method in class org.apache.spark.ml.clustering.LDAModel
- 
Inferred topics, where each topic is represented by a distribution over terms. 
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
- 
Inferred topics, where each topic is represented by a distribution over terms. 
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LDAModel
- 
Inferred topics, where each topic is represented by a distribution over terms. 
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-  
- toPMML(StreamResult) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
- 
Export the model to the stream result in PMML format 
- toPMML(String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
- 
:: Experimental ::
 Export the model to a local file in PMML format 
- toPMML(SparkContext, String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
- 
:: Experimental ::
 Export the model to a directory on a distributed file system in PMML format 
- toPMML(OutputStream) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
- 
:: Experimental ::
 Export the model to the OutputStream in PMML format 
- toPMML() - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
- 
:: Experimental ::
 Export the model to a String in PMML format 
- topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-  
- topTopicsPerDocument(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
- 
For each document, return the top k weighted topics for that document and their weights. 
- toRadians(Column) - Static method in class org.apache.spark.sql.functions
- 
Converts an angle measured in degrees to an approximately equivalent angle measured in radians. 
- toRadians(String) - Static method in class org.apache.spark.sql.functions
- 
Converts an angle measured in degrees to an approximately equivalent angle measured in radians. 
- toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-  
- toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-  
- toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-  
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-  
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-  
- TorrentBroadcastFactory - Class in org.apache.spark.broadcast
- 
A  Broadcast implementation that uses a BitTorrent-like
 protocol to do a distributed transfer of the broadcasted data to the executors. 
- TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
-  
- toSchemaRDD() - Method in class org.apache.spark.sql.DataFrame
- 
Deprecated.
As of 1.3.0, replaced by toDF(). This will be removed in Spark 2.0.
 
 
- toSeq() - Method in class org.apache.spark.ml.param.ParamMap
- 
Converts this param map to a sequence of param pairs. 
- toSeq() - Method in interface org.apache.spark.sql.Row
- 
Return a Scala Seq representing the row. 
- toShort() - Method in class org.apache.spark.sql.types.Decimal
-  
- toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-  
- toSparse() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
- 
Generate a SparseMatrixfrom the givenDenseMatrix.
 
- toSparse() - Method in class org.apache.spark.mllib.linalg.DenseVector
-  
- toSparse() - Method in class org.apache.spark.mllib.linalg.SparseVector
-  
- toSparse() - Method in interface org.apache.spark.mllib.linalg.Vector
- 
Converts this vector to a sparse vector with all explicit zeros removed. 
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-  
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-  
- toString() - Method in class org.apache.spark.Accumulable
-  
- toString() - Method in class org.apache.spark.api.java.JavaRDD
-  
- toString() - Method in class org.apache.spark.broadcast.Broadcast
-  
- toString() - Method in class org.apache.spark.graphx.EdgeDirection
-  
- toString() - Method in class org.apache.spark.graphx.EdgeTriplet
-  
- toString() - Method in class org.apache.spark.ml.attribute.Attribute
-  
- toString() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-  
- toString() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-  
- toString() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-  
- toString() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-  
- toString() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-  
- toString() - Method in class org.apache.spark.ml.feature.RFormula
-  
- toString() - Method in class org.apache.spark.ml.feature.RFormulaModel
-  
- toString() - Method in class org.apache.spark.ml.param.Param
-  
- toString() - Method in class org.apache.spark.ml.param.ParamMap
-  
- toString() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-  
- toString() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-  
- toString() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-  
- toString() - Method in class org.apache.spark.ml.tree.InternalNode
-  
- toString() - Method in class org.apache.spark.ml.tree.LeafNode
-  
- toString() - Method in interface org.apache.spark.ml.util.Identifiable
-  
- toString() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-  
- toString() - Method in class org.apache.spark.mllib.classification.SVMModel
-  
- toString() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
-  
- toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
-  
- toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
A human readable representation of the matrix 
- toString(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
A human readable representation of the matrix with maximum lines and width 
- toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
-  
- toString() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
- 
Print a summary of the model. 
- toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-  
- toString() - Method in class org.apache.spark.mllib.stat.test.BinarySample
-  
- toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-  
- toString() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
-  
- toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult
- 
String explaining the hypothesis test result. 
- toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
- 
Print a summary of the model. 
- toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-  
- toString() - Method in class org.apache.spark.mllib.tree.model.Node
-  
- toString() - Method in class org.apache.spark.mllib.tree.model.Predict
-  
- toString() - Method in class org.apache.spark.mllib.tree.model.Split
-  
- toString() - Method in class org.apache.spark.partial.BoundedDouble
-  
- toString() - Method in class org.apache.spark.partial.PartialResult
-  
- toString() - Method in class org.apache.spark.rdd.RDD
-  
- toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
-  
- toString() - Method in class org.apache.spark.scheduler.SplitInfo
-  
- toString() - Method in class org.apache.spark.SerializableWritable
-  
- toString() - Method in class org.apache.spark.sql.Column
-  
- toString() - Method in interface org.apache.spark.sql.Row
-  
- toString() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
-  
- toString() - Method in class org.apache.spark.sql.types.Decimal
-  
- toString() - Method in class org.apache.spark.sql.types.DecimalType
-  
- toString() - Method in class org.apache.spark.sql.types.Metadata
-  
- toString() - Method in class org.apache.spark.sql.types.StructField
-  
- toString() - Method in class org.apache.spark.storage.BlockId
-  
- toString() - Method in class org.apache.spark.storage.BlockManagerId
-  
- toString() - Method in class org.apache.spark.storage.RDDInfo
-  
- toString() - Method in class org.apache.spark.storage.StorageLevel
-  
- toString() - Method in class org.apache.spark.streaming.Duration
-  
- toString() - Method in class org.apache.spark.streaming.kafka.Broker
-  
- toString() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-  
- toString() - Method in class org.apache.spark.streaming.State
-  
- toString() - Method in class org.apache.spark.streaming.Time
-  
- toString() - Method in class org.apache.spark.util.MutablePair
-  
- toString() - Method in class org.apache.spark.util.StatCounter
-  
- toString() - Method in class org.apache.spark.util.Vector
-  
- toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute
- 
Converts to a StructFieldwith some existing metadata.
 
- toStructField() - Method in class org.apache.spark.ml.attribute.Attribute
- 
Converts to a StructField.
 
- toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Converts to a StructField with some existing metadata. 
- toStructField() - Method in class org.apache.spark.ml.attribute.AttributeGroup
- 
Converts to a StructField. 
- totalBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-  
- totalBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-  
- totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-  
- totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
- 
Time taken for all the jobs of this batch to finish processing from the time they
 were submitted. 
- totalDuration() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- totalInputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- totalIterations() - Method in interface org.apache.spark.ml.classification.LogisticRegressionTrainingSummary
- 
Number of training iterations until termination 
- totalIterations() - Method in class org.apache.spark.ml.regression.LinearRegressionTrainingSummary
-  
- totalShuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- totalShuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- totalTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-  
- toTuple() - Method in class org.apache.spark.graphx.EdgeTriplet
-  
- toUnscaledLong() - Method in class org.apache.spark.sql.types.Decimal
-  
- train(DataFrame) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-  
- train(DataFrame) - Method in class org.apache.spark.ml.classification.GBTClassifier
-  
- train(DataFrame) - Method in class org.apache.spark.ml.classification.LogisticRegression
-  
- train(DataFrame) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
- 
Train a model using the given dataset and parameters. 
- train(DataFrame) - Method in class org.apache.spark.ml.classification.NaiveBayes
-  
- train(DataFrame) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-  
- train(DataFrame) - Method in class org.apache.spark.ml.Predictor
- 
Train a model using the given dataset and parameters. 
- train(RDD<ALS.Rating<ID>>, int, int, int, int, double, boolean, double, boolean, StorageLevel, StorageLevel, int, long, ClassTag<ID>, Ordering<ID>) - Static method in class org.apache.spark.ml.recommendation.ALS
-  
- train(DataFrame) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-  
- train(DataFrame) - Method in class org.apache.spark.ml.regression.GBTRegressor
-  
- train(DataFrame) - Method in class org.apache.spark.ml.regression.LinearRegression
-  
- train(DataFrame) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-  
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
- 
Train a logistic regression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
- 
Train a logistic regression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
- 
Train a logistic regression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
- 
Train a logistic regression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- train(RDD<LabeledPoint>, double, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-  
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
- 
Train a SVM model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
- 
Train a SVM model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
- 
Train a SVM model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
- 
Train a SVM model given an RDD of (label, features) pairs. 
- train(RDD<Vector>, int, int, int, String, long) - Static method in class org.apache.spark.mllib.clustering.KMeans
- 
Trains a k-means model using the given set of parameters. 
- train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans
- 
Trains a k-means model using the given set of parameters. 
- train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
- 
Trains a k-means model using specified parameters and the default values for unspecified. 
- train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
- 
Trains a k-means model using specified parameters and the default values for unspecified. 
- train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-  
- train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-  
- train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-  
- train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-  
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
- 
Train a Lasso model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
- 
Train a Lasso model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
- 
Train a Lasso model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
- 
Train a Lasso model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
- 
Train a Linear Regression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
- 
Train a LinearRegression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
- 
Train a LinearRegression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
- 
Train a LinearRegression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
- 
Train a RidgeRegression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
- 
Train a RidgeRegression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
- 
Train a RidgeRegression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
- 
Train a RidgeRegression model given an RDD of (label, features) pairs. 
- train(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.DecisionTree
- 
Method to train a decision tree model. 
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
- 
Method to train a decision tree model. 
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
- 
Method to train a decision tree model. 
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>) - Static method in class org.apache.spark.mllib.tree.DecisionTree
- 
Method to train a decision tree model. 
- train(RDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
- 
Method to train a gradient boosting model. 
- train(JavaRDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
- 
Java-friendly API for GradientBoostedTrees$.train(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.BoostingStrategy)
 
- trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
- 
Method to train a decision tree model for binary or multiclass classification. 
- trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
- 
Java-friendly API for DecisionTree$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
 
- trainClassifier(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
- 
Method to train a decision tree model for binary or multiclass classification. 
- trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
- 
Method to train a decision tree model for binary or multiclass classification. 
- trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
- 
Java-friendly API for RandomForest$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
 
- trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-  
- trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-  
- trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-  
- trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-  
- trainingLogLikelihood() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
- 
Log likelihood of the observed tokens in the training set,
 given the current parameter estimates:
  log P(docs | topics, topic distributions for docs, Dirichlet hyperparameters) 
- trainOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Update the clustering model by training on batches of data from a DStream. 
- trainOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
- 
Java-friendly version of trainOn.
 
- trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
- 
Update the model by training on batches of data from a DStream. 
- trainOn(JavaDStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
- 
Java-friendly version of trainOn.
 
- trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
- 
Method to train a decision tree model for regression. 
- trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
- 
Java-friendly API for DecisionTree$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
 
- trainRegressor(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
- 
Method to train a decision tree model for regression. 
- trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
- 
Method to train a decision tree model for regression. 
- trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
- 
Java-friendly API for RandomForest$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
 
- TrainValidationSplit - Class in org.apache.spark.ml.tuning
- 
:: Experimental ::
 Validation for hyper-parameter tuning. 
- TrainValidationSplit(String) - Constructor for class org.apache.spark.ml.tuning.TrainValidationSplit
-  
- TrainValidationSplit() - Constructor for class org.apache.spark.ml.tuning.TrainValidationSplit
-  
- TrainValidationSplitModel - Class in org.apache.spark.ml.tuning
- 
:: Experimental ::
 Model from train validation split. 
- transform(DataFrame) - Method in class org.apache.spark.ml.classification.ClassificationModel
- 
Transforms dataset by reading from featuresCol, and appending new columns as specified by
 parameters:
  - predicted labels aspredictionColof typeDouble- raw predictions (confidences) asrawPredictionColof typeVector.
 
- transform(DataFrame) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
- 
Transforms dataset by reading from featuresCol, and appending new columns as specified by
 parameters:
  - predicted labels aspredictionColof typeDouble- raw predictions (confidences) asrawPredictionColof typeVector- probability of each class asprobabilityColof typeVector.
 
- transform(DataFrame) - Method in class org.apache.spark.ml.clustering.KMeansModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.clustering.LDAModel
- 
Transforms the input dataset. 
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.Binarizer
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.Bucketizer
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.ColumnPruner
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.HashingTF
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.IDFModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.IndexToString
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.Interaction
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.PCAModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.RFormulaModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.SQLTransformer
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorAssembler
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorSlicer
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.Word2VecModel
- 
Transform a sentence column to a vector column to represent the whole sentence. 
- transform(DataFrame) - Method in class org.apache.spark.ml.PipelineModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.PredictionModel
- 
Transforms dataset by reading from featuresCol, callingpredict(), and storing
 the predictions as a new columnpredictionCol.
 
- transform(DataFrame) - Method in class org.apache.spark.ml.recommendation.ALSModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-  
- transform(DataFrame, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer
- 
Transforms the dataset with optional parameters 
- transform(DataFrame, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer
- 
Transforms the dataset with optional parameters 
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Transformer
- 
Transforms the dataset with provided parameter map as additional parameters. 
- transform(DataFrame) - Method in class org.apache.spark.ml.Transformer
- 
Transforms the input dataset. 
- transform(DataFrame) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
-  
- transform(DataFrame) - Method in class org.apache.spark.ml.UnaryTransformer
-  
- transform(Vector) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
- 
Applies transformation on a vector. 
- transform(Vector) - Method in class org.apache.spark.mllib.feature.ElementwiseProduct
- 
Does the hadamard product transformation. 
- transform(Iterable<Object>) - Method in class org.apache.spark.mllib.feature.HashingTF
- 
Transforms the input document into a sparse term frequency vector. 
- transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF
- 
Transforms the input document into a sparse term frequency vector (Java version). 
- transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
- 
Transforms the input document to term frequency vectors. 
- transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
- 
Transforms the input document to term frequency vectors (Java version). 
- transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
- 
Transforms term frequency (TF) vectors to TF-IDF vectors. 
- transform(Vector) - Method in class org.apache.spark.mllib.feature.IDFModel
- 
Transforms a term frequency (TF) vector to a TF-IDF vector 
- transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
- 
Transforms term frequency (TF) vectors to TF-IDF vectors (Java version). 
- transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer
- 
Applies unit length normalization on a vector. 
- transform(Vector) - Method in class org.apache.spark.mllib.feature.PCAModel
- 
Transform a vector by computed Principal Components. 
- transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
- 
Applies standardization transformation on a vector. 
- transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
- 
Applies transformation on a vector. 
- transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
- 
Applies transformation on an RDD[Vector]. 
- transform(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
- 
Applies transformation on an JavaRDD[Vector]. 
- transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-  
- transform(Function1<DataFrame, DataFrame>) - Method in class org.apache.spark.sql.DataFrame
- 
Concise syntax for chaining custom transformations. 
- transform(Function1<Dataset<T>, Dataset<U>>) - Method in class org.apache.spark.sql.Dataset
- 
Concise syntax for chaining custom transformations. 
- transform(Function<R, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream. 
- transform(Function2<R, Time, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream. 
- transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create a new DStream in which each RDD is generated by applying a function on RDDs of
 the DStreams. 
- transform(Function1<RDD<T>, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream. 
- transform(Function2<RDD<T>, Time, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream. 
- transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
- 
Create a new DStream in which each RDD is generated by applying a function on RDDs of
 the DStreams. 
- Transformer - Class in org.apache.spark.ml
- 
:: DeveloperApi ::
 Abstract class for transformers that transform one dataset into another. 
- Transformer() - Constructor for class org.apache.spark.ml.Transformer
-  
- transformImpl(DataFrame) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-  
- transformImpl(DataFrame) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-  
- transformImpl(DataFrame) - Method in class org.apache.spark.ml.PredictionModel
-  
- transformImpl(DataFrame) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-  
- transformImpl(DataFrame) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRest
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.KMeans
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.KMeansModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.LDA
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.LDAModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Binarizer
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Bucketizer
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ColumnPruner
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.CountVectorizer
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.HashingTF
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDF
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDFModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IndexToString
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Interaction
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.PCA
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.PCAModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.RFormula
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.RFormulaModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.SQLTransformer
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScaler
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexer
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorAssembler
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexer
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorSlicer
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2Vec
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2VecModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.Pipeline
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineStage
- 
:: DeveloperApi :: 
- transformSchema(StructType, boolean) - Method in class org.apache.spark.ml.PipelineStage
- 
:: DeveloperApi :: 
- transformSchema(StructType) - Method in class org.apache.spark.ml.PredictionModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.Predictor
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALS
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALSModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidator
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
-  
- transformSchema(StructType) - Method in class org.apache.spark.ml.UnaryTransformer
-  
- transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream. 
- transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream. 
- transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
- 
Create a new DStream in which each RDD is generated by applying a function on RDDs of
 the DStreams. 
- transformWith(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream. 
- transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream. 
- transformWith(DStream<U>, Function2<RDD<T>, RDD<U>, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream. 
- transformWith(DStream<U>, Function3<RDD<T>, RDD<U>, Time, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream. 
- transformWithToPair(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream. 
- transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
- 
Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream. 
- translate(Column, String, String) - Static method in class org.apache.spark.sql.functions
- 
Translate any character in the src by a character in replaceString. 
- transpose() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-  
- transpose() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
- 
Transpose this BlockMatrix.
 
- transpose() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-  
- transpose() - Method in interface org.apache.spark.mllib.linalg.Matrix
- 
Transpose the Matrix. 
- transpose() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-  
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Aggregates the elements of this RDD in a multi-level tree pattern. 
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
- 
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
- 
Aggregates the elements of this RDD in a multi-level tree pattern. 
- treeReduce(Function2<T, T, T>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
Reduces the elements of this RDD in a multi-level tree pattern. 
- treeReduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
- 
- treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
- 
- treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.rdd.RDD
- 
Reduces the elements of this RDD in a multi-level tree pattern. 
- trees() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-  
- trees() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-  
- trees() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-  
- trees() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-  
- trees() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-  
- trees() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-  
- treeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-  
- treeString() - Method in class org.apache.spark.sql.types.StructType
-  
- treeWeights() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-  
- treeWeights() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-  
- treeWeights() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-  
- treeWeights() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-  
- treeWeights() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-  
- triangleCount() - Method in class org.apache.spark.graphx.GraphOps
- 
Compute the number of triangles passing through each vertex. 
- TriangleCount - Class in org.apache.spark.graphx.lib
- 
Compute the number of triangles passing through each vertex. 
- TriangleCount() - Constructor for class org.apache.spark.graphx.lib.TriangleCount
-  
- trim(Column) - Static method in class org.apache.spark.sql.functions
- 
Trim the spaces from both ends for the specified string column. 
- TripletFields - Class in org.apache.spark.graphx
- 
Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]]. 
- TripletFields() - Constructor for class org.apache.spark.graphx.TripletFields
- 
Constructs a default TripletFields in which all fields are included. 
- TripletFields(boolean, boolean, boolean) - Constructor for class org.apache.spark.graphx.TripletFields
-  
- triplets() - Method in class org.apache.spark.graphx.Graph
- 
An RDD containing the edge triplets, which are edges along with the vertex data associated with
 the adjacent vertices. 
- triplets() - Method in class org.apache.spark.graphx.impl.GraphImpl
- 
Return a RDD that brings edges together with their source and destination vertices. 
- truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
- 
Returns true positive rate for a given label (category) 
- trunc(Column, String) - Static method in class org.apache.spark.sql.functions
- 
Returns date truncated to the unit specified by the format. 
- tryRecoverFromCheckpoint(String) - Method in class org.apache.spark.streaming.StreamingContextPythonHelper
- 
This is a private method only for Python to implement getOrCreate.
 
- tuple(Encoder<T1>, Encoder<T2>) - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for 2-ary tuples. 
- tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>) - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for 3-ary tuples. 
- tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>, Encoder<T4>) - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for 4-ary tuples. 
- tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>, Encoder<T4>, Encoder<T5>) - Static method in class org.apache.spark.sql.Encoders
- 
An encoder for 5-ary tuples. 
- tValues() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
- 
T-statistic of estimated coefficients and intercept. 
- TwitterUtils - Class in org.apache.spark.streaming.twitter
-  
- TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils
-  
- TypedColumn<T,U> - Class in org.apache.spark.sql
- 
A  Column where an  Encoder has been given for the expected input and return type. 
- TypedColumn(Expression, ExpressionEncoder<U>) - Constructor for class org.apache.spark.sql.TypedColumn
-  
- typeName() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-  
- typeName() - Method in class org.apache.spark.sql.types.DataType
- 
Name of the type used in JSON serialization. 
- typeName() - Method in class org.apache.spark.sql.types.DecimalType
-