DataFrame.DropDuplicates Method

Definition

Overloads

DropDuplicates(String, String[])

Returns a new DataFrame with duplicate rows removed, considering only the subset of columns.

DropDuplicates()

Returns a new DataFrame that contains only the unique rows from this DataFrame. This is an alias for Distinct().

DropDuplicates(String, String[])

Returns a new DataFrame with duplicate rows removed, considering only the subset of columns.

public Microsoft.Spark.Sql.DataFrame DropDuplicates(string col, params string[] cols);

Parameters

col
String

Column name

cols
String[]

Additional column names

Returns

DataFrame object

Applies to

Microsoft.Spark latest
Product Versions
Microsoft.Spark latest

DropDuplicates()

Returns a new DataFrame that contains only the unique rows from this DataFrame. This is an alias for Distinct().

public Microsoft.Spark.Sql.DataFrame DropDuplicates();

Returns

Applies to

Microsoft.Spark latest
Product Versions
Microsoft.Spark latest