[SPARK-16772][PYTHON][DOCS] Fix API doc references to UDFRegistration + Update "important classes"

## Proposed Changes

* Update the list of "important classes" in `pyspark.sql` to match 2.0.
* Fix references to `UDFRegistration` so that the class shows up in the docs. It currently [doesn't](http://spark.apache.org/docs/latest/api/python/pyspark.sql.html).
* Remove some unnecessary whitespace in the Python RST doc files.

I reused the [existing JIRA](https://issues.apache.org/jira/browse/SPARK-16772) I created last week for similar API doc fixes.

## How was this patch tested?

* I ran `lint-python` successfully.
* I ran `make clean build` on the Python docs and confirmed the results are as expected locally in my browser.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #14496 from nchammas/SPARK-16772-UDFRegistration.
This commit is contained in:
Nicholas Chammas 2016-08-06 05:02:59 +01:00 committed by Sean Owen
parent 14dba45208
commit 2dd0388617
3 changed files with 5 additions and 9 deletions

View file

@ -50,4 +50,3 @@ Indices and tables
================== ==================
* :ref:`search` * :ref:`search`

View file

@ -8,14 +8,12 @@ Module Context
:members: :members:
:undoc-members: :undoc-members:
pyspark.sql.types module pyspark.sql.types module
------------------------ ------------------------
.. automodule:: pyspark.sql.types .. automodule:: pyspark.sql.types
:members: :members:
:undoc-members: :undoc-members:
pyspark.sql.functions module pyspark.sql.functions module
---------------------------- ----------------------------
.. automodule:: pyspark.sql.functions .. automodule:: pyspark.sql.functions

View file

@ -18,7 +18,7 @@
""" """
Important classes of Spark SQL and DataFrames: Important classes of Spark SQL and DataFrames:
- :class:`pyspark.sql.SQLContext` - :class:`pyspark.sql.SparkSession`
Main entry point for :class:`DataFrame` and SQL functionality. Main entry point for :class:`DataFrame` and SQL functionality.
- :class:`pyspark.sql.DataFrame` - :class:`pyspark.sql.DataFrame`
A distributed collection of data grouped into named columns. A distributed collection of data grouped into named columns.
@ -26,8 +26,6 @@ Important classes of Spark SQL and DataFrames:
A column expression in a :class:`DataFrame`. A column expression in a :class:`DataFrame`.
- :class:`pyspark.sql.Row` - :class:`pyspark.sql.Row`
A row of data in a :class:`DataFrame`. A row of data in a :class:`DataFrame`.
- :class:`pyspark.sql.HiveContext`
Main entry point for accessing data stored in Apache Hive.
- :class:`pyspark.sql.GroupedData` - :class:`pyspark.sql.GroupedData`
Aggregation methods, returned by :func:`DataFrame.groupBy`. Aggregation methods, returned by :func:`DataFrame.groupBy`.
- :class:`pyspark.sql.DataFrameNaFunctions` - :class:`pyspark.sql.DataFrameNaFunctions`
@ -45,7 +43,7 @@ from __future__ import absolute_import
from pyspark.sql.types import Row from pyspark.sql.types import Row
from pyspark.sql.context import SQLContext, HiveContext from pyspark.sql.context import SQLContext, HiveContext, UDFRegistration
from pyspark.sql.session import SparkSession from pyspark.sql.session import SparkSession
from pyspark.sql.column import Column from pyspark.sql.column import Column
from pyspark.sql.dataframe import DataFrame, DataFrameNaFunctions, DataFrameStatFunctions from pyspark.sql.dataframe import DataFrame, DataFrameNaFunctions, DataFrameStatFunctions
@ -55,7 +53,8 @@ from pyspark.sql.window import Window, WindowSpec
__all__ = [ __all__ = [
'SparkSession', 'SQLContext', 'HiveContext', 'DataFrame', 'GroupedData', 'Column', 'SparkSession', 'SQLContext', 'HiveContext', 'UDFRegistration',
'Row', 'DataFrameNaFunctions', 'DataFrameStatFunctions', 'Window', 'WindowSpec', 'DataFrame', 'GroupedData', 'Column', 'Row',
'DataFrameNaFunctions', 'DataFrameStatFunctions', 'Window', 'WindowSpec',
'DataFrameReader', 'DataFrameWriter' 'DataFrameReader', 'DataFrameWriter'
] ]