Commit graph

161 commits

Author SHA1 Message Date
MechCoder ffa05c84fe [SPARK-9828] [PYSPARK] Mutable values should not be default arguments
Author: MechCoder <manojkumarsivaraj334@gmail.com>

Closes #8110 from MechCoder/spark-9828.
2015-08-14 12:46:05 -07:00
Yijie Shen 8c320e45b5 [SPARK-6591] [SQL] Python data source load options should auto convert common types into strings
JIRA: https://issues.apache.org/jira/browse/SPARK-6591

Author: Yijie Shen <henry.yijieshen@gmail.com>

Closes #7926 from yjshen/py_dsload_opt and squashes the following commits:

b207832 [Yijie Shen] fix style
efdf834 [Yijie Shen] resolve comment
7a8f6a2 [Yijie Shen] lowercase
822e769 [Yijie Shen] convert load opts to string
2015-08-05 17:28:23 -07:00
Cheng Lian d38c5029a2 [SPARK-9100] [SQL] Adds DataFrame reader/writer shortcut methods for ORC
This PR adds DataFrame reader/writer shortcut methods for ORC in both Scala and Python.

Author: Cheng Lian <lian@databricks.com>

Closes #7444 from liancheng/spark-9100 and squashes the following commits:

284d043 [Cheng Lian] Fixes PySpark test cases and addresses PR comments
e0b09fb [Cheng Lian] Adds DataFrame reader/writer shortcut methods for ORC
2015-07-21 15:08:44 +08:00
Reynold Xin 660c6cec75 [SPARK-8698] partitionBy in Python DataFrame reader/writer interface should not default to empty tuple.
Author: Reynold Xin <rxin@databricks.com>

Closes #7079 from rxin/SPARK-8698 and squashes the following commits:

8513e1c [Reynold Xin] [SPARK-8698] partitionBy in Python DataFrame reader/writer interface should not default to empty tuple.
2015-06-29 00:22:44 -07:00
Cheolsoo Park ac2e17b01c [SPARK-8355] [SQL] Python DataFrameReader/Writer should mirror Scala
I compared PySpark DataFrameReader/Writer against Scala ones. `Option` function is missing in both reader and writer, but the rest seems to all match.

I added `Option` to reader and writer and updated the `pyspark-sql` test.

Author: Cheolsoo Park <cheolsoop@netflix.com>

Closes #7078 from piaozhexiu/SPARK-8355 and squashes the following commits:

c63d419 [Cheolsoo Park] Fix version
524e0aa [Cheolsoo Park] Add option function to df reader and writer
2015-06-29 00:13:39 -07:00
Yin Huai 5ab9fcfb01 [SPARK-8532] [SQL] In Python's DataFrameWriter, save/saveAsTable/json/parquet/jdbc always override mode
https://issues.apache.org/jira/browse/SPARK-8532

This PR has two changes. First, it fixes the bug that save actions (i.e. `save/saveAsTable/json/parquet/jdbc`) always override mode. Second, it adds input argument `partitionBy` to `save/saveAsTable/parquet`.

Author: Yin Huai <yhuai@databricks.com>

Closes #6937 from yhuai/SPARK-8532 and squashes the following commits:

f972d5d [Yin Huai] davies's comment.
d37abd2 [Yin Huai] style.
d21290a [Yin Huai] Python doc.
889eb25 [Yin Huai] Minor refactoring and add partitionBy to save, saveAsTable, and parquet.
7fbc24b [Yin Huai] Use None instead of "error" as the default value of mode since JVM-side already uses "error" as the default value.
d696dff [Yin Huai] Python style.
88eb6c4 [Yin Huai] If mode is "error", do not call mode method.
c40c461 [Yin Huai] Regression test.
2015-06-22 13:51:23 -07:00
Reynold Xin ce320cb2db [SPARK-8060] Improve DataFrame Python test coverage and documentation.
Author: Reynold Xin <rxin@databricks.com>

Closes #6601 from rxin/python-read-write-test-and-doc and squashes the following commits:

baa8ad5 [Reynold Xin] Code review feedback.
f081d47 [Reynold Xin] More documentation updates.
c9902fa [Reynold Xin] [SPARK-8060] Improve DataFrame Python reader/writer interface doc and testing.
2015-06-03 00:23:34 -07:00
Davies Liu 445647a1a3 [SPARK-8021] [SQL] [PYSPARK] make Python read/write API consistent with Scala
add schema()/format()/options() for reader,  add mode()/format()/options()/partitionBy() for writer

cc rxin yhuai  pwendell

Author: Davies Liu <davies@databricks.com>

Closes #6578 from davies/readwrite and squashes the following commits:

720d293 [Davies Liu] address comments
b65dfa2 [Davies Liu] Update readwriter.py
1299ab6 [Davies Liu] make Python API consistent with Scala
2015-06-02 08:37:18 -07:00
Davies Liu be47af1bdb [SPARK-7840] add insertInto() to Writer
Add tests later.

Author: Davies Liu <davies@databricks.com>

Closes #6375 from davies/insertInto and squashes the following commits:

826423e [Davies Liu] add insertInto() to Writer
2015-05-23 09:07:14 -07:00
Davies Liu 8ddcb25b39 [SPARK-7606] [SQL] [PySpark] add version to Python SQL API docs
Add version info for public Python SQL API.

cc rxin

Author: Davies Liu <davies@databricks.com>

Closes #6295 from davies/versions and squashes the following commits:

cfd91e6 [Davies Liu] add more version for DataFrame API
600834d [Davies Liu] add version to SQL API docs
2015-05-20 23:05:54 -07:00
Davies Liu 4de74d2602 [SPARK-7738] [SQL] [PySpark] add reader and writer API in Python
cc rxin, please take a quick look, I'm working on tests.

Author: Davies Liu <davies@databricks.com>

Closes #6238 from davies/readwrite and squashes the following commits:

c7200eb [Davies Liu] update tests
9cbf01b [Davies Liu] Merge branch 'master' of github.com:apache/spark into readwrite
f0c5a04 [Davies Liu] use sqlContext.read.load
5f68bc8 [Davies Liu] update tests
6437e9a [Davies Liu] Merge branch 'master' of github.com:apache/spark into readwrite
bcc6668 [Davies Liu] add reader amd writer API in Python
2015-05-19 14:23:28 -07:00