spark-instrumented-optimizer/dev
hyukjinkwon 02a4386aec [SPARK-20978][SQL] Bump up Univocity version to 2.5.4
## What changes were proposed in this pull request?

There was a bug in Univocity Parser that causes the issue in SPARK-20978. This was fixed as below:

```scala
val df = spark.read.schema("a string, b string, unparsed string").option("columnNameOfCorruptRecord", "unparsed").csv(Seq("a").toDS())
df.show()
```

**Before**

```
java.lang.NullPointerException
	at scala.collection.immutable.StringLike$class.stripLineEnd(StringLike.scala:89)
	at scala.collection.immutable.StringOps.stripLineEnd(StringOps.scala:29)
	at org.apache.spark.sql.execution.datasources.csv.UnivocityParser.org$apache$spark$sql$execution$datasources$csv$UnivocityParser$$getCurrentInput(UnivocityParser.scala:56)
	at org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anonfun$org$apache$spark$sql$execution$datasources$csv$UnivocityParser$$convert$1.apply(UnivocityParser.scala:207)
	at org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anonfun$org$apache$spark$sql$execution$datasources$csv$UnivocityParser$$convert$1.apply(UnivocityParser.scala:207)
...
```

**After**

```
+---+----+--------+
|  a|   b|unparsed|
+---+----+--------+
|  a|null|       a|
+---+----+--------+
```

It was fixed in 2.5.0 and 2.5.4 was released. I guess it'd be safe to upgrade this.

## How was this patch tested?

Unit test added in `CSVSuite.scala`.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #19113 from HyukjinKwon/bump-up-univocity.
2017-09-05 23:21:43 +08:00
..
create-release [SPARK-14280][BUILD][WIP] Update change-version.sh and pom.xml to add Scala 2.12 profiles and enable 2.12 compilation 2017-09-01 19:21:21 +01:00
deps [SPARK-20978][SQL] Bump up Univocity version to 2.5.4 2017-09-05 23:21:43 +08:00
sparktestsupport [SPARK-20974][BUILD] we should run REPL tests if SQL module has code changes 2017-06-02 21:59:52 -07:00
tests [SPARK-10359] Enumerate dependencies in a file and diff against it for new pull requests 2015-12-30 12:47:42 -08:00
.gitignore [SPARK-6219] Reuse pep8.py 2015-04-18 16:46:28 -07:00
.rat-excludes [SPARK-20434][YARN][CORE] Move Hadoop delegation token code from yarn to core 2017-06-15 11:46:00 -07:00
appveyor-guide.md [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate building and testing on Windows (currently SparkR only) 2016-09-08 08:26:59 -07:00
appveyor-install-dependencies.ps1 [MINOR][BUILD] Download RAT and R version info over HTTPS; use RAT 0.12 2017-08-12 14:31:05 +09:00
change-scala-version.sh [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10 2017-07-13 17:06:24 +08:00
check-license [MINOR][BUILD] Download RAT and R version info over HTTPS; use RAT 0.12 2017-08-12 14:31:05 +09:00
checkstyle-suppressions.xml [MINOR][BUILD] Fix lint-java breaks. 2017-05-10 13:56:34 +01:00
checkstyle.xml [SPARK-18073][DOCS][WIP] Migrate wiki to spark.apache.org web site 2016-11-23 11:25:47 +00:00
github_jira_sync.py [SPARK-19002][BUILD][PYTHON] Check pep8 against all Python scripts 2017-01-02 15:23:19 +00:00
lint-java [SPARK-16967] move mesos to module 2016-08-26 12:25:22 -07:00
lint-python [MINOR][PYTHON] Ignore pep8 on test scripts generated in tests in work directory 2017-06-02 14:25:38 +01:00
lint-r [SPARK-10328] [SPARKR] Fix generic for na.omit 2015-08-28 00:37:50 -07:00
lint-r.R [SPARK-14074][SPARKR] Specify commit sha1 ID when using install_github to install intr package. 2016-03-23 07:57:03 -07:00
lint-scala [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
make-distribution.sh [SPARK-20123][BUILD] SPARK_HOME variable might have spaces in it(e.g. $SPARK… 2017-04-02 15:31:13 +01:00
merge_spark_pr.py [MINOR] Minor comment fixes in merge_spark_pr.py script 2017-07-31 10:07:33 +09:00
mima [SPARK-21709][BUILD] sbt 0.13.16 and some plugin updates 2017-08-12 20:01:20 +01:00
pip-sanity-check.py [SPARK-19064][PYSPARK] Fix pip installing of sub components 2017-01-25 14:43:39 -08:00
README.md Merge pull request #565 from pwendell/dev-scripts. Closes #565. 2014-02-08 23:13:34 -08:00
requirements.txt [SPARK-19064][PYSPARK] Fix pip installing of sub components 2017-01-25 14:43:39 -08:00
run-pip-tests Revert "[SPARK-13534][PYSPARK] Using Apache Arrow to increase performance of DataFrame.toPandas" 2017-06-28 14:28:40 +08:00
run-tests [SPARK-5161] Parallelize Python test execution 2015-06-29 21:32:40 -07:00
run-tests-jenkins [SPARK-19955][PYSPARK] Jenkins Python Conda based test. 2017-03-29 11:41:17 -07:00
run-tests-jenkins.py [SPARK-21189][INFRA] Handle unknown error codes in Jenkins rather then leaving incomplete comment in PRs 2017-06-24 10:14:31 +01:00
run-tests.py [SPARK-20974][BUILD] we should run REPL tests if SQL module has code changes 2017-06-02 21:59:52 -07:00
scalastyle [SPARK-16967] move mesos to module 2016-08-26 12:25:22 -07:00
test-dependencies.sh [SPARK-19550][BUILD][CORE][WIP] Remove Java 7 support 2017-02-16 12:32:45 +00:00
tox.ini [MINOR][PYTHON] Ignore pep8 on test scripts generated in tests in work directory 2017-06-02 14:25:38 +01:00

Spark Developer Scripts

This directory contains scripts useful to developers when packaging, testing, or committing to Spark.

Many of these scripts require Apache credentials to work correctly.