History

hyukjinkwon 02a4386aec [SPARK-20978][SQL] Bump up Univocity version to 2.5.4 ## What changes were proposed in this pull request? There was a bug in Univocity Parser that causes the issue in SPARK-20978. This was fixed as below: ```scala val df = spark.read.schema("a string, b string, unparsed string").option("columnNameOfCorruptRecord", "unparsed").csv(Seq("a").toDS()) df.show() ``` Before ``` java.lang.NullPointerException at scala.collection.immutable.StringLike$class.stripLineEnd(StringLike.scala:89) at scala.collection.immutable.StringOps.stripLineEnd(StringOps.scala:29) at org.apache.spark.sql.execution.datasources.csv.UnivocityParser.org$apache$spark$sql$execution$datasources$csv$UnivocityParser$$getCurrentInput(UnivocityParser.scala:56) at org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anonfun$org$apache$spark$sql$execution$datasources$csv$UnivocityParser$$convert$1.apply(UnivocityParser.scala:207) at org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anonfun$org$apache$spark$sql$execution$datasources$csv$UnivocityParser$$convert$1.apply(UnivocityParser.scala:207) ... ``` After ``` +---+----+--------+ \| a\| b\|unparsed\| +---+----+--------+ \| a\|null\| a\| +---+----+--------+ ``` It was fixed in 2.5.0 and 2.5.4 was released. I guess it'd be safe to upgrade this. ## How was this patch tested? Unit test added in `CSVSuite.scala`. Author: hyukjinkwon <gurwls223@gmail.com> Closes #19113 from HyukjinKwon/bump-up-univocity.		2017-09-05 23:21:43 +08:00
..
create-release	[SPARK-14280][BUILD][WIP] Update change-version.sh and pom.xml to add Scala 2.12 profiles and enable 2.12 compilation	2017-09-01 19:21:21 +01:00
deps	[SPARK-20978][SQL] Bump up Univocity version to 2.5.4	2017-09-05 23:21:43 +08:00
sparktestsupport	[SPARK-20974][BUILD] we should run REPL tests if SQL module has code changes	2017-06-02 21:59:52 -07:00
tests	[SPARK-10359] Enumerate dependencies in a file and diff against it for new pull requests	2015-12-30 12:47:42 -08:00
.gitignore	[SPARK-6219] Reuse pep8.py	2015-04-18 16:46:28 -07:00
.rat-excludes	[SPARK-20434][YARN][CORE] Move Hadoop delegation token code from yarn to core	2017-06-15 11:46:00 -07:00
appveyor-guide.md	[SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate building and testing on Windows (currently SparkR only)	2016-09-08 08:26:59 -07:00
appveyor-install-dependencies.ps1	[MINOR][BUILD] Download RAT and R version info over HTTPS; use RAT 0.12	2017-08-12 14:31:05 +09:00
change-scala-version.sh	[SPARK-19810][BUILD][CORE] Remove support for Scala 2.10	2017-07-13 17:06:24 +08:00
check-license	[MINOR][BUILD] Download RAT and R version info over HTTPS; use RAT 0.12	2017-08-12 14:31:05 +09:00
checkstyle-suppressions.xml	[MINOR][BUILD] Fix lint-java breaks.	2017-05-10 13:56:34 +01:00
checkstyle.xml	[SPARK-18073][DOCS][WIP] Migrate wiki to spark.apache.org web site	2016-11-23 11:25:47 +00:00
github_jira_sync.py	[SPARK-19002][BUILD][PYTHON] Check pep8 against all Python scripts	2017-01-02 15:23:19 +00:00
lint-java	[SPARK-16967] move mesos to module	2016-08-26 12:25:22 -07:00
lint-python	[MINOR][PYTHON] Ignore pep8 on test scripts generated in tests in work directory	2017-06-02 14:25:38 +01:00
lint-r	[SPARK-10328] [SPARKR] Fix generic for na.omit	2015-08-28 00:37:50 -07:00
lint-r.R	[SPARK-14074][SPARKR] Specify commit sha1 ID when using install_github to install intr package.	2016-03-23 07:57:03 -07:00
lint-scala	[SPARK-2627] [PySpark] have the build enforce PEP 8 automatically	2014-08-06 12:58:24 -07:00
make-distribution.sh	[SPARK-20123][BUILD] SPARK_HOME variable might have spaces in it(e.g. $SPARK…	2017-04-02 15:31:13 +01:00
merge_spark_pr.py	[MINOR] Minor comment fixes in merge_spark_pr.py script	2017-07-31 10:07:33 +09:00
mima	[SPARK-21709][BUILD] sbt 0.13.16 and some plugin updates	2017-08-12 20:01:20 +01:00
pip-sanity-check.py	[SPARK-19064][PYSPARK] Fix pip installing of sub components	2017-01-25 14:43:39 -08:00
README.md	Merge pull request #565 from pwendell/dev-scripts. Closes #565 .	2014-02-08 23:13:34 -08:00
requirements.txt	[SPARK-19064][PYSPARK] Fix pip installing of sub components	2017-01-25 14:43:39 -08:00
run-pip-tests	Revert "[SPARK-13534][PYSPARK] Using Apache Arrow to increase performance of DataFrame.toPandas"	2017-06-28 14:28:40 +08:00
run-tests	[SPARK-5161] Parallelize Python test execution	2015-06-29 21:32:40 -07:00
run-tests-jenkins	[SPARK-19955][PYSPARK] Jenkins Python Conda based test.	2017-03-29 11:41:17 -07:00
run-tests-jenkins.py	[SPARK-21189][INFRA] Handle unknown error codes in Jenkins rather then leaving incomplete comment in PRs	2017-06-24 10:14:31 +01:00
run-tests.py	[SPARK-20974][BUILD] we should run REPL tests if SQL module has code changes	2017-06-02 21:59:52 -07:00
scalastyle	[SPARK-16967] move mesos to module	2016-08-26 12:25:22 -07:00
test-dependencies.sh	[SPARK-19550][BUILD][CORE][WIP] Remove Java 7 support	2017-02-16 12:32:45 +00:00
tox.ini	[MINOR][PYTHON] Ignore pep8 on test scripts generated in tests in work directory	2017-06-02 14:25:38 +01:00

README.md

Spark Developer Scripts

This directory contains scripts useful to developers when packaging, testing, or committing to Spark.

Many of these scripts require Apache credentials to work correctly.