Author: Reynold Xin <rxin@apache.org>
Closes#1334 from rxin/sqlConfThreadSafetuy and squashes the following commits:
c1e0a5a [Reynold Xin] Fixed the duplicate comment.
7614372 [Reynold Xin] [SPARK-2409] Make SQLConf thread safe.
According to
```scala
private val maxNumExecutorFailures = sparkConf.getInt("spark.yarn.max.executor.failures",
sparkConf.getInt("spark.yarn.max.worker.failures", math.max(args.numExecutors * 2, 3)))
```
default value should be numExecutors * 2, with minimum of 3, and it's same to the config
`spark.yarn.max.worker.failures`
Author: CrazyJvm <crazyjvm@gmail.com>
Closes#1282 from CrazyJvm/yarn-doc and squashes the following commits:
1a5f25b [CrazyJvm] remove deprecated config
c438aec [CrazyJvm] fix style
86effa6 [CrazyJvm] change expression
211f130 [CrazyJvm] fix html tag
2900d23 [CrazyJvm] fix style
a4b2e27 [CrazyJvm] fix configuration spark.yarn.max.executor.failures
https://issues.apache.org/jira/browse/SPARK-2403
Spark hangs for us whenever we forget to register a class with Kryo. This should be a simple fix for that. But let me know if you have a better suggestion.
I did not write a new test for this. It would be pretty complicated and I'm not sure it's worthwhile for such a simple change. Let me know if you disagree.
Author: Daniel Darabos <darabos.daniel@gmail.com>
Closes#1329 from darabos/spark-2403 and squashes the following commits:
3aceaad [Daniel Darabos] Print full stack trace for miscellaneous exceptions during serialization.
52c22ba [Daniel Darabos] Only catch NonFatal exceptions.
361e962 [Daniel Darabos] Catch all errors during serialization in DAGScheduler.
Author: Michael Armbrust <michael@databricks.com>
Closes#1325 from marmbrus/slowLike and squashes the following commits:
023c3eb [Michael Armbrust] add comment.
8b421c2 [Michael Armbrust] Handle the case where the final % is actually escaped.
d34d37e [Michael Armbrust] add periods.
3bbf35f [Michael Armbrust] Roll back changes to SparkBuild
53894b1 [Michael Armbrust] Fix grammar.
4094462 [Michael Armbrust] Fix grammar.
6d3d0a0 [Michael Armbrust] Optimize common LIKE patterns.
Right now I have to open it manually
Author: Andrew Or <andrewor14@gmail.com>
Closes#1296 from andrewor14/hist-serv-port and squashes the following commits:
8895a1f [Andrew Or] Add default history server port to ec2 script
Using Spark's take can result in an entire in-memory partition to be shipped in order to retrieve a single row.
Author: Michael Armbrust <michael@databricks.com>
Closes#1318 from marmbrus/takeLimit and squashes the following commits:
77289a5 [Michael Armbrust] Update scala doc
32f0674 [Michael Armbrust] Custom take implementation for LIMIT queries.
Author: witgo <witgo@qq.com>
Closes#1153 from witgo/expectResult and squashes the following commits:
97541d8 [witgo] merge master
ead26e7 [witgo] Resolve sbt warnings during build
Made sure that readers know the random number generator seed argument, within the 'takeSample' method, is optional.
Author: Rishi Verma <riverma@apache.org>
Closes#1324 from riverma/patch-1 and squashes the following commits:
4699676 [Rishi Verma] Updated programming-guide.md
Hi all,
I want to submit a basic operator Intersect
For example , in sql case
select * from table1
intersect
select * from table2
So ,i want use this operator support this function in Spark SQL
This operator will return the the intersection of SparkPlan child table RDD .
JIRA:https://issues.apache.org/jira/browse/SPARK-2235
Author: Yanjie Gao <gaoyanjie55@163.com>
Author: YanjieGao <396154235@qq.com>
Closes#1150 from YanjieGao/patch-5 and squashes the following commits:
4629afe [YanjieGao] reformat the code
bdc2ac0 [YanjieGao] reformat the code as Michael's suggestion
3b29ad6 [YanjieGao] Merge remote branch 'upstream/master' into patch-5
1cfbfe6 [YanjieGao] refomat some files
ea78f33 [YanjieGao] resolve conflict and add annotation on basicOperator and remove HiveQl
0c7cca5 [YanjieGao] modify format problem
a802ca8 [YanjieGao] Merge remote branch 'upstream/master' into patch-5
5e374c7 [YanjieGao] resolve conflict in SparkStrategies and basicOperator
f7961f6 [Yanjie Gao] update the line less than
bdc4a05 [Yanjie Gao] Update basicOperators.scala
0b49837 [Yanjie Gao] delete the annotation
f1288b4 [Yanjie Gao] delete annotation
e2b64be [Yanjie Gao] Update basicOperators.scala
4dd453e [Yanjie Gao] Update SQLQuerySuite.scala
790765d [Yanjie Gao] Update SparkStrategies.scala
ac73e60 [Yanjie Gao] Update basicOperators.scala
d4ac5e5 [Yanjie Gao] Update HiveQl.scala
61e88e7 [Yanjie Gao] Update SqlParser.scala
469f099 [Yanjie Gao] Update basicOperators.scala
e5bff61 [Yanjie Gao] Spark SQL basicOperator add Intersect operator
For example, for
```
{"array": [{"field":214748364700}, {"field":1}]}
```
the type of field is resolved as IntType. While, for
```
{"array": [{"field":1}, {"field":214748364700}]}
```
the type of field is resolved as LongType.
JIRA: https://issues.apache.org/jira/browse/SPARK-2375
Author: Yin Huai <huaiyin.thu@gmail.com>
Closes#1308 from yhuai/SPARK-2375 and squashes the following commits:
3e2e312 [Yin Huai] Update unit test.
1b2ff9f [Yin Huai] Merge remote-tracking branch 'upstream/master' into SPARK-2375
10794eb [Yin Huai] Correctly resolve the type of a field inside an array of structs.
When execute `saveAsParquetFile` with non-primitive type, `RowWriteSupport` uses wrong type `Int` for `ByteType` and `ShortType`.
Author: Takuya UESHIN <ueshin@happy-camper.st>
Closes#1315 from ueshin/issues/SPARK-2386 and squashes the following commits:
20d89ec [Takuya UESHIN] Use None instead of null.
bd88741 [Takuya UESHIN] Add a test.
323d1d2 [Takuya UESHIN] Modify RowWriteSupport to use the exact types to cast.
Reported by http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Join-throws-exception-td8599.html
After we get the table from the catalog, because the table has an alias, we will temporarily insert a Subquery. Then, we convert the table alias to lower case no matter if the parser is case sensitive or not.
To see the issue ...
```
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
case class Person(name: String, age: Int)
val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
people.registerAsTable("people")
sqlContext.sql("select PEOPLE.name from people PEOPLE")
```
The plan is ...
```
== Query Plan ==
Project ['PEOPLE.name]
ExistingRdd [name#0,age#1], MapPartitionsRDD[4] at mapPartitions at basicOperators.scala:176
```
You can find that `PEOPLE.name` is not resolved.
This PR introduces three changes.
1. If a table has an alias, the catalog will not lowercase the alias. If a lowercase alias is needed, the analyzer will do the work.
2. A catalog has a new val caseSensitive that indicates if this catalog is case sensitive or not. For example, a SimpleCatalog is case sensitive, but
3. Corresponding unit tests.
With this PR, case sensitivity of database names and table names is handled by the catalog. Case sensitivity of other identifiers are handled by the analyzer.
JIRA: https://issues.apache.org/jira/browse/SPARK-2339
Author: Yin Huai <huai@cse.ohio-state.edu>
Closes#1317 from yhuai/SPARK-2339 and squashes the following commits:
12d8006 [Yin Huai] Handling case sensitivity correctly. This patch introduces three changes. 1. If a table has an alias, the catalog will not lowercase the alias. If a lowercase alias is needed, the analyzer will do the work. 2. A catalog has a new val caseSensitive that indicates if this catalog is case sensitive or not. For example, a SimpleCatalog is case sensitive, but 3. Corresponding unit tests. With this patch, case sensitivity of database names and table names is handled by the catalog. Case sensitivity of other identifiers is handled by the analyzer.
Author: Neville Li <neville@spotify.com>
Closes#1319 from nevillelyh/gh/SPARK-1977 and squashes the following commits:
1f0a355 [Neville Li] [SPARK-1977][MLLIB] register mutable BitSet in MovieLenseALS
Fix nullabilities of `Join`/`Generate`/`Aggregate` because:
- Output attributes of opposite side of `OuterJoin` should be nullable.
- Output attributes of generater side of `Generate` should be nullable if `join` is `true` and `outer` is `true`.
- `AttributeReference` of `computedAggregates` of `Aggregate` should be the same as `aggregateExpression`'s.
Author: Takuya UESHIN <ueshin@happy-camper.st>
Closes#1266 from ueshin/issues/SPARK-2327 and squashes the following commits:
3ace83a [Takuya UESHIN] Add withNullability to Attribute and use it to change nullabilities.
df1ae53 [Takuya UESHIN] Modify nullabilize to leave attribute if not resolved.
799ce56 [Takuya UESHIN] Add nullabilization to Generate of SparkPlan.
a0fc9bc [Takuya UESHIN] Fix scalastyle errors.
0e31e37 [Takuya UESHIN] Fix Aggregate resultAttribute nullabilities.
09532ec [Takuya UESHIN] Fix Generate output nullabilities.
f20f196 [Takuya UESHIN] Fix Join output nullabilities.
The right side of `LeftSemi` join needs columns only used in join condition.
Author: Takuya UESHIN <ueshin@happy-camper.st>
Closes#1301 from ueshin/issues/SPARK-2366 and squashes the following commits:
7677a39 [Takuya UESHIN] Update comments.
786d3a0 [Takuya UESHIN] Rename method name.
e0957b1 [Takuya UESHIN] Add column pruning for the right side of LeftSemi join.
Due to the non registration of BoundedPriorityQueue with kryoserializer, operations which are dependend on BoundedPriorityQueue are giving exceptions.One such instance is using top along with kryo serialization.
Fixed the issue by registering BoundedPriorityQueue with kryoserializer.
Author: ankit.bhardwaj <ankit.bhardwaj@guavus.com>
Closes#1299 from AnkitBhardwaj12/BoundedPriorityQueueWithKryoIssue and squashes the following commits:
a4ae8ed [ankit.bhardwaj] [SPARK-2306]:BoundedPriorityQueue is private and not registered with Kryo
Author: Michael Armbrust <michael@databricks.com>
Closes#1305 from marmbrus/usePrunerPartitions and squashes the following commits:
744aa20 [Michael Armbrust] Use getAllPartitionsForPruner instead of getPartitions, which avoids retrieving auth data
This was omitted in #1260. @aarondav
Author: Reynold Xin <rxin@apache.org>
Closes#1300 from rxin/historyServer and squashes the following commits:
af720a3 [Reynold Xin] Added SignalLogger to HistoryServer.
If the docs are built after a Maven build has finished the intermediate
state somehow causes a compiler bug during sbt compilation. This just
does a clean before attempting to build the docs.
This replaces #1263 with a test case.
Author: Reynold Xin <rxin@apache.org>
Author: Michael Armbrust <michael@databricks.com>
Closes#1265 from rxin/sql-analysis-error and squashes the following commits:
a639e01 [Reynold Xin] Added a test case for unresolved attribute analysis.
7371e1b [Reynold Xin] Merge pull request #1263 from marmbrus/analysisChecks
448c088 [Michael Armbrust] Add analysis checks
This is an alternate solution to #1176.
Author: Prashant Sharma <prashant.s@imaginea.com>
Closes#1179 from ScrapCodes/SPARK-1199/repl-fix-second-approach and squashes the following commits:
820b34b [Prashant Sharma] Here we generate two kinds of import wrappers based on whether it is a class or not.
This is a fix for the problem revealed by PR #1265.
Currently `HiveComparisonSuite` ignores output of `ExplainCommand` since Catalyst query plan is quite different from Hive query plan. But exceptions throw from `CheckResolution` still breaks test cases. This PR catches any `TreeNodeException` and reports it as part of the query explanation.
After merging this PR, PR #1265 can also be merged safely.
For a normal query:
```
scala> hql("explain select key from src").foreach(println)
...
[Physical execution plan:]
[HiveTableScan [key#9], (MetastoreRelation default, src, None), None]
```
For a wrong query with unresolved attribute(s):
```
scala> hql("explain select kay from src").foreach(println)
...
[Error occurred during query planning: ]
[Unresolved attributes: 'kay, tree:]
[Project ['kay]]
[ LowerCaseSchema ]
[ MetastoreRelation default, src, None]
```
Author: Cheng Lian <lian.cs.zju@gmail.com>
Closes#1294 from liancheng/safe-explain and squashes the following commits:
4318911 [Cheng Lian] Don't throw TreeNodeException in `execution.ExplainCommand`
JIRA: https://issues.apache.org/jira/browse/SPARK-2282
This issue is caused by a buildup of sockets in the TIME_WAIT stage of TCP, which is a stage that lasts for some period of time after the communication closes.
This solution simply allows us to reuse sockets that are in TIME_WAIT, to avoid issues with the buildup of the rapid creation of these sockets.
Author: Aaron Davidson <aaron@databricks.com>
Closes#1220 from aarondav/SPARK-2282 and squashes the following commits:
2e5cab3 [Aaron Davidson] SPARK-2282: Reuse PySpark Accumulator sockets to avoid crashing Spark
**Problem.** The existing code in `ExecutorPage.scala` requires a linear scan through all the blocks to filter out the uncached ones. Every refresh could be expensive if there are many blocks and many executors.
**Solution.** The proper semantics should be the following: `StorageStatusListener` should contain only block statuses that are cached. This means as soon as a block is unpersisted by any mean, its status should be removed. This is reflected in the changes made in `StorageStatusListener.scala`.
Further, the `StorageTab` must stop relying on the `StorageStatusListener` changing a dropped block's status to `StorageLevel.NONE` (which no longer happens). This is reflected in the changes made in `StorageTab.scala` and `StorageUtils.scala`.
----------
If you have been following this chain of PRs like pwendell, you will quickly notice that this reverts the changes in #1249, which reverts the changes in #1080. In other words, we are adding back the changes from #1080, and fixing SPARK-2307 on top of those changes. Please ask questions if you are confused.
Author: Andrew Or <andrewor14@gmail.com>
Closes#1255 from andrewor14/storage-ui-fix-reprise and squashes the following commits:
45416fa [Andrew Or] Merge branch 'master' of github.com:apache/spark into storage-ui-fix-reprise
a82ea25 [Andrew Or] Add tests for StorageStatusListener
8773b01 [Andrew Or] Update comment / minor changes
3afde3f [Andrew Or] Correctly report the number of blocks on SparkUI
Prior to this change, we could throw a NPE if we launch a driver while another one is waiting, because removing from an iterator while iterating over it is not safe.
Author: Aaron Davidson <aaron@databricks.com>
Closes#1289 from aarondav/master-fail and squashes the following commits:
1cf1cf4 [Aaron Davidson] SPARK-2350: Don't NPE while launching drivers
Workaround Hadoop conf ConcurrentModification issue
Author: Raymond Liu <raymond.liu@intel.com>
Closes#1273 from colorant/hadoopRDD and squashes the following commits:
994e98b [Raymond Liu] Address comments
e2cda3d [Raymond Liu] Workaround Hadoop conf ConcurrentModification issue
Fix a bad Java code sample and a broken link in the streaming programming guide.
Author: Clément MATHIEU <clement@unportant.info>
Closes#1286 from cykl/streaming-programming-guide-typos and squashes the following commits:
b0908cb [Clément MATHIEU] Fix broken URL
9d3c535 [Clément MATHIEU] Spark streaming requires at least two working threads (scala version was OK)
Trivial fix.
Author: Prashant Sharma <prashant.s@imaginea.com>
Closes#1050 from ScrapCodes/SPARK-2109/pyspark-script-bug and squashes the following commits:
77072b9 [Prashant Sharma] Changed echos to redirect to STDERR.
13f48a0 [Prashant Sharma] [SPARK-2109] Setting SPARK_MEM for bin/pyspark does not work.
The function cast doesn't conform to the intention of "Those expressions are supposed to be in the same data type, and also the return type." comment
Author: Yijie Shen <henry.yijieshen@gmail.com>
Closes#1283 from yijieshen/master and squashes the following commits:
c7aaa4b [Yijie Shen] [SPARK-2342] Evaluation helper's output type doesn't conform to input type
Just closing out this small JIRA, resolving with a comment change.
Author: Sean Owen <sowen@cloudera.com>
Closes#1171 from srowen/SPARK-1675 and squashes the following commits:
45ee9b7 [Sean Owen] Add simple note that data need not be centered for computePrincipalComponents
It did not handle null keys very gracefully before.
Author: Andrew Or <andrewor14@gmail.com>
Closes#1288 from andrewor14/fix-external and squashes the following commits:
312b8d8 [Andrew Or] Abstract key hash code
ed5adf9 [Andrew Or] Fix NPE for ExternalAppendOnlyMap
The spark.local.dir is configured as a list of multiple paths as follows /data1/sparkenv/local,/data2/sparkenv/local. If the disk data2 of the driver node has error, the application will exit since DiskBlockManager exits directly at createLocalDirs. If the disk data2 of the worker node has error, the executor will exit either.
DiskBlockManager should not exit directly at createLocalDirs if one of spark.local.dir has error. Since spark.local.dir has multiple paths, a problem should not affect the overall situation.
I think DiskBlockManager could ignore the bad directory at createLocalDirs.
Author: yantangzhai <tyz0303@163.com>
Closes#1274 from YanTangZhai/SPARK-2324 and squashes the following commits:
609bf48 [yantangzhai] [SPARK-2324] SparkContext should not exit directly when spark.local.dir is a list of multiple paths and one of them has error
df08673 [yantangzhai] [SPARK-2324] SparkContext should not exit directly when spark.local.dir is a list of multiple paths and one of them has error
Author: Takuya UESHIN <ueshin@happy-camper.st>
Closes#1226 from ueshin/issues/SPARK-2287 and squashes the following commits:
32ef7c3 [Takuya UESHIN] Add execution of `SHOW TABLES` before `TestHive.reset()`.
541dc8d [Takuya UESHIN] Merge branch 'master' into issues/SPARK-2287
fac5fae [Takuya UESHIN] Remove unnecessary method receiver.
d306e60 [Takuya UESHIN] Merge branch 'master' into issues/SPARK-2287
7de5706 [Takuya UESHIN] Make ScalaReflection be able to handle Generic case classes.
`PruningSuite` is executed first of Hive tests unfortunately, `TestHive.reset()` breaks the test environment.
To prevent this, we must run a query before calling reset the first time.
Author: Takuya UESHIN <ueshin@happy-camper.st>
Closes#1268 from ueshin/issues/SPARK-2328 and squashes the following commits:
043ceac [Takuya UESHIN] Add execution of `SHOW TABLES` before `TestHive.reset()`.
**Description** This patch enables using the `.select()` function in SchemaRDD with functions such as `Sum`, `Count` and other.
**Testing** Unit tests added.
Author: Ximo Guanter Gonzalbez <ximo@tid.es>
Closes#1211 from edrevo/add-expression-support-in-select and squashes the following commits:
fe4a1e1 [Ximo Guanter Gonzalbez] Extend SQL DSL to functions
e1d344a [Ximo Guanter Gonzalbez] SPARK-2186: Spark SQL DSL support for simple aggregations such as SUM and AVG
SqlParser has been case-insensitive after dab5439a08 was merged
Author: CodingCat <zhunansjtu@gmail.com>
Closes#1275 from CodingCat/master and squashes the following commits:
17931cd [CodingCat] update the comments in SqlParser
This functionality was added in an earlier commit but shortly
after was removed due to a bad git merge (totally my fault).
Author: Kay Ousterhout <kayousterhout@gmail.com>
Closes#1149 from kayousterhout/warning_bug and squashes the following commits:
3f1bb00 [Kay Ousterhout] Fixed Json tests
462a664 [Kay Ousterhout] Removed task set name from warning message
e89b2f6 [Kay Ousterhout] Fixed Json tests.
7af424c [Kay Ousterhout] Emit warning when task size exceeds a threshold.
Fix for class of test suite failures in jenkins
Author: Peter MacKinnon <pmackinn@redhat.com>
Closes#1271 from pdmack/master and squashes the following commits:
cfe59fd [Peter MacKinnon] exclude servlet-api in hadoop-client for sbt
6f39fec [Peter MacKinnon] add exclusion for old servlet-api on hadoop-client in core
This is the only occurrence of this pattern in the examples that needs to be replaced. It only addresses the example change.
Author: Sean Owen <sowen@cloudera.com>
Closes#1250 from srowen/SPARK-2293 and squashes the following commits:
6b1b28c [Sean Owen] Compute prediction-and-label RDD directly rather than by zipping, for efficiency
Author: Reynold Xin <rxin@apache.org>
Closes#1260 from rxin/signalhandler1 and squashes the following commits:
8e73552 [Reynold Xin] Uh add Logging back in ApplicationMaster.
0402ba8 [Reynold Xin] Synchronize SignalLogger.register.
dc70705 [Reynold Xin] Added SignalLogger to YARN ApplicationMaster.
79a21b4 [Reynold Xin] Added license header.
0da052c [Reynold Xin] Added the SignalLogger itself.
e587d2e [Reynold Xin] [SPARK-2318] When exiting on a signal, print the signal name first.
This should go into 1.0.1.
Author: Reynold Xin <rxin@apache.org>
Closes#1264 from rxin/SPARK-2322 and squashes the following commits:
c77c07f [Reynold Xin] Added comment to SparkDriverExecutionException and a test case for accumulator.
5d8d920 [Reynold Xin] [SPARK-2322] Exception in resultHandler could crash DAGScheduler and shutdown SparkContext.
I could settle with this being a debug also if we provided an example of how to turn it on in `log4j.properties`
https://issues.apache.org/jira/browse/SPARK-2077
Author: Andrew Ash <andrew@andrewash.com>
Closes#1017 from ash211/SPARK-2077 and squashes the following commits:
580f680 [Andrew Ash] Drop to debug
0266415 [Andrew Ash] SPARK-2077 Log serializer that actually ends up being used
These commits cause `ClosureCleaner.clean` to attempt to serialize the cleaned closure with the default closure serializer and throw a `SparkException` if doing so fails. This behavior is enabled by default but can be disabled at individual callsites of `SparkContext.clean`.
Commit 98e01ae8 fixes some no-op assertions in `GraphSuite` that this work exposed; I'm happy to put that in a separate PR if that would be more appropriate.
Author: William Benton <willb@redhat.com>
Closes#143 from willb/spark-897 and squashes the following commits:
bceab8a [William Benton] Commented DStream corner cases for serializability checking.
64d04d2 [William Benton] FailureSuite now checks both messages and causes.
3b3f74a [William Benton] Stylistic and doc cleanups.
b215dea [William Benton] Fixed spurious failures in ImplicitOrderingSuite
be1ecd6 [William Benton] Don't check serializability of DStream transforms.
abe816b [William Benton] Make proactive serializability checking optional.
5bfff24 [William Benton] Adds proactive closure-serializablilty checking
ed2ccf0 [William Benton] Test cases for SPARK-897.
Details can be see in [SPARK-2104](https://issues.apache.org/jira/browse/SPARK-2104). This work is based on Reynold's work, add some unit tests to validate the issue.
@rxin , would you please take a look at this PR, thanks a lot.
Author: jerryshao <saisai.shao@intel.com>
Closes#1245 from jerryshao/SPARK-2104 and squashes the following commits:
c8ee362 [jerryshao] Make field partitions transient
2b41917 [jerryshao] Minor changes
47d763c [jerryshao] Fix task serializing issue when sort with Java non serializable class
This commit adds a new metric in TaskMetrics to record
the input data size and displays this information in the UI.
An earlier version of this commit also added the read time,
which can be useful for diagnosing straggler problems,
but unfortunately that change introduced a significant performance
regression for jobs that don't do much computation. In order to
track read time, we'll need to do sampling.
The screenshots below show the UI with the new "Input" field,
which I added to the stage summary page, the executor summary page,
and the per-stage page.
![image](https://cloud.githubusercontent.com/assets/1108612/3167930/2627f92a-eb77-11e3-861c-98ea5bb7a1a2.png)
![image](https://cloud.githubusercontent.com/assets/1108612/3167936/475a889c-eb77-11e3-9706-f11c48751f17.png)
![image](https://cloud.githubusercontent.com/assets/1108612/3167948/80ebcf12-eb77-11e3-87ed-349fce6a770c.png)
Author: Kay Ousterhout <kayousterhout@gmail.com>
Closes#962 from kayousterhout/read_metrics and squashes the following commits:
f13b67d [Kay Ousterhout] Correctly format input bytes on executor page
8b70cde [Kay Ousterhout] Added comment about potential inaccuracy of bytesRead
d1016e8 [Kay Ousterhout] Udated SparkListenerSuite test
8461492 [Kay Ousterhout] Miniscule style fix
ae04d99 [Kay Ousterhout] Remove input metrics for parallel collections
719f19d [Kay Ousterhout] Style fixes
bb6ec62 [Kay Ousterhout] Small fixes
869ac7b [Kay Ousterhout] Updated Json tests
44a0301 [Kay Ousterhout] Fixed accidentally added line
4bd0568 [Kay Ousterhout] Added input source, renamed Hdfs to Hadoop.
f27e535 [Kay Ousterhout] Updates based on review comments and to fix rebase
bf41029 [Kay Ousterhout] Updated Json tests to pass
0fc33e0 [Kay Ousterhout] Added explicit backward compatibility test
4e52925 [Kay Ousterhout] Added Json output and associated tests.
365400b [Kay Ousterhout] [SPARK-1683] Track task read metrics.
Author: Reynold Xin <rxin@apache.org>
Closes#1261 from rxin/ui-pre-size and squashes the following commits:
7ab1a69 [Reynold Xin] [SPARK-2320] Reduce exception/code block font size in web ui