ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Shivaram Venkataraman	fd137bd7c6	Address Reynold's comments. Also use a builder pattern to construct the regression classes.	2013-07-05 11:13:45 -07:00
Shivaram Venkataraman	48770419bd	Add random data used for LR testing. Verified that results match with glm in R	2013-07-05 11:13:45 -07:00
Shivaram Venkataraman	282c8ed788	Add LogisticRegression using StochasticGradientDescent. Also refactor RidgeRegression and LogisticRegression to re-use code and update the test as well	2013-07-05 11:13:45 -07:00
Shivaram Venkataraman	b9d9b6f981	Add a unit test for Ridge Regression	2013-07-05 11:13:45 -07:00
Shivaram Venkataraman	4dc13bf5be	Revert back to closed form CV error	2013-07-05 11:13:45 -07:00
Shivaram Venkataraman	c8169c0a33	Add LPSA data set. Data from http://www-stat.stanford.edu/~tibs/ElemStatLearn/datasets/prostate.data	2013-07-05 11:13:45 -07:00
Shivaram Venkataraman	c070decb8e	Add methods to normalize the data before training Also update model after training based appropriately.	2013-07-05 11:13:45 -07:00
Reynold Xin	6a9a9a364c	Minor clean up of the RidgeRegression code. I am not even sure why I did this :s.	2013-07-05 11:13:45 -07:00
Matei Zaharia	729e463f64	Import RidgeRegression example Conflicts: run	2013-07-05 11:13:41 -07:00
Matei Zaharia	6ad85d0918	Merge pull request #677 from jerryshao/fix_stage_clean Clean StageToInfos periodically when spark.cleaner.ttl is enabled	2013-07-04 21:32:29 -07:00
jerryshao	e4ff544a8d	Clean StageToInfos periodically when spark.cleaner.ttl is enabled	2013-07-05 10:34:45 +08:00
Konstantin Boudnik	7687ed5292	Use standard ASF published avro module instead of a proprietory built one	2013-07-04 13:48:33 -07:00
Lian Cheng	c0c3155c3c	Bug fix: SPARK-789 https://spark-project.atlassian.net/browse/SPARK-789	2013-07-05 00:54:10 +08:00
Andrew xia	6ccfb73ca9	Add fair scheduler config template file	2013-07-04 19:19:44 +08:00
Holden Karau	0f06d6217d	s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning	2013-07-04 01:05:39 -07:00
Mingfei	04567a1771	update guava version from 11.0.1 to 14.0.1	2013-07-03 17:43:37 +08:00
$Y.CORP.YAHOO.COM\tgraves$ Y.CORP.YAHOO.COM\tgraves	923cf92900	Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes to only add the credentials when the profile is hadoop2-yarn.	2013-07-02 21:18:59 -05:00
Patrick Wendell	39e2325675	Removing dead code	2013-07-02 16:28:40 -07:00
Patrick Wendell	8ca1cc1786	Adding truncation for log files	2013-07-02 16:10:50 -07:00
Matei Zaharia	6d60fe571a	Merge pull request #666 from c0s/master hbase dependency is missed in hadoop2-yarn profile of examples module	2013-07-01 18:24:03 -07:00
Konstantin Boudnik	6fdbc68f2c	Fixing missed hbase dependency in examples hadoop2-yarn profile	2013-07-01 17:45:07 -07:00
Patrick Wendell	9a42d04efa	Throw exception for missing resource	2013-07-01 14:43:13 -07:00
Patrick Wendell	1025d7d1ef	Package refactoring	2013-07-01 14:40:53 -07:00
Patrick Wendell	30b9034241	Fixing bug where logs aren't shown	2013-07-01 13:48:01 -07:00
Patrick Wendell	8688689387	Various formatting changes	2013-07-01 13:40:12 -07:00
Patrick Wendell	735c951a09	Adding test script	2013-07-01 09:33:22 -07:00
Patrick Wendell	5de326db7d	Print exception message	2013-07-01 09:19:45 -07:00
root	7cd490ef5b	Clarify that PySpark is not supported on Windows	2013-07-01 06:26:43 +00:00
root	ec31e68d5d	Fixed PySpark perf regression by not using socket.makefile(), and improved debuggability by letting "print" statements show up in the executor's stderr Conflicts: core/src/main/scala/spark/api/python/PythonRDD.scala	2013-07-01 06:26:31 +00:00
root	3296d132b6	Fix performance bug with new Python code not using buffered streams	2013-07-01 06:25:43 +00:00
Matei Zaharia	39ae073b5c	Increase SLF4j version in Maven too	2013-06-30 17:11:14 -07:00
Matei Zaharia	5bbd0eec84	Update docs on SCALA_LIBRARY_PATH	2013-06-30 17:00:40 -07:00
Matei Zaharia	03d0b858c8	Made use of spark.executor.memory setting consistent and documented it Conflicts: core/src/main/scala/spark/SparkContext.scala	2013-06-30 15:46:46 -07:00
Matei Zaharia	ccfe953a4d	Merge pull request #577 from skumargithub/master Example of cumulative counting using updateStateByKey	2013-06-29 17:57:53 -07:00
Matei Zaharia	5cfcd3c336	Remove Twitter4J specific repo since it's in Maven central	2013-06-29 15:37:27 -07:00
Matei Zaharia	4358acfe07	Initialize Twitter4J OAuth from system properties instead of prompting	2013-06-29 15:25:06 -07:00
Matei Zaharia	1667158544	Merge remote-tracking branch 'mrpotes/master'	2013-06-29 14:36:09 -07:00
Patrick Wendell	e721ff7e5a	Allowing details for failed stages	2013-06-29 11:26:30 -07:00
Patrick Wendell	473961d82e	Styling for progress bar	2013-06-29 08:38:04 -07:00
Patrick Wendell	249f0e54ba	Minor changes from Matei's review	2013-06-28 13:25:26 -07:00
Matei Zaharia	50ca17635a	Merge pull request #664 from pwendell/test-fix Removing incorrect test statement	2013-06-27 22:24:52 -07:00
Matei Zaharia	4974b658ed	Look at JAVA_HOME before PATH to determine Java executable	2013-06-27 22:16:40 -07:00
Patrick Wendell	c537e869f3	Missing logo file	2013-06-27 22:02:03 -07:00
Patrick Wendell	c767e74370	Removing incorrect test statement	2013-06-27 21:48:58 -07:00
Patrick Wendell	62c2c6b856	Forcing Jetty to run as daemon	2013-06-27 21:47:22 -07:00
Patrick Wendell	a55190d314	Adding better tabs for UI headers.	2013-06-27 19:14:51 -07:00
Patrick Wendell	362d996c81	Handful of changes based on matei's review - Avoid exception when no tasks have finished for a stage - Adding DOCTYPE so css renders properly - Adding progress slider	2013-06-27 19:14:28 -07:00
Patrick Wendell	92a4c2a5f6	Fixing bug in local scheduler time recording	2013-06-27 12:33:06 -07:00
Matei Zaharia	aea727f68d	Simplify Python docs a little to do substring search	2013-06-26 21:15:09 -07:00
Matei Zaharia	03906f7f0a	Fixes to compute-classpath on Windows	2013-06-26 17:40:22 -07:00

1 2 3 4 5 ...

3119 commits