Commit graph

77 commits

Author SHA1 Message Date
Karen Feng cfb6447ac4 Fixed for nonexistent bytes, added unit tests, changed stdout-page to stdout 2013-07-10 11:47:57 -07:00
Karen Feng 620a6974c6 Allows for larger files, refactors lastNBytes, removes old Log column, fixes imports, uses map 2013-07-10 10:20:53 -07:00
Karen Feng b6072b58bf Fixes style, makes "std__-page" consistent, reads only parts of files 2013-07-09 17:25:10 -07:00
Matei Zaharia 1ffadb2d9e Merge remote-tracking branch 'pwendell/ui-updates'
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	pom.xml
2013-07-06 15:51:41 -07:00
Patrick Wendell 39e2325675 Removing dead code 2013-07-02 16:28:40 -07:00
Patrick Wendell 8ca1cc1786 Adding truncation for log files 2013-07-02 16:10:50 -07:00
Patrick Wendell 8688689387 Various formatting changes 2013-07-01 13:40:12 -07:00
Patrick Wendell 362d996c81 Handful of changes based on matei's review
- Avoid exception when no tasks have finished for a stage
- Adding DOCTYPE so css renders properly
- Adding progress slider
2013-06-27 19:14:28 -07:00
Matei Zaharia 6c8d1b2ca6 Fix computation of classpath when we launch java directly
The previous version assumed that a CLASSPATH environment variable was
set by the "run" script when launching the process that starts the
ExecutorRunner, but unfortunately this is not true in tests. Instead, we
factor the classpath calculation into an extenral script and call that.

NOTE: This includes a Windows version but hasn't yet been tested there.
2013-06-25 18:21:00 -04:00
Matei Zaharia 15b00914c5 Some fixes to the launch-java-directly change:
- Split SPARK_JAVA_OPTS into multiple command-line arguments if it
  contains spaces; this splitting follows quoting rules in bash
- Add the Scala JARs to the classpath if they're not in the CLASSPATH
  variable because the ExecutorRunner is launched with "scala" (this can
  happen when using local-cluster URLs in spark-shell)
2013-06-25 17:17:27 -04:00
Matei Zaharia 1ef5d0d2c9 Merge pull request #644 from shimingfei/joblogger
add Joblogger to Spark (on new Spark code)
2013-06-22 09:35:57 -07:00
Mingfei 4b9862ac9c small format modification 2013-06-21 17:55:32 +08:00
Mingfei 5240795154 edit according to comments 2013-06-21 17:38:23 +08:00
Reynold Xin be3c406edf Fixed the typo pointed out by Matei. 2013-06-17 17:07:51 -04:00
Reynold Xin 1450296797 SPARK-781: Log the temp directory path when Spark says "Failed to create
temp directory".
2013-06-17 16:58:23 -04:00
Mingfei 1a4d93c025 modify to pass job annotation by localProperties and use daeamon thread to do joblogger's work 2013-06-08 14:23:39 +08:00
Reynold Xin bed1b08169 Do not create symlink for local add file. Instead, copy the file.
This prevents Spark from changing the original file's permission, and
also allow add file to work on non-posix operating systems.
2013-05-30 16:21:49 -07:00
Stephen Haberman 4fe1fbdd51 Remove unused addIfNoPort. 2013-05-28 16:26:32 -05:00
Mridul Muralidharan dfde9ce9dd comment out debug versions of checkHost, etc from Utils - which were used to test 2013-05-02 07:41:33 +05:30
Mridul Muralidharan 609a817f52 Integrate review comments on pull request 2013-05-02 06:44:33 +05:30
Mridul Muralidharan d960e7e0f8 a) Add support for hyper local scheduling - specific to a host + port - before trying host local scheduling.
b) Add some fixes to test code to ensure it passes (and fixes some other issues).

c) Fix bug in task scheduling which incorrectly used availableCores instead of all cores on the node.
2013-05-01 20:24:00 +05:30
Mridul Muralidharan 7acab3ab45 Fix review comments, add a new api to SparkHadoopUtil to create appropriate Configuration. Modify an example to show how to use SplitInfo 2013-04-22 08:01:13 +05:30
Mridul Muralidharan 5ee2f5c483 Cache pattern, add (commented out) alternatives for check* apis 2013-04-17 23:13:34 +05:30
Mridul Muralidharan d90d2af103 Checkpoint commit - compiles and passes a lot of tests - not all though, looking into FileSuite issues 2013-04-15 18:12:11 +05:30
Charles Reiss 092c631fa8 Pull detection of being in a shutdown hook into utility function. 2013-02-19 17:49:55 -08:00
Matei Zaharia 8b3041c723 Reduced the memory usage of reduce and similar operations
These operations used to wait for all the results to be available in an
array on the driver program before merging them. They now merge values
incrementally as they arrive.
2013-02-01 15:38:42 -08:00
Matei Zaharia f03d9760fd Clean up BlockManagerUI a little (make it not be an object, merge with
Directives, and bind to a random port)
2013-01-27 23:56:14 -08:00
Matei Zaharia 4d77d554e1 Merge pull request #394 from JoshRosen/add_file_fix
Add SparkFiles.get() API to access files added through addFile().
2013-01-23 12:16:30 -08:00
Josh Rosen 551a47a620 Refactor daemon thread pool creation. 2013-01-21 23:31:00 -08:00
Josh Rosen ef711902c1 Don't download files to master's working directory.
This should avoid exceptions caused by existing
files with different contents.

I also removed some unused code.
2013-01-21 17:34:17 -08:00
Matei Zaharia a88b44ed3b Only bind to IPv4 addresses when trying to auto-detect external IP 2013-01-21 11:59:21 -08:00
Matei Zaharia 86057ec7c8 Merge branch 'master' into streaming
Conflicts:
	core/src/main/scala/spark/api/python/PythonRDD.scala
2013-01-20 12:47:55 -08:00
Matei Zaharia 54c0f9f185 Fix code that assumed spark.local.dir is only a single directory 2013-01-17 17:40:55 -08:00
Tathagata Das d34dba25c2 Merge branch 'mesos' into dev-merge 2013-01-01 15:48:39 -08:00
Josh Rosen 397e67103c Change Utils.fetchFile() warning to SparkException. 2012-12-28 17:37:13 -08:00
Josh Rosen f1bf4f0385 Skip deletion of files in clearFiles().
This fixes an issue where Spark could delete
original files in the current working directory
that were added to the job using addFile().

There was also the potential for addFile() to
overwrite local files, which is addressed by
changing Utils.fetchFile() to log a warning
instead of overwriting a file with new contents.

This is a short-term fix; a better long-term
solution would be to remove the dependence on
storing files in the current working directory,
since we can't change the cwd from Java.
2012-12-28 17:00:57 -08:00
Reynold Xin eac566a7f4 Merge branch 'master' of github.com:mesos/spark into dev
Conflicts:
	core/src/main/scala/spark/MapOutputTracker.scala
	core/src/main/scala/spark/PairRDDFunctions.scala
	core/src/main/scala/spark/ParallelCollection.scala
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/rdd/BlockRDD.scala
	core/src/main/scala/spark/rdd/CartesianRDD.scala
	core/src/main/scala/spark/rdd/CoGroupedRDD.scala
	core/src/main/scala/spark/rdd/CoalescedRDD.scala
	core/src/main/scala/spark/rdd/FilteredRDD.scala
	core/src/main/scala/spark/rdd/FlatMappedRDD.scala
	core/src/main/scala/spark/rdd/GlommedRDD.scala
	core/src/main/scala/spark/rdd/HadoopRDD.scala
	core/src/main/scala/spark/rdd/MapPartitionsRDD.scala
	core/src/main/scala/spark/rdd/MapPartitionsWithSplitRDD.scala
	core/src/main/scala/spark/rdd/MappedRDD.scala
	core/src/main/scala/spark/rdd/PipedRDD.scala
	core/src/main/scala/spark/rdd/SampledRDD.scala
	core/src/main/scala/spark/rdd/ShuffledRDD.scala
	core/src/main/scala/spark/rdd/UnionRDD.scala
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/BlockManagerId.scala
	core/src/main/scala/spark/storage/BlockManagerMaster.scala
	core/src/main/scala/spark/storage/StorageLevel.scala
	core/src/main/scala/spark/util/MetadataCleaner.scala
	core/src/main/scala/spark/util/TimeStampedHashMap.scala
	core/src/test/scala/spark/storage/BlockManagerSuite.scala
	run
2012-12-20 14:53:40 -08:00
Matei Zaharia e1d7cd2276 Search for a non-loopback address in Utils.getLocalIpAddress 2012-12-08 00:33:11 -08:00
Tathagata Das 0fe2fc4d5e Merged branch mesos/master to branch dev. 2012-11-26 13:16:59 -08:00
mbautin 00f4e3ff9c Addressing Matei's comment: SPARK_LOCAL_IP environment variable 2012-11-19 11:52:10 -08:00
mbautin 1f5a7e0e64 SPARK-624: make the default local IP customizable 2012-11-15 13:57:47 -08:00
Matei Zaharia 863a55ae42 Merge remote-tracking branch 'public/master' into dev
Conflicts:
	core/src/main/scala/spark/BlockStoreShuffleFetcher.scala
	core/src/main/scala/spark/KryoSerializer.scala
	core/src/main/scala/spark/MapOutputTracker.scala
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/executor/Executor.scala
	core/src/main/scala/spark/network/Connection.scala
	core/src/main/scala/spark/network/ConnectionManagerTest.scala
	core/src/main/scala/spark/rdd/BlockRDD.scala
	core/src/main/scala/spark/rdd/NewHadoopRDD.scala
	core/src/main/scala/spark/scheduler/ShuffleMapTask.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/BlockMessage.scala
	core/src/main/scala/spark/storage/BlockStore.scala
	core/src/main/scala/spark/storage/StorageLevel.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	project/SparkBuild.scala
	run
2012-10-24 23:21:00 -07:00
Reynold Xin f66c0e9561 Changed the println to logInfo in Utils.fetchFile. 2012-10-07 01:53:24 -07:00
Reynold Xin 80f59e17e2 Fixed a bug in addFile that if the file is specified as "file:///", the
symlink is created wrong for local mode.
2012-10-07 00:54:38 -07:00
Matei Zaharia 1d44644f4f Logging tweaks 2012-09-28 23:28:16 -07:00
Matei Zaharia ae8c7d6cfa Made disk store use multiple directories, deleted ShuffleManager 2012-09-28 18:28:13 -07:00
Matei Zaharia 3d7267999d Print and track user call sites in more places in Spark 2012-09-28 17:42:00 -07:00
Matei Zaharia 051785c7e6 Several fixes to sampling issues pointed out by Henry Milner:
- takeSample was biased towards earlier partitions
- There were some range errors in takeSample
- SampledRDDs with replacement didn't produce appropriate counts
  across partitions (we took exactly frac of each one)
2012-09-25 21:46:58 -07:00
Denny ca64d16a2d When a file is downloaded, make it executable. That's neccsary for scripts (e.g. in Shark) 2012-09-17 10:08:37 -07:00
Denny b864c36a30 Dynamically adding jar files and caching fileSets. 2012-09-10 12:49:09 -07:00