Matei Zaharia
b0110de5b6
Warn about user programs that try to set spark.cache.class
2012-10-06 17:27:14 -07:00
Matei Zaharia
65113b7e1b
Only group elements ten at a time into SequenceFile records in
...
saveAsObjectFile
2012-10-06 17:14:41 -07:00
Matei Zaharia
716e10ca32
Minor formatting fixes
2012-10-05 22:03:06 -07:00
Matei Zaharia
70f02fa912
Merge branch 'dev' of github.com:mesos/spark into dev
2012-10-05 22:00:22 -07:00
Andy Konwinski
a242cdd0a6
Factor subclasses of RDD out of RDD.scala into their own classes
...
in the rdd package.
2012-10-05 19:53:54 -07:00
Andy Konwinski
d7363a6b8a
Moves all files in core/src/main/scala/ that have RDD in their name
...
from that directory to a new core/src/main/scala/rdd directory.
2012-10-05 19:23:45 -07:00
Andy Konwinski
e0067da082
Moves all files in core/src/main/scala/ that have RDD in them from
...
package spark to package spark.rdd and updates all references to them.
2012-10-05 19:23:45 -07:00
Matei Zaharia
69588baf65
Cleaning up code slightly
2012-10-05 19:16:09 -07:00
root
f52bc09a34
Reduce some overly aggressive logging in connection manager
2012-10-06 01:54:39 +00:00
Matei Zaharia
e3ae98b54e
Merge pull request #247 from squito/dev
...
Dev
2012-10-05 10:27:18 -07:00
Imran Rashid
e0698f8f26
change tests to show utility of localValue
2012-10-04 23:05:42 -07:00
Imran Rashid
82a3327862
make accumulator.localValue public, add tests
...
Conflicts:
core/src/test/scala/spark/AccumulatorSuite.scala
2012-10-04 23:05:01 -07:00
Matei Zaharia
8c82f43db3
Scaladoc documentation for some core Spark functionality
2012-10-04 22:59:36 -07:00
Reynold Xin
45f4b7cc7e
Made Serializer and JavaSerializer non private.
2012-10-03 10:20:59 -07:00
Matei Zaharia
833f1d0c86
Made StorageLevel public
2012-10-03 08:27:25 -07:00
Matei Zaharia
6cf5dffc72
Make more stuff private[spark]
2012-10-02 22:28:55 -07:00
Matei Zaharia
626f701931
Merge pull request #240 from dennybritz/private_classes
...
Package-Private Classes
2012-10-02 21:24:32 -07:00
Denny
0361353a70
Make Java API abstract wrapped functions private
2012-10-02 20:02:53 -07:00
Denny
b9badcd5bd
accidentially removed trait
2012-10-02 19:35:07 -07:00
Denny
18a1faedf6
Stylistic changes and Public Accumulable and Broadcast
2012-10-02 19:28:37 -07:00
Denny
b7a913e1fa
Make dependency classes public - used by spark
2012-10-02 19:04:23 -07:00
Denny
4d9f4b01af
Make classes package private
2012-10-02 19:00:19 -07:00
Matei Zaharia
97cbd699d7
Merge branch 'dev' of github.com:mesos/spark into dev
2012-10-02 17:31:01 -07:00
Matei Zaharia
6098f7e87a
Fixed cache replacement behavior of BlockManager:
...
- Partitions that get dropped to disk will now be loaded back into RAM
after they're accessed again
- Same-RDD rule for cache replacement is now implemented (don't drop
partitions from an RDD to make room for other partitions from itself)
- Items stored as MEMORY_AND_DISK go into memory only first, instead of
being eagerly written out to disk
- MemoryStore.ensureFreeSpace is called within a lock on the writer
thread to prevent race conditions (this can still be optimized to
allow multiple concurrent calls to it but it's a start)
- MemoryStore does not accept blocks larger than its limit
2012-10-02 17:25:38 -07:00
Reynold Xin
7997585616
Added a check to make sure SPARK_MEM <= memoryPerSlave for local cluster
...
mode.
2012-10-02 15:45:25 -07:00
Reynold Xin
0898a21b95
Merge branch 'dev' of https://github.com/mesos/spark into dev
2012-10-02 13:08:01 -07:00
Matei Zaharia
22684653a5
Revert "Place Spray repo ahead of Cloudera in Maven search path"
...
This reverts commit 42e0a68082
.
2012-10-02 12:01:32 -07:00
Reynold Xin
b8cd681169
Allow whitespaces in cluster URL configuration for local cluster.
2012-10-02 11:52:12 -07:00
Matei Zaharia
42e0a68082
Place Spray repo ahead of Cloudera in Maven search path
2012-10-02 11:37:19 -07:00
Matei Zaharia
b9fb8d6463
Include date in folder name for Spark local dir.
2012-10-01 15:55:16 -07:00
Matei Zaharia
bc881e4798
Merge branch 'dev' of github.com:mesos/spark into dev
2012-10-01 15:21:56 -07:00
Matei Zaharia
802aa8aef9
Some bug fixes and logging fixes for broadcast.
2012-10-01 15:20:42 -07:00
Reynold Xin
f264153162
Fixed #232 : DirectBuffer's cleaner was empty and Spark tried to invoke
...
clean on it.
2012-10-01 14:07:34 -07:00
Matei Zaharia
3b348f909d
Improve log messages from BlockManager
2012-10-01 12:01:38 -07:00
Matei Zaharia
53f90d0f0e
Use underscores instead of colons in RDD IDs
2012-10-01 10:48:53 -07:00
Matei Zaharia
2314132d57
Added a (failing) test for LRU with MEMORY_AND_DISK.
2012-09-30 22:52:16 -07:00
Matei Zaharia
3128c57f90
Simplified Class / ClassLoader test
2012-09-30 21:48:27 -07:00
Matei Zaharia
83143f9a5f
Fixed several bugs that caused weird behavior with files in spark-shell:
...
- SizeEstimator was following through a ClassLoader field of Hadoop
JobConfs, which referenced the whole interpreter, Scala compiler, etc.
Chaos ensued, giving an estimated size in the tens of gigabytes.
- Broadcast variables in local mode were only stored as MEMORY_ONLY and
never made accessible over a server, so they fell out of the cache when
they were deemed too large and couldn't be reloaded.
2012-09-30 21:19:39 -07:00
Matei Zaharia
fd0374b9de
Comment
2012-09-29 21:43:06 -07:00
Matei Zaharia
5718cef2a4
Removed Logging trait from CoalescedRDD since we don't log anything
2012-09-29 21:40:43 -07:00
Matei Zaharia
143ef4f90d
Added a CoalescedRDD class for reducing the number of partitions in an RDD.
2012-09-29 21:30:52 -07:00
Matei Zaharia
ebd52347b5
Merge branch 'dev' of github.com:mesos/spark into dev
2012-09-29 20:22:31 -07:00
Matei Zaharia
9b326d01e9
Made BlockManager unmap memory-mapped files when necessary to reduce the
...
number of open files. Also optimized sending of disk-based blocks.
2012-09-29 20:21:54 -07:00
Matei Zaharia
2f11e3c285
Merge pull request #227 from JoshRosen/fix/distinct_numsplits
...
Allow controlling number of splits in distinct().
2012-09-28 23:57:24 -07:00
Josh Rosen
8654165e69
Use null as dummy value in distinct().
2012-09-28 23:55:17 -07:00
Josh Rosen
37c199bbb0
Allow controlling number of splits in distinct().
2012-09-28 23:44:19 -07:00
Matei Zaharia
56dcad5936
Don't create a Cache in SparkEnv because we don't use it
2012-09-28 23:40:56 -07:00
Matei Zaharia
1d44644f4f
Logging tweaks
2012-09-28 23:28:16 -07:00
Matei Zaharia
815d6bd69a
Renamed subdirs option
2012-09-28 19:02:41 -07:00
Matei Zaharia
e54e1d7043
Made subdirs per local dir configurable, and reduced lock usage a bit
2012-09-28 19:00:50 -07:00