FDB Notes and Lazy Transaction Notes

This commit is contained in:
Oliver Kennedy 2019-09-04 23:46:54 -04:00
parent f19dc629ed
commit beb379d998
Signed by: okennedy
GPG key ID: 3E5F9B3ABD3FDB60
2 changed files with 78 additions and 0 deletions

View file

@ -4,6 +4,16 @@ title: Functional Data Structures
date: Sept. 3
---
<!-- 2019 by OK
Most of the deck works, but some of the do-it-yourself examples need work.
- The set example didn't happen... can probably cut that.
- Example 3 needs more lead-up. In particular, it would be good to emphasize how lazy evaluation works (i.e., lazy references).
- It might be helpful to rephrase example 3 in terms of REPLACE rather than APPEND.
- It might be helpful to have a summary takeaway --- Lazy evaluation allows limited mutability, BUT in a very controlled, and **monotone** way.
Parts of the amortized analysis fell a bit flat. It might be useful to start off with a bit more of a motivating example. In particular, a timeline of 'time' for each operation might be helpful, as well as an emphasis on computing "Average" rather than "Worst-case" behavior.
-->
<section>
<section>
<p><span class="fragment highlight-red">Mutable</span> vs Immutable Data</p>

View file

@ -0,0 +1,68 @@
---
xtemplate: templates/cse662_2019_slides.erb
title: Lazy Transactions
date: Sept. 5
---
Questions
- Why are timestamps needed?
- How does the system figure out the read/write sets?
- The Now phase is also required to determine the write set of the transaction: Which values get written to may be determined by the current state of the database.
- Strengths/Weaknesses of lazy transactions on bursty workloads. Where do/do not they work.
- How to create spatial locality in transactions scheduled adjacently
- Read latency: Really really bursty read costs. How to avoid?
- Normal distribution microbenchmark: Creating synthetic spatial locality as well.
- Overview
- Transactions
- Declarative vs pl/X
- Read/Write Sets + Code
- Object here = record
- Challenges
- Parallel Execution
- Recap:
- OCC
- Locking
- Guarantees required by Commit
- Xact guaranteed not to abort
- Deadlock / OCC conflict
- Constraint violation (Uniqueness, Domain, FK, etc...)
- Model as code that reads a set of values and returns Y/N
- Xact's effects are guaranteed to be visible to subsequent xacts
- Simple Model: Sequential Execution
- How to improve:
- Split commit check vs actual execution (Now vs Later)
- Collapse commit checks, excute in background
- Except what if we need to read a value?
- Naive: Force log execution up to the point
- Graph model: Dependencies
- Total order vs Partial order
- Implementing
- List of Unexecuted Transactions
- Stored w/ Read Set + Write Set
- [[[ How do you get the Read/Write sets out? ]]]
- Might need to generate the read/write sets during the now phase.
- Key/value map / Versioned [[[ why? ]]]
- Value can be a reference to a transaction
- Like lazy excution: lazy value overwritten after transaction executes.
- Forcing a read (e.g., during the now phase) forces execution of everything before
- [[[ Why is this a problem? ]]]
- Burstiness! Some reads might take absurdly long
- [[[ How do we fix this? ]]]
- Trade-off: Limit length of read chains
- [[[ How do we do this cheaply? ]]]
- Don't trace the graph depth (O(N) in the maximum graph depth)
- Keep a "depth" counter with each written value.
- [[[ How do you allocate worker threads ]]]
- Single-threaded sticky: Already working with lots of contentious data, with lots of very fast operations
- Multi-threaded workers: Natural parallelism exposed through read/write sets
- Minimizing work during "Now" phase
- Optimize Uniqueness Checks by only consulting index (not actual record) --- does the key exist in the map?
- Minimizing overhead
- Move fast, high-contention operations to the now phase
- Spatial locality
- [[[ The authors claim that the system can absorb bursty workloads. When is this true / not true? ]]]
- Graph review:
- Microbenchmark: Normal distribution creates spatial locality
- External Read Latency CDF : 10s reads vs 1ms!!!