2.5 KiB
2.5 KiB
Big Todos for CIDR:
- Clone the main Spark Optimizer Rules
- PushProjectionThroughUnion,
- PushProjectionThroughLimit,
- ReorderJoin,
- EliminateOuterJoin,
- PushDownPredicates,
- PushDownLeftSemiAntiJoin,
- PushLeftSemiLeftAntiThroughJoin,
- LimitPushDown,
- LimitPushDownThroughWindow,
- ColumnPruning,
- GenerateOptimization,
- CollapseRepartition,
- CollapseProject,
- OptimizeWindowFunctions,
- CollapseWindow,
- EliminateOffsets,
- EliminateLimits,
- CombineUnions,
- OptimizeRepartition,
- TransposeWindow,
- NullPropagation,
- NullDownPropagation,
- ConstantPropagation,
- FoldablePropagation,
- OptimizeIn,
- OptimizeRand,
- ConstantFolding,
- EliminateAggregateFilter,
- ReorderAssociativeOperator,
- LikeSimplification,
- BooleanSimplification,
- SimplifyConditionals,
- PushFoldableIntoBranches,
- RemoveDispensableExpressions,
- SimplifyBinaryComparison,
- ReplaceNullWithFalseInPredicate,
- PruneFilters,
- SimplifyCasts,
- SimplifyCaseConversionExpressions,
- RewriteCorrelatedScalarSubquery,
- RewriteLateralSubquery,
- EliminateSerialization,
- RemoveRedundantAliases,
- RemoveRedundantAggregates,
- UnwrapCastInBinaryComparison,
- RemoveNoopOperators,
- OptimizeUpdateFields,
- SimplifyExtractValueOps,
- OptimizeCsvJsonExprs,
- CombineConcats,
- PushdownPredicatesAndPruneColumnsForCTEDef
- Codegen: Output a Spark-like optimizer based on the above rules
- Logic generation
- Compilation pipeline to streamline testing
- Test: Do these rules generate the same results as spark?
- Apply the rule merge optimization
- Generate a merged decision tree
- Codegen: Generate a rule for the joint BDD
- Test: Do these rules generate the same results as spark?
- Test: Can the generated rule be plugged back into spark?
- Apply the view maintenance optimization
- Recursive Pattern-Matching Logic
- Codegen: Generate a new optimizer
- Test: Do these rules generate the same results as spark?