astral-compiler/TODOs.md

2.5 KiB

Big Todos for CIDR:

  • Clone the main Spark Optimizer Rules
    • PushProjectionThroughUnion,
    • PushProjectionThroughLimit,
    • ReorderJoin,
    • EliminateOuterJoin,
    • PushDownPredicates,
    • PushDownLeftSemiAntiJoin,
    • PushLeftSemiLeftAntiThroughJoin,
    • LimitPushDown,
    • LimitPushDownThroughWindow,
    • ColumnPruning,
    • GenerateOptimization,
    • CollapseRepartition,
    • CollapseProject,
    • OptimizeWindowFunctions,
    • CollapseWindow,
    • EliminateOffsets,
    • EliminateLimits,
    • CombineUnions,
    • OptimizeRepartition,
    • TransposeWindow,
    • NullPropagation,
    • NullDownPropagation,
    • ConstantPropagation,
    • FoldablePropagation,
    • OptimizeIn,
    • OptimizeRand,
    • ConstantFolding,
    • EliminateAggregateFilter,
    • ReorderAssociativeOperator,
    • LikeSimplification,
    • BooleanSimplification,
    • SimplifyConditionals,
    • PushFoldableIntoBranches,
    • RemoveDispensableExpressions,
    • SimplifyBinaryComparison,
    • ReplaceNullWithFalseInPredicate,
    • PruneFilters,
    • SimplifyCasts,
    • SimplifyCaseConversionExpressions,
    • RewriteCorrelatedScalarSubquery,
    • RewriteLateralSubquery,
    • EliminateSerialization,
    • RemoveRedundantAliases,
    • RemoveRedundantAggregates,
    • UnwrapCastInBinaryComparison,
    • RemoveNoopOperators,
    • OptimizeUpdateFields,
    • SimplifyExtractValueOps,
    • OptimizeCsvJsonExprs,
    • CombineConcats,
    • PushdownPredicatesAndPruneColumnsForCTEDef
  • Codegen: Output a Spark-like optimizer based on the above rules
    • Logic generation
    • Compilation pipeline to streamline testing
    • Test: Do these rules generate the same results as spark?
  • Apply the rule merge optimization
    • Generate a merged decision tree
    • Codegen: Generate a rule for the joint BDD
    • Test: Do these rules generate the same results as spark?
    • Test: Can the generated rule be plugged back into spark?
  • Apply the view maintenance optimization
    • Recursive Pattern-Matching Logic
    • Codegen: Generate a new optimizer
    • Test: Do these rules generate the same results as spark?