spark-instrumented-optimizer/python/pyspark/pandas/_typing.py
Takuya UESHIN a98c8ae57d [SPARK-35944][PYTHON] Introduce Name and Label type aliases
### What changes were proposed in this pull request?

Introduce `Name` and `Label` type aliases to distinguish what is expected instead of `Any` or `Union[Any, Tuple]`.

- `Label`: `Tuple[Any, ...]`
  Internal expression for name-like metadata, like `index_names`, `column_labels`, and `column_label_names` in `InternalFrame`, and similar internal structures.
- `Name`: `Union[Any, Label]`
  External expression for user-facing names, which can be scalar values or tuples.

### Why are the changes needed?

Currently `Any` or `Union[Any, Tuple]` is used for name-like types, but type aliases should be used to distinguish what is expected clearly.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #33159 from ueshin/issues/SPARK-35944/name_and_label.

Authored-by: Takuya UESHIN <ueshin@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-07-01 09:40:07 +09:00

52 lines
1.8 KiB
Python

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import datetime
import decimal
from typing import Any, Tuple, TypeVar, Union, TYPE_CHECKING
import numpy as np
from pandas.api.extensions import ExtensionDtype
if TYPE_CHECKING:
from pyspark.pandas.base import IndexOpsMixin # noqa: F401 (SPARK-34943)
from pyspark.pandas.frame import DataFrame # noqa: F401 (SPARK-34943)
from pyspark.pandas.generic import Frame # noqa: F401 (SPARK-34943)
from pyspark.pandas.indexes.base import Index # noqa: F401 (SPARK-34943)
from pyspark.pandas.series import Series # noqa: F401 (SPARK-34943)
# TypeVars
T = TypeVar("T")
FrameLike = TypeVar("FrameLike", bound="Frame")
IndexOpsLike = TypeVar("IndexOpsLike", bound="IndexOpsMixin")
# Type aliases
Scalar = Union[
int, float, bool, str, bytes, decimal.Decimal, datetime.date, datetime.datetime, None
]
# TODO: use the actual type parameters.
Label = Tuple[Any, ...]
Name = Union[Any, Label]
Axis = Union[int, str]
Dtype = Union[np.dtype, ExtensionDtype]
DataFrameOrSeries = Union["DataFrame", "Series"]
SeriesOrIndex = Union["Series", "Index"]