spark-instrumented-optimizer/.github/workflows
Kent Yao 16e2faadac [SPARK-34944][SQL][TESTS] Replace bigint with int for web_returns and store_returns in TPCDS tests to employ correct data type
### What changes were proposed in this pull request?

According to  http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-ds_v2.9.0.pdf

```
 2.2.2 Datatype
2.2.2.1 Each column employs one of the following datatypes:
a) Identifier means that the column shall be able to hold any key value generated for that column.
b) Integer means that the column shall be able to exactly represent integer values (i.e., values in increments of
1) in the range of at least ( − 2n − 1) to (2n − 1 − 1), where n is 64.
c) Decimal(d, f) means that the column shall be able to represent decimal values up to and including d digits,
of which f shall occur to the right of the decimal place; the values can be either represented exactly or
interpreted to be in this range.
d) Char(N) means that the column shall be able to hold any string of characters of a fixed length of N.
Comment: If the string that a column of datatype char(N) holds is shorter than N characters, then trailing
spaces shall be stored in the database or the database shall automatically pad with spaces upon retrieval such
that a CHAR_LENGTH() function will return N.
e) Varchar(N) means that the column shall be able to hold any string of characters of a variable length with a
maximum length of N. Columns defined as "varchar(N)" may optionally be implemented as "char(N)".
f) Date means that the column shall be able to express any calendar day between January 1, 1900 and
December 31, 2199.
2.2.2.2 The datatypes do not correspond to any specific SQL-standard datatype. The definitions are provided to
highlight the properties that are required for a particular column. The benchmark implementer may employ any internal representation or SQL datatype that meets those requirements.
```

This PR proposes that we use int for identifiers instead of bigint to reach a compromise with TPC-DS Standard Specification.

After this PR, the field schemas are now consistent with those DDLs in the `tpcds.sql` from tpc-ds tool kit, see https://gist.github.com/yaooqinn/b9978a77bbf4f871a95d6a9103019907

### Why are the changes needed?

reach a compromise with TPC-DS Standard Specification

### Does this PR introduce _any_ user-facing change?

no test only

### How was this patch tested?

test only

Closes #32037 from yaooqinn/SPARK-34944.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
2021-04-13 11:28:35 +08:00
..
benchmark.yml [SPARK-35002][INFRA][FOLLOW-UP] Use localhost instead of 127.0.0.1 at SPARK_LOCAL_IP in GA builds 2021-04-09 16:39:20 +08:00
build_and_test.yml [SPARK-34944][SQL][TESTS] Replace bigint with int for web_returns and store_returns in TPCDS tests to employ correct data type 2021-04-13 11:28:35 +08:00
cancel_duplicate_workflow_runs.yml [SPARK-34053][INFRA][FOLLOW-UP] Disables canceling push/schedule workflows 2021-01-12 23:10:20 +09:00
labeler.yml [SPARK-33282] Migrate from deprecated probot autolabeler to GitHub labeler action 2020-11-05 16:10:52 +09:00
publish_snapshot.yml [BUILD][MINOR] Do not publish snapshots from forks 2020-12-23 00:22:42 +09:00
stale.yml [SPARK-30173] Tweak stale PR message 2020-01-07 08:34:59 -06:00
test_report.yml [SPARK-34874][INFRA] Recover test reports for failed GA builds 2021-03-26 18:12:28 +09:00