[MINOR][DOCS] Use proper html tag in markdown
### What changes were proposed in this pull request? This PR fix and use proper html tag in docs ### Why are the changes needed? Fix documentation format error. ### Does this PR introduce any user-facing change? No ### How was this patch tested? N/A Closes #26302 from uncleGen/minor-doc. Authored-by: uncleGen <hustyugm@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
This commit is contained in:
parent
1bf65d97ac
commit
5f1ef544f3
|
@ -88,17 +88,17 @@ creating table, you can create a table using storage handler at Hive side, and u
|
||||||
<tr>
|
<tr>
|
||||||
<td><code>inputFormat, outputFormat</code></td>
|
<td><code>inputFormat, outputFormat</code></td>
|
||||||
<td>
|
<td>
|
||||||
These 2 options specify the name of a corresponding `InputFormat` and `OutputFormat` class as a string literal,
|
These 2 options specify the name of a corresponding <code>InputFormat</code> and <code>OutputFormat</code> class as a string literal,
|
||||||
e.g. `org.apache.hadoop.hive.ql.io.orc.OrcInputFormat`. These 2 options must be appeared in a pair, and you can not
|
e.g. <code>org.apache.hadoop.hive.ql.io.orc.OrcInputFormat</code>. These 2 options must be appeared in a pair, and you can not
|
||||||
specify them if you already specified the `fileFormat` option.
|
specify them if you already specified the <code>fileFormat</code> option.
|
||||||
</td>
|
</td>
|
||||||
</tr>
|
</tr>
|
||||||
|
|
||||||
<tr>
|
<tr>
|
||||||
<td><code>serde</code></td>
|
<td><code>serde</code></td>
|
||||||
<td>
|
<td>
|
||||||
This option specifies the name of a serde class. When the `fileFormat` option is specified, do not specify this option
|
This option specifies the name of a serde class. When the <code>fileFormat</code> option is specified, do not specify this option
|
||||||
if the given `fileFormat` already include the information of serde. Currently "sequencefile", "textfile" and "rcfile"
|
if the given <code>fileFormat</code> already include the information of serde. Currently "sequencefile", "textfile" and "rcfile"
|
||||||
don't include the serde information and you can use this option with these 3 fileFormats.
|
don't include the serde information and you can use this option with these 3 fileFormats.
|
||||||
</td>
|
</td>
|
||||||
</tr>
|
</tr>
|
||||||
|
|
|
@ -60,7 +60,7 @@ the following case-insensitive options:
|
||||||
The JDBC table that should be read from or written into. Note that when using it in the read
|
The JDBC table that should be read from or written into. Note that when using it in the read
|
||||||
path anything that is valid in a <code>FROM</code> clause of a SQL query can be used.
|
path anything that is valid in a <code>FROM</code> clause of a SQL query can be used.
|
||||||
For example, instead of a full table you could also use a subquery in parentheses. It is not
|
For example, instead of a full table you could also use a subquery in parentheses. It is not
|
||||||
allowed to specify `dbtable` and `query` options at the same time.
|
allowed to specify <code>dbtable</code> and <code>query</code> options at the same time.
|
||||||
</td>
|
</td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
<tr>
|
||||||
|
@ -72,10 +72,10 @@ the following case-insensitive options:
|
||||||
<code> SELECT <columns> FROM (<user_specified_query>) spark_gen_alias</code><br><br>
|
<code> SELECT <columns> FROM (<user_specified_query>) spark_gen_alias</code><br><br>
|
||||||
Below are a couple of restrictions while using this option.<br>
|
Below are a couple of restrictions while using this option.<br>
|
||||||
<ol>
|
<ol>
|
||||||
<li> It is not allowed to specify `dbtable` and `query` options at the same time. </li>
|
<li> It is not allowed to specify <code>dbtable</code> and <code>query</code> options at the same time. </li>
|
||||||
<li> It is not allowed to specify `query` and `partitionColumn` options at the same time. When specifying
|
<li> It is not allowed to specify <code>query</code> and <code>partitionColumn</code> options at the same time. When specifying
|
||||||
`partitionColumn` option is required, the subquery can be specified using `dbtable` option instead and
|
<code>partitionColumn</code> option is required, the subquery can be specified using <code>dbtable</code> option instead and
|
||||||
partition columns can be qualified using the subquery alias provided as part of `dbtable`. <br>
|
partition columns can be qualified using the subquery alias provided as part of <code>dbtable</code>. <br>
|
||||||
Example:<br>
|
Example:<br>
|
||||||
<code>
|
<code>
|
||||||
spark.read.format("jdbc")<br>
|
spark.read.format("jdbc")<br>
|
||||||
|
|
|
@ -280,12 +280,12 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession
|
||||||
<td><code>spark.sql.parquet.compression.codec</code></td>
|
<td><code>spark.sql.parquet.compression.codec</code></td>
|
||||||
<td>snappy</td>
|
<td>snappy</td>
|
||||||
<td>
|
<td>
|
||||||
Sets the compression codec used when writing Parquet files. If either `compression` or
|
Sets the compression codec used when writing Parquet files. If either <code>compression</code> or
|
||||||
`parquet.compression` is specified in the table-specific options/properties, the precedence would be
|
<code>parquet.compression</code> is specified in the table-specific options/properties, the precedence would be
|
||||||
`compression`, `parquet.compression`, `spark.sql.parquet.compression.codec`. Acceptable values include:
|
<code>compression</code>, <code>parquet.compression</code>, <code>spark.sql.parquet.compression.codec</code>. Acceptable values include:
|
||||||
none, uncompressed, snappy, gzip, lzo, brotli, lz4, zstd.
|
none, uncompressed, snappy, gzip, lzo, brotli, lz4, zstd.
|
||||||
Note that `zstd` requires `ZStandardCodec` to be installed before Hadoop 2.9.0, `brotli` requires
|
Note that <code>zstd</code> requires <code>ZStandardCodec</code> to be installed before Hadoop 2.9.0, <code>brotli</code> requires
|
||||||
`BrotliCodec` to be installed.
|
<code>BrotliCodec</code> to be installed.
|
||||||
</td>
|
</td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
<tr>
|
||||||
|
|
|
@ -473,8 +473,8 @@ The following configurations are optional:
|
||||||
<td>Desired minimum number of partitions to read from Kafka.
|
<td>Desired minimum number of partitions to read from Kafka.
|
||||||
By default, Spark has a 1-1 mapping of topicPartitions to Spark partitions consuming from Kafka.
|
By default, Spark has a 1-1 mapping of topicPartitions to Spark partitions consuming from Kafka.
|
||||||
If you set this option to a value greater than your topicPartitions, Spark will divvy up large
|
If you set this option to a value greater than your topicPartitions, Spark will divvy up large
|
||||||
Kafka partitions to smaller pieces. Please note that this configuration is like a `hint`: the
|
Kafka partitions to smaller pieces. Please note that this configuration is like a <code>hint</code>: the
|
||||||
number of Spark tasks will be **approximately** `minPartitions`. It can be less or more depending on
|
number of Spark tasks will be <strong>approximately</strong> <code>minPartitions</code>. It can be less or more depending on
|
||||||
rounding errors or Kafka partitions that didn't receive any new data.</td>
|
rounding errors or Kafka partitions that didn't receive any new data.</td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
<tr>
|
||||||
|
@ -482,7 +482,7 @@ The following configurations are optional:
|
||||||
<td>string</td>
|
<td>string</td>
|
||||||
<td>spark-kafka-source</td>
|
<td>spark-kafka-source</td>
|
||||||
<td>streaming and batch</td>
|
<td>streaming and batch</td>
|
||||||
<td>Prefix of consumer group identifiers (`group.id`) that are generated by structured streaming
|
<td>Prefix of consumer group identifiers (<code>group.id</code>) that are generated by structured streaming
|
||||||
queries. If "kafka.group.id" is set, this option will be ignored.</td>
|
queries. If "kafka.group.id" is set, this option will be ignored.</td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
<tr>
|
||||||
|
|
|
@ -1717,7 +1717,7 @@ Here is the compatibility matrix.
|
||||||
<td style="vertical-align: middle;">Append, Update, Complete</td>
|
<td style="vertical-align: middle;">Append, Update, Complete</td>
|
||||||
<td>
|
<td>
|
||||||
Append mode uses watermark to drop old aggregation state. But the output of a
|
Append mode uses watermark to drop old aggregation state. But the output of a
|
||||||
windowed aggregation is delayed the late threshold specified in `withWatermark()` as by
|
windowed aggregation is delayed the late threshold specified in <code>withWatermark()</code> as by
|
||||||
the modes semantics, rows can be added to the Result Table only once after they are
|
the modes semantics, rows can be added to the Result Table only once after they are
|
||||||
finalized (i.e. after watermark is crossed). See the
|
finalized (i.e. after watermark is crossed). See the
|
||||||
<a href="#handling-late-data-and-watermarking">Late Data</a> section for more details.
|
<a href="#handling-late-data-and-watermarking">Late Data</a> section for more details.
|
||||||
|
@ -2324,7 +2324,7 @@ Here are the different kinds of triggers that are supported.
|
||||||
<tr>
|
<tr>
|
||||||
<td><b>One-time micro-batch</b></td>
|
<td><b>One-time micro-batch</b></td>
|
||||||
<td>
|
<td>
|
||||||
The query will execute *only one* micro-batch to process all the available data and then
|
The query will execute <strong>only one</strong> micro-batch to process all the available data and then
|
||||||
stop on its own. This is useful in scenarios you want to periodically spin up a cluster,
|
stop on its own. This is useful in scenarios you want to periodically spin up a cluster,
|
||||||
process everything that is available since the last period, and then shutdown the
|
process everything that is available since the last period, and then shutdown the
|
||||||
cluster. In some case, this may lead to significant cost savings.
|
cluster. In some case, this may lead to significant cost savings.
|
||||||
|
|
Loading…
Reference in a new issue