How to rewrite a query which uses the ROW_NUMBER() window function in versions 5.7 or earlier before window functions were supported
e.g.,
SELECT ROW_NUMBER() OVER (PARTITION BY fieldA) AS rownum, myTable.* FROM myTable;
index 用不上的
Solution
Assuming the table has a unique or primary key field named 'id', this query produces equivalent results:
SELECT COUNT(t.id) + 1 AS rownum, myTable.*
FROM myTable
LEFT JOIN myTable AS t ON
myTable.fieldA = t.fieldA
AND myTable.id > t.id
GROUP BY
myTable.id
;
---------Oracle 不适合
-----order by 1 不同于 select 中的order by
Applies to:
Oracle Database - Enterprise Edition - Version 19.1.0.0.0 and later
Information in this document applies to any platform.
Symptoms
A query with row_number() over (order by 1) runs very slow with a bad plan.
The optimizer mode for the session is "all_rows." 10053 trace shows selectivity which is "not sane," for the switched optimizer mode "First K Rows".
SINGLE TABLE ACCESS PATH (First K Rows)
Single Table Cardinality Estimation for T[T]
SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
kkecdn: Single Table Predicate:"T"."DT">=SYSDATE@!
The computed sel: -1.5072e+00 is not sane. <<<--------
Using density: 0.010000 of col #1 as selectivity of pred does not have a sane value. The wrong sel was: -1.5072e+00
Table: T Alias: T
Card: Original: 101.000000 Rounded: 1 Computed: 1.010000 Non Adjusted: 1.010000
Changes
Cause
Incorrect syntax.
ROW_NUMBER () OVER (ORDER BY 1) is the same as ROW_NUMBER () OVER (ORDER BY NULL)
ORDER BY <NUMBER> in a window function is not the same as in a regular ORDER BY clause where the constant would mean a column number for ordering.
Solution
Replace the number with the correct column name.
row_number() over ( order by <NUMBER>) ===> row_number() over ( order by <COLUMN_NAME>)
---------ORDER BY b, c, d 要是一个unique key才能保证结果每次一样
Symptoms
When SQL statements use analytic functions ROW_NUMBER, FIRST_VALUE or LAST_VALUE it is sometimes possible that inconsistent results are produced.
The same SQL executed repeatedly on the same unchanging table data can produce different results.
This can be mistaken for an intermittent wrong results bug when in fact it could be a SQL coding issue.
Changes
This type of problem can occur when coding new SQL statements using the above analytic functions.
Cause
Consider a table T having columns A,B,C,D used in a query like the following:
SELECT a, b, c, d ROW_NUMBER() OVER (PARTITION BY a ORDER BY b) AS rn
FROM t;
The analytic clause (the OVER clause) uses the columns A and B of this table: the rows in T are grouped in partitions with the same value for column A and within each such group they are ordered by column B.
After this grouping and ordering is done, the analytic function assigns a row number (aliased as RN) starting from 1 and increasing to each row within each group. The columns A, B and RN are then returned together with the remaining columns C and D as the result.
It can sometimes occur that the result of the query is inconsistent across executions i.e. the value for RN can be assigned differently to rows with particular values for A, B, C and D.
Here is an example of why this can happen: the table T has more columns than the ones which appear in the OVER clause, i.e. columns C and D. So it is possible that rows exist with the same values for A and B which have different values for C and D. Such rows can be considered duplicates as far as the OVER clause is concerned. The ROW_NUMBER function will assign row numbers (RN) to all of them but could do so differently from execution to execution as there is no condition in the OVER clause which enforces a particular assignment. The only ordering is on column B therefore the rows with various values for C and D could be assigned row numbers in no particular way. (This is similar in concept to the lack of ordering when rows are fetched in a query but no ORDER BY clause has been speficied: rows can be returned in any order whatsoever.)
Solution
To solve this problem, the SQL needs to be coded so that a specific ordering is enforced in the OVER clause.
In this example, the OVER clause was written as follows:
SELECT a, b, c, d ROW_NUMBER() OVER (PARTITION BY a ORDER BY b, c, d) AS rn
FROM t;
i.e. the remaining columns C and D are included in the ORDER BY of the OVER clause.
In fact it is not necessary to include all the columns of the table in the ORDER BY clause. It is sufficient to include the columns of a unique (or primary) key as there can be no duplicate rows in this case.
The example in this article used ROW_NUMBER as the analytic function, however the issue also applies to FIRST_VALUE and LAST_VALUE. The difference is that instead of a row number being assigned to "duplicate" rows non-deterministically, a different row may be chosen as first or last value from those "duplicates".