標籤:

自動化賦值的 SQL 語句

不明就裡的朋友,看上一篇就明白了,欠大家一份自動化賦值語句:

黃贇:Cross Apply 與 Inner Join 的對抗?

zhuanlan.zhihu.com圖標

其文中提到了三張表的建表語句,如下:

CREATE TABLE dbo.DimUser (

UserID INT IDENTITY(1, 1)

,UserName NVARCHAR(200)

)

GO

CREATE TABLE dbo.FctOrderHeader (

OrderID BIGINT IDENTITY(1, 1)

,OrderDate DATETIME

,OrderAmount BIGINT

)

GO

CREATE TABLE dbo.FctOrderUser (

OrderID BIGINT

,UserID INT

)

GO

我們嘗試著給這三張表賦值,DimUser 表附上 10000 條數據, FctOrderHeader 附上3650000 行數據, 而 FctOrderUser 則是將這些訂單均分給這些用戶。

一般大家的做法肯定是循環處理,使用 for..each, while 等等手段,這裡介紹一種不一樣的 set 手法。

1. 使用 tally table 產生一定數量的記錄

原型是這樣的, 利用 Join 可以產生笛卡爾積,以 3 為底數,2^5 為指數,共產生 power(3,32) 次方的數據行,再用 Row_Number() 函數給每一行排序,得到連續的 n 行指定數據:

;WITH LO AS (

SELECT c

FROM (

VALUES (1),

(2),

(3)

) AS d(c)

),

L1 AS (

SELECT a.c,

b.c AS bc

FROM LO AS a

CROSS JOIN LO AS b

),

L2 AS (

SELECT a.c,

b.c AS bc

FROM L1 AS a

CROSS JOIN L1 AS b

),

L3 AS (

SELECT a.c,

b.c AS bc

FROM L2 AS a

CROSS JOIN L2 AS b

),

L4 AS (

SELECT a.c,

b.c AS bc

FROM L3 AS a

CROSS JOIN L3 AS b

),

L5 AS (

SELECT a.c,

b.c AS bc

FROM L4 AS a

CROSS JOIN L4 AS b

)

SELECT ROW_NUMBER() OVER (

ORDER BY (

SELECT NULL

)

) AS RNK

FROM L5

ORDER BY RNK

OFFSET 0 ROWS

FETCH FIRST 1000 ROWS ONLY;

在正式的環境中,自然要將它擴展為可服用的單元代碼,所以封裝為表值函數是最好的:

IF EXISTS (

SELECT TOP 1 1

FROM sys.objects WITH (NOLOCK)

WHERE UPPER(name) = UPPER(GetSeqNum)

AND type_desc = SQL_INLINE_TABLE_VALUED_FUNCTION

)

BEGIN

DROP FUNCTION dbo.GetSeqNum;

END

GO

CREATE FUNCTION dbo.GetSeqNum (

@BEGINNUM BIGINT,

@ENDNUM BIGINT

)

RETURNS TABLE

AS

RETURN

WITH LO AS (

SELECT c

FROM (

VALUES (1),

(2),

(3)

) AS d(c)

),

L1 AS (

SELECT a.c,

b.c AS bc

FROM LO AS a

CROSS JOIN LO AS b

),

L2 AS (

SELECT a.c,

b.c AS bc

FROM L1 AS a

CROSS JOIN L1 AS b

),

L3 AS (

SELECT a.c,

b.c AS bc

FROM L2 AS a

CROSS JOIN L2 AS b

),

L4 AS (

SELECT a.c,

b.c AS bc

FROM L3 AS a

CROSS JOIN L3 AS b

),

L5 AS (

SELECT a.c,

b.c AS bc

FROM L4 AS a

CROSS JOIN L4 AS b

)

SELECT ROW_NUMBER() OVER (

ORDER BY (

SELECT NULL

)

) AS RNK

FROM L5

ORDER BY RNK

OFFSET(@BEGINNUM - 1) ROWS

FETCH FIRST(@ENDNUM - @BEGINNUM) + 1 ROWS ONLY;

GO

在以上兩段 SQL 中,我們都使用了 SQL Server 2012 以來的新語法糖, offset ... fetch first rows only 來分頁

2. 給用戶表附上一定數量(比如 10000)的記錄

DECLARE @LOW INT = 1;

DECLARE @HIGH INT = 10000;

WITH LO

AS (

SELECT c

FROM (

VALUES (1),

(2),

(3)

) AS d(c)

),

L1

AS (

SELECT a.c,

b.c AS bc

FROM LO AS a

CROSS JOIN LO AS b

),

L2

AS (

SELECT a.c,

b.c AS bc

FROM L1 AS a

CROSS JOIN L1 AS b

),

L3

AS (

SELECT a.c,

b.c AS bc

FROM L2 AS a

CROSS JOIN L2 AS b

),

L4

AS (

SELECT a.c,

b.c AS bc

FROM L3 AS a

CROSS JOIN L3 AS b

),

L5

AS (

SELECT a.c,

b.c AS bc

FROM L4 AS a

CROSS JOIN L4 AS b

)

INSERT INTO dbo.DimUser (UserName)

SELECT WX_USER + LTRIM(RTRIM(CONVERT(NVARCHAR, ROW_NUMBER() OVER (

ORDER BY (

SELECT NULL

)

))))

FROM L5

ORDER BY ROW_NUMBER() OVER (

ORDER BY (

SELECT NULL

)

)

OFFSET(@LOW - 1) ROWS

FETCH FIRST(@HIGH - @LOW) + 1 ROWS ONLY

GO

3. 給訂單表附上一定數量的記錄,並按照設定日期,隨機配上訂單的數量(比如10年,每天 1000 筆訂單)

DECLARE @BEGINDATE DATETIME = 2008-05-01,

@ENDDATE DATETIME = 2018-05-01;

DECLARE @LOW INT = 1;

DECLARE @HIGH INT = 1000;

WITH LO

AS (

SELECT c

FROM (

VALUES (1),

(2),

(3)

) AS d(c)

),

L1

AS (

SELECT a.c,

b.c AS bc

FROM LO AS a

CROSS JOIN LO AS b

),

L2

AS (

SELECT a.c,

b.c AS bc

FROM L1 AS a

CROSS JOIN L1 AS b

),

L3

AS (

SELECT a.c,

b.c AS bc

FROM L2 AS a

CROSS JOIN L2 AS b

),

L4

AS (

SELECT a.c,

b.c AS bc

FROM L3 AS a

CROSS JOIN L3 AS b

),

L5

AS (

SELECT a.c,

b.c AS bc

FROM L4 AS a

CROSS JOIN L4 AS b

)

INSERT INTO dbo.FctOrderHeader (

OrderDate,

OrderAmount

)

SELECT DATEADD(DAY, DT.RNK - 1, @BEGINDATE) AS OrderDate,

RAND() * DT2.RNK * 100000

FROM dbo.GetSeqNum(1, DATEDIFF(DAY, @BEGINDATE, @ENDDATE)) DT

CROSS APPLY (

SELECT ROW_NUMBER() OVER (

ORDER BY (

SELECT NULL

)

) AS RNK

FROM L5

ORDER BY RNK ASC

OFFSET(@LOW - 1) ROWS

FETCH FIRST(@HIGH - @LOW) + 1 ROWS ONLY

) DT2

ORDER BY OrderDate,

DT2.RNK

GO

4. 按照模運算,把訂單分配給用戶,以便測試用

INSERT INTO dbo.FctOrderUser (

OrderID,

UserID

)

SELECT ord.OrderID,

Usr.UserID

FROM dbo.FctOrderHeader ord

INNER JOIN dbo.DimUser Usr ON ord.OrderID % 10000 = Usr.UserID

GO

Tally Table 作為特別有用的一個輔助手段,可以在多種場景中復用,比如數據孤島問題,即求解非連續數字或者填補連續數字等。


推薦閱讀:

如何評價cmu-db的peloton資料庫?
SQL SERVER性能優化綜述
從MapReduce的執行來看如何優化MaxCompute(原ODPS) SQL
Mysql學習筆記:索引
MySQL入門(二):基礎練習

TAG:SQL |