自動化賦值的 SQL 語句
不明就裡的朋友,看上一篇就明白了,欠大家一份自動化賦值語句:
黃贇:Cross Apply 與 Inner Join 的對抗其文中提到了三張表的建表語句,如下:
CREATE TABLE dbo.DimUser (
UserID INT IDENTITY(1, 1)
,UserName NVARCHAR(200)
)
GO
CREATE TABLE dbo.FctOrderHeader (
OrderID BIGINT IDENTITY(1, 1)
,OrderDate DATETIME
,OrderAmount BIGINT
)
GO
CREATE TABLE dbo.FctOrderUser (
OrderID BIGINT
,UserID INT
)
GO
我們嘗試著給這三張表賦值,DimUser 表附上 10000 條數據, FctOrderHeader 附上3650000 行數據, 而 FctOrderUser 則是將這些訂單均分給這些用戶。
一般大家的做法肯定是循環處理,使用 for..each, while 等等手段,這裡介紹一種不一樣的 set 手法。
1. 使用 tally table 產生一定數量的記錄
原型是這樣的, 利用 Join 可以產生笛卡爾積,以 3 為底數,2^5 為指數,共產生 power(3,32) 次方的數據行,再用 Row_Number() 函數給每一行排序,得到連續的 n 行指定數據:
;WITH LO AS (
SELECT c
FROM (
VALUES (1),
(2),
(3)
) AS d(c)
),
L1 AS (
SELECT a.c,
b.c AS bc
FROM LO AS a
CROSS JOIN LO AS b
),
L2 AS (
SELECT a.c,
b.c AS bc
FROM L1 AS a
CROSS JOIN L1 AS b
),
L3 AS (
SELECT a.c,
b.c AS bc
FROM L2 AS a
CROSS JOIN L2 AS b
),
L4 AS (
SELECT a.c,
b.c AS bc
FROM L3 AS a
CROSS JOIN L3 AS b
),
L5 AS (
SELECT a.c,
b.c AS bc
FROM L4 AS a
CROSS JOIN L4 AS b
)
SELECT ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
) AS RNK
FROM L5
ORDER BY RNK
OFFSET 0 ROWS
FETCH FIRST 1000 ROWS ONLY;
在正式的環境中,自然要將它擴展為可服用的單元代碼,所以封裝為表值函數是最好的:
IF EXISTS (
SELECT TOP 1 1
FROM sys.objects WITH (NOLOCK)
WHERE UPPER(name) = UPPER(GetSeqNum)
AND type_desc = SQL_INLINE_TABLE_VALUED_FUNCTION
)
BEGIN
DROP FUNCTION dbo.GetSeqNum;
END
GO
CREATE FUNCTION dbo.GetSeqNum (
@BEGINNUM BIGINT,
@ENDNUM BIGINT
)
RETURNS TABLE
AS
RETURN
WITH LO AS (
SELECT c
FROM (
VALUES (1),
(2),
(3)
) AS d(c)
),
L1 AS (
SELECT a.c,
b.c AS bc
FROM LO AS a
CROSS JOIN LO AS b
),
L2 AS (
SELECT a.c,
b.c AS bc
FROM L1 AS a
CROSS JOIN L1 AS b
),
L3 AS (
SELECT a.c,
b.c AS bc
FROM L2 AS a
CROSS JOIN L2 AS b
),
L4 AS (
SELECT a.c,
b.c AS bc
FROM L3 AS a
CROSS JOIN L3 AS b
),
L5 AS (
SELECT a.c,
b.c AS bc
FROM L4 AS a
CROSS JOIN L4 AS b
)
SELECT ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
) AS RNK
FROM L5
ORDER BY RNK
OFFSET(@BEGINNUM - 1) ROWS
FETCH FIRST(@ENDNUM - @BEGINNUM) + 1 ROWS ONLY;
GO
在以上兩段 SQL 中,我們都使用了 SQL Server 2012 以來的新語法糖, offset ... fetch first rows only 來分頁2. 給用戶表附上一定數量(比如 10000)的記錄
DECLARE @LOW INT = 1;
DECLARE @HIGH INT = 10000;
WITH LO
AS (
SELECT c
FROM (
VALUES (1),
(2),
(3)
) AS d(c)
),
L1
AS (
SELECT a.c,
b.c AS bc
FROM LO AS a
CROSS JOIN LO AS b
),
L2
AS (
SELECT a.c,
b.c AS bc
FROM L1 AS a
CROSS JOIN L1 AS b
),
L3
AS (
SELECT a.c,
b.c AS bc
FROM L2 AS a
CROSS JOIN L2 AS b
),
L4
AS (
SELECT a.c,
b.c AS bc
FROM L3 AS a
CROSS JOIN L3 AS b
),
L5
AS (
SELECT a.c,
b.c AS bc
FROM L4 AS a
CROSS JOIN L4 AS b
)
INSERT INTO dbo.DimUser (UserName)
SELECT WX_USER + LTRIM(RTRIM(CONVERT(NVARCHAR, ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
))))
FROM L5
ORDER BY ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
)
OFFSET(@LOW - 1) ROWS
FETCH FIRST(@HIGH - @LOW) + 1 ROWS ONLY
GO
3. 給訂單表附上一定數量的記錄,並按照設定日期,隨機配上訂單的數量(比如10年,每天 1000 筆訂單)
DECLARE @BEGINDATE DATETIME = 2008-05-01,
@ENDDATE DATETIME = 2018-05-01;
DECLARE @LOW INT = 1;
DECLARE @HIGH INT = 1000;
WITH LO
AS (
SELECT c
FROM (
VALUES (1),
(2),
(3)
) AS d(c)
),
L1
AS (
SELECT a.c,
b.c AS bc
FROM LO AS a
CROSS JOIN LO AS b
),
L2
AS (
SELECT a.c,
b.c AS bc
FROM L1 AS a
CROSS JOIN L1 AS b
),
L3
AS (
SELECT a.c,
b.c AS bc
FROM L2 AS a
CROSS JOIN L2 AS b
),
L4
AS (
SELECT a.c,
b.c AS bc
FROM L3 AS a
CROSS JOIN L3 AS b
),
L5
AS (
SELECT a.c,
b.c AS bc
FROM L4 AS a
CROSS JOIN L4 AS b
)
INSERT INTO dbo.FctOrderHeader (
OrderDate,
OrderAmount
)
SELECT DATEADD(DAY, DT.RNK - 1, @BEGINDATE) AS OrderDate,
RAND() * DT2.RNK * 100000
FROM dbo.GetSeqNum(1, DATEDIFF(DAY, @BEGINDATE, @ENDDATE)) DT
CROSS APPLY (
SELECT ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
) AS RNK
FROM L5
ORDER BY RNK ASC
OFFSET(@LOW - 1) ROWS
FETCH FIRST(@HIGH - @LOW) + 1 ROWS ONLY
) DT2
ORDER BY OrderDate,
DT2.RNK
GO
4. 按照模運算,把訂單分配給用戶,以便測試用
INSERT INTO dbo.FctOrderUser (
OrderID,
UserID
)
SELECT ord.OrderID,
Usr.UserID
FROM dbo.FctOrderHeader ord
INNER JOIN dbo.DimUser Usr ON ord.OrderID % 10000 = Usr.UserID
GO
Tally Table 作為特別有用的一個輔助手段,可以在多種場景中復用,比如數據孤島問題,即求解非連續數字或者填補連續數字等。
推薦閱讀:
※如何評價cmu-db的peloton資料庫?
※SQL SERVER性能優化綜述
※從MapReduce的執行來看如何優化MaxCompute(原ODPS) SQL
※Mysql學習筆記:索引
※MySQL入門(二):基礎練習
TAG:SQL |