CTR預估系列一覽表

  • 本篇作為目錄頁,希望可以幫助到大家快速找到希望了解的章節;
  • 感謝大家一直以來的支持和幫助,有任何意見或建議可留言或私信等交流討論;
  • CTR預估作為一個生命力頑強且不斷發展的領域,歡迎各位老師指點雅正。

關於CTR預估系列:

本系列脫胎於外講的系列講稿,在組織文章方面基於如下幾點考慮:

  • 我們在工作中發現,很多同學對於基礎模型的一些細節並不是非常了解,而這些細節可能會影響CTR策略的實施和調優。所以,我們在對主流演算法的介紹中會穿插一些基礎要點的理解與推導、內涵與外延;
  • 有些知識點本身對於CTR預估的應用關係不大,但對於完整地理解問題有幫助。對於這部分相對不緊密的部分,我們會用各類「Aside篇」來說明表示;
  • CTR預估領域本身是一個發展中的領域,創新點眾多;對於相對成熟的細分領域,我們會盡量概括並給出綱要;部分概括和綱要會更重視整體的流轉,為便於理解而酌情放棄一些細節。
  • 一些子領域中,各家觀點相左;這裡會列舉各家觀點,對於相對重要的部分,會給出筆者這邊的實驗結果。

本系列的文章預計會寫40~60篇,從模型側、特徵側、特徵工程、評估、工程&並行化、監控、問題追蹤等角度相對詳細的闡述CTR預估的各個方面。

CTR 系列的框架和目錄:

0. 問題描述和主要解法

  • CTR預估[一]: Problem Description and Main Solution

1. 模型側

模型側總圖

  • Logistic Regression
    • Naive LR及LR和統計的關係
    • LR的正則化
    • LR的Bias及其運用
    • LR的Model擴展-MLR
  • Factorization Machine
    • FM:理論(margin,objective)和實踐
    • FM的Model擴展-FFM/BFM/SFM(待填坑)
  • GBDT
    • GBDT: Preliminary - bagging&boosting, bias&variance
    • GBDT: Preliminary - 參數空間優化和函數空間優化
    • GBM和XgBoost
    • Aside: Random Forest
  • GBDT Encoder
    • GBDT Encoder
  • Deep CTR(待填坑)
  • Online Learning(待填坑)
  • Reinforcement Learning(待填坑)

2. 特徵工程

(待填坑)

3. 特徵側

(待填坑)

4. 評估

(待填坑)

5. Model Debug, Monitor and Online Predicting

(待填坑)

Reference (整理ing.)

  • Papers
    • [LR-CTR] Predicting Clicks- Estimating the Click-Through Rate for New Ads by _Microsoft_2007_WWW
    • [MLR]Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction
    • [FM]Factorization Machines
    • [FM-FTRL]Factorization Machines with Follow-The-Regularized-Leader for CTR prediction in Display Advertising
    • [FM-FFM]Field-aware Factorization Machines for CTR Prediction
    • [FM-FFM]Field-aware Factorization Machines in a Real-world Online Advertising System
    • [FM-BFM] Bayesian Factorization Machines
    • [FM-SFM] Sparse Factorization Machines for Click-through Rate Prediction
    • [GBDT Encoder]Practical Lessons from Predicting Clicks on Ads at Facebook
    • [FE]Position-Normalized Click Prediction in Search Advertising.
    • [FE]Click Through Rate Estimation for Rare Events in Online Advertising
    • [FE]SFP-Rank: Significant Frequent Pattern Analysis for Effective Ranking
    • [GBDT-GBM]greedy function approximation a gradient boosting machine
    • [GBDT-XgBoost]XGBoost: A Scalable Tree Boosting System
    • [GBDT-fastRGF]Learning Nonlinear Functions Using Regularized Greedy Forest
    • [DNN-Deep CTR]Deep CTR Prediction in Display Advertising
    • [DNN-Deep FM] DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
    • [DNN-WnD]Wide & Deep Learning for Recommender Systems
    • [DNN-FNN]Deep Learning over Multi-field Categorical Data
    • [Feature]Image Feature Learning for Cold Start Problem in Display Advertising
    • [Feature]Multimedia Features for Click Prediction of New Ads in Display Advertising
    • [Feature]The Impact of Visual Appearance on User Response in Online Display Advertising
    • [Feature]Color Harmonization
    • [Feature]Measuring colourfulness in natural images
    • [Feature]Natural color image enhancement Natural color image enhancementand evaluation algorithm based on and evaluation algorithm based onhuman visual system human visual system, 2006
  • Blogs
    • Lazy Sparse Stochastic Gradient Descent for Regularized Mutlinomial Logistic Regression
    • Regularized Regression A Bayesian point of view
    • Logistic Regression and Odds Ratio:Logistic回歸分析和比值比
    • FM:FM lecture by CMU
    • Field-aware Factorization Machines
    • 深入FFM原理與實踐
    • 程序化廣告交易中的點擊率預估
    • 機器學習中的數據清洗與特徵處理綜
    • 用戶在線廣告點擊行為預測的深度學習模型
    • Deep Learning over Multi-field Categorical Data
    • 第四範式聯合創始人陳雨強:機器學習在工業應用中的新思考
    • kaggle-2014-criteo Idiot』s
    • CTR預估中GBDT與LR融合方案
  • Books
    • The Elements of Statistical Learning

推薦閱讀:

CTR預估[二]: Algorithm-Naive Logistic Regression
CTR預估[九]: Algorithm-GBDT: Boosting Trees
計算廣告和機器學習的興起
SSP能夠給內容商帶來的好處有哪些?與AD Network的本質區別是什麼?

TAG:点击率 | 机器学习 | 计算广告学 |