用 Python 實現簡單的 Markdown 轉換器
來自專欄茄子的 Python 學習之路5 人贊了文章
今天心血來潮,寫了一個 Markdown 轉換器。
import os, re,webbrowsertext = # TextHeader ## Header1 List - 1 - 2 - 3 > **quote** 》 quote2 ## Header2 1. *斜體* 2. [@以茄之名](https://www.zhihu.com/people/e4f87c3476a926c1e2ef51b4fcd18fa3) 3、 ![](https://pic4.zhimg.com/v2-8560440c136c746730a63813ed701f52_is.jpg) ## Header3 `*[文章地址](https://zhuanlan.zhihu.com/p/39742445)*` ·**code1**· - [x]是否點贊
程序開頭先處理一些行內的語法,比如 code、strong、i 等,用正則直接替換:
text = re.sub(re.compile(([`·])([^`·]+)[`·]), r<code>2</code>, text)text = re.sub(re.compile(**([^*]+)**), r<strong>1</strong>, text)text = re.sub(re.compile(([^*])*([^*]+)*), r1<i>2</i>, text)
接著是複雜一點的圖片和鏈接:
text = re.sub(re.compile(([^!])[([^]]+)](([^)]+))), r1<a href="3" target="_blank">2</a>, text)text = re.sub(re.compile(![([^]]*)](([^)]+))), r<img src="2" >, text)
接著就處理其他的語法,先把文本按每一行分開:
lines = text.split(
)html = list_flag =
處理列表和待辦事項的問題:
for line in lines: line = line.strip( ) if re.match(- [[ x]], line): print(matched) p_html = if re.match(- [x], line): p_html = checked="checked" line = re.sub(- [[ x]], , line) html += <label class="cssCheckbox"> <input type="checkbox" %s /> <span></span>%s </label> % (p_html, line)
因為有序列表和無序列表的區別是頭尾的ol和ul,所以要用 list_flag 變數來判斷
elif re.match([+-*] , line): if list_flag == : html += <ul>
list_flag = ul line = re.sub([+-*] , , line) html += <li>%s</li>
% (line) elif re.match([d]+[.、] , line): if list_flag == : list_flag = ol html += <ol>
line = re.sub([d]+[.、] , , line) html += <li>%s</li>
% (line)
處理完後處理其他的語法:
else: if list_flag != : html += </%s>
% list_flag list_flag = if re.match(#+, line): well = re.match(#+, line).group().count(#) line = re.sub(#+, , line) html += <h%i>%s</h%i>
% (well, line, well) elif re.match([>》 ], line): line = re.sub(^s*[>》 ], , line) html += <blockquote>%s</blockquote>
% (line) # elif re.match([>》 ], line): # line = re.sub(^s*[>》 ], , line) # html += <blockquote>%s</blockquote>
% (line) else: html += line
這裡我稍微修改了一點,讓 > 和 》 都可以轉換成引用,主要是切換中英文標點太難了。
然後就是添加 CSS,自己改了一點馬克飛象的進去,因為他的引用做得很漂亮:
with open(markdown.html, w, encoding=utf-8)as f: f.write(<html><head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><style>body{ margin: 0 auto; font-family: "ubuntu", "Tahoma", "Microsoft YaHei", arial,sans-serif; color: #444444; line-height: 1; padding: 30px;} input[type=checkbox]+span::before { content: ;/*不換行空格*/ display: inline-block; vertical-align: 0.2em; width:0.8em; height:0.8em; margin-right: .2em; border-radius:.2em; background: silver;/*複選框的背景色*/ text-indent:0.15em; line-height: 0.65;}input[type=checkbox] { /*隱藏掉原先實際的 checkbox 框,之所以沒用 display:none; 這種簡單直接的方式,是因為這種方法會把它從鍵盤 tab 鍵切換焦點的隊列中完全刪除*/ position: absolute; clip:rect(0,0,0,0);}input[type=checkbox]:checked+span::before { content:u221a; /*對號的 Unicode字元*/ background: yellowgreen;/*對號的顏色*/}img { max-width: 100%;}@media screen and (min-width: 1000px) { body { width: 842px; margin: 10px auto; } }h1, h2, h3, h4 { color: #111111; font-weight: 400; margin-top: 1em;}h1, h2, h3, h4, h5 { font-family: Georgia, Palatino, serif;}h1, h2, h3, h4, h5, dl{ margin-bottom: 16px; padding: 0;}p { margin-top: 8px; margin-bottom: 3px;}h1 { font-size: 48px; line-height: 54px;}h2 { font-size: 36px; line-height: 42px;}h1, h2 { border-bottom: 1px solid #EFEAEA; padding-bottom: 10px;}h3 { font-size: 24px; line-height: 30px;}h4 { font-size: 21px; line-height: 26px;}h5 { font-size: 18px; line-height: 23px;}a { color: #0099ff; margin: 0 2px; padding: 0; vertical-align: baseline; text-decoration: none;}a:hover { text-decoration: none; color: #ff6600;}a:visited { /*color: purple;*/}ul, ol { padding: 0; padding-left: 18px; margin: 0;}li { line-height: 24px;}p, ul, ol { font-size: 16px; line-height: 24px;}ol ol, ul ol { list-style-type: lower-roman;}code, pre { font-family: Consolas, Monaco, Andale Mono, monospace; background-color:#f7f7f7; color: inherit;}code { font-family: Consolas, Monaco, Andale Mono, monospace; margin: 0 2px;}pre { font-family: Consolas, Monaco, Andale Mono, monospace; line-height: 1.7em; overflow: auto; padding: 6px 10px; border-left: 5px solid #6CE26C;}pre > code { font-family: Consolas, Monaco, Andale Mono, monospace; border: 0; display: inline; max-width: initial; padding: 0; margin: 0; overflow: initial; line-height: 1.6em; font-size: .95em; white-space: pre; background: 0 0;}code { color: #666555;}aside { display: block; float: right; width: 390px;}blockquote { border-left-width: 10px; background-color: rgba(102,128,153,0.05); border-top-right-radius: 5px; border-bottom-right-radius: 5px; padding: 15px 20px;}blockquote cite { font-size:14px; line-height:20px; color:#bfbfbf;}blockquote cite:before { content: 2014 0A0;}blockquote p { color: #666;}hr { text-align: left; color: #999; height: 2px; padding: 0; margin: 16px 0; background-color: #e7e7e7; border: 0 none;}dl { padding: 0;}dl dt { padding: 10px 0; margin-top: 16px; font-size: 1em; font-style: italic; font-weight: bold;}dl dd { padding: 0 16px; margin-bottom: 16px;}dd { margin-left: 0;}table { *border-collapse: collapse; /* IE7 and lower */ border-spacing: 0; width: 100%;}table { border: solid #ccc 1px;}table thead { background: #f7f7f7;}table thead tr:hover { background: #f7f7f7}table tr:hover { background: #fbf8e9; -o-transition: all 0.1s ease-in-out; -webkit-transition: all 0.1s ease-in-out; -moz-transition: all 0.1s ease-in-out; -ms-transition: all 0.1s ease-in-out; transition: all 0.1s ease-in-out;}table td, .table th { border-left: 1px solid #ccc; border-top: 1px solid #ccc; padding: 10px; text-align: left;}table th { border-top: none; text-shadow: 0 1px 0 rgba(255,255,255,.5); padding: 5px; border-left: 1px solid #ccc;}table td:first-child, table th:first-child { border-left: none;}</style></head>) f.write(html) f.write(</html>)
用 Chrome 打開網頁:
webbrowser.get(C:/Program Files (x86)/CentBrowser/Application/chrome.exe %s).open( file:///+os.getcwd()+/markdown.html)
話說這裡也是個坑,系統自帶的 Edge 一直打開失敗,用那個註冊器註冊 Chrome 也沒辦法用 ,最後還是在外網找到了解決方案。
最後的效果:
推薦閱讀:
※做了個簡陋的陰陽師懸賞任務查詢網站
※詳解Python項目開發時自定義模塊中對象的導入和使用
※入門:用Python抓取網頁上的免費賬號(六)
※Supervisor是一枚大坑
※網路爬蟲