演算法集錦(7)| 實用代碼 | Google Colab使用及配置技巧
更多AI資訊,關注微信公眾號:九三智能控
Google Colaboratory是Jupyter的一個專用伺服器,允許用戶免費使用12個小時(重啟後可以繼續使用)。用戶可以利用Google Colab測試Python代碼,對於進行機器學習和數據科學研究的小夥伴是個非常實用的工具。
今天,我們介紹一些使用使用和配置Google Colab的方法及小技巧。
配置與連接Google Drive
# Create drive folder!mkdir -p drive!apt-get install -y -qq software-properties-common python-software-properties module-init-tools!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null!apt-get update -qq 2>&1 > /dev/null!apt-get -y install -qq google-drive-ocamlfuse fuse# Authorize instance to use Google Drivefrom google.colab import authauth.authenticate_user()from oauth2client.client import GoogleCredentialscreds = GoogleCredentials.get_application_default()import getpass!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URLvcode = getpass.getpass()!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}# Connect drive to folder!google-drive-ocamlfuse drive
從實例上傳和下載文件
from google.colab import filesdef upload(path): uploaded = files.upload() with open(path,』wb』) as fp: fp.write(uploaded[uploaded.keys()[0]])def download(path): files.download(path)
使用
Facet源代碼:https://github.com/PAIR-code/facets
1import shutil 2if os.path.exists(./facets): 3 shutil.rmtree("./facets") 4!git clone https://github.com/PAIR-code/facets 5!jupyter nbextension install facets/facets-dist/ 6import sys 7import os 8sys.path.append(os.path.abspath(./facets/facets_overview/python/)) 9from generic_feature_statistics_generator import GenericFeatureStatisticsGenerator10import base6411class FacetsOverview(object):12 def __init__(self, df_train, df_test):13 gfsg = GenericFeatureStatisticsGenerator()14 self._proto = gfsg.ProtoFromDataFrames([{name: train, table: df_train},15 {name: test, table: df_test}])16 def _repr_html_(self):17 protostr = base64.b64encode(self._proto.SerializeToString()).decode("utf-8")18 HTML_TEMPLATE = """<link rel="import" href="/nbextensions/facets-dist/facets-jupyter.html" >19 <facets-overview id="elem"></facets-overview>20 <script>21 document.querySelector("#elem").protoInput = "{protostr}";22 </script>"""23 html = HTML_TEMPLATE.format(protostr=protostr)24 return html25class FacetsDive(object):26 def __init__(self, data):27 self._data = data28 self.height = 100029 def _repr_html_(self):30 HTML_TEMPLATE = """<link rel="import" href="/nbextensions/facets-dist/facets-jupyter.html" >31 <facets-dive id="elem" height="{height}"></facets-dive>32 <script>33 document.querySelector("#elem").data = {data};34 </script>"""35 html = HTML_TEMPLATE.format(data=self._data.to_json(orient=records), height=self.height)36 return html
上述定義的函數使用方法如下:
FacetsOverview(df_train, df_test)FacetsDive(df_train.head(500))
在代碼中運行TensorBord
LOG_DIR = /tmpget_ipython().system_raw( tensorboard --logdir {} --host 0.0.0.0 --port 6006 & .format(LOG_DIR))! wget -c -nc https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip! unzip -o ngrok-stable-linux-amd64.zipget_ipython().system_raw(./ngrok http 6006 &)! curl -s http://localhost:4040/api/tunnels | python3 -c "import sys, json; print(json.load(sys.stdin)[tunnels][0][public_url])"
連接代碼與ssh
1#Generate root password 2import random, string 3password = .join(random.choice(string.ascii_letters + string.digits) for i in range(20)) 4#Download ngrok 5! wget -q -c -nc https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip 6! unzip -qq -n ngrok-stable-linux-amd64.zip 7#Setup sshd 8! apt-get install -qq -o=Dpkg::Use-Pty=0 openssh-server pwgen > /dev/null 9#Set root password10! echo root:$password | chpasswd11! mkdir -p /var/run/sshd12! echo "PermitRootLogin yes" >> /etc/ssh/sshd_config13! echo "PasswordAuthentication yes" >> /etc/ssh/sshd_config14! echo "LD_LIBRARY_PATH=/usr/lib64-nvidia" >> /root/.bashrc15! echo "export LD_LIBRARY_PATH" >> /root/.bashrc16#Run sshd17get_ipython().system_raw(/usr/sbin/sshd -D &)18#Ask token19print("Copy authtoken from https://dashboard.ngrok.com/auth")20import getpass21authtoken = getpass.getpass()22#Create tunnel23get_ipython().system_raw(./ngrok authtoken $authtoken && ./ngrok tcp 22 &)24#Print root password25print("Root password: {}".format(password))26#Get public address27! curl -s http://localhost:4040/api/tunnels | python3 -c 28 "import sys, json; print(json.load(sys.stdin)[tunnels][0][public_url])"
你的數據在/content/directory中。
現階段,免費的Ngrok賬戶不支持並行雙通道,如果你正使用其運行TensorBoard,你可以通過以下方法終止它。
!kill $(ps aux | grep ./ngrok | awk {print $2})
Google Colab與Kaggle的數據交互
為了實現Colab與Kaggle的數據上傳和下載,你需要安裝Kaggle-API庫,地址如下:https://github.com/Kaggle/kaggle-api
微信群&交流合作
- 加入微信群:不定期分享資料,拓展行業人脈請在公眾號留言:「微信號+名字+研究領域/專業/學校/公司」,我們將很快與您聯繫。
- 投稿、交流合作請留言聯繫。
http://weixin.qq.com/r/AC91bd-EloLprZsO93oS (二維碼自動識別)
推薦閱讀:
※037 Sudoku Solver[H]
※從機械工程師到數據分析師再到演算法工程師
※阿里集團搜索和推薦關於效率&穩定性的思考和實踐
※Leetcodes Solution 34 Search for a Range
※九章演算法 | Google面試題:原子計數