體驗自動化機器學習,試試這款能幫你每小時省20刀的新框架

體驗自動化機器學習,試試這款能幫你每小時省20刀的新框架

來自專欄景略集智25 人贊了文章

什麼是自動化機器學習

這兩年在人工智慧領域出現了一個新方向——自動化機器學習(AutoML),讓人們無需編程即可創建模型。我們都知道,開發機器學習模型的工作流程非常耗時,對技術能力要求很高,這個流程包括數據準備、特徵選擇、模型或技術選擇、訓練以及調優等。在這種情況下,誕生了自動化機器學習,它使用許多不同的統計學和深度學習技術,旨在使整個機器學習開發工作流程實現自動化

  • 簡而言之,自動化學習能夠為我們自動化完成以下工作任務:
  • 預處理和清洗數據
  • 選擇和構造合適的數據特徵
  • 選擇合適的模型組
  • 優化模型超參數
  • 機器學習模型的後期處理
  • 嚴格分析獲取的結果

可以看到,AutoML能大幅縮短創建機器學習模型的時間,減少工作量,而自動搜索構建深度學習模型和調參一直是數據科學家們嚮往的工具。另一方面,它能讓商業用戶能夠在編程方面沒有紮實背景的情況下開發機器學習模型,是AI工具大眾化的重要部分。因此,自動化機器學習成為未來的一個重要趨勢。

於是AI巨頭谷歌就看上了這塊市場,在今年1月份推出了Google AutoML平台,讓不懂機器學習的人,也能訓練出一個定製化的機器學習模型,大大降低了機器學習開發門檻。然而,Google AutoML有很多好,卻也有一個不好:收費,而且還很貴,每小時收費20美元。儼然稱得上「土豪玩具」。

不過谷歌將AutoML平台收費的做法,遭到不少AI學界人士的diss,並在本月推出了一個新的自動化機器學習開發框架和Google AutoML叫板。這個框架就是由美國德州農工大學助理教授胡俠及其兩名博士生Haifeng Jin、Qingquan Song開發的Auto-Keras。它基於常見的機器學習開發工具Keras,完全開源免費,功能和Google AutoML很相似,也能為深度學習模型自動搜索架構和超參數。

安裝Auto-Keras也很簡單,用下面一行命令就夠了:

pip install autokeras

實際使用

我們使用Auto-Keras官方所用的例子來講解。但是首先我們看看做同樣的事情,用當前的一些機器學習框架比如TensorFlow和PyTorch會是怎樣的結果,然後再看看換成Auto-Keras會怎樣,做個比較。這裡使用機器學習入門開發都會用的MNIST數據集。

MNIST是一個非常簡單的計算機視覺數據集,包含如下圖所示的手寫數字圖像:

每張圖像也附有標籤,告訴我們圖像上是什麼數字。

我們首先用TensorFlow搭建一個識別手寫字的模型:

from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_function import osimport time # pylint: disable=g-bad-import-orderfrom absl import app as absl_appfrom absl import flagsimport tensorflow as tf# pylint: enable=g-bad-import-order from official.mnist import dataset as mnist_datasetfrom official.mnist import mnistfrom official.utils.flags import core as flags_corefrom official.utils.misc import model_helpers tfe = tf.contrib.eager def loss(logits, labels): return tf.reduce_mean( tf.nn.sparse_softmax_cross_entropy_with_logits( logits=logits, labels=labels)) def compute_accuracy(logits, labels): predictions = tf.argmax(logits, axis=1, output_type=tf.int64) labels = tf.cast(labels, tf.int64) batch_size = int(logits.shape[0]) return tf.reduce_sum( tf.cast(tf.equal(predictions, labels), dtype=tf.float32)) / batch_size def train(model, optimizer, dataset, step_counter, log_interval=None): """Trains model on `dataset` using `optimizer`.""" start = time.time() for (batch, (images, labels)) in enumerate(dataset): with tf.contrib.summary.record_summaries_every_n_global_steps( 10, global_step=step_counter): # 記錄操作用於計算輸入的損失 # 這樣就能計算出和變數相對的損失梯度 with tf.GradientTape() as tape: logits = model(images, training=True) loss_value = loss(logits, labels) tf.contrib.summary.scalar(loss, loss_value) tf.contrib.summary.scalar(accuracy, compute_accuracy(logits, labels)) grads = tape.gradient(loss_value, model.variables) optimizer.apply_gradients( zip(grads, model.variables), global_step=step_counter) if log_interval and batch % log_interval == 0: rate = log_interval / (time.time() - start) print(Step #%d Loss: %.6f (%d steps/sec) % (batch, loss_value, rate)) start = time.time() def test(model, dataset): """Perform an evaluation of `model` on the examples from `dataset`.""" avg_loss = tfe.metrics.Mean(loss, dtype=tf.float32) accuracy = tfe.metrics.Accuracy(accuracy, dtype=tf.float32) for (images, labels) in dataset: logits = model(images, training=False) avg_loss(loss(logits, labels)) accuracy( tf.argmax(logits, axis=1, output_type=tf.int64), tf.cast(labels, tf.int64)) print(Test set: Average loss: %.4f, Accuracy: %4f%%
% (avg_loss.result(), 100 * accuracy.result())) with tf.contrib.summary.always_record_summaries(): tf.contrib.summary.scalar(loss, avg_loss.result()) tf.contrib.summary.scalar(accuracy, accuracy.result()) def run_mnist_eager(flags_obj): """Run MNIST training and eval loop in eager mode. Args: flags_obj: An object containing parsed flag values. """ tf.enable_eager_execution() model_helpers.apply_clean(flags.FLAGS) # 自動決定設備和data_format (device, data_format) = (/gpu:0, channels_first) if flags_obj.no_gpu or not tf.test.is_gpu_available(): (device, data_format) = (/cpu:0, channels_last) # 如果在FLAGS中定義了data_format,就自動重寫設定值 if flags_obj.data_format is not None: data_format = flags_obj.data_format print(Using device %s, and data format %s. % (device, data_format)) # 載入數據集 train_ds = mnist_dataset.train(flags_obj.data_dir).shuffle(60000).batch( flags_obj.batch_size) test_ds = mnist_dataset.test(flags_obj.data_dir).batch( flags_obj.batch_size) # 創建模型和優化器 model = mnist.create_model(data_format) optimizer = tf.train.MomentumOptimizer(flags_obj.lr, flags_obj.momentum) # 創建文件編寫器用於寫入TensorBoard 總結 if flags_obj.output_dir: # 創建TensorBoard總結所在的目錄 # tensorboard --logdir=<output_dir> # 然後可用於查看記錄的總結 train_dir = os.path.join(flags_obj.output_dir, train) test_dir = os.path.join(flags_obj.output_dir, eval) tf.gfile.MakeDirs(flags_obj.output_dir) else: train_dir = None test_dir = None summary_writer = tf.contrib.summary.create_file_writer( train_dir, flush_millis=10000) test_summary_writer = tf.contrib.summary.create_file_writer( test_dir, flush_millis=10000, name=test) # 創建和保存檢查點(如果路徑中存在) checkpoint_prefix = os.path.join(flags_obj.model_dir, ckpt) step_counter = tf.train.get_or_create_global_step() checkpoint = tf.train.Checkpoint( model=model, optimizer=optimizer, step_counter=step_counter) # 如果檢查點存在,保存創建中的變數 checkpoint.restore(tf.train.latest_checkpoint(flags_obj.model_dir)) # 以一定數量的周期訓練和評估模型 with tf.device(device): for _ in range(flags_obj.train_epochs): start = time.time() with summary_writer.as_default(): train(model, optimizer, train_ds, step_counter, flags_obj.log_interval) end = time.time() print(
Train time for epoch #%d (%d total steps): %f % (checkpoint.save_counter.numpy() + 1, step_counter.numpy(), end - start)) with test_summary_writer.as_default(): test(model, test_ds) checkpoint.save(checkpoint_prefix) def define_mnist_eager_flags(): """Defined flags and defaults for MNIST in eager mode.""" flags_core.define_base_eager() flags_core.define_image() flags.adopt_module_key_flags(flags_core) flags.DEFINE_integer( name=log_interval, short_name=li, default=10, help=flags_core.help_wrap(batches between logging training status)) flags.DEFINE_string( name=output_dir, short_name=od, default=None, help=flags_core.help_wrap(Directory to write TensorBoard summaries)) flags.DEFINE_float(name=learning_rate, short_name=lr, default=0.01, help=flags_core.help_wrap(Learning rate.)) flags.DEFINE_float(name=momentum, short_name=m, default=0.5, help=flags_core.help_wrap(SGD momentum.)) flags.DEFINE_bool(name=no_gpu, short_name=nogpu, default=False, help=flags_core.help_wrap( disables GPU usage even if a GPU is available)) flags_core.set_defaults( data_dir=/tmp/tensorflow/mnist/input_data, model_dir=/tmp/tensorflow/mnist/checkpoints/, batch_size=100, train_epochs=10, ) def main(_): run_mnist_eager(flags.FLAGS) if __name__ == __main__: define_mnist_eager_flags() absl_app.run(main=main)

並不是很容易。不過雖然TensorFlow並非搭建深度學習模型的最易用的工具,但它運行比較快,也很可靠。

資料源:

github.com/tensorflow/m

下面我們換成PyTorch:

from __future__ import print_functionimport argparseimport torchimport torch.nn as nnimport torch.nn.functional as Fimport torch.optim as optimfrom torchvision import datasets, transforms class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(1, 10, kernel_size=5) self.conv2 = nn.Conv2d(10, 20, kernel_size=5) self.conv2_drop = nn.Dropout2d() self.fc1 = nn.Linear(320, 50) self.fc2 = nn.Linear(50, 10) def forward(self, x): x = F.relu(F.max_pool2d(self.conv1(x), 2)) x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2)) x = x.view(-1, 320) x = F.relu(self.fc1(x)) x = F.dropout(x, training=self.training) x = self.fc2(x) return F.log_softmax(x, dim=1) def train(args, model, device, train_loader, optimizer, epoch): model.train() for batch_idx, (data, target) in enumerate(train_loader): data, target = data.to(device), target.to(device) optimizer.zero_grad() output = model(data) loss = F.nll_loss(output, target) loss.backward() optimizer.step() if batch_idx % args.log_interval == 0: print(Train Epoch: {} [{}/{} ({:.0f}%)] Loss: {:.6f}.format( epoch, batch_idx * len(data), len(train_loader.dataset), 100. * batch_idx / len(train_loader), loss.item())) def test(args, model, device, test_loader): model.eval() test_loss = 0 correct = 0 with torch.no_grad(): for data, target in test_loader: data, target = data.to(device), target.to(device) output = model(data) test_loss += F.nll_loss(output, target, reduction=sum).item() # sum up batch loss pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability correct += pred.eq(target.view_as(pred)).sum().item() test_loss /= len(test_loader.dataset) print(
Test set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)
.format( test_loss, correct, len(test_loader.dataset), 100. * correct / len(test_loader.dataset))) def main(): # 訓練設置 parser = argparse.ArgumentParser(description=PyTorch MNIST Example) parser.add_argument(--batch-size, type=int, default=64, metavar=N, help=input batch size for training (default: 64)) parser.add_argument(--test-batch-size, type=int, default=1000, metavar=N, help=input batch size for testing (default: 1000)) parser.add_argument(--epochs, type=int, default=10, metavar=N, help=number of epochs to train (default: 10)) parser.add_argument(--lr, type=float, default=0.01, metavar=LR, help=learning rate (default: 0.01)) parser.add_argument(--momentum, type=float, default=0.5, metavar=M, help=SGD momentum (default: 0.5)) parser.add_argument(--no-cuda, action=store_true, default=False, help=disables CUDA training) parser.add_argument(--seed, type=int, default=1, metavar=S, help=random seed (default: 1)) parser.add_argument(--log-interval, type=int, default=10, metavar=N, help=how many batches to wait before logging training status) args = parser.parse_args() use_cuda = not args.no_cuda and torch.cuda.is_available() torch.manual_seed(args.seed) device = torch.device("cuda" if use_cuda else "cpu") kwargs = {num_workers: 1, pin_memory: True} if use_cuda else {} train_loader = torch.utils.data.DataLoader( datasets.MNIST(../data, train=True, download=True, transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ])), batch_size=args.batch_size, shuffle=True, **kwargs) test_loader = torch.utils.data.DataLoader( datasets.MNIST(../data, train=False, transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ])), batch_size=args.test_batch_size, shuffle=True, **kwargs) model = Net().to(device) optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum) for epoch in range(1, args.epochs + 1): train(args, model, device, train_loader, optimizer, epoch) test(args, model, device, test_loader) if __name__ == __main__: main()

資料源:

github.com/pytorch/exam

然後再換成Keras:

Trains a simple convnet on the MNIST dataset. Gets to 99.25% test accuracy after 12 epochs(there is still a lot of margin for parameter tuning).16 seconds per epoch on a GRID K520 GPU. from __future__ import print_functionimport kerasfrom keras.datasets import mnistfrom keras.models import Sequentialfrom keras.layers import Dense, Dropout, Flattenfrom keras.layers import Conv2D, MaxPooling2Dfrom keras import backend as K batch_size = 128num_classes = 10epochs = 12 # 輸入圖像的維度img_rows, img_cols = 28, 28 # 將數據集分割為訓練集和測試集(x_train, y_train), (x_test, y_test) = mnist.load_data() if K.image_data_format() == channels_first: x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols) x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols) input_shape = (1, img_rows, img_cols)else: x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1) x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1) input_shape = (img_rows, img_cols, 1) x_train = x_train.astype(float32)x_test = x_test.astype(float32)x_train /= 255x_test /= 255print(x_train shape:, x_train.shape)print(x_train.shape[0], train samples)print(x_test.shape[0], test samples) # 將類向量轉為二元類矩陣y_train = keras.utils.to_categorical(y_train, num_classes)y_test = keras.utils.to_categorical(y_test, num_classes) model = Sequential()model.add(Conv2D(32, kernel_size=(3, 3), activation=relu, input_shape=input_shape))model.add(Conv2D(64, (3, 3), activation=relu))model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Dropout(0.25))model.add(Flatten())model.add(Dense(128, activation=relu))model.add(Dropout(0.5))model.add(Dense(num_classes, activation=softmax)) model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=[accuracy]) model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test))score = model.evaluate(x_test, y_test, verbose=0)print(Test loss:, score[0])print(Test accuracy:, score[1])

資料源:

github.com/keras-team/k

你可以看到,在MNIST這個案例中,Keras是相對更簡單的包。其具有很多優良特性,能讓我們很快地從零開始搭建模型。

好了,正餐時間到了,下面我們看看同樣的任務,用Auto-Keras會怎樣

from keras.datasets import mnistfrom autokeras.classifier import ImageClassifier if __name__ == __main__: (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train = x_train.reshape(x_train.shape + (1,)) x_test = x_test.reshape(x_test.shape + (1,)) clf = ImageClassifier(verbose=True, augment=False) clf.fit(x_train, y_train, time_limit=12 * 60 * 60) clf.final_fit(x_train, y_train, x_test, y_test, retrain=True) y = clf.evaluate(x_test, y_test)print(y * 100)

沒錯,就這些,只用這幾行代碼就完成了。我們只需一個ImageClassifier,然後擬合數據,對其評估就好了。我們還會得到一個final_fit,在找到最佳模型架構後執行最終訓練。

因為所有的代碼都是開源的,所以如果想實現真正的自定義,你甚至可以利用其中的參數。由於基於Keras,所以代碼並不複雜,能幫助開發人員快速準確地創建模型,也能讓研究人員深入研究架構搜索。

可以說Auto-Keras具備一個好的開源項目該有的一切:安裝快速,運行簡單,功能豐富,易於修改。重要的是,由於是開源,它為需求自動化機器學習的人找到了Google AutoML的絕佳免費替代品,不必再忍痛每小時花20刀。

當然Auto-Keras由於剛推出,還有許多需要改進的地方,比如目前僅支持Python 3.6,在沒有特別設置情況下會很容易佔滿你的CPU或GPU等等。但開源項目都有一個慢慢成長的過程,也期待Auto-Keras越來越好。

附Auto-Keras官網:

autokeras.com/

GitHub地址:

github.com/jhfjhfj1/aut

研究論文:

arxiv.org/abs/1806.1028

參考資料:

towardsdatascience.com/



推薦閱讀:

Keras vs PyTorch:誰是「第一」深度學習框架?
淺談QSR(定性空間關係推理)演算法
外刊 | 這次,機器人終於有智能了?
引領第四次工業革命,第六代微軟小冰正式發布!
[感想] 一個極好的收集label的工程

TAG:機器學習 | 人工智慧 | Python |