Magenta魔改記-1：MIDI文件讀取

10-14

1 人贊了文章

Magenta魔改記-0：Magenta魔改記-0：Magetna初見

本項目github：

https://github.com/lukewys/Magenta-Modification

包含所有文字及代碼。

Magenta魔改記-1：MIDI文件讀取

Magenta version:0.3.6 Tensorflow version:1.9.0

Magenta支持MIDI（.mid/.midi）文件與MusicXML（.xml/.mxl）、ABC數據（http://abcnotation.com，沒有測試過）文件做訓練數據。通常，製作數據集的步驟是，先將原始文件轉化成單個tfrecord文件保存，再根據每個不同的模型進行不同的數據清洗與處理。

那麼這篇文章先著重分析第一步，將MIDI/MusicXML文件直接轉換成tfrecord：對應github中的提示： https://github.com/tensorflow/magenta/tree/master/magenta/scripts#building-your-dataset

上述鏈接中的命令行如下：

INPUT_DIRECTORY=<folder containing MIDI and/or MusicXML files. can have child folders.># TFRecord file that will contain NoteSequence protocol buffers.SEQUENCES_TFRECORD=/tmp/notesequences.tfrecordconvert_dir_to_note_sequences --input_dir=$INPUT_DIRECTORY --output_file=$SEQUENCES_TFRECORD --recursive

這一步的bazel命令行如下（摘自源代碼注釋）：

Example usage: $ bazel build magenta/scripts:convert_dir_to_note_sequences $ ./bazel-bin/magenta/scripts/convert_dir_to_note_sequences --input_dir=/path/to/input/dir --output_file=/path/to/tfrecord/file --num_threads=4 --log=INFO

可以看到，兩個命令行的參數內容都不同，可見Magenta項目組對於文檔或API的介紹並沒有進行認真的維護。

魔改-1.0：

那麼下面介紹如何修改這一步預處理的參數。

這一步運行的文件位置如下： https://github.com/tensorflow/magenta/blob/master/magenta/scripts/convert_dir_to_note_sequences.py

打開源代碼我們可以看到，程序一開始就定義了一系列tf.flag：

FLAGS = tf.app.flags.FLAGStf.app.flags.DEFINE_string(input_dir, None, Directory containing files to convert.)#輸入路徑tf.app.flags.DEFINE_string(output_file, None, Path to output TFRecord file. Will be overwritten if it already exists.)#輸出路徑tf.app.flags.DEFINE_bool(recursive, False, Whether or not to recurse into subdirectories.)#是否遞歸查找子路徑的文件tf.app.flags.DEFINE_integer(num_threads, 1, Number of worker threads to run in parallel.)#線程數量。如果數據文件很多且CPU性能足夠的話，建議設置一個相對大的值tf.app.flags.DEFINE_string(log, INFO, The threshold for what messages will be logged DEBUG, INFO, WARN, ERROR, or FATAL.)#顯示消息的類型

這是Tensorflow中用於從命令行傳遞參數的變數，基於argparse實現。如果在運行時不輸入參數，則會按程序中默認填寫的參數運行。因此可以看到，這一個程序共有5個參數，而上面兩種命令行方法都沒有寫出所有的變數，但上述兩種方法都能運行，因為沒有默認值的變數只有輸入路徑和輸出路徑兩個。通過

python convert_dir_to_note_sequences.py.py –h

可以顯示注釋信息和參數及其詳情。因此，我們在自定義參數時，既可以在命令行運行時輸入：

python convert_dir_to_note_sequences.py --input_dir=E:MagentaDataset awach --output_file=E:MagentaDatasetpreach.tfrecord --recursive=True --num_threads=4

同樣，我們也可以把前面這幾行當做超參數變數聲明，直接在程序里改（第二個參數），然後運行。

魔改-2.0

Magenta version:0.3.6

接下來介紹這一步的詳細原理以及文件儲存的數據類型。

源代碼地址： https://github.com/tensorflow/magenta/blob/master/magenta/scripts/convert_dir_to_note_sequences.py

在本程序中，大致的運行步驟為： 1. 先檢測輸入路徑（以及子路徑）中所有符合要求的文件，生成文件路徑列表。 2. 再根據列表多線程的處理數據。 3. 最後再存成.tfrecord文件。

第一步對應queue_conversions(root_dir, sub_dir, pool, recursive=False)函數，在此不多展開。

第二步對應convert_midi(root_dir, sub_dir, full_file_path)、 convert_musicxml(root_dir, sub_dir, full_file_path)兩個函數。顧名思義就是針對midi和xml文件的處理函數（一開始說的ABC數據處理函數未知）。它們的參數以及返回值可以在函數注釋中找到詳細的介紹。簡單來說就是輸入文件路徑、文件所在文件夾路徑、上一級路徑，輸出NoteSequence proto，一個在Magenta項目中用來表示音符序列的數據類型。

第三步則對應convert_directory(root_dir, output_file, num_threads,recursive=False)，是總的函數。

首先我們可以把這個文件導入：

import tensorflow as tfimport magenta as mgtimport magenta.scripts.convert_dir_to_note_sequences as cvrt

導入之後我們也可以用查看子類的方式查看它的FLAGS參數：

print(cvrt.FLAGS)magenta.scripts.convert_dir_to_note_sequences: --input_dir: Directory containing files to convert. --log: The threshold for what messages will be logged DEBUG, INFO, WARN, ERROR, or FATAL. (default: INFO) --num_threads: Number of worker threads to run in parallel. (default: 1) (an integer) --output_file: Path to output TFRecord file. Will be overwritten if it already exists. --[no]recursive: Whether or not to recurse into subdirectories. (default: false)absl.flags: --flagfile: Insert flag definitions from the given file into the command line. (default: ) --undefok: comma-separated list of flag names that it is okay to specify on the command line even if the program does not define a flag with that name. IMPORTANT: flags in this list that have arguments MUST use the --flag=value format. (default: )#加這行是因為jupyter notebook對tf.app.flags.FLAGS有bug#見https://github.com/tensorflow/tensorflow/issues/17702tf.app.flags.DEFINE_string(f, , kernel)

因此我們也可以用修改FLAGS子類參數的方法運行本程序：

我這裡以及下面的代碼中都採用了絕對路徑，所以在自己運行時請修改路徑。文件會附在github中。

cvrt.FLAGS.input_dir=rE:MagentaDatasetxml_rawcvrt.FLAGS.output_file=rE:MagentaDatasetpreach.tfrecordcvrt.FLAGS.num_threads=4cvrt.FLAGS.recursive=Truecvrt.FLAGS.log=INFOtf.app.run(cvrt.main)INFO:tensorflow:Converting files in E:MagentaDatasetxml_raw.INFO:tensorflow:0 files converted.INFO:tensorflow:Converted MusicXML file E:MagentaDatasetxml_rawbwv1.6.mxl.INFO:tensorflow:Converted MusicXML file E:MagentaDatasetxml_rawbwv2.6.mxl.An exception has occurred, use %tb to see the full traceback.SystemExitD:Anaconda3envstensorflowlibsite-packagesIPythoncoreinteractiveshell.py:2918: UserWarning: To exit: use exit, quit, or Ctrl-D. warn("To exit: use exit, quit, or Ctrl-D.", stacklevel=1)

如上所說，這個文件包含convert_midi(root_dir, sub_dir, full_file_path)、convert_musicxml(root_dir, sub_dir, full_file_path)兩個函數。

下面我們分別來運行一下轉換函數並看一下它們返回的結果。

full_file_path_xml=rE:MagentaDatasetxml_rawwv1.6.mxlroot_dir_xml=rE:MagentaDatasetxml_rawsub_dir_xml=rE:MagentaDatasetxml_rawsequence_xml=cvrt.convert_musicxml(root_dir_xml, sub_dir_xml, full_file_path_xml)INFO:tensorflow:Converted MusicXML file E:MagentaDatasetxml_rawbwv1.6.mxl.

我們可以看到sequence_xml是一個基於Google protobuf的數據類型。

print(type(sequence_xml))<class magenta.protobuf.music_pb2.NoteSequence>#print(sequence_xml)

從上面我們可以看到這裡面包含了路徑、id、以及xml中的內容。大部分應該是直接從xml中直接轉換而來，但將它們分成類結構化儲存了。

於是，我們也可以直接訪問它的子類

print(sequence_xml.id)print(sequence_xml.filename)/id/musicxml/xml_raw/efab6353dd5ebf096204bc6316d1d2f003b156c7E:MagentaDatasetxml_rawbwv1.6.mxlprint(type(sequence_xml.notes))<class google.protobuf.pyext._message.RepeatedCompositeContainer>

sequence_xml的note類裡面就是最主要的內容了，主要記錄了所有的音符。音符類當然也支持索引，我們可以看到每個音符由音高、音色、起始時間、終止時間等組成。

print(sequence_xml.notes[0])pitch: 65velocity: 64end_time: 0.5numerator: 1denominator: 4instrument: 7program: 1voice: 1

我用MuseScore2將XML導出為midi後進行下面的測試：

full_file_path_midi=rE:MagentaMidiwv1.6.midroot_dir_midi=rE:MagentaMidisub_dir_midi=rE:MagentaMidisequence_midi=cvrt.convert_midi(root_dir_midi, sub_dir_midi, full_file_path_midi)INFO:tensorflow:Converted MIDI file E:MagentaMidibwv1.6.mid.D:Anaconda3envstensorflowlibsite-packagespretty_midipretty_midi.py:100: RuntimeWarning: Tempo, Key or Time signature change events found on non-zero tracks. This is not a valid type 0 or type 1 MIDI file. Tempo, Key or Time Signature may be wrong. RuntimeWarning)

我們看到，MIDI形式的儲存格式和XML大同小異，但是起始和終止的時間看起來很亂。

#print(sequence_midi)print(type(sequence_midi.notes))<class google.protobuf.pyext._message.RepeatedCompositeContainer>print(sequence_midi.notes[0])print(sequence_midi.notes[1])print(sequence_midi.notes[288])pitch: 65velocity: 80end_time: 0.7878292625pitch: 67velocity: 80start_time: 0.789474end_time: 1.1825662625pitch: 67velocity: 80start_time: 51.31581end_time: 52.1036392625instrument: 2

我們可以看到，MIDI的Notes裡面的內容就相對簡單了。而且似乎格式也不太整齊。這有可能是由於我轉換的方式不夠好。

full_file_path_midi=rE:MagentaMidiBwv0525 Sonate en trio n1.midroot_dir_midi=rE:MagentaMidisub_dir_midi=rE:MagentaMidisequence_midi=cvrt.convert_midi(root_dir_midi, sub_dir_midi, full_file_path_midi)INFO:tensorflow:Converted MIDI file E:MagentaMidiBwv0525 Sonate en trio n1.mid.#print(sequence_midi)print(sequence_midi.notes[0])print(sequence_midi.notes[1])print(sequence_midi.notes[2880])pitch: 70velocity: 92start_time: 6.4end_time: 6.800000000000001pitch: 74velocity: 92start_time: 6.800000000000001end_time: 7.2pitch: 69velocity: 97start_time: 139.8end_time: 140.0instrument: 1program: 19

換了一個MIDI數據集中的MIDI文件，似乎還是這樣。具體原因我再尋找一下，在下一節時分析。

總結

我們了解了Magenta項目原始數據整合的過程，並了解了讀取MIDI和XML的函數。

如果你想進行自己的項目的話，直接用Magenta的數據處理函數也是個不錯的選擇。

同時，也有其他的數據讀取與處理方式，例如Magenta.piplines類，pretty_midi庫，都是不錯的選擇。但在這裡就不詳細展開了，也許在之後的教程里會提到。

對於讀取到的MIDI數據以及XML數據的解釋，我會在下一節中說明。

最後，對於生成的tfrecord，Magenta將數據轉換成了二進位字元儲存，讀取稍微有些複雜，我們有機會再詳細說明。