PROJECT: MUSICAL MACHINE LEARNING
Challenge: As part of the 2018 launch of ML.NET, the new machine learning framework for .NET, I developed all of the “Getting Started” materials. I crafted all documentation, sample, tutorials, blog posts, and press releases. I also was in charge of creating an interesting demo that could show the power of ML.NET in an unique setting. I decided to combine my passion for music with technology to create a machine learning model that could predict missing notes in a piece of music.
Process:
The process that I followed to create the model can be broken into five steps:
Step 1: Find a data set
One of the biggest hurdles to creating a machine learning model is finding or gathering the right set of data to be able to train with. More data can help improve accuracy of your models, but is not always easy to come by. Luckily, UC Irvine has a repository of machine learning datasets that are available for public use. I was able to find a dataset that contained Bach chorales to use as the basis for my model.
Step 2: Choose the machine learning task
There are different types of machine learning tasks (e.g. regression, classification, etc.) that can be used depending on the scenario. My scenario was about predicting missing notes in a piece of music. At the time of this project, ML.NET did not support time series regression and so I choose to turn this into a multi-classification problem. My model was built to answer the question of “based on all of the other notes inside of this measure, what is the likely note that is missing?”.
Step 3: Transform the data
It’s rare to find data in the exact format that you would like it in to be able to train a machine learning model. Sometimes, there are extra parameters that aren’t related to the outcome you are trying to pick, or sometimes there are categorical variables that need to be encoded for an algorithm to process them. In my case, the incoming data had to be transformed quite significantly:
Step 4: Choose a learning algorithm
Depending on the machine learning task that you choose, every machine learning framework has a certain set of learning algorithms that are available for training your model. Since this was a classification problem, I chose a classification algorithm called Stochastic Dual Coordinate Ascent Classifier (SDCA). Learning algorithms can also significantly change the accuracy of your model, so I tried a few before landing on this as the optimal one for this scenario.
Step 5: Train and evaluate the model
The last step was to train and evaluate the model, through unit tests to make sure it could accurately predict the missing note in a measure. As you can see from the picture, the micro accuracy based on the data and algorithm I chose was only .34 (on a scale of 0-1). However, in music there are many “correct” patterns of notes that belong together in a measure, and so I capitalizing on this forgiving nature and moved forward to test this out in a real scenario.
Results: Finally, it was time to test the model out and see if it could make predictions on a new piece of music outside of the data set I had used to train the model. I created a fake melody according to music theory principles, and removed some notes. I then used the model to predict how to fill in those notes, and the results were astounding!
Check out these videos to see the process and results in action: