paint-brush
How to Use Bootleg Score: A Low Resource Optical Music Recognition Systemby@samugc
215 reads

How to Use Bootleg Score: A Low Resource Optical Music Recognition System

by SamuGCJuly 23rd, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

In this short tutorial, I aim to introduce you to Bootleg Score, a mid-level music representation that facilitates music sheet processing.
featured image - How to Use Bootleg Score: A Low Resource Optical Music Recognition System
SamuGC HackerNoon profile picture

DISCLAIMER: The code included in this tutorial belongs to TJ Tsai, the creator of Bootleg Score. This tutorial is designed to facilitate the initial approach for new researchers. If you have any questions, do not hesitate to contact him: [email protected].




Without Optical Music Recognition (OMR), computers can not be able to interpret a music score, and a lot of tasks related with Music Information Retrieval can not be achieved. In simple words, OMR transforms the data from music scores into a format that machines can understand, with some famous music representations such as MIDI, MusicXML or PianoRoll.


In this short tutorial, I aim to introduce you to Bootleg Score, a mid-level music representation that facilitates music sheet processing without requiring significant computational power or extensive libraries and dependencies. Since I have been working with this representation lately, I think its possibilities should be known by more people.


This representation was created by TJ Tsai and his lab from Harvey Mudd College, which seeks to represent the position of filled notes on a staff using zeros and ones, explicitly encoding the rules of Western Music Notation for piano sheet music. Some of their articles showing applications and results obtained by performing tasks with Bootleg Score representation include the following:


  1. MIDI Passage Retrieval Using Cell Phone Pictures Of Sheet Music
  2. Composer Style Classification Of Piano Sheet Music Images Using Language Model Pretraining
  3. Camera-Based Piano Sheet Music Identification
  4. MIDI-Sheet Music Alignment Using Bootleg Score Synthesis


In order to facilitate the first approach to this representation, all the necessary files to get the Bootleg Score representation of a PDF sheet music are available in the following github repository: How To Use Bootleg Score


INSTALLMENT

All requirements are found in the requirements.txt file of the repository. I recommend working in a Linux system such as Ubuntu. Once you create a Conda Environment, installing all the dependencies is as easy as writing the following command in the terminal.


pip install -r requirements.txt


Some of the most important libraries are OpenCV, frequently used for computer vision, and Scikit-Learn, used for data analysis.


HOW DOES BOOTLEG SCORE WORKS


To get this low resource optical recognition system, it is necessary to focus on three things on the sheet music: filled noteheads, staff lines and bar lines. All the functions mentioned in the next paragraphs are included in the ExtractBootlegFeatures1.py file from the repository.


NOTEHEAD DETECTION


The filled noteheads are selected by using a simple blob detector and estimating the noteheads that fill in a notehead template. The function adaptiveNoteheadDetect() performs the adaptive detection of noteheads from our image. Finally, with the notes list data (nhlocs), we can obtain the required information of all the noteheads detected. Another function takes a list of bounding boxes representing noteheads, calculates the central coordinates of each one of them, and provides an average estimation of the width and length of the noteheads in that list.





STAFF LINE DETECTION


The staff lines location and spacing are estimated using every detected notehead from the previous step. The function isolateStaffLines() uses morphological operations to isolate the staff lines in an image. First, the horizontal lines are identified and then the note bars are removed to obtain only the staff lines. With the getEstStaffLineLocs() function, we have a list of predictions where each prediction is a tuple containing the initial and final position of a staff line, the associated column and row, and the corresponding index in the featmap. The next step consists of assigning the noteheads to the staff that corresponds to them.The final result will be similar to what is shown in the following image.





BAR LINE DETECTION


The Bar Line detection helps to cluster the staff line into grand staff systems. It is important to remember that the Bootleg Score representation is taught to be used with Piano Sheet Music. Bounding boxes are predicted around the bar lines in the image. Bar lines are necessary to properly group the staff lines into clusters of grand staff systems, each one containing a staff corresponding to the right hand and another corresponding to the left hand.





BOOTLEG SCORE GENERATION


Using the information from the noteheads, staff lines, and bar lines, the Bootleg Score is generated as an array of 0’s and 1’s. The vertical axis represents the positions of the right and left hand staves, while the notes are located along the horizontal axis. For each single bootleg line, collapseSimultaneousEvents() groups the noteheads that are close to each other in the same column, assuming that these notes sound simultaneously. The function constructBootlegScore encodes the previous information into the Bootleg Score representation. The Bootleg representation is a matrix with dimensions of 62×N, where 62 represents the possible staff positions and N represents the number of simultaneous events.


An example of how a Bootleg Score can look is shown in the next image:





Once the final representation is done, it is time for you to try different tasks involving music information retrieval taking advantage of the possibilities that a low resource representation like this one provides to users.


A sheet music PDF is included in the repository for testing. Once everything is correctly configured, the only thing you need to do is execute the Test_Bootleg.py file. Do not hesitate to take a deep look at the code; it is a good opportunity to learn new methods for analyzing sheet music, and I found it very funny..



CONCLUSIONS


In conclusion, TJ Tsai’s Bootleg Score representation represents a significant progress in Optical Music Recognition, where computational costs and extensive processing times have been a problem for many years. The articles mentioned in previous sections demonstrate that using the Bootleg Score can achieve good results in various tasks, so I encourage you to try it and give it an opportunity.