One of the tasks of the researchers of musicology is to develop and validate the assumptions of music, after studying the historical documents in addition to all the information available for that topic. Many documents have already been digitized and access to them through a computer makes it much easier for the researchers to their work. However, the engines basic search operate at the level of a “find string exact”, and may not capture the meaning of the content in many occasions.
In a new article published, the researchers of the science data of the music -led by Sergio Oramas – tested an approach to natural language processing that could find things in the historical documents that human beings have overlooked, which would help scientists to discover new hypotheses and to identify interesting patterns in the available data.
“As a musicologist, I want to explore the contents of the encyclopedias larger, such as The New Grove Dictionary or Wikipedia,” says Oramas, and adds: “There is a lot of content to read and very little time in life, but computers can help us.”
The work of Oramas and colleagues applied the automatic processing language for large collections of texts related to the music to discover new facts that are implied and then to be able to measure the potential of machine learning in the research musicológica. The study used a variety of sources: Wikipedia, DBpedia, and MusicBrainz, specifically in the area of flamenco, the music of the renaissance music and popular music.
Using natural language processing, a computational method to analyze the speech and the writings of human beings, the researchers were able to identify interesting patterns in the history of music. “We extracted directly from the data of the artists of flamenco and renaissance of major influence, and we discovered migratory trends of the composers among the european cities in the 15th and 16th centuries,” said Oramas.
We also analyzed the texts of the reviews made on Amazon and found interesting facts about the evolution of popular music, as a positivism extraordinary in the language used in the year 2008. Interestingly, the musical genres traditionally associated with various communities, such as Jazz and Latin music, had remarkable improvements in the positive perception of the public, while other genres (in other countries) did not have this result.
The study also found a strong correlation between the views expressed by other users on their reviews, and the popularity of albums that were released in certain decades on certain genres in particular, as the Pop in the sixties or Reggae, in the early 80s. Apparently the golden age of Reggae is due to the popularity of the albums of Bob Marley, who seems to have contributed in this area.
The work of Oramas and colleagues test the analysis of the critical music written during some particular periods of time could help musicologists to discover more about the evolution of those genres and identify key moments in history. “Ultimately, our greatest discovery is the demonstration that the natural language processing may help to discover new hypotheses musicológicas and so have a clearer vision about what the data tell us”, concludes Oramas.
In the future, we plan to expand this research to other types of content, audio, images and data collected in the Project the Genome Music Pandora, the taxonomy for the more sophisticated of the musical information never before collected.