Results for ""
Have you ever wondered the meaning of your dog's bark? AI can help you find out.
The University of Michigan researchers are exploring the possibilities of AI, developing tools that can identify whether a dog's bark conveys playfulness or aggression. A collaboration with Mexico's National Institute of Astrophysics, Optics and Electronics (INAOE) Institute in Puebla, the study finds that AI models originally trained on human speech can be used as a starting point to train new systems that target animal communication.
The same models can also glean other information from animal vocalizations, such as the animal's age, breed and sex. The results were presented at the Joint International Conference on Computational Linguistics, Language Resources and Evaluation.
Rada Mihalcea, the Janice M. Jenkins Collegiate Professor of Computer Science and Engineering and director of U-M's AI Laboratory, stated that their research uses speech processing models initially trained on human speech. This opens a new window into how we can leverage what we have built so far in speech processing to start understanding the nuances of dog barks.
We know little about the animals sharing this world with us. Advances in AI can revolutionize our understanding of animal communication, and our findings suggest that we may not have to start from scratch.
However, there are numerous obstacles in this field. One of the prevailing obstacles to developing AI models that can analyze animal vocalizations is the lack of publicly available data. While numerous resources and opportunities exist for recording human speech, collecting such data from animals is more difficult.
Artem Abzaliev, lead author and U-M doctoral student in computer science and engineering remarked that animal vocalizations are logistically much harder to solicit and record. They must be passively recorded in the wild or, in the case of domestic pets, with owners' permission.
Because of this dearth of usable data, techniques for analyzing dog vocalizations have proven challenging to develop, and a lack of training material limits the existing ones. The researchers overcame these challenges by repurposing a model originally designed to analyze human speech.
Researchers tapped into robust models that form the backbone of the various voice-enabled technologies we use today, including voice-to-text and language translation. These models are trained to distinguish nuances in human speech, like tone, pitch and accent, and convert this information into a format that a computer can use to identify what words are being said, recognize the individual speaking, and more.
According to Abzaliev, these models can learn and encode complex human language and speech patterns. The researchers wanted to see if we could leverage this ability to discern and interpret dog barks. They used a dataset of vocalizations recorded from 74 dogs of varying breeds, ages, and sexes in various contexts. Abzaliev then used the recordings to modify a machine-learning model—a computer algorithm that identifies patterns in large data sets. The team chose a speech representation model called Wav2Vec2, which was initially trained on human speech data.
With this model, the researchers could generate representations of the acoustic data collected from the dogs and interpret these representations. They found that Wav2Vec2 succeeded at four classification tasks and outperformed other models trained specifically on dog bark data, with accuracy figures up to 70%.
Their results show that the sounds and patterns derived from human speech can serve as a foundation for analyzing and understanding the acoustic patterns of other sounds, such as animal vocalizations.