Tags:
audio, motif discovery, discovery, linkmedia, multimedia
Owner:
gabriel.sargent@irisa.fr
RADI.sh: Repeated Audio motif DIscovery within audio files.
Please log in to perform a job with this app.
This service finds repeating patterns within speech and audio streams without prior knowledge.
RADI.sh discovers and collects occurrences of repeating spoken/audio motifs within the input audio stream. It is language- and topic-free as it doesn't rely on any prior acoustic and linguistic knowledge, nor training material (unsupervised approach). It handles large audio streams in a reasonable amount of time.
This service proceeds as follows. First, the audio is translated into a sequence of feature vectors, i.e., either a sequence of MFCCs or a posteriorgram calculated from them. Second, the sequence of feature vectors is progressively analyzed using a sliding window to detect repeated motives and store a prototypical pattern representing them in a library. The analysis window consists of two portions: the first portion represents the small pattern to be matched called "seed", and the second represents its recent future. The seed is considered as a potential fragment of motif if it matches partly an element of the library, or if it is repeated within its recent future. In this case, the matching patterns are extended to search for the complete motif occurrences. If the similarity between the matching motifs is above a particular threshold, either a new occurrence of a motif from the library is detected, or a new motif is detected and added to the library. Otherwise, the seed is discarded.
In the current version of this service, the recent future lasts 90 seconds or less according to the remaining duration of the stream. When a new occurrence of an existing motif is detected, its reference motif in the library is updated: it is replaced by the median occurrence according to a dynamic time warping (DTW)-based score. More information can be found in [1].
[1] Muscariello A., Bimbot F. and Gravier G., "Unsupervised Motif Acquisition in Speech via Seeded Discovery and Template Matching", in IEEE Transactions in Audio, Speech and Language Processing, vol. 20, issue 7, pp: 2031-2044, 2012 (online version).
RADI.sh takes an audio file as input and outputs a text file and a JSON file containing the position of the motifs discovered. A JSON file produced by other multimedia services from A||go can be provided as input to be completed with this description.
<start_time>\t<end_time>\t<motif_index>\n <start_time>\t<end_time>\t<motif_index>\n ...each line describing a single motif occurrence. Repeated motifs are related to the same index.
{ "general_info":{ "src":"<input_file_name>", "audio":{ "duration":"<time_in_hh:mm:ss_format>", "start":"<temporal_offset_in_seconds>", "format":"<bit_coding_format>", "sampling_rate":"<frequency> Hz", "nb_channels":"<n> channels", "bit_rate":"<bit_rate> kb/s" } }, "radish":{ "annotation_type":"repeated motives", "system":"radish", "parameters":"<input_parameters>", "modality":"audio", "time_unit":"seconds", "events":[ { "start":<seg_start_time>, "end":<seg_end_time>, "type":"<motif_index>" }, { "start":<seg_start_time>, "end":<seg_end_time>, "type":"<motif_index>" }, ... { "start":<seg_start_time>, "end":<seg_end_time>, "type":"<motif_index>" } ] } }
RADI.sh is the online version of Modis: an audio MOtif DIScovery software. It incorporates Spro 5.0 for the extraction of MFCCs and Audioseg for the calculation of the posteriorgram. Spro was developed by Guillaume Gravier. Modis is a free speech and audio motif discovery software created and developed by (in alphabetical order) Frédéric Bimbot, Laurence Catanese, Guillaume Gravier, Armando Muscariello and Nathan Souviraà-Labastie. It is the property of IRISA, CNRS, INRIA and the University of Rennes.
In input :
In output :
10/08/2017 : Version 1.0,
This app id is : 87
This curl command will create a job, and return your job url, and also the average execution time
files and/or dataset are optionnal, think to remove them if not wantedcurl -H 'Authorization: Token token=<your_private_token>' -X POST -F job[webapp_id]=87 -F job[param]="" -F job[queue]=standard -F files[0]=@test.txt -F files[1]=@test2.csv -F job[file_url]=<my_file_url> -F job[dataset]=<my_dataset_name> https://allgo.inria.fr/api/v1/jobs
Then, check your job to get the url files with :
curl -H 'Authorization: Token token=<your_private_token>' -X GET https://allgo.inria.fr/api/v1/jobs/<job_id>