WO2007132285A1 - A customizable user interface - Google Patents

A customizable user interface Download PDF

Info

Publication number
WO2007132285A1
WO2007132285A1 PCT/IB2006/001926 IB2006001926W WO2007132285A1 WO 2007132285 A1 WO2007132285 A1 WO 2007132285A1 IB 2006001926 W IB2006001926 W IB 2006001926W WO 2007132285 A1 WO2007132285 A1 WO 2007132285A1
Authority
WO
WIPO (PCT)
Prior art keywords
segment
user interface
audio sample
segments
user
Prior art date
Application number
PCT/IB2006/001926
Other languages
French (fr)
Other versions
WO2007132285A8 (en
Inventor
Antti Eronen
Kai Havukainen
Jukka Holm
Timo Kosonen
Original Assignee
Nokia Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corporation filed Critical Nokia Corporation
Priority to PCT/IB2006/001926 priority Critical patent/WO2007132285A1/en
Publication of WO2007132285A1 publication Critical patent/WO2007132285A1/en
Publication of WO2007132285A8 publication Critical patent/WO2007132285A8/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M19/00Current supply arrangements for telephone systems
    • H04M19/02Current supply arrangements for telephone systems providing ringing current or supervisory tones, e.g. dialling tone or busy tone
    • H04M19/04Current supply arrangements for telephone systems providing ringing current or supervisory tones, e.g. dialling tone or busy tone the ringing-current being generated at the substations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions

Definitions

  • a customizable user interface is a customizable user interface.
  • Embodiments of the present invention relate to a customizable user interface.
  • they relate to a method, device, user interface and computer program for providing for the customization of a user interface using audio.
  • This capability may be a dedicated functionality or one function in a multifunctional device.
  • a method comprising: enabling selection of an audio sample for segmentation; automatically segmenting the selected audio sample into a plurality of segments; presenting at least one candidate segment; and enabling assignment of a presented candidate segment to a user interface event.
  • an electronic device comprising: a user input device for enabling user selection of an audio sample for segmentation and for enabling user assignment of a presented candidate segment to a user interface event; a processor for automatically segmenting the user selected audio sample into a plurality of segments; and a display for presenting at least one candidate segment for user assignment to a user interface event.
  • a user interface for an electronic device comprising: a plurality of graphical items representing candidate audio segments obtained from a user selected audio sample; and a first option for assigning one of the plurality of candidate audio segments with a user interface event of the electronic device.
  • a computer program comprising computer program instructions that enable: selection of an audio sample for segmentation; automatic segmentation of the selected audio sample into a plurality of segments; presentation of at least one candidate segment; and assignment of a presented candidate segment to a user interface event.
  • a user can assign his favorite parts of a song as user interface sounds.
  • the user interface of the device is therefore user customizable.
  • An audio sample is a sample or portion of an audio waveform.
  • a waveform is sampled to convert it from analogue to digital samples.
  • An audio sample comprises multiple digital samples.
  • An example of an audio sample is a music sample, which may be a portion of a music track.
  • Fig. 1 schematically illustrates an electronic device
  • Fig. 2 schematically illustrates a method for segmenting a music sample and associating a created music segment with a user interface event
  • Fig. 3 is a schematic illustration of a user interface.
  • Fig. 1 schematically illustrates an electronic device 10 comprising: a processor 12, an audio output device 14, a user input mechanism 16, a display 18 and a memory 20 storing a computer program 22 and a plurality of audio files 24, for example, music files that contain music tracks or other music samples.
  • the device 10 may also have an input port 26 via which the device 10 receives audio files for storage in the memory 20.
  • the input port 26 may be, for example, a serial interface or a radio transceiver or similar.
  • the input port 26 is optional when the memory 20 is a replaceable memory such as a memory card or similar.
  • the processor 12 is arranged to write to and read from the memory 20. It provides output command signals to the audio output device 14 and to the display 18.
  • the processor receives input commands from the user input device 16 and data from the input mechanism 26. Although a single processor 12 is illustrated the functionality of the processor may be performed by one or more devices.
  • the memory 20 may be a monolithic memory or a collection of memory devices.
  • the memory may be wholly or partly integrated within the device 10 or may be wholly or partly portable allowing it to be removed from the device 10.
  • the memory 20 stores computer program instructions 22 that control the operation of the electronic device 10 when loaded into the processor 12.
  • the computer program instructions 22 provide the logic and routines that enables the electronic device to perform the methods illustrated in Figs 2 and 3.
  • the computer program instructions may arrive at the electronic device 10 via an electromagnetic carrier signal or be copied from a physical entity 2 such as a computer program product, a memory device or a record medium such as a CD-ROM or DVD.
  • a physical entity 2 such as a computer program product, a memory device or a record medium such as a CD-ROM or DVD.
  • the audio output device 14 may be any device interface that is suitable for producing sound waves or for providing signals to another device for producing sound waves.
  • the audio output device 14 may be a loudspeaker, headphones, a jack or a wireless interface for headphones etc.
  • the user input device 16 is a mechanism that is actuable by a user to provide command signals to the processor 12.
  • the device is typically actuated by touch but it may be actuated by speech or by other user action.
  • a touch actuated user input devices may include one or more of, for example, keys, joysticks, dials, rollers, computer mice, trackballs, touch sensitive screens etc.
  • the display 18 is any suitable display.
  • Popular displays at the present time are LCD displays and TFT displays.
  • any suitable mechanism for presenting information in a visual form may be used.
  • the electronic device 10 may be any electronic device that is arranged to process music. It may, for example, be a personal computer, a portable computer, a mobile cellular telephone, a personal digital assistant, a personal music player etc.
  • a user of the device 10 is able to use the device 10 to select an audio file 24 or a set of files 24 that are then automatically analyzed by processor 12, under the control of the program 22, to find appropriate segments for association with a user interface event or user interface events.
  • the association may be manual, semi-automatic or automatic.
  • the user may give a general criteria according to which audio files for analysis are automatically suggested to the user. The user is then able to select an appropriate audio file.
  • This functionality allows the user to customize the device 10 according to personal tastes.
  • the user is able to use music from their own music collection 24 to create sounds used with user interface events by the device 10.
  • a user interface event is any event that occurs at the device of which the user is aware. It may, for example, be a user initiated event that occurs as a consequence of a user's interaction with the device. It may, for example, be a system initiated event that occurs for some other reason.
  • a user interface event may be, for example, an alarm activation, an incoming message, ring tone activation, low battery level alarm, system startup, an error indicator, a download complete indicator, the device switching on, the device switching off, accessing or navigating certain menus, actuation of the user input device 16 etc.
  • Fig. 2 schematically illustrates a method for segmenting a music sample and associating a created music segment with a user interface event. The method is performed by the processor 12 under the control of the computer program 22. User input is provided via the user input device 16.
  • a music sample is selected for analysis.
  • the selection may occur as a result of the user starting an application specifically for assigning music segments to particular Ul events.
  • the application is provided by the computer program 22.
  • Such an application may be started via a profile menu, which is a menu accessed by actuating a shortcut key e.g. by tapping an on/off key of the device 10.
  • the user is able to browse the collection of music samples 24 and select one for analysis.
  • the selection may also occur as a result of the device noticing that, e.g., there is no audio assigned to a particular Ul event. This might be the case, for example, when new software/application etc. is downloaded to the device and there are new kinds of Ul events.
  • the process then moves to the analysis block 32, the selected music sample is partitioned into meaningful music segments.
  • process may occur automatically. Automatic in this sense means without user intervention.
  • a segment may be, for example, a unit of sound between abrupt changes in a quality of the music such as loudness and/or pitch and/or timbre.
  • a segment may be a unit of sound between any arbitrary number of beats of the music.
  • a segment may be a unit of sound that is often repeated in the music sample such as at least a portion of the chorus or verse of a musical sample, a repeated melody, a repeated chord, a repetitively sung phrase etc. Repetition in this sense means that it occurs at least twice but most likely more often.
  • Rap music in particular may be segmented into short clips containing a word or two sung by the singer.
  • a segment may be any distinctive section or non-repeating section of the music file, such as the intro, bridge, outro, solo, or part of any distinctive section (such as the first two measures of the verse).
  • the analysis block 32 may, for example, use any one or more of the following methods for music segmentation.
  • the onset detection algorithm applies a Fourier transform to the whole music sample to obtain a spectrogram.
  • the power spectrum is then calculated.
  • the resulting bins are grouped and converted into a plurality of bands. Further masking processing may then be applied.
  • the resulting spectrogram is converted to an event detection function by, for example, first calculating the first-order difference function for each spectral band, and then by summing these envelopes across the channels.
  • the resulting signal contains peaks which correspond to onset transitions. The onset transients are found by identifying local maxima
  • a loudness function may be obtained from the resulting spectrogram as well. If it is assumed that an onset is associated with an increase in loudness, then the local minimum in loudness that directly precedes the local maximum of the event detection function is found. Then a zero-crossing (in a defined direction) of the music waveform that is closest to the found local minimum in loudness is determined and is set as the start of a music segment.
  • the start of a music segment is therefore according to this algorithm where there is an abrupt change in loudness and pitch.
  • the algorithm it would be possible to modify the algorithm so that the start of a music segment is where there is an abrupt change in loudness or pitch.
  • the segment might be, for example, a chord or an orchestra hit.
  • the rhythm or meter of the music sample can be analyzed e.g. with the method presented by Klapuri, Anssi P, et al.: Analysis of the Meter of Acoustic Musical Signals, IEEE Transactions on Audio, Speech and Language Processing, Vol. 14, No. 1 , pp. 342-355, January 2006.
  • This method detects in a music sample beats that define a tempo.
  • This method also detects the measure (or bar), which is a level of rhythm above the beat. For example, in music with 4/4 time signature each measure consists of 4 beats. Music with 3 A time signature has 3 beats per measure, and so on.
  • the method also detects the "tatum", or temporal atom, which is the shortest durational value present in music.
  • the tatum can correspond for example to the duration of the 1/8 th note or the 1/16 th note.
  • the degree of musical accent as a function of time is measured at four different frequency ranges.
  • a bank of comb filter resonators which extract features for estimating the periods and phases of the measure, beat, and tatum pulse.
  • a probabilistic model representing primitive musicological knowledge is used to perform joint estimation of the measure, beat, and tatum pulses based on these features.
  • the probabilistic model takes into account the temporal dependencies between successive estimates, the prior likelihoods of different periods, and models the most likely relations of the periods (e.g. the frequent relation that a measure pulse may consist of 4 beat pulses).
  • the period of a pulse means the time duration between successive beats, and the phase means the time when the beat occurs with respect to the beginning of the piece. Based on the periods and phases of the measure, beat, and tatum pulses their times can be identified. Based on these times a segment consisting of a measure of the music signal can be segmented. When a meter analyzer is used to segment an audio file, the obtained segments are of duration of one beat or multiple of beats. For example, four beats form a measure in music with a 4/4 time signature, if one beat corresponds to a quarter note.
  • the segments may also be of duration of one tatum or multiple tatums. For example, the duration of a segment may be one measure, one beat, and two tatums.
  • each measure consists of an integer number of beats
  • each beat consists of an integer number of tatums.
  • a Chorus-Section Detecting Method for musical audio signals ICASSP 2003 Procs, pp. V-437-440, April 2003 by Masataka Goto describes a method for detecting a chorus section in a music sample.
  • the chorus is typically a section of the music sample that is repeated throughout the music sample, perhaps with different arrangement of accompaniments or melody lines.
  • the paper describes how to create a measure that can be used to judge the similarity of different sections of the music sample, and a criterion for judging which sections are repeated.
  • This kind of analysis could be used to segment the chorus section (or the first few measures of it).
  • the analysis method also returns repetitive sections other than the chorus, e.g. the verse.
  • the segment can also be a portion of the music that is not found to be repetitive by this analysis.
  • a non-repetitive section might be e.g. the intra or outro, or a solo section or some other nonrepeating section within the music file.
  • the segments returned by the method by Masataka Goto are not necessarily an integer number of measures or beats in duration.
  • the music rhythm information consisting of the measures, beats, and/or tatums can be used to take a portion of the repetitive section that is an integer number of measures or beats in duration. For example, if the sound sample is going to be looped, it is desirable to make its duration to be an integer number of measures or beats. This way the sound sample sounds nice when it is looped.
  • information to help or perform the segmentation may also be included with a music file as metadata (e.g. MPEG 7 is used for media content description).
  • the metadata may, for example, give the lyrics of a song as text. This text could be processed to identify repetitive lyrics within the song and the segments of the music sample where they occur.
  • the metadata might also contain e.g. the rhythm information or segment boundaries of a music file.
  • the invention is not limited to music content only.
  • the sound file that is segmented can also be a movie sound track, or a sound sample recorded by the user e.g. with the built in microphone of the device.
  • an onset detector can be used to find onsets from the sound sample, and then distinctive samples following a change in pitch, loudness, or timbre from the sound sample can be picked and presented as potential user interface sounds.
  • a sound segmentation method using timbre or spectrum information may be suitable.
  • One such method is presented in Wu et al.
  • the method segments the audio file to segments consisting of speech, music, speech with music background, speech with noise background, or noise. These segments might be returned by the analysis.
  • the method can also be used to return a list of audio change points. The portion of the audio sample following a change point is a good candidate for an audio segment.
  • the process then moves to block 34 where potential candidate segments are identified and presented to the user.
  • the candidate segments may be all the segments of the music sample or a sub-set of the segments of the music sample.
  • the sub-set may be obtained by applying a filter to the segments of the music samples to identify appropriate candidate segments.
  • the segments may be automatically analyzed to identify sound clips with certain characteristics (such as metallic or wooden tone).
  • the system might have a model for the desired spectrum of the sound.
  • the model might be for example the mean and covariance of the mel-frequency cepstral coefficients (MFCC) of a desired template sound.
  • MFCC mel-frequency cepstral coefficients
  • the segment could then be selected as the one having the MFCC vectors closest to the model of the desired template sound.
  • the distance could be calculated e.g. as the average Mahalanobis distance of the segment feature vectors to the desired sound template parameterized by the mean and covariance.
  • the most appropriate segment may also be identified, e.g., as the most often repeating noise sample. If it is desirable to return only a single candidate sound sample, the postprocessing might also involve finding the segment that is most distinct from the other sound segments, e.g., with respect of its sound timbre or spectrum. This could be done, for example, by calculating the mean and covariance of the mel-frequency cepstral coefficients of the audio sample. The mean and covariance of the MFCCs would also be calculated for each of the candidate sound segments separately. The mean and covariance could then be vectorized such that the whole audio sample is represented with a single feature vector consisting of the mean values and the vectorized values of the covariance matrix. Similar feature vector would represent each candidate sound segment. A distance measure, such as the Euclidean distance, could then be calculated between the feature vector modelling the whole audio sample and the feature vectors of all candidate segments. The audio segment with largest distance to the feature vector of the whole segment could then be selected as the most distinctive audio segment.
  • a distance measure such
  • the characteristics may be ones that are associated with that user interface event. For example, music segments with increasing pitch may be appropriate, if the user interface event is advancing while navigating in a menu, whereas music segments with decreasing pitch may be appropriate for going back while navigating in a menu.
  • the pitch of the audio segment can be estimated, e.g., with the method presented in A. de Cheveigne and H. Kawahara, "YIN, a fundamental frequency estimator for speech and music," J. Aco ⁇ st. Soc. Am., vol. 111 , pp. 1917-1930, April 2002. If there are multiple fundamental frequencies present in the audio segment then the method, e.g., in Matti P.
  • Ryynanen and Anssi Klapuri: "POLYPHONIC MUSIC TRANSCRIPTION USING NOTE EVENT MODELING", Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct. 16-19, 2005, New Paltz, New York may be more suitable.
  • short segments may be appropriate, if the user interface event is selecting a key of the user input device 16 and longer segments, such as the first 4 bars of the verse section, may be appropriate, if the user interface event is a background sound while the device is idle.
  • the event is a ring tone then, e.g., the chorus section may be appropriate.
  • a suitable number of bars may be taken from the beginning of the chorus section to make the length of the section close to 20 seconds.
  • the process then moves to block 36 where a candidate segment is associated with a user interface event such that when the user interface event subsequently occurs the associated music segment is played
  • Fig. 3 is a schematic illustration of a user interface 60.
  • the Fig presents the content of the display 18 at this stage.
  • the content of the display is an aspect of a user interface of the device, other aspects including the control options that are available to a user.
  • Graphic items 4OA, 4OB, 4OC ... are presented as a list 42 to the left of the display 18.
  • the graphic items represent candidate segments.
  • One or more graphic items 50 are presented to the right of the display 18.
  • the graphic item 50 represents a user interface event.
  • a cursor 60 may be moved in the display 18 using the user input device 16.
  • a particular music segment may be presented aurally i.e. played by placing the cursor over the graphic item 40 that represents the music segment and double-selecting the user input device 16.
  • a particular music segment may be associated with a user interface event by placing the cursor over the graphic item 40 that represents the music segment, selecting the user input device 16, dragging the selected graphic item 40 to the graphic item 50 that represents the desired user interface event and deselecting the user input device which drops the selected graphic item 40 onto the graphic item 50.
  • the segments represented by the graphical items 40 may be played one after the other until the user selects one for the desired user interface event.
  • the user interface events may be events initiated by a user that cause the state of the device 10 to change.
  • a user interface event may be which level of a menu the device is currently at. It may be possible to associate a different music segment with each menu level in this example.
  • a user interface event may be the actuation of the user input device 16 e.g. the 'click' sound of a key press may be replaced with a music segment.
  • a music segment may be used to add a sound effect when the user moves from one menu level to another.
  • a music segment may be used to add a sound effect when the user starts an application.
  • a music segment may be used when the device is switched on.
  • a music segment may be used as an alert when an incoming message is received.
  • the assignment of a music segment to a user interface event may also be accomplished (semi-) automatically.
  • the candidate music segments may be automatically analyzed to find segments with certain characteristics (such as metallic or wooden tone) that are known to be suitable for certain user interface events and are then automatically associated with those user interface events or are associated with those user interface events at the option of the user.
  • a chorus section, or another representative section of the song or part of it might automatically be associated as the ring tone.
  • the process then moves to block 38 where the associated music segment may be modified.
  • this block is positioned in the Fig after block 36 it may occur before block 36 in some embodiments i.e. the modification of the music segment may occur before or after it is associated with a user interface event.
  • This block may enable a user to add re-mixing effects such as enabling/disabling fade-in and out of the music segment and/or the looping of the music segment.
  • the user may also be able to concatenate several music segments into a larger segment.
  • This block may, in some embodiments, automatically apply an effect, such as pitch shift or low pass filter to the music segment to get a sound more closely matched to a desired characteristic of the user interface event to which that music segment is associated.
  • an effect such as pitch shift or low pass filter
  • the candidate music segments may be presented at block 34 by playing the whole music sample.
  • the user selects a segment for association by making a selection while the music is playing.
  • the device identifies the music segment that corresponds in time with the user selection and provides that music segment for association with a user interface event.
  • the application for segmenting a music sample and enabling the association of a music segment with a user interface event was a dedicated stand alone application. It may however, be an application that is integrated within another application or called from another application. Such another application may be a media player application. According to this embodiment while a user listens to a music sample that is being played, an option is available for the user to select the playing music sample as in step 30 of Fig 3 and start the method illustrated in Fig 3.
  • the device may be arranged to identify the portion of the music sample at which the user selected the option and then present as a candidate at least the segment that coincides with that portion of the music sample.
  • the user may select the corresponding portion of the sound track to be used for segmentation. For example, the system may take the portion of the sound track 30 seconds before and 30 after the point where the user initiated the analysis and find the candidate segments from this portion of the sound track.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Stereophonic System (AREA)
  • User Interface Of Digital Computer (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

A method including: enabling selection of an audio sample for segmentation; automatically segmenting the selected audio sample into a plurality of segments; presenting at least one candidate segment; and enabling assignment of a presented candidate segment to a user interface event.

Description

TITLE
A customizable user interface.
FIELD OF THE INVENTION
Embodiments of the present invention relate to a customizable user interface. In particular, they relate to a method, device, user interface and computer program for providing for the customization of a user interface using audio.
BACKGROUND TO THE INVENTION
Audio effects have been integrated into electronic devices since the advent of the piezo-electric buzzer.
At the present time, many electronic devices are capable of playing music. This capability may be a dedicated functionality or one function in a multifunctional device.
The market for digital music and, in particular, digital music that is downloadable from the Internet is increasing. User's are becoming used to customizing their electronic devices with their own collection of individual music samples (i.e. music tracks) rather than relying upon albums of music tracks distributed as a collection via CD-ROM.
It would be desirable to enable further customization of an electronic device using music.
BRIEF DESCRIPTION OF THE INVENTION According to one embodiment of the invention there is provided a method comprising: enabling selection of an audio sample for segmentation; automatically segmenting the selected audio sample into a plurality of segments; presenting at least one candidate segment; and enabling assignment of a presented candidate segment to a user interface event.
According to another embodiment of the invention there is provided an electronic device comprising: a user input device for enabling user selection of an audio sample for segmentation and for enabling user assignment of a presented candidate segment to a user interface event; a processor for automatically segmenting the user selected audio sample into a plurality of segments; and a display for presenting at least one candidate segment for user assignment to a user interface event.
According to another embodiment of the invention there is provided a user interface for an electronic device comprising: a plurality of graphical items representing candidate audio segments obtained from a user selected audio sample; and a first option for assigning one of the plurality of candidate audio segments with a user interface event of the electronic device.
According to another embodiment of the invention there is provided a computer program comprising computer program instructions that enable: selection of an audio sample for segmentation; automatic segmentation of the selected audio sample into a plurality of segments; presentation of at least one candidate segment; and assignment of a presented candidate segment to a user interface event.
Thus a user according to at least some embodiments of the invention can assign his favorite parts of a song as user interface sounds. The user interface of the device is therefore user customizable. An audio sample is a sample or portion of an audio waveform. A waveform is sampled to convert it from analogue to digital samples. An audio sample comprises multiple digital samples. An example of an audio sample is a music sample, which may be a portion of a music track.
BRIEF DESCRIPTION OF THE DRAWINGS
For a better understanding of the present invention reference will now be made by way of example only to the accompanying drawings in which: Fig. 1 schematically illustrates an electronic device; and
Fig. 2 schematically illustrates a method for segmenting a music sample and associating a created music segment with a user interface event; and Fig. 3 is a schematic illustration of a user interface.
DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
Fig. 1 schematically illustrates an electronic device 10 comprising: a processor 12, an audio output device 14, a user input mechanism 16, a display 18 and a memory 20 storing a computer program 22 and a plurality of audio files 24, for example, music files that contain music tracks or other music samples.
The device 10 may also have an input port 26 via which the device 10 receives audio files for storage in the memory 20. The input port 26 may be, for example, a serial interface or a radio transceiver or similar. The input port 26 is optional when the memory 20 is a replaceable memory such as a memory card or similar.
The processor 12 is arranged to write to and read from the memory 20. It provides output command signals to the audio output device 14 and to the display 18. The processor receives input commands from the user input device 16 and data from the input mechanism 26. Although a single processor 12 is illustrated the functionality of the processor may be performed by one or more devices.
The memory 20 may be a monolithic memory or a collection of memory devices. The memory may be wholly or partly integrated within the device 10 or may be wholly or partly portable allowing it to be removed from the device 10.
The memory 20 stores computer program instructions 22 that control the operation of the electronic device 10 when loaded into the processor 12. The computer program instructions 22 provide the logic and routines that enables the electronic device to perform the methods illustrated in Figs 2 and 3.
The computer program instructions may arrive at the electronic device 10 via an electromagnetic carrier signal or be copied from a physical entity 2 such as a computer program product, a memory device or a record medium such as a CD-ROM or DVD.
The audio output device 14 may be any device interface that is suitable for producing sound waves or for providing signals to another device for producing sound waves. For example, the audio output device 14 may be a loudspeaker, headphones, a jack or a wireless interface for headphones etc.
The user input device 16 is a mechanism that is actuable by a user to provide command signals to the processor 12. The device is typically actuated by touch but it may be actuated by speech or by other user action. A touch actuated user input devices may include one or more of, for example, keys, joysticks, dials, rollers, computer mice, trackballs, touch sensitive screens etc.
The display 18 is any suitable display. Popular displays at the present time are LCD displays and TFT displays. However, any suitable mechanism for presenting information in a visual form may be used. The electronic device 10 may be any electronic device that is arranged to process music. It may, for example, be a personal computer, a portable computer, a mobile cellular telephone, a personal digital assistant, a personal music player etc.
A user of the device 10 is able to use the device 10 to select an audio file 24 or a set of files 24 that are then automatically analyzed by processor 12, under the control of the program 22, to find appropriate segments for association with a user interface event or user interface events. The association may be manual, semi-automatic or automatic. Alternatively, the user may give a general criteria according to which audio files for analysis are automatically suggested to the user. The user is then able to select an appropriate audio file.
This functionality allows the user to customize the device 10 according to personal tastes. The user is able to use music from their own music collection 24 to create sounds used with user interface events by the device 10.
A user interface event is any event that occurs at the device of which the user is aware. It may, for example, be a user initiated event that occurs as a consequence of a user's interaction with the device. It may, for example, be a system initiated event that occurs for some other reason. A user interface event may be, for example, an alarm activation, an incoming message, ring tone activation, low battery level alarm, system startup, an error indicator, a download complete indicator, the device switching on, the device switching off, accessing or navigating certain menus, actuation of the user input device 16 etc.
Fig. 2 schematically illustrates a method for segmenting a music sample and associating a created music segment with a user interface event. The method is performed by the processor 12 under the control of the computer program 22. User input is provided via the user input device 16.
At block 30, a music sample is selected for analysis.
The selection may occur as a result of the user starting an application specifically for assigning music segments to particular Ul events. The application is provided by the computer program 22. Such an application may be started via a profile menu, which is a menu accessed by actuating a shortcut key e.g. by tapping an on/off key of the device 10. Within the application, the user is able to browse the collection of music samples 24 and select one for analysis.
The selection may also occur as a result of the device noticing that, e.g., there is no audio assigned to a particular Ul event. This might be the case, for example, when new software/application etc. is downloaded to the device and there are new kinds of Ul events.
The process then moves to the analysis block 32, the selected music sample is partitioned into meaningful music segments. Thus process may occur automatically. Automatic in this sense means without user intervention.
A segment may be, for example, a unit of sound between abrupt changes in a quality of the music such as loudness and/or pitch and/or timbre. Alternatively, a segment may be a unit of sound between any arbitrary number of beats of the music. Alternatively a segment may be a unit of sound that is often repeated in the music sample such as at least a portion of the chorus or verse of a musical sample, a repeated melody, a repeated chord, a repetitively sung phrase etc. Repetition in this sense means that it occurs at least twice but most likely more often. Rap music in particular may be segmented into short clips containing a word or two sung by the singer. Alternatively, a segment may be any distinctive section or non-repeating section of the music file, such as the intro, bridge, outro, solo, or part of any distinctive section (such as the first two measures of the verse).
There has been previous published research into how a music sample may be segmented. The analysis block 32 may, for example, use any one or more of the following methods for music segmentation.
"Event Synchronous Music Analysis/Synthesis" by Tristan Jehan, Proc 7th lnt Conf Digital Audio Effects, October 5-8, 2004, describes a method for segmenting music into short segments either based on using an onset detector or a beat tracker.
The onset detection algorithm, applies a Fourier transform to the whole music sample to obtain a spectrogram. The power spectrum is then calculated. The resulting bins are grouped and converted into a plurality of bands. Further masking processing may then be applied.
The resulting spectrogram is converted to an event detection function by, for example, first calculating the first-order difference function for each spectral band, and then by summing these envelopes across the channels. The resulting signal contains peaks which correspond to onset transitions. The onset transients are found by identifying local maxima
A loudness function may be obtained from the resulting spectrogram as well. If it is assumed that an onset is associated with an increase in loudness, then the local minimum in loudness that directly precedes the local maximum of the event detection function is found. Then a zero-crossing (in a defined direction) of the music waveform that is closest to the found local minimum in loudness is determined and is set as the start of a music segment.
The start of a music segment is therefore according to this algorithm where there is an abrupt change in loudness and pitch. Of course it would be possible to modify the algorithm so that the start of a music segment is where there is an abrupt change in loudness or pitch.
When an onset detector is used, the segment might be, for example, a chord or an orchestra hit.
The rhythm or meter of the music sample can be analyzed e.g. with the method presented by Klapuri, Anssi P, et al.: Analysis of the Meter of Acoustic Musical Signals, IEEE Transactions on Audio, Speech and Language Processing, Vol. 14, No. 1 , pp. 342-355, January 2006. This method detects in a music sample beats that define a tempo. This method also detects the measure (or bar), which is a level of rhythm above the beat. For example, in music with 4/4 time signature each measure consists of 4 beats. Music with 3A time signature has 3 beats per measure, and so on. The method also detects the "tatum", or temporal atom, which is the shortest durational value present in music. The tatum can correspond for example to the duration of the 1/8th note or the 1/16th note. In the method, the degree of musical accent as a function of time is measured at four different frequency ranges. This is followed by a bank of comb filter resonators which extract features for estimating the periods and phases of the measure, beat, and tatum pulse. A probabilistic model representing primitive musicological knowledge is used to perform joint estimation of the measure, beat, and tatum pulses based on these features. The probabilistic model takes into account the temporal dependencies between successive estimates, the prior likelihoods of different periods, and models the most likely relations of the periods (e.g. the frequent relation that a measure pulse may consist of 4 beat pulses).
The period of a pulse means the time duration between successive beats, and the phase means the time when the beat occurs with respect to the beginning of the piece. Based on the periods and phases of the measure, beat, and tatum pulses their times can be identified. Based on these times a segment consisting of a measure of the music signal can be segmented. When a meter analyzer is used to segment an audio file, the obtained segments are of duration of one beat or multiple of beats. For example, four beats form a measure in music with a 4/4 time signature, if one beat corresponds to a quarter note. The segments may also be of duration of one tatum or multiple tatums. For example, the duration of a segment may be one measure, one beat, and two tatums. Thus, the measure, beat, and tatum times for a hierarchical structure of temporal points, between which the signal can be segmented. Usually, each measure consists of an integer number of beats, and each beat consists of an integer number of tatums.
"A Chorus-Section Detecting Method for musical audio signals", ICASSP 2003 Procs, pp. V-437-440, April 2003 by Masataka Goto describes a method for detecting a chorus section in a music sample. The chorus is typically a section of the music sample that is repeated throughout the music sample, perhaps with different arrangement of accompaniments or melody lines. The paper describes how to create a measure that can be used to judge the similarity of different sections of the music sample, and a criterion for judging which sections are repeated.
This kind of analysis could be used to segment the chorus section (or the first few measures of it). The analysis method also returns repetitive sections other than the chorus, e.g. the verse. The segment can also be a portion of the music that is not found to be repetitive by this analysis. A non-repetitive section might be e.g. the intra or outro, or a solo section or some other nonrepeating section within the music file.
The segments returned by the method by Masataka Goto are not necessarily an integer number of measures or beats in duration. Thus, the music rhythm information consisting of the measures, beats, and/or tatums can be used to take a portion of the repetitive section that is an integer number of measures or beats in duration. For example, if the sound sample is going to be looped, it is desirable to make its duration to be an integer number of measures or beats. This way the sound sample sounds nice when it is looped.
"Automated Extraction of Music Snippets" by Lie Lu et al. discloses the extraction of a music segment (snippet) from a music sample. The music segment is usually a part of a repeated melody, main theme or chorus. The most salient segment of the music is first detected based on its occurrence frequency and energy information. Meanwhile, the boundaries of musical phrases are also detected based on the estimated phrase length and phrase boundary confidence of each frame. These boundaries are used to ensure that an extracted snippet does not break musical phrases. Finally, the musical phrases including the most salient segment are extracted as music segments.
In addition to doing the actual signal analysis on a mobile terminal, information to help or perform the segmentation may also be included with a music file as metadata (e.g. MPEG 7 is used for media content description). The metadata may, for example, give the lyrics of a song as text. This text could be processed to identify repetitive lyrics within the song and the segments of the music sample where they occur. The metadata might also contain e.g. the rhythm information or segment boundaries of a music file.
The invention is not limited to music content only. The sound file that is segmented can also be a movie sound track, or a sound sample recorded by the user e.g. with the built in microphone of the device. In movie sound tracks and recordings of natural sounds from the environment there is often no repetition, so looking for repetitive parts cannot always be used as a hint for the analysis. Instead, an onset detector can be used to find onsets from the sound sample, and then distinctive samples following a change in pitch, loudness, or timbre from the sound sample can be picked and presented as potential user interface sounds. Especially if the sound sample is not music, a sound segmentation method using timbre or spectrum information may be suitable. One such method is presented in Wu et al. "Multiple Change-Point Audio Segmentation and Classification Using an MDL-Based Gaussian Model", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 14. No. 2, March 2006, pp. ' 647-657. The method segments the audio file to segments consisting of speech, music, speech with music background, speech with noise background, or noise. These segments might be returned by the analysis. The method can also be used to return a list of audio change points. The portion of the audio sample following a change point is a good candidate for an audio segment.
The process then moves to block 34 where potential candidate segments are identified and presented to the user. The candidate segments may be all the segments of the music sample or a sub-set of the segments of the music sample.
The sub-set may be obtained by applying a filter to the segments of the music samples to identify appropriate candidate segments. The segments may be automatically analyzed to identify sound clips with certain characteristics (such as metallic or wooden tone). For example, the system might have a model for the desired spectrum of the sound. The model might be for example the mean and covariance of the mel-frequency cepstral coefficients (MFCC) of a desired template sound. The segment could then be selected as the one having the MFCC vectors closest to the model of the desired template sound. The distance could be calculated e.g. as the average Mahalanobis distance of the segment feature vectors to the desired sound template parameterized by the mean and covariance.
The most appropriate segment may also be identified, e.g., as the most often repeating noise sample. If it is desirable to return only a single candidate sound sample, the postprocessing might also involve finding the segment that is most distinct from the other sound segments, e.g., with respect of its sound timbre or spectrum. This could be done, for example, by calculating the mean and covariance of the mel-frequency cepstral coefficients of the audio sample. The mean and covariance of the MFCCs would also be calculated for each of the candidate sound segments separately. The mean and covariance could then be vectorized such that the whole audio sample is represented with a single feature vector consisting of the mean values and the vectorized values of the covariance matrix. Similar feature vector would represent each candidate sound segment. A distance measure, such as the Euclidean distance, could then be calculated between the feature vector modelling the whole audio sample and the feature vectors of all candidate segments. The audio segment with largest distance to the feature vector of the whole segment could then be selected as the most distinctive audio segment.
If the user interface event to which a segment is to be assigned has already been selected, then the characteristics may be ones that are associated with that user interface event. For example, music segments with increasing pitch may be appropriate, if the user interface event is advancing while navigating in a menu, whereas music segments with decreasing pitch may be appropriate for going back while navigating in a menu. The pitch of the audio segment can be estimated, e.g., with the method presented in A. de Cheveigne and H. Kawahara, "YIN, a fundamental frequency estimator for speech and music," J. Acoυst. Soc. Am., vol. 111 , pp. 1917-1930, April 2002. If there are multiple fundamental frequencies present in the audio segment then the method, e.g., in Matti P. Ryynanen and Anssi Klapuri: "POLYPHONIC MUSIC TRANSCRIPTION USING NOTE EVENT MODELING", Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct. 16-19, 2005, New Paltz, New York may be more suitable. As another example, short segments may be appropriate, if the user interface event is selecting a key of the user input device 16 and longer segments, such as the first 4 bars of the verse section, may be appropriate, if the user interface event is a background sound while the device is idle. As yet another example, if the event is a ring tone then, e.g., the chorus section may be appropriate. Furthermore, if it is desired that the ring tone is approximately 20 seconds long, a suitable number of bars may be taken from the beginning of the chorus section to make the length of the section close to 20 seconds.
The process then moves to block 36 where a candidate segment is associated with a user interface event such that when the user interface event subsequently occurs the associated music segment is played
Fig. 3 is a schematic illustration of a user interface 60. The Fig presents the content of the display 18 at this stage. The content of the display is an aspect of a user interface of the device, other aspects including the control options that are available to a user.
Graphic items 4OA, 4OB, 4OC ... are presented as a list 42 to the left of the display 18. The graphic items represent candidate segments. One or more graphic items 50 are presented to the right of the display 18. The graphic item 50 represents a user interface event. A cursor 60 may be moved in the display 18 using the user input device 16. A particular music segment may be presented aurally i.e. played by placing the cursor over the graphic item 40 that represents the music segment and double-selecting the user input device 16.
A particular music segment may be associated with a user interface event by placing the cursor over the graphic item 40 that represents the music segment, selecting the user input device 16, dragging the selected graphic item 40 to the graphic item 50 that represents the desired user interface event and deselecting the user input device which drops the selected graphic item 40 onto the graphic item 50.
The segments represented by the graphical items 40 may be played one after the other until the user selects one for the desired user interface event. The user interface events may be events initiated by a user that cause the state of the device 10 to change. For example, a user interface event may be which level of a menu the device is currently at. It may be possible to associate a different music segment with each menu level in this example. As another example, a user interface event may be the actuation of the user input device 16 e.g. the 'click' sound of a key press may be replaced with a music segment. As another example, a music segment may be used to add a sound effect when the user moves from one menu level to another. As another example, a music segment may be used to add a sound effect when the user starts an application. As another example, a music segment may be used when the device is switched on. As another example, a music segment may be used as an alert when an incoming message is received.
It should of course be appreciate that different segments of the same or different music samples may be associated with different user interface events. Furthermore, not all user interface events need to have a music segment associated with them.
The assignment of a music segment to a user interface event may also be accomplished (semi-) automatically. For example, the candidate music segments may be automatically analyzed to find segments with certain characteristics (such as metallic or wooden tone) that are known to be suitable for certain user interface events and are then automatically associated with those user interface events or are associated with those user interface events at the option of the user. A chorus section, or another representative section of the song or part of it might automatically be associated as the ring tone.
The process then moves to block 38 where the associated music segment may be modified. Although this block is positioned in the Fig after block 36 it may occur before block 36 in some embodiments i.e. the modification of the music segment may occur before or after it is associated with a user interface event.
This block may enable a user to add re-mixing effects such as enabling/disabling fade-in and out of the music segment and/or the looping of the music segment. The user may also be able to concatenate several music segments into a larger segment.
This block may, in some embodiments, automatically apply an effect, such as pitch shift or low pass filter to the music segment to get a sound more closely matched to a desired characteristic of the user interface event to which that music segment is associated.
In the above described embodiment, the candidate music segments may be presented at block 34 by playing the whole music sample. The user then selects a segment for association by making a selection while the music is playing. The device identifies the music segment that corresponds in time with the user selection and provides that music segment for association with a user interface event.
In the above described embodiments, the application for segmenting a music sample and enabling the association of a music segment with a user interface event was a dedicated stand alone application. It may however, be an application that is integrated within another application or called from another application. Such another application may be a media player application. According to this embodiment while a user listens to a music sample that is being played, an option is available for the user to select the playing music sample as in step 30 of Fig 3 and start the method illustrated in Fig 3.
In addition, the device may be arranged to identify the portion of the music sample at which the user selected the option and then present as a candidate at least the segment that coincides with that portion of the music sample. In a similar manner, when the user is watching a video or digital television broadcast, the user may select the corresponding portion of the sound track to be used for segmentation. For example, the system may take the portion of the sound track 30 seconds before and 30 after the point where the user initiated the analysis and find the candidate segments from this portion of the sound track.
Although embodiments of the present invention have been described in the preceding paragraphs with reference to various examples, it should be appreciated that modifications to the examples given can be made without departing from the scope of the invention as claimed.
Whilst endeavoring in the foregoing specification to draw attention to those features of the invention believed to be of particular importance it should be understood that the Applicant claims protection in respect of any patentable feature or combination of features hereinbefore referred to and/or shown in the drawings whether or not particular emphasis has been placed thereon.
I/we claim:

Claims

1. A method comprising: enabling selection of an audio sample for segmentation; automatically segmenting the selected audio sample into a plurality of segments; presenting at least one candidate segment; and enabling assignment of a presented candidate segment to a user interface event.
2. A method as claimed in claim 1 , wherein, when the user interface event subsequently occurs the associated segment is played.
3. A method as claimed in claim 1 or 2, wherein at least one segment is a unit of sound between abrupt changes in a quality of the audio sample.
4. A method as claimed in claim 3, wherein the quality of the audio sample is loudness and/or pitch and/or timbre.
5. A method as claimed in any preceding claim, wherein at least one segment is a unit of sound between beats of music.
6. A method as claimed in any preceding claim, wherein at least one segment is a unit of sound that is repeated.
7. A method as claimed in any preceding claim, wherein at least one segment is a section of a few words of Rap music.
8. A method as claimed in any preceding claim, further comprising modifying a segment.
9. A method as claimed in any preceding claim, further comprising applying re-mixing effects to one or more segments.
10. A method as claimed in any preceding claim, further comprising applying an effect to a segment.
11. A method as claimed in claim 10, wherein the applied effect modifies a segment so that it more closely matches a predefined characteristic of a user interface event.
12. A method as claimed in any preceding claim, further comprising pre- filtering a plurality of segments to identify appropriate candidate segments.
13. A method as claimed in any preceding claim, further comprising automatically analyzing the plurality of segments to identify segments with characteristics that match a predefined characteristic of a user interface event to which one of the segments is to be assigned.
14. A method as claimed in any preceding claim, further comprising presenting a plurality of candidate segments as a list wherein the list of candidate segments is a sub-set of the segments of the audio sample.
15. A method as claimed in any preceding claim, wherein selection of an audio sample involves browsing a user's collection of music samples.
16. A method as claimed in any preceding claim, wherein selection of an audio sample involves making a selection while an audio sample is playing, the method further comprising identifying the segment that corresponds in time with the user selection and providing that segment for association with a user interface event.
17. A method as claimed in any preceding claim, wherein a segment is associated with a user interface event by dragging and dropping a graphical item representing the segment onto a graphical item representing the user interface event.
18. A method as claimed in any preceding claim, wherein segmentation of an audio sample involves onset detection.
19. A method as claimed in claim 18, wherein onset detection comprises: converting the audio sample from the time domain to the frequency domain to produce a spectrogram; converting the spectrogram to an event detection function in the frequency domain; and identifying local maxima in the event detection function.
20. A method as claimed in any preceding claim, wherein segmentation of an audio sample involves the analysis of metadata relating to the audio sample.
21. An electronic device comprising: a user input device for enabling user selection of an audio sample for segmentation and for enabling user assignment of a presented candidate segment to a user interface event; a processor for automatically segmenting the user selected audio sample into a plurality of segments; and a display for presenting at least one candidate segment for user assignment to a user interface event.
22. A user interface for an electronic device comprising: a plurality of graphical items representing candidate audio segments obtained from a user selected audio sample; and a first option for assigning one of the plurality of candidate audio segments with a user interface event of the electronic device.
23. A user interface as claimed in claim 22, further comprising a second option for selecting the user interface event.
24. A user interface as claimed in claim 22 or 23, wherein the first option is available while an audio sample is playing.
25. A user interface as claimed in claim 24, wherein selection of the first option while the audio sample is playing provides a candidate segment that corresponds to the portion of the audio sample at which the first option was selected.
26. A user interface as claimed in any one of claims 22 to 25, further comprising an option for modifying an audio segment.
27. A user interface as claimed in any one of claims 22 to 26, wherein a plurality of user interface events are represented by respective graphical items and the first option is an option to drag a graphical item representing an audio segment onto a graphical item representing a user interface event.
28. A computer program which when loaded into a processor provides the methods of any one of claims 1 to 20 or the user interface of any one of claims 22 to 27.
29. A computer program comprising computer program instructions that enable: selection of an audio sample for segmentation; automatic segmentation of the selected audio sample into a plurality of segments; presentation of at least one candidate segment; and assignment of a presented candidate segment to a user interface event.
30. A computer program as claimed in claim 29, which provides a dedicated stand alone application.
31. A computer program as claimed in claim 29, which is integrated within a music player application such that while an audio sample plays an option is available for the user to access the computer program, the computer program being arranged to identify the portion of the audio sample at which the user selected the option and enable the presentation, as a candidate, at least the segment that coincides with the identified portion of the audio sample.
32. A computer program as claimed in claim 29, 30 or 31 arranged to enable the modification of an audio segment
PCT/IB2006/001926 2006-05-12 2006-05-12 A customizable user interface WO2007132285A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IB2006/001926 WO2007132285A1 (en) 2006-05-12 2006-05-12 A customizable user interface

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2006/001926 WO2007132285A1 (en) 2006-05-12 2006-05-12 A customizable user interface

Publications (2)

Publication Number Publication Date
WO2007132285A1 true WO2007132285A1 (en) 2007-11-22
WO2007132285A8 WO2007132285A8 (en) 2008-03-06

Family

ID=38693590

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2006/001926 WO2007132285A1 (en) 2006-05-12 2006-05-12 A customizable user interface

Country Status (1)

Country Link
WO (1) WO2007132285A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0851649A2 (en) * 1996-12-30 1998-07-01 Nokia Mobile Phones Ltd. Programming of a telephone's ringing tone
US20050114800A1 (en) * 2003-11-21 2005-05-26 Sumita Rao System and method for arranging and playing a media presentation
KR20060017043A (en) * 2004-08-19 2006-02-23 엘지전자 주식회사 Bell service method using mp3 music of mobile phone
JP2006091680A (en) * 2004-09-27 2006-04-06 Casio Hitachi Mobile Communications Co Ltd Mobile communication terminal and program
US20060079217A1 (en) * 2002-07-23 2006-04-13 Harris Scott C Compressed audio information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0851649A2 (en) * 1996-12-30 1998-07-01 Nokia Mobile Phones Ltd. Programming of a telephone's ringing tone
US20060079217A1 (en) * 2002-07-23 2006-04-13 Harris Scott C Compressed audio information
US20050114800A1 (en) * 2003-11-21 2005-05-26 Sumita Rao System and method for arranging and playing a media presentation
KR20060017043A (en) * 2004-08-19 2006-02-23 엘지전자 주식회사 Bell service method using mp3 music of mobile phone
JP2006091680A (en) * 2004-09-27 2006-04-06 Casio Hitachi Mobile Communications Co Ltd Mobile communication terminal and program

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DATABASE WPI Week 200627, Derwent World Patents Index; Class P86, AN 2006-258602, XP003008417 *
DATABASE WPI Week 200676, Derwent World Patents Index; Class W01, AN 2006-739752, XP003008418 *
LIU C.-C. ET AL.: "Automatic summarization of MP3 music objects", 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, vol. 5, pages 921 - 924, XP010719080 *

Also Published As

Publication number Publication date
WO2007132285A8 (en) 2008-03-06

Similar Documents

Publication Publication Date Title
CA2650612C (en) An adaptive user interface
Pachet et al. Improving timbre similarity: How high is the sky
US8153882B2 (en) Time compression/expansion of selected audio segments in an audio file
US9672800B2 (en) Automatic composer
US10235981B2 (en) Intelligent crossfade with separated instrument tracks
US7386357B2 (en) System and method for generating an audio thumbnail of an audio track
US7953504B2 (en) Method and apparatus for selecting an audio track based upon audio excerpts
US20110011243A1 (en) Collectively adjusting tracks using a digital audio workstation
EP1938325A2 (en) Method and apparatus for processing audio for playback
Hargreaves et al. Structural segmentation of multitrack audio
US8554348B2 (en) Transient detection using a digital audio workstation
US20110015767A1 (en) Doubling or replacing a recorded sound using a digital audio workstation
US11271993B2 (en) Streaming music categorization using rhythm, texture and pitch
WO2017028686A1 (en) Information processing method, terminal device and computer storage medium
JP2023535975A (en) Latent Spatial Representation of Audio Signals for Audio Content-Based Ingestion
US20090067605A1 (en) Video Sequence for a Musical Alert
US8612031B2 (en) Audio player and audio fast-forward playback method capable of high-speed fast-forward playback and allowing recognition of music pieces
Gouyon et al. Content processing of music audio signals
CN113781989A (en) Audio animation playing and rhythm stuck point identification method and related device
WO2007132285A1 (en) A customizable user interface
KR100468971B1 (en) Device for music reproduction based on melody
Dixon Analysis of musical expression in audio signals
Pope et al. Feature extraction and database design for music software
WO2023273440A1 (en) Method and apparatus for generating plurality of sound effects, and terminal device
CN117156173A (en) Vlog generation method and related device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 06779856

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06779856

Country of ref document: EP

Kind code of ref document: A1