CN104157286A

CN104157286A - Idiomatic phrase acquisition method and device

Info

Publication number: CN104157286A
Application number: CN201410374537.4A
Authority: CN
Inventors: 卢存洋
Original assignee: Shenzhen Jinli Communication Equipment Co Ltd
Current assignee: Shenzhen Microphone Holdings Co Ltd
Priority date: 2014-07-31
Filing date: 2014-07-31
Publication date: 2014-11-19
Anticipated expiration: 2034-07-31
Also published as: CN104157286B

Abstract

The embodiment of the invention discloses an idiomatic phrase acquisition method which comprises the following steps: if a voice signal from a user is detected, voice data corresponding to the voice signal is acquired; according to a preset voice byte threshold, a number of target voice bytes corresponding to the voice byte threshold are selected from the voice data; and the target voice bytes are analyzed, and an analysis result including the idiomatic phrases of the user is acquired. The embodiment of the invention also discloses an idiomatic phrase acquisition device. By adopting the idiomatic phrase acquisition method and device, the idiomatic phrases of related users can be obtained in a targeted way.

Description

A kind of phrasal acquisition methods and device

Technical field

The present invention relates to medium technique field, relate in particular to a kind of phrasal acquisition methods and device.

Background technology

In daily life, people inevitably will exchange with other people.But, with people's communication process in, people have the words custom of oneself, therefore in the time exchanging, may carry some idioms.Wherein, some term customs as uncivil words may destroy communication environment, such as in certain comparatively formal occasion, unconscious several the uncultivated pet phrases of emerging, can affect the concordance exchanging between people, may bring negative effect to speaker, even cause certain loss.Therefore key be accustomed to becoming in the words of, grasping in time self.But, in prior art, do not exist user's words custom is analyzed, nor can pass through current means of communication, obtain associated user's words custom.

Summary of the invention

The embodiment of the present invention provides a kind of phrasal acquisition methods and device, can obtain targetedly associated user's idiom.

The embodiment of the present invention provides a kind of phrasal acquisition methods, comprising:

If the voice signal that user sends detected, obtain the speech data that described voice signal is corresponding;

According to default voice byte threshold value, from described speech data, filter out the target voice byte of described voice byte threshold value corresponding number;

Described target voice byte is resolved, and obtain the phrasal analysis result that comprises described user.

Correspondingly, the embodiment of the present invention also provides a kind of idiom acquisition device, comprising:

The first acquiring unit, if the voice signal for detecting that user sends obtains the speech data that described voice signal is corresponding;

Screening unit, for according to default voice byte threshold value, filters out the target voice byte of described voice byte threshold value corresponding number from described the first acquiring unit the speech data obtaining;

Second acquisition unit, resolves for the target voice byte that described screening sieve unit is selected, and obtains the phrasal analysis result that comprises described user.

The embodiment of the present invention can be in the time detecting the voice signal that user sends, obtain corresponding speech data, analyze by the target voice byte to filtering out in this speech data, thereby obtain active user's idiom, the idiom that can obtain targetedly associated user, dirigibility is stronger.

Brief description of the drawings

In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, below the accompanying drawing of required use during embodiment is described is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.

Fig. 1 is the schematic flow sheet of a kind of phrasal acquisition methods of the embodiment of the present invention;

Fig. 2 is the schematic flow sheet of the phrasal acquisition methods of another kind of the embodiment of the present invention;

Fig. 3 is the schematic flow sheet of a kind of method of obtaining target voice byte of the embodiment of the present invention;

Fig. 4 is the mutual schematic diagram of a kind of phrasal acquisition methods of the embodiment of the present invention;

Fig. 5 is the schematic flow sheet of another phrasal acquisition methods of the embodiment of the present invention;

Fig. 6 is the structural representation of a kind of idiom acquisition device of the embodiment of the present invention;

Fig. 7 is the structural representation of the another kind of idiom acquisition device of the embodiment of the present invention;

Fig. 8 is the structural representation of another idiom acquisition device of the embodiment of the present invention;

Fig. 9 is the structural representation of a kind of terminal of the embodiment of the present invention;

Figure 10 is the structural representation of a kind of server of the embodiment of the present invention;

Figure 11 is the structural representation that a kind of idiom of the embodiment of the present invention obtains system.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiment.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.

Refer to Fig. 1, it is the schematic flow sheet of a kind of phrasal acquisition methods of the embodiment of the present invention, described method can specifically be applied in the terminal devices such as mobile phone, panel computer, wearable device, or can be applicable in server, and the embodiment of the present invention does not limit.Concrete, described method comprises:

S101: if the voice signal that user sends detected, obtain the speech data that described voice signal is corresponding.

In specific embodiment, can be by detecting the current voice signal that whether exists user to send, and in the time voice signal being detected, trigger and obtain the speech data that this voice signal is corresponding, such as acquire this speech data by recording.

Further, before obtaining this speech data, also can detect whether the current user who sends voice signal is the validated user of current terminal, such as the speech samples by default carries out matching detection, wherein, this speech samples is the sound clip of validated user, specifically can be recorded and obtain in advance by validated user.

S102: according to default voice byte threshold value, filter out the target voice byte of described voice byte threshold value corresponding number from described speech data.

In specific embodiment, can set in advance a voice byte threshold value, and from the speech data obtaining, extract target voice byte according to this threshold value.In general, the corresponding voice byte of the word that user often says, such as user says " how do you do ", corresponding three voice bytes.

Alternatively, this speech data obtaining can be in short, and the voice byte that can extract this threshold value corresponding number from the ad-hoc location of these words as beginning and/or ending according to this default voice byte threshold value is as target voice byte.That is to say, can often acquire in short, such as often recording while obtaining in short, can carry out the screening operation of target voice byte, thereby screening obtains the target voice byte of some.Wherein, between every words, can distinguish by default dead time interval.

Further alternatively, this speech data obtaining also can be one section of word (being made up of many words), can carry out staging treating to this speech data obtaining according to default dead time interval, obtain multiple sound bites (sound bite can correspond in short).Correspondingly, if this voice byte threshold value setting is 5, can extract 5 voice bytes as target voice byte from the ad-hoc location of each sound bite, such as extracting front 5 bytes of this sound bite and/or rear 5 bytes as target voice byte, thereby obtain multiple target voice bytes.

S103: described target voice byte is resolved, and obtain the phrasal analysis result that comprises described user.

Concrete, in each target voice byte, exist identical if be resolved to, be that some target voice byte repeats, calculate the occurrence number of this voice byte, it is multiplicity, and exceed default amount threshold in this multiplicity, such as 5 times time, corresponding target voice byte is stored as this user's idiom.

The user habit term that further, also parsing can be obtained and this phrasal multiplicity are pushed to current terminal.

Further, also can send voice signal to user in subsequent detection, and the idiom that obtains of speech data corresponding to this voice signal and this parsing is while matching, the prompting that initiates a message, with the relevant words of call user's attention.

Implementing the embodiment of the present invention can be in the time detecting the voice signal that user sends, obtain corresponding speech data, analyze by the target voice byte to filtering out in this speech data, thereby obtain active user's idiom, the idiom that can obtain targetedly associated user, dirigibility is stronger.

Refer to Fig. 2, be the schematic flow sheet of the phrasal acquisition methods of another kind of the embodiment of the present invention, concrete, described method comprises:

S201: if the voice signal that user sends detected, obtain the voice attribute that described voice signal is corresponding.

S202: whether the voice attribute corresponding with preset speech samples matches to judge the voice attribute that described voice signal is corresponding.

In specific embodiment, can set in advance a speech samples, the sound clip that this speech samples is validated user, specifically can be recorded and be obtained by current validated user.

S203: if coupling is obtained the speech data that described voice signal is corresponding.

Concrete, in the time detecting that voice signal that user sends detects that someone speaks, can be by the voice attribute of this voice signal be mated to contrast with the voice attribute of this speech samples, such as whether the tone color and the frequency that judge both correspondences match, thereby determine the legitimacy of active user's identity, and be coupling in judged result, when active user's identity is legal, triggers and obtain the speech data that this voice signal is corresponding.Wherein, described voice attribute can comprise word speed, intonation, tone color or frequency etc.

S204: according to default dead time interval, described speech data is carried out to segmentation, obtain sound bite.

In the time determining the current user who sends voice signal for validated user, can obtain corresponding speech data as by this speech data is recorded and obtained.Concrete, this speech data can be one whole section of voice, multiple sound bites have been comprised, can carry out staging treating to this speech data by default segmented mode, such as according to the default dead time interval between each voice byte in this speech data as 200ms carries out segmentation, obtain sound bite (this sound bite can correspond in short).Further, the speech data of recording if current, can be using this word as a sound bite only in short, often records a word, can be using this word as a sound bite, thus obtain the sound bite of predetermined number threshold value.

S205: according to default voice byte threshold value, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described sound bite is respectively as target voice byte.

In specific embodiment, also can set in advance a voice byte threshold value, according to this threshold value from divide each sound bite ad-hoc location as beginning and/or ending extract target voice byte.For instance, if this voice byte threshold value setting is 5, can extracts front 5 bytes of this sound bite and rear 5 bytes as target voice byte simultaneously, thereby obtain multiple target voice bytes.

Further, can arrange described voice byte threshold value is successively decreased successively, such as being decremented to successively 4,3,2,1 from 5, and repeat the target voice byte that extracts corresponding voice byte threshold value corresponding number from the beginning and end of each sound bite, until this voice byte threshold value becomes 0, extract 5 voice bytes, 4 voice bytes, 3 voice bytes, 2 voice bytes and 1 voice byte as target voice byte from the beginning and end of each sound bite respectively, thereby acquire the target voice byte of different phonetic byte number.

S206: calculate the multiplicity of described target voice byte, and record described multiplicity.

S207: obtain described multiplicity if detect and reach the second default amount threshold, the idiom using described target voice byte as described user, and preserve described idiom.

Concrete, in each target voice byte, there is identical target voice byte if be resolved to, calculate the occurrence number of this voice byte, it is multiplicity, and exceed default amount threshold in this multiplicity, such as 5 times time, corresponding target voice byte is stored as this user's idiom, carry out analysis result inquiry or directly this analysis result that comprises user habit term be pushed to user for user.

Alternatively, can preset a reminder time, such as some every nights nine, and in the time that this arrives, the analysis result acquiring is pushed to current terminal as object informations such as user habit term and corresponding multiplicity thereof reminder time.

In specific embodiment, also can set in advance one and forbid sound bank, can presetly carry the sound bite of forbidding instruction in this sound bank, i.e. some usual uncivil words, as the voice byte such as " leaning on ", " behaviour ", " your younger sister ".Alternatively, be the voice byte that need to forbid if resolve described idiom, words as uncivil in some, can generate and forbid instruction, and the idiom of forbidding instruction described in carrying is forbidden in sound bank described in the sound bite of forbidding joins.

Further, speech data corresponding to described user sends if detect voice signal with described in forbid that in sound bank, any one of each sound bite matches, the prompting that can initiate a message, with the call user's attention words of being correlated with.Concrete, this message notifying can comprise the prompting of note, the tinkle of bells or vibration mode, the embodiment of the present invention is not construed as limiting.

Implementing the embodiment of the present invention can be detecting that the current user identity that sends voice signal triggers and obtains corresponding speech data when legal, obtain sound bite and filter out more representational words from beginning and/or the ending of each sound bite by this speech data being carried out to staging treating, thereby analyze the idiom that obtains active user, and targetedly this idiom is pushed to associated user, further, also can be arranged on subsequent detection and say this pet phrase as uncivil words to user time, user is reminded.

Refer to Fig. 3, be the schematic flow sheet of a kind of method of obtaining target voice byte of the embodiment of the present invention, concrete, described method comprises:

S301: filter out the target voice fragment that voice byte number is more than or equal to default voice byte threshold value from described sound bite.

S302: if the quantity of the described target voice fragment filtering out is not less than the first default amount threshold, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described target voice fragment is respectively as target voice byte.

For instance, if this voice byte threshold value setting is 5, the amount threshold that sound bite is corresponding is set to 6, can from this sound bite, filter out voice byte and be more than or equal to 5 sound bite, and can be in the time that screening reaches 6 sound bites, front 5 voice bytes of these 6 sound bites of triggering extraction and/or rear 5 voice bytes are as target voice byte.

Alternatively, in the sound bite obtaining for division, voice byte is less than the sound bite of this default voice byte threshold value, can be used as the appearance that is about to of the uncivil words of conjecture, this is less than to the sound bite of voice byte threshold value and preset forbidding, and in sound bank, each sound bite contrasts, if detecting both mates, the sound bite that this can be less than to voice byte threshold value is as uncivil words, and preserve this uncivil words and occurrence number thereof, so that user is pushed to active user follow-up inquiry maybe by uncivil words and the occurrence number thereof of this appearance.

S303: described voice byte threshold value is successively decreased successively, and judge whether the voice byte threshold value after successively decreasing is zero.

Further, can arrange described voice byte threshold value is successively decreased successively, such as being decremented to successively 4,3,2,1 from 5, and repeated execution of steps S302, until this voice byte threshold value becomes 0, extract 5,4,3,2 and 1 voice bytes as target voice byte from beginning and/or the ending of this target voice fragment filtering out respectively.

S304: obtain target voice byte.

If this voice byte threshold value becomes 0, can show the extraction operation of target end voice byte, thereby acquire the target voice byte of different phonetic byte number.

For instance, if screening obtains following sound bite:

1. this class has been about to begin at once.

2. then classmates recall rapidly the content that class is said.

3. having got well elder generation has not seen.

4. then open your book and translate into the 55 page.

5. then have a look suggestion content there.

6. this class has started.

Wherein, the amount threshold that this sound bite is corresponding is 6, and voice byte threshold value setting is 5, can be using 6 word of accumulation continuously as a comparative unit, and every words all meet voice byte and are more than or equal to 5.

For a 6 above-mentioned word, according to this voice byte threshold value 5, the voice byte that can be respectively extracts " this class at once " and " will start ", " then classmates " and " content of saying ", " get well elder generation not " and " elder generation do not seen ", " then opening you " and " the 55 page ", " then having a look " and " suggestion content ", " this class starts " and " class has started " correspondence from the beginning and end of every words is as target voice byte, and the each target voice byte extracting is resolved.

In specific embodiment, the voice byte that can start by every words that relatively extract respectively and the voice byte of ending place are resolved each target voice byte.Such as comparing the beginning that heading statement is every words, " this class at once ", " then classmates ", " having got well elder generation does not see ", " then opening you ", " then having a look ", " this class starts ", find that in 6 heading statements, neither one is identical; Further, relatively tail statement is ending place of every words, " to start ", " content of saying ", " elder generation do not seen ", " the 55 page ", " suggestion content ", " class has started ", find that in six tail statements, neither one is identical, can arrange this voice byte threshold value is decremented to 4 by 5.

According to this voice byte threshold value 4, can compare heading statement " this class horse ", " then classmate ", " having got well elder generation not ", " then opening ", " then seeing one ", " this class is opened ", find that in six heading statements, neither one is identical; Further, relatively tail statement " started ", " content ", " first not seen ", " 55 pages ", " showing content ", " beginning ", find that in six tail statements, neither one is identical, can arrange this voice byte threshold value is decremented to 3 by 4, and by that analogy.

Until this voice byte threshold value is decremented to 2, in the heading statement of discovery 6 word, " then " appearance three times, now preserves voice byte corresponding to " then ", records corresponding multiplicity 3, occurs 3 times.

Finally this voice byte threshold value is decremented to 1 by 2, finds that in heading statement, " this " occurs twice, now preserve the voice byte that " this " is corresponding, record its multiplicity 2; " so " word occurs three times, records its multiplicity 3; Also can find that in tail statement, " " occurs four times, preserve voice byte corresponding to " ", recording multiplicity is 4.Further, multiplicity that should " so " and voice byte threshold value are that the multiplicity of " then " of 2 o'clock is identical, be 3, not higher than the occurrence number of " then ", and " then " comprise " so ", can directly give up the relative recording of this " so ", otherwise record " so " and multiplicity thereof.

To sum up analyze and obtain, the user habit term that this is resolved to is that pet phrase has " this ", " then " and " ".Further, if amount threshold corresponding to this multiplicity is set to 3, " then " and " " can be stored as this user's idiom.

Further, can be more than or equal to 56 word to follow-up voice byte and carry out above-mentioned resolving, and obtain the analysis result that comprises user's pet phrase, if the pet phrase monitoring has consistent with pet phrase above, can add up the occurrence number of this pet phrase, and cross certain number of times at Preset Time wide-ultra, when occurring exceeding 20 times in 3 hours, be labeled as grave warning, initiate a message and notify active user.

Implementing the embodiment of the present invention can be by filtering out the sound bite that exceedes certain voice byte number, and according to the descending of predetermined word joint number, punish the target voice byte of getting indescribably corresponding byte number from the beginning and end of each sound bite, resolve the byte that whether has repetition in each target voice byte, thereby analyze the idiom that obtains active user, specific aim is stronger.

Please participate in Fig. 4, be the mutual schematic diagram of a kind of phrasal acquisition methods of the embodiment of the present invention, and described method comprises:

S401: if terminal detects the voice signal that user sends, described terminal is obtained the speech data that described voice signal is corresponding.

Alternatively, if terminal detects the voice signal that user sends, described terminal is obtained the speech data that described voice signal is corresponding, can be specially: if terminal detects the voice signal that user sends, described terminal is obtained the voice attribute that described voice signal is corresponding; Whether the voice attribute corresponding with preset speech samples matches the voice attribute that described in described terminal judges, voice signal is corresponding, described speech samples is the sound clip of validated user, and described voice attribute comprises any one in word speed, intonation, tone color and frequency or multinomial; If described terminal judges result is coupling, detect when active user is validated user, described terminal triggers obtains the speech data that described voice signal is corresponding.

S402: described speech data is sent to server by terminal.

S403: server receives the speech data that described terminal sends, and according to default voice byte threshold value, filters out the target voice byte of described voice byte threshold value corresponding number from described speech data.

In specific embodiment, can set in advance a voice byte threshold value, and from the speech data obtaining, extract target voice byte according to this threshold value.

Alternatively, described server is according to default voice byte threshold value, from described speech data, filter out the target voice byte of described voice byte threshold value corresponding number, can be specially: described server carries out segmentation according to default dead time interval to described speech data, obtain sound bite, described speech data comprises at least one sound bite; Described server is according to default voice byte threshold value, and the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described sound bite is respectively as target voice byte.

It should be noted that, this S403 obtains the step of target voice byte and also can be carried out by terminal, can be by terminal according to default voice byte threshold value, filter out the target voice byte of described voice byte threshold value corresponding number from described speech data after, the target voice byte of obtaining is sent to server, so that server is resolved described target voice byte.

S404: server is resolved target voice byte, and obtain the phrasal analysis result that comprises described user.

Alternatively, described server is resolved target voice byte, and obtains the phrasal analysis result that comprises described user, can be specially: described server calculates the multiplicity of described target voice byte, and record described multiplicity; If described server detection obtains described multiplicity and reaches default amount threshold, the idiom of described server using described target voice byte as described user, and preserve described idiom.

S405: described analysis result is pushed to terminal by server.

Further, the analysis result such as user habit term and this phrasal multiplicity that described server also can obtain parsing is pushed to current terminal.

Referring to Fig. 5, is the schematic flow sheet of another phrasal acquisition methods of the embodiment of the present invention, and described method can specifically be applied in server, concrete, and described method comprises:

S501: the speech data that server receiving terminal sends, and according to default voice byte threshold value, from described speech data, filter out the target voice byte of described voice byte threshold value corresponding number.

Wherein, the speech data corresponding with described voice signal that described speech data obtains in the time the voice signal that user sends being detected for described terminal.

In specific embodiment, described server can carry out segmentation to described speech data according to default dead time interval, obtain sound bite, and according to default voice byte threshold value, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described sound bite is respectively as target voice byte.Wherein, described speech data comprises at least one sound bite;

Concrete, server can set in advance a voice byte threshold value, and extracts target voice byte from the ad-hoc location of each sound bite of dividing as beginning and/or ending according to this threshold value.For instance, if this voice byte threshold value setting is 5, can extracts front 5 bytes of this sound bite and rear 5 bytes as target voice byte simultaneously, thereby obtain multiple target voice bytes.

S502: described server is resolved target voice byte, and obtain the phrasal analysis result that comprises described user.

In specific embodiment, described server can calculate the multiplicity of described target voice byte, and records described multiplicity; If described server detection obtains described multiplicity and reaches default amount threshold, the idiom of described server using described target voice byte as described user, and preserve described idiom.

Implementing embodiment of the present invention server can be in the time receiving the speech data of terminal transmission, analyze by the target voice byte to filtering out in this speech data, thereby obtain active user's idiom, can obtain targetedly associated user's idiom, dirigibility is stronger.

Referring to Fig. 6, is the structural representation of a kind of idiom acquisition device of the embodiment of the present invention, and described device can specifically be arranged in the terminal devices such as mobile phone, panel computer, wearable device, or is arranged in server, and the embodiment of the present invention does not limit.Concrete, described device comprises the first acquiring unit 11, screening unit 12 and second acquisition unit 13.Wherein,

The first acquiring unit 11, if the voice signal for detecting that user sends obtains the speech data that described voice signal is corresponding.

In specific embodiment, the first acquiring unit 11 can be by detecting the current voice signal that whether exists user to send, and in the time voice signal being detected, trigger and obtain the speech data that this voice signal is corresponding, such as acquire this speech data by recording.

Screening unit 12, for according to default voice byte threshold value, filters out the target voice byte of described voice byte threshold value corresponding number from described the first acquiring unit 11 speech data obtaining.

In specific embodiment, can set in advance a voice byte threshold value, screening unit 12 can extract according to this threshold value target voice byte from the speech data obtaining.In general, the corresponding voice byte of the word that user often says, such as user says " how do you do ", corresponding three voice bytes.

Alternatively, the speech data that this first acquiring unit 11 obtains can be in short, and the voice byte that screening unit 12 can extract this threshold value corresponding number from the ad-hoc location of these words as beginning and/or ending according to this predetermined threshold value is as target voice byte.That is to say, can often acquire in short, such as often recording while obtaining in short, can carry out by screening unit 12 the screening operation of target voice byte, thereby screening obtains the target voice byte of some.

Further alternatively, the speech data that this first acquiring unit 11 obtains also can be one section of word (being made up of many words), screening unit 12 can carry out staging treating to this speech data of recording according to default dead time interval, obtains multiple sound bites (a sound bite can correspond in short).Correspondingly, if this voice byte threshold value setting is 5, screen unit 12 and can extract 5 voice bytes as target voice byte from the ad-hoc location of each sound bite, such as extracting front 5 bytes of this sound bite and/or rear 5 bytes as target voice byte, thereby obtain multiple target voice bytes.

Second acquisition unit 13, resolves for the target voice byte that described screening unit 12 is filtered out, and obtains the phrasal analysis result that comprises described user.

Concrete, if second acquisition unit 13 be resolved in each target voice byte, exist identical, be that some target voice byte repeats, can calculate the occurrence number of this voice byte, it is multiplicity, and exceed default amount threshold in this multiplicity, such as 5 times time, corresponding target voice byte is stored as this user's idiom.

Further, this second acquisition unit 13 also can obtain parsing user habit term and this phrasal multiplicity are pushed to current terminal.

Implementing the embodiment of the present invention can be in the time detecting the voice signal that user sends, corresponding speech data is recorded, analyze by the target voice byte to filtering out from the speech data of recording, thereby obtain active user's idiom, the idiom that can obtain targetedly associated user, dirigibility is stronger.

Refer to Fig. 7, it is the structural representation of the another kind of idiom acquisition device of the embodiment of the present invention, described device comprises the first acquiring unit 11, screening unit 12 and the second acquisition unit 13 of above-mentioned idiom acquisition device, further, in embodiments of the present invention, described the first acquiring unit 11 can comprise:

Information acquisition unit 111, if the voice signal for detecting that user sends obtains the voice attribute that described voice signal is corresponding;

Judging unit 112, for judging voice attribute corresponding to described voice signal that described information acquisition unit 111 obtains, whether the voice attribute corresponding with preset speech samples matches.

Wherein, described voice attribute comprises any one in word speed, intonation, tone color and frequency or multinomial.

Data capture unit 113, in the time that described judging unit 112 judged results are coupling, obtains the speech data that described voice signal is corresponding.

Concrete, in the time that information acquisition unit 111 detects that voice signal that user sends detects that someone speaks, can obtain the voice attribute that this voice signal is corresponding, and by judging unit 112, the voice attribute of this voice signal is mated to contrast with the voice attribute of this speech samples, such as whether the tone color and the frequency that judge both correspondences match, thereby determine the legitimacy of active user's identity, and be coupling in judged result, be active user's identity when legal, obtain by data capture unit 113 speech data that this voice signal is corresponding.

Further, in embodiments of the present invention, described screening unit 12 can comprise:

Data segmentation unit 121, for according to default dead time interval, described speech data being carried out to segmentation, obtains sound bite.

Wherein, described speech data comprises at least one sound bite.

When judging unit 112 judged results are coupling, when the current user who sends voice signal is validated user, can obtain corresponding speech data by data capture unit 113, such as by data capture unit 113, this speech data being recorded.Concrete, this speech data can be one whole section of voice, multiple sound bites have been comprised, data segmentation unit 121 can be carried out staging treating to this speech data by default segmented mode, such as according to the dead time interval between each voice byte in this speech data as 200ms carries out segmentation, obtain sound bite (this sound bite can correspond in short).Further, if the speech data of recording by the first acquiring unit 11 is only a word, data segmentation unit 121 can be using this word as a sound bite, the first acquiring unit 11 is often recorded in short, data segmentation unit 121 can be using this word as a sound bite, thereby obtains the sound bite of predetermined number threshold value.

Data extracting unit 122, for according to default voice byte threshold value, the beginning of the sound bite of dividing from described data segmentation unit 121 respectively or ending extract the voice byte of described voice byte threshold value corresponding number as target voice byte.

In specific embodiment, data extracting unit 122 can according to default voice byte threshold value from divide each sound bite ad-hoc location as beginning and/or ending extract target voice byte.For instance, if this voice byte threshold value setting is 5, data extracting unit 122 can be extracted front 5 bytes of this sound bite and rear 5 bytes as target voice byte simultaneously, thereby obtains multiple target voice bytes.

Alternatively, described data extracting unit 122 can be specifically for:

From described sound bite, filter out the target voice fragment that voice byte number is more than or equal to default voice byte threshold value; If the quantity of the described target voice fragment filtering out is not less than the first default amount threshold, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described target voice fragment is respectively as target voice byte.

For instance, if this voice byte threshold value setting is 5, the amount threshold that sound bite is corresponding is set to 6, data extracting unit 122 can filter out voice byte and is more than or equal to 5 sound bite from this sound bite, and can be in the time that screening reaches 6 sound bites, obtain subelement 1222 by voice and extract front 5 voice bytes of these 6 sound bites and/or rear 5 voice bytes as target voice byte.

Further, in embodiments of the present invention, described device also can comprise:

Control module 14, for controlling, described voice byte threshold value is successively decreased successively, and the voice byte that notification data extraction unit 122 extracts described voice byte threshold value corresponding number from beginning or the ending of described target voice fragment is respectively as target voice byte, until described voice byte threshold value is zero.

Further, control module 14 can arrange described voice byte threshold value is successively decreased successively, such as being decremented to successively 4 from 5, 3, 2, 1, and notification data extraction unit 122 extracts the target voice byte of corresponding voice byte threshold value corresponding number from the beginning and end of each sound bite, until this voice byte threshold value becomes 0, be that notification data extraction unit 122 extracts 5 voice bytes from the beginning and end of each sound bite respectively, 4 voice bytes, 3 voice bytes, 2 voice bytes and 1 voice byte are as target voice byte, thereby acquire the target voice byte of different phonetic byte number.

Further, in embodiments of the present invention, described second acquisition unit 13 can comprise:

Computing unit 131, for calculating the multiplicity of described target voice byte, and records described multiplicity;

Information memory cell 132, reaches the second default amount threshold if obtain described multiplicity for detection, the idiom using described target voice byte as described user, and preserve described idiom.

Concrete, in each target voice byte, there is identical target voice byte if be resolved to, can calculate by computing unit 131 occurrence number of this speech data, it is multiplicity, and exceed default amount threshold in this multiplicity, such as 5 times time, by information memory cell 132, corresponding target voice byte is stored as this user's idiom, carry out analysis result inquiry or directly this analysis result that comprises user habit term be pushed to user for user.

Implementing the embodiment of the present invention can be detecting that the current user identity that sends voice signal triggers and obtains corresponding speech data when legal, obtain sound bite and filter out more representational words from the beginning and end of each sound bite by this speech data being carried out to staging treating, thereby analyze and obtain active user's idiom, and targetedly this idiom is pushed to associated user.

Referring to Fig. 8, is the structural representation of another idiom acquisition device of the embodiment of the present invention, and described device can specifically be arranged in server, concrete, and described device comprises screening unit 21 and acquiring unit 22.Wherein,

Described screening unit 21, for according to default voice byte threshold value, filters out the target voice byte of described voice byte threshold value corresponding number from terminal the speech data sending.

In specific embodiment, can set in advance voice byte threshold value, screening unit 12 can extract according to this threshold value target voice byte from the speech data obtaining.

Described acquiring unit 22, resolves for the target voice byte that described screening unit 21 is filtered out, and obtains the phrasal analysis result that comprises described user.

Further, in embodiments of the present invention, described screening unit 21 can comprise:

Data segmentation unit 211, for according to default dead time interval, described speech data being carried out to segmentation, obtains sound bite, and described speech data comprises at least one sound bite;

Data extracting unit 212, for according to default voice byte threshold value, the beginning of the sound bite of dividing from described data segmentation unit 211 respectively or ending extract the voice byte of described voice byte threshold value corresponding number as target voice byte.

In specific embodiment, the ad-hoc location of each sound bite that data extracting unit 212 can be divided from data segmentation unit 211 according to default voice byte threshold value as beginning and/or ending extract target voice byte.For instance, if this voice byte threshold value setting is 5, data extracting unit 212 can be extracted front 5 bytes of this sound bite and rear 5 bytes as target voice byte simultaneously, thereby obtains multiple target voice bytes.

Alternatively, described data extracting unit 212 can be specifically for:

Further, in embodiments of the present invention, described acquiring unit 22 can comprise:

Computing unit 221, for calculating the multiplicity of described target voice byte, and records described multiplicity;

Information memory cell 222, reaches default amount threshold if obtain described multiplicity for detection, the idiom using described target voice byte as described user, and preserve described idiom.

Concrete, in each target voice byte, there is identical target voice byte if be resolved to, can calculate by computing unit 221 occurrence number of this speech data, it is multiplicity, and exceed default amount threshold in this multiplicity, such as 5 times time, by information memory cell 222, corresponding target voice byte is stored as this user's idiom, carry out analysis result inquiry or directly this analysis result that comprises user habit term be pushed to user for user.

Further, referring to Fig. 9, is the structural representation of a kind of terminal of the embodiment of the present invention.As shown in Figure 9, this terminal comprises: at least one processor 100, for example CPU, at least one user interface 300, storer 400, at least one communication bus 200.Wherein, communication bus 200 is for realizing the connection communication between these assemblies.Wherein, user interface 300 can comprise display screen (Display), keyboard (Keyboard), and selectable user interface 300 can also comprise wireline interface, the wave point of standard.Storer 400 can be high-speed RAM storer, also can the unsettled storer of right and wrong (non-volatile memory), and for example at least one magnetic disk memory.Storer 400 can also be optionally that at least one is positioned at the memory storage away from aforementioned processing device 100.Wherein processor 100 can, in conjunction with Fig. 6 and the described idiom acquisition device of Fig. 7, be stored batch processing code in storer 400, and processor 100 calls in storer 400 program code of storage, for carrying out following operation:

In optional embodiment, processor 100 calls the program code of storage in storer 400 in the time the voice signal that user sends being detected, and obtains the speech data that described voice signal is corresponding, is specifically as follows:

If the voice signal that user sends detected, obtain the voice attribute that described voice signal is corresponding;

Whether the voice attribute corresponding with preset speech samples matches to judge the voice attribute that described voice signal is corresponding, and described speech samples is recorded and obtained by validated user, and described voice attribute comprises any one in word speed, intonation, tone color and frequency or multinomial;

If coupling, obtains the speech data that described voice signal is corresponding.

Further alternative, processor 100 calls the program code of storage in storer 400 according to default voice byte threshold value, filters out the target voice byte of described voice byte threshold value corresponding number from described speech data, is specifically as follows:

According to default dead time interval, described speech data is carried out to segmentation, obtain sound bite, described speech data comprises at least one sound bite;

According to default voice byte threshold value, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described sound bite is respectively as target voice byte.

In optional embodiment, processor 100 calls the program code of storage in storer 400 according to default voice byte threshold value, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described sound bite respectively, as target voice byte, is specifically as follows:

From described sound bite, filter out the target voice fragment that voice byte number is more than or equal to default voice byte threshold value;

If the quantity of the described target voice fragment filtering out is not less than the first default amount threshold, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described target voice fragment is respectively as target voice byte.

In optional embodiment, processor 100 can also be carried out following steps:

Described voice byte threshold value is successively decreased successively;

Repeat beginning or the ending from described target voice fragment extracts described voice byte threshold value corresponding number respectively voice byte as target voice byte step, until described voice byte threshold value is zero.

In optional embodiment, processor 100 calls the program code of storage in storer 400 described target voice byte is resolved, and obtains the phrasal analysis result that comprises described user, is specifically as follows:

Calculate the multiplicity of described target voice byte, and record described multiplicity;

Obtain described multiplicity if detect and reach the second default amount threshold, the idiom using described target voice byte as described user, and preserve described idiom.

Concrete, the terminal of introducing in the present embodiment can be in order to implement the part or all of flow process in embodiment of the method that the present invention obtains in conjunction with the idiom of Fig. 1 to Fig. 4 introduction.

Further, referring to Figure 10, is the structural representation of a kind of server of the embodiment of the present invention.As shown in figure 10, this server comprises: at least one processor 500, for example CPU, at least one user interface 700, storer 800, at least one communication bus 600.Wherein, communication bus 600 is for realizing the connection communication between these assemblies.Wherein, user interface 700 can comprise wireline interface, the wave point of standard.Storer 800 can be high-speed RAM storer, also can the unsettled storer of right and wrong (non-volatile memory), and for example at least one magnetic disk memory.Storer 800 can also be optionally that at least one is positioned at the memory storage away from aforementioned processing device 500.Wherein processor 500 can, in conjunction with Fig. 6 and the described idiom acquisition device of Fig. 7, be stored batch processing code in storer 800, and processor 500 calls in storer 800 program code of storage, for carrying out following operation:

In optional embodiment, processor 500 calls the program code of storage in storer 800 in the time the voice signal that user sends being detected, and obtains the speech data that described voice signal is corresponding, is specifically as follows:

Further alternative, processor 500 calls the program code of storage in storer 800 according to default voice byte threshold value, filters out the target voice byte of described voice byte threshold value corresponding number from described speech data, is specifically as follows:

In optional embodiment, processor 500 calls the program code of storage in storer 800 according to default voice byte threshold value, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described sound bite respectively, as target voice byte, is specifically as follows:

In optional embodiment, processor 500 can also be carried out following steps:

Described voice byte threshold value is successively decreased successively;

In optional embodiment, processor 500 calls the program code of storage in storer 800 described target voice byte is resolved, and obtains the phrasal analysis result that comprises described user, is specifically as follows:

Concrete, the server of introducing in the present embodiment can be in order to implement the part or all of flow process in embodiment of the method that the present invention obtains in conjunction with the idiom of Fig. 1 to Fig. 4 introduction.

Further, referring to Figure 11, is the structural representation that a kind of idiom of the embodiment of the present invention obtains system, and described system comprises: terminal 1 and server 2; Wherein,

Described terminal 1, if the voice signal for detecting that user sends obtains the speech data that described voice signal is corresponding, and is sent to described server 2 by described speech data;

Described server 2, the speech data sending for receiving described terminal 1, and according to default voice byte threshold value, from described speech data, filter out the target voice byte of described voice byte threshold value corresponding number; Described target voice byte is resolved, and obtain the phrasal analysis result that comprises described user.

In optional embodiment, described terminal 1, if also can be used for the voice signal that detects that user sends, obtains the voice attribute that described voice signal is corresponding; Whether the voice attribute corresponding with preset speech samples matches to judge the voice attribute that described voice signal is corresponding, described speech samples is the sound clip of validated user, and described voice attribute comprises any one in word speed, intonation, tone color and frequency or multinomial; If coupling, obtains the speech data that described voice signal is corresponding.

In optional embodiment, described server 2, also can be used for, according to default dead time interval, described speech data is carried out to segmentation, obtains sound bite, and described speech data comprises at least one sound bite; According to default voice byte threshold value, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described sound bite is respectively as target voice byte.

Concrete, server 2 can filter out the target voice fragment that voice byte number is more than or equal to default voice byte threshold value from described sound bite, and be not less than the first default amount threshold in the quantity of the described target voice fragment filtering out, during such as 6, the voice byte that extracts described voice byte threshold value corresponding number from beginning or the ending of described target voice fragment is respectively as target voice byte.

Further, server 2 can be controlled described voice byte threshold value is successively decreased successively, and repeat beginning or the ending from described target voice fragment respectively and extract the voice byte of described voice byte threshold value corresponding number as the step of target voice byte, until described voice byte threshold value is zero, thereby acquire the target voice byte of multiple different phonetic byte numbers.

In optional embodiment, described server 2, also can be used for calculating the multiplicity of described target voice byte, and records described multiplicity; Obtain described multiplicity if detect and reach the second default amount threshold, the idiom of described server 2 using described target voice byte as described user, and preserve described idiom.

One of ordinary skill in the art will appreciate that all or part of flow process realizing in above-described embodiment method, can carry out the hardware that instruction is relevant by computer program to complete, described program can be stored in a computer read/write memory medium, this program, in the time carrying out, can comprise as the flow process of the embodiment of above-mentioned each side method.Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc.

It should be noted that, in the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields in certain embodiment, there is no the part of detailed description, can be referring to the associated description of other embodiment.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and related action and unit might not be that the present invention is necessary.

Step in embodiment of the present invention method can be carried out according to actual needs order and adjusted, merges and delete.

Module in embodiment of the present invention device or unit can merge according to actual needs, divide and delete.

Module described in the embodiment of the present invention or unit, can pass through universal integrated circuit, for example CPU (Central Processing Unit, central processing unit), or realize by ASIC (Application Specific Integrated Circuit, special IC).

The text message display packing and the terminal that above the embodiment of the present invention are provided are described in detail, applied specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment is just for helping to understand method of the present invention and core concept thereof; , for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention meanwhile.

Claims

1. a phrasal acquisition methods, is characterized in that, comprising:

2. the method for claim 1, is characterized in that, if the voice signal that the described user of detecting sends obtains the speech data that described voice signal is corresponding, comprising:

Whether the voice attribute corresponding with preset speech samples matches to judge the voice attribute that described voice signal is corresponding, described speech samples is the sound clip of validated user, and described voice attribute comprises any one in word speed, intonation, tone color and frequency or multinomial;

3. the method for claim 1, is characterized in that, the voice byte threshold value that described basis is default filters out the target voice byte of described voice byte threshold value corresponding number from described speech data, comprising:

4. method as claimed in claim 3, is characterized in that, also comprises:

Described voice byte threshold value is successively decreased successively;

Repeat beginning or the ending from described sound bite respectively and extract the voice byte of described voice byte threshold value corresponding number as the step of target voice byte, until described voice byte threshold value is zero.

5. the method for claim 1, is characterized in that, described described target voice byte is resolved, and obtains the phrasal analysis result that comprises described user, comprising:

6. an idiom acquisition device, is characterized in that, comprising:

7. device as claimed in claim 6, is characterized in that, described the first acquiring unit comprises:

Information acquisition unit, if the voice signal for detecting that user sends obtains the voice attribute that described voice signal is corresponding;

Judging unit, for judging voice attribute corresponding to described voice signal that described information acquisition unit obtains, whether the voice attribute corresponding with preset speech samples matches, described speech samples is the sound clip of validated user, and described voice attribute comprises any one in word speed, intonation, tone color and frequency or multinomial;

Data capture unit, in the time that described judging unit judged result is coupling, obtains the speech data that described voice signal is corresponding.

8. device as claimed in claim 6, is characterized in that, described screening unit comprises:

Data segmentation unit, for according to default dead time interval, described speech data being carried out to segmentation, obtains sound bite, and described speech data comprises at least one sound bite;

Data extracting unit, for according to default voice byte threshold value, the beginning of the sound bite of dividing from described data segmentation unit respectively or the voice byte that ending extracts described voice byte threshold value corresponding number are as target voice byte.

9. device as claimed in claim 8, is characterized in that, also comprises:

Control module, for controlling, described voice byte threshold value is successively decreased successively, and the voice byte that notification data extraction unit extracts described voice byte threshold value corresponding number from beginning or the ending of described sound bite is respectively as target voice byte, until described voice byte threshold value is zero.

10. device as claimed in claim 6, is characterized in that, described second acquisition unit comprises:

Computing unit, for calculating the multiplicity of described target voice byte, and records described multiplicity;

Information memory cell, reaches the second default amount threshold if obtain described multiplicity for detection, the idiom using described target voice byte as described user, and preserve described idiom.