CN103780625A - Method and device for discovering interest of users - Google Patents

Method and device for discovering interest of users Download PDF

Info

Publication number
CN103780625A
CN103780625A CN201410038066.XA CN201410038066A CN103780625A CN 103780625 A CN103780625 A CN 103780625A CN 201410038066 A CN201410038066 A CN 201410038066A CN 103780625 A CN103780625 A CN 103780625A
Authority
CN
China
Prior art keywords
access
netwoks
behavioral data
user
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410038066.XA
Other languages
Chinese (zh)
Other versions
CN103780625B (en
Inventor
汤传喜
郭奇
崔华
居胜峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201410038066.XA priority Critical patent/CN103780625B/en
Publication of CN103780625A publication Critical patent/CN103780625A/en
Application granted granted Critical
Publication of CN103780625B publication Critical patent/CN103780625B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method and device for discovering interest of users. The method mainly includes the steps that network access behavior data of users are acquired; a field to which the network access behavior data belong is determined according to entity words contained in the network access behavior data and multiple preset entity words corresponding to fields respectively; weighted values of the network access behavior data are calculated according to multi-dimensional attribute information corresponding to the network access behavior data; the attention of the users to the field to which the network access behavior data belong is determined according to the weighted values of the network access behavior data of the users; the interest of the users is recognized according to the attention to the field to which the network access behavior data belong and a preset interest threshold value corresponding to the corresponding field, wherein the interest threshold value corresponding to the field is set according to the network access behavior data acquired by the multiple users in the network to have access to the field. According to the technical scheme, the interest of the users can be further accurately determined.

Description

User interest discover method and device
Technical field
The present invention relates to network access technology field, be specifically related to user interest discover method and corresponding user interest discovery device.
Background technology
Recommendation of personalized information technology issues because it can make network side the information that meets user interest to user, therefore, recommendation of personalized information technology can effectively improve click volume and the amount of reading of Internet resources.In view of this, recommendation of personalized information technology is more and more applied gradually in access to netwoks.
In recommendation of personalized information technology, discovery user interest is accurately and timely a very important link in this technology.
Existing user interest finds that mode mainly comprises two kinds, and a kind of mode is for guiding user initiatively its interest to be informed to network side; And another kind of mode is automatically to find user interest, find user interest according to user's behavioural information (being user's access to netwoks behavioral data); Wherein, above-mentioned user's behavioural information can comprise: commodity that the information of the information of the microblogging that the information of the webpage that user browsed, the keyword of user search, user deliver, the blog (blog) that user delivers and user buy etc.
At present, find that according to user's behavioural information the specific implementation of user interest is generally: in the time that user reads one piece of document or reads the contents such as webpage, determine the affiliated field of the document, thereby this field can be defined as to user's interest; Certainly, also can further be compared in multiple fields related this user, user be set foot in to one or two maximum fields and is defined as user's interest.
Inventor finds realizing in process of the present invention, and the implementation of existing discovery user interest easily produces misjudgment phenomenon, describes below for two concrete examples:
The example that first is concrete, user reads a certain content and sometimes has disturbing factor, and the user interest of finding according to this disturbing factor is not probably the real interest of user; If a certain field is popular domain, often more thereby user reads the chance of content in this field, but representative of consumer is really not interested in this field for this; For another example, owing to playing, window pushes or user can be caused user by title misleading and browses related content, but these contents that user browses can not be expressed the real interest place of user.
Second concrete example, user's reading may show its shallow degree and temporary transient interest, and if according to user read this part content shallow user degree and temporary transient interest are identified as to the real interest of user, produce the erroneous judgement of user interest; If user is seeing in the process of a TV play, sometimes the performer in TV play is searched for, there is the recommended information about this performer to have read some, user's this reading behavior does not conventionally have high amount of reading and continues the feature occurring, if it is interested in this performer to identify accordingly user, and it is obviously also improper to push the information relevant to this performer to user.
Summary of the invention
The object of the invention is to, overcome existing user interest and find the existing technical problem of mode, provide a kind of user interest discover method and corresponding user interest to find device, technical problem to be solved is further to determine accurately user interest.
Object of the present invention and solve its technical problem and can adopt following technical scheme to realize.
A kind of user interest discover method proposing according to the present invention, wherein, described method comprises: the access to netwoks behavioral data that gathers user; The field under described access to netwoks behavioral data is determined in the entity word comprising according to access to netwoks behavioral data and predefined each field respectively corresponding multiple entity words; Calculate the weighted value of described access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to described access to netwoks behavioral data; Determine the attention rate of described user to the field under described access to netwoks behavioral data according to the weighted value of described user's access to netwoks behavioral data; Interest threshold value corresponding to the attention rate according to described user to field under described access to netwoks behavioral data and predefined corresponding field identified described user's interest, wherein, interest threshold value corresponding to described field is that the access to netwoks behavioral data this field being conducted interviews according to multiple users in network arranges.
A kind of user interest providing according to the embodiment of the present invention is found device, and wherein, this device comprises: acquisition module, for gathering user's access to netwoks behavioral data; Determine field module, for the entity word that comprises according to described access to netwoks behavioral data and predefined each field respectively corresponding multiple entity words determine the field under described access to netwoks behavioral data; Weighted value module, for calculating the weighted value of described access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to described access to netwoks behavioral data; Attention rate module, for determining the attention rate of described user to the field under described access to netwoks behavioral data according to the weighted value of described user's access to netwoks behavioral data; Interest identification module, for according to described user interest threshold value corresponding to the attention rate to field under described access to netwoks behavioral data and predefined corresponding field identify described user's interest; Wherein, interest threshold value corresponding to described field is that the access to netwoks behavioral data this field being conducted interviews according to multiple users in network arranges.
By technique scheme, user interest discover method provided by the invention and device at least have following advantages and a beneficial effect: the embodiment of the present invention arranges the interest threshold value in corresponding field by the access to netwoks behavioral data that utilizes multiple users in network to carry out access to netwoks to corresponding field, make the interest threshold value in corresponding field can be based upon multiple users corresponding field is carried out on access to netwoks characteristic distributions that access to netwoks forms, thereby make the interest threshold value in corresponding field be configured to rational interest threshold value; Weigh the attention rate of unique user to corresponding field by the interest threshold value of utilizing the corresponding field that the present invention arranges, can avoid as much as possible by the heterogeneous networks access line to unique user self comparison between being to determine existing misjudgment phenomenon in the process of user interest; Final the present invention can determine the interest of user to corresponding field more accurately, and more accurately for user issues its real interested content.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to better understand technological means of the present invention, and can be implemented according to the content of specification, and for above-mentioned and other objects of the present invention, feature and advantage can be become apparent, especially exemplified by preferred embodiment, be described in detail as follows below.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the user interest discover method flow chart that the embodiment of the present invention provides;
Fig. 2 is the framework schematic diagram of the user interest discover method that provides of the embodiment of the present invention;
Fig. 3 is that the user interest that the embodiment of the present invention provides is found device schematic diagram.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, the described embodiment of specification is only part embodiment of the present invention, rather than whole embodiment.Based on the embodiment in the present invention, other embodiment that those of ordinary skills obtained successively and that do not pay through creative work, belong to the scope of protection of the invention.
Embodiment mono-, user interest discover method.The flow process of this user interest discover method and signal are as depicted in figs. 1 and 2.
In Fig. 1, S100, collection user's access to netwoks behavioral data.
Concrete, the access to netwoks behavioral data of the user in the present embodiment comprises: information of the commodity that the information of the blog that the information of the microblogging that the information of the webpage that user browsed, the keyword of user search, user deliver (as at least one keyword extracting from microblogging etc.), user deliver (as at least one keyword extracting from blog etc.) and user buy etc.Above-mentioned access to netwoks behavioral data can also comprise that user carries out the temporal information of access to netwoks behavior, buys time of commodity etc. as user activates the time of browser client, the time that user closes browser client, the time of user's logging in network, the time of user's browsing page, the time of user search keyword, the time that user delivers microblogging, time and the user that user delivers blog.Above-mentioned user carries out the temporal information of access to netwoks behavior can be for the calculating at follow-up visiting frequency and access interval etc.
The present embodiment can utilize the browser clients in user's network-termination device to bring in the access to netwoks behavioral data that gathers user.A concrete example, browser client in user's network-termination device can get easily user and carry out the relevant information that access to netwoks operates, it is user's access to netwoks behavioral data, like this, browser client can be according to the predefined network appliance address in its inside, the user's who is collected access to netwoks behavioral data is transferred to the corresponding network equipment (as the network equipment at browser server end place or other equipment), thereby makes the corresponding network equipment can collect easily user's access to netwoks behavioral data.It should be noted that, the present embodiment is in the transmitting procedure of access to netwoks behavioral data of carrying out user, browser client also should transfer to the corresponding network equipment with access to netwoks behavioral data by its identification information, like this, the network equipment can be determined the corresponding user of access to netwoks behavioral data that it receives by the identification information based on browser client; That is to say, in the present embodiment, user can represent with the identification information of browser client.
The access to netwoks behavioral data real-time Transmission that browser client can be collected is given the corresponding network equipment, the timing of access to netwoks behavioral data that browser client also can be collected or not timing be transferred to the corresponding network equipment, for example every integral point moment, the access to netwoks behavioral data that browser client is gathered and the local access to netwoks of being carried out in last hour by user of storing operates and produce is transferred to the corresponding network equipment, after successfully transmitting, browser client is deleted the access to netwoks behavioral data that successfully transfers to the corresponding network equipment of above-mentioned local storage, again for example, when the access to netwoks behavioral data that browser client gathers also local storage at it reaches predetermined quantity (as the shared memory space of access to netwoks behavioral data of collection this locality storage reaches predetermined memory space size), give the corresponding network equipment by the all-network access behavior transfer of data of this locality storage, and after successfully transmitting, browser client is deleted the access to netwoks behavioral data that successfully transfers to the corresponding network equipment of above-mentioned local storage.
The present embodiment also can utilize API(Application Programming Interface, application programming interfaces) gather user's access to netwoks behavioral data from network side.In the case of utilizing the access to netwoks behavioral data that API gathers user from network side, the present embodiment can get the more access to netwoks behavioral data of user, as the present embodiment can utilize API to get the access to netwoks that starts to carry out to user before network equipment report network access behavioral data at browser client to produce and be stored in the access to netwoks behavioral data of network side, that is to say, before browser client is being configured to obtain user's access to netwoks behavioral data and is sending access to netwoks behavioral data to the corresponding network equipment, user utilizes the performed access to netwoks of this browser client to operate corresponding access to netwoks behavioral data and can gather by API.
First the concrete example of access to netwoks behavioral data that utilizes API to gather user is: the network equipment (as the network equipment at browser server end place etc.) is in the time receiving the information that browser client comes by the network-termination device transmission at its place, judge immediately the log-on message that whether includes microblogging or blog etc. in its information receiving, if judging in its information receiving, the network equipment includes log-on message, the network equipment obtains the logon account information of login user from log-on message, and the content of utilizing API to obtain login user from corresponding server to utilize its login account to deliver (content such as blog or microblogging of delivering as login user), then, the content that the network equipment obtains for it is extracted the processing such as keyword, thereby the network equipment collects the access to netwoks behavioral data of user's (being that browser client identifies represented user), wherein, utilize API to obtain content that login user utilizes its login account to deliver and be not limited in login user and utilize this content of delivering of its login account, can also comprise the content that login user utilizes its login account to be delivered in a period of time (as previous month of current time) before.
Second the concrete example that utilizes API to gather user's access to netwoks behavioral data is: all information that the network equipment comes by its network-termination device transmission according to all browser clients that receive in predetermined time interval (as 24 hours) for it of timing of predefined time (as the morning of every day) are carried out collective analysis, to identify the information of the log-on message that includes microblogging or blog etc. all information that receive from it, then, content that the log-on message of the login user comprising in the information that the network equipment identifies according to these is utilized API to obtain each login user from corresponding server (as the server of the correspondence such as microblogging or blog) to utilize its login account to be delivered (content such as blog or microblogging of delivering as login user), afterwards, the content that the network equipment obtains for it is extracted the processing such as keyword, thereby the network equipment collects the access to netwoks behavioral data of user's (being that browser client identifies represented user), wherein, utilize API to obtain content that login user utilizes its login account to deliver and be not limited in login user and utilize this content of delivering of its login account, can also comprise the content that login user utilizes its login account to be delivered in a period of time (as previous month of current time) before.
It should be noted that, in above-mentioned first concrete example and second concrete example, if a network-termination device exists the situation of many people's uses, in many information from the browser client in this network-termination device, may comprise the log-on message of multiple different login users; In this case, the present embodiment can using the log-on message of the multiple different login user in many information of the browser client from a network-termination device respectively the keyword in corresponding content all as the access to netwoks behavioral data of a user (being that browser client identifies represented user), that is to say, login user is not distinguished; Certainly, the present embodiment also can be using the keyword in the content corresponding log-on message of one of them login user in the multiple different login user in many information of the browser client from a network-termination device as a user access to netwoks behavioral data of (being the represented user of browser client), that is to say, the present embodiment can be distinguished login user; For example, the present embodiment can be using the keyword in the content corresponding log-on message of maximum login times in multiple login users login users the user's in the present embodiment (being the represented user of browser client) access to netwoks behavioral data, and the processing operation of the log-on message of other login users not being obtained to corresponding contents and extraction keyword etc. identifies represented user by a maximum login user of login times with browser client and is associated.
The present embodiment can also adopt and obtain user's access to netwoks behavioral data except above-mentioned two kinds other modes browser client acquisition mode and API acquisition mode utilized that exemplify.In addition, the network-termination device of the user in the present embodiment can be the network-termination device that user's computer or intelligent mobile phone or panel computer etc. can carry out access to netwoks.
The field under the access to netwoks behavioral data that above-mentioned steps collects is determined in S110, the entity word comprising according to access to netwoks behavioral data and predefined each field respectively corresponding multiple entity words.
Concrete, the present embodiment can be expressed as each field the vector being made up of a series of entity word in advance, the access to netwoks behavioral data receiving for the network equipment, the entity word (as comprising one or more entity word) that the network equipment can first comprise this access to netwoks behavioral data calculates a vector by pre-defined algorithm, then, measure the distance between vector corresponding to the corresponding vector of this access to netwoks behavioral data and each field by predetermined distance function, afterwards, determine the field (as nearest field being defined as to the field under this access to netwoks behavioral data) under the above-mentioned access to netwoks behavioral data receiving according to each distance of measuring out.
The present embodiment also can adopt other modes to determine the field that the above-mentioned access to netwoks behavioral data collecting is affiliated, exemplifies no longer one by one explanation at this.
S120, according to the weighted value of the attribute information computing network access behavioral data in multiple dimensions corresponding to access to netwoks behavioral data.
Concrete, access to netwoks behavioral data in the present embodiment is to there being multiple dimensions (the dimension here also can be called statistics dimension), and in each dimension all to there being corresponding attribute information, this attribute information does not represent access to netwoks behavioral data intrinsic attribute in its corresponding dimension, but the interim attribute that a kind of access behavior due to user has it on safeguarding.
A concrete example, the attribute information in multiple dimensions corresponding to access to netwoks behavioral data in the present embodiment can comprise: under access to netwoks behavioral data, field tactile reaches the information quality of visiting frequency, the access mode that produces this access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data in field under number of times, access to netwoks behavioral data.
The example that another is concrete, the attribute information in multiple dimensions corresponding to access to netwoks behavioral data in the present embodiment can comprise: under access to netwoks behavioral data, field tactile reaches the information quality of access interval, the access mode that produces this access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data in field under number of times, access to netwoks behavioral data.
The example that another is concrete, the attribute information in multiple dimensions corresponding to access to netwoks behavioral data in the present embodiment can comprise: under access to netwoks behavioral data, field tactile reaches the information quality of access interval, the access mode that produces this access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data in field under the visiting frequency, access to netwoks behavioral data in field under number of times, access to netwoks behavioral data.
Wherein, under above-mentioned access to netwoks behavioral data, the tactile number of times that reaches in field represents tactile the reach number of times of user to this field, that is to say, in a territory, if tactile the reaching of the all-network access behavioral data to this field carried out order metering, the order metering value that this access to netwoks behavioral data is corresponding is the tactile number of times that reaches in the affiliated field of this access to netwoks behavioral data.Under above-mentioned access to netwoks behavioral data, the tactile number of times that reaches in field can be arranged by the network equipment.
Wherein, under above-mentioned access to netwoks behavioral data, the visiting frequency in field represents the visiting frequency of user to this field, that is to say, in a territory, if using all once access to this field as user of each the access to netwoks behavioral data in this field, the visiting frequency value obtaining in the time this access to netwoks behavioral data being brought in real time into the calculating of the visiting frequency to this field can be used as the visiting frequency in field described in this access to netwoks behavioral data.Under above-mentioned access to netwoks behavioral data, the visiting frequency in field can be calculated and be arranged by the network equipment.Touch to reach between number of times and visiting frequency and have relation, as tactile within a period of time, to reach number of times more, visiting frequency can be higher, a concrete example, if user often sees NBA news, the tactile number of times that reaches of entity word NBA can be a lot, and meanwhile, the visiting frequency that entity word NBA shows on time dimension also can be higher.
Wherein, the access mode of this access to netwoks behavioral data of above-mentioned generation refers to the concrete access mode that user adopts while producing this access to netwoks behavioral data carrying out corresponding access to netwoks, if access to netwoks behavioral data is user because access initiatively produces (as active open any browser client and input the web page browsing of corresponding URL and the web page browsing of active searching keyword etc. in address field), or user owing to clicking, content in bullet window or the webpage of propelling movement produces.The access mode of this access to netwoks behavioral data of above-mentioned generation can be brought in setting by browser clients, and transfers to the network equipment with access to netwoks behavioral data.
Wherein, the information quality of the corresponding content resource of above-mentioned access to netwoks behavioral data can be expressed the professional degree of corresponding content resource to a certain extent, and the information quality of content resource can utilize at least one high-end user in the affiliated field of this content resource to determine the access situation of the corresponding content resource of this access to netwoks behavioral data; The high-end user here can be for being confirmed as this field (field under the above-mentioned access to netwoks behavioral data receiving) to have user's (also can be called the senior user in this field) of interest.A concrete example, the relevant informations such as the number of times whether the present embodiment can be accessed by the one or more high-end user in corresponding field and/or be accessed by all high-end users in corresponding field according to the corresponding content resource of access to netwoks behavioral data decide the concrete value of the corresponding content resource of this access to netwoks behavioral data on information quality.The information quality of the corresponding content resource of above-mentioned access to netwoks behavioral data can be arranged by the network equipment.In addition, above-mentioned high-end user also can reach for the interest that is not only confirmed as to this field is had interest and also tackles this field the user of fever degree, as user to content resource under the attention rate in field not only reach corresponding interest threshold value, but also reach in the situation of predetermined threshold, this user is confirmed as the high-end user in this field, and this predetermined threshold is higher than interest threshold value corresponding to field under Internet resources; For another example, user to content resource under the attention rate in field not only reach corresponding interest threshold value, and this user also carried out access to predetermined website, this user can be defined as to high-end user; Above-mentioned predetermined website is generally highly professional website.
Wherein, the access interval of the access time interval user in field to this field under above-mentioned access to netwoks behavioral data; Repeatedly the going up in network process of user, the front once access to a field and the user at the interval number of times of surfing the Net between access next time to this field; The online number of times here can calculate using sky as unit (being user's repeatedly surfing the Net as this user's once online within a day), online number of times in the present embodiment also can calculate with other unit, if the number of times take user's open any browser client is unit calculating.Under above-mentioned access to netwoks behavioral data, the access interval in field can be calculated and be arranged by the network equipment.A concrete example, user surfed the Net and has accessed the content resource in sports field January 7, afterwards, user is online never, until January 10, user is online again, and has again accessed the content resource in sports field, the access in field under corresponding access to netwoks behavioral data can be set to 1 so, and be not the number of days that this access is set to interval between January 7 and January 10.
The present embodiment can be in advance for the different attribute information in all dimensions or part dimension arranges respectively corresponding coefficient, as be that coefficient that initiatively access arranges is higher than the coefficient arranging for passive access, for another example for the coefficient of information quality setting corresponding to the content resource of being accessed by high-end user is higher than the coefficient of information quality setting corresponding to the content resource of not accessed by high-end user.Like this, the present embodiment, after the attribute information of determining in multiple dimensions that access to netwoks behavioral data is corresponding, can utilize each attribute information and corresponding coefficient calculations to go out the weighted value of access to netwoks behavioral data.The present embodiment can adopt corresponding computational methods to carry out the calculating of the weighted value of access to netwoks behavioral data according to actual conditions, and concrete computational methods are in this illustrated in greater detail no longer one by one.
The present embodiment can be in the time receiving an access to netwoks behavioral data or receive many access to netwoks behavioral datas simultaneously, calculate immediately the weighted value of the access to netwoks behavioral data that receives, and by the weighted value calculating and access to netwoks behavioral data with and corresponding each dimension on attribute information together with local storage.Certainly, the present embodiment also can adopt the mode of timing or not timing to process the access to netwoks behavioral data that it receives, for example, every integral point moment, the network equipment is received and all access to netwoks behavioral datas that calculate of local storage carry out weighted value calculating, and after calculating completes, the weighted value calculating is stored together with the attribute information in each dimension that access to netwoks behavioral data and access to netwoks behavioral data are corresponding accordingly; Again for example, the network equipment is in the time that the access to netwoks behavioral data of its local storage reaches predetermined quantity (as the shared memory space of access to netwoks behavioral data that receives also local storage reaches predetermined memory space size), all access to netwoks behavioral datas that do not carry out weighted value calculating to this locality storage carry out weighted value calculating, and after calculating completes, the each weighted value calculating is stored together with the attribute information in corresponding access to netwoks behavioral data and each dimension corresponding to access to netwoks behavioral data.
Attribute information and the weighted value calculating etc. in multiple dimensions corresponding to user's access to netwoks behavioral data, access to netwoks behavioral data can be stored in this user's feature database (as shown in Figure 2) together.
The present embodiment can adopt various ways to carry out the weighted value of computing network access behavioral data, and concrete implementation can arrange according to practical situations, exemplifies no longer in detail explanation at this.
S130, determine the attention rate of user to the field under access to netwoks behavioral data according to the weighted value of user's access to netwoks behavioral data.
Concrete, the present embodiment can real-time mode calculate the attention rate of user to the field under access to netwoks behavioral data, that is to say, when the network equipment often receives an access to netwoks behavioral data or the network equipment and receives many access to netwoks behavioral datas simultaneously, can carry out immediately the attention rate of access to netwoks behavioral data calculates, and utilize the current attention rate calculating to revise the attention rate of user to field under this access to netwoks behavioral data (as " processing online " in Fig. 2, and utilize the storage information in the modified result " feature database " of " online process ").
The present embodiment also can adopt non real-time mode (being offline mode) to calculate the attention rate of user to the field under access to netwoks behavioral data, for example, in the computing of morning this user's who receives the previous day access to netwoks behavioral data being carried out attention rate of every day, after computing completes, utilize the current attention rate calculating to revise the attention rate of user to field under each access to netwoks behavioral data (as " processed offline " in Fig. 2, and utilizing the storage information in the modified result " feature database " of " processed offline ").
The present embodiment can adopt various ways to utilize the weighted value of user's access to netwoks behavioral data to calculate the attention rate of user to field under access to netwoks behavioral data, concrete implementation can arrange according to actual conditions, exemplifies no longer in detail explanation at this.
Interest threshold value identification user corresponding to S140, attention rate according to user to field under access to netwoks behavioral data and predefined corresponding field interest.
Concrete, in the present embodiment, interest threshold value corresponding to predefined corresponding field is to arrange belonging to content resource in this field access to netwoks behavioral data being produced that conducts interviews according to multiple users in network (as the whole network user).
Because multiple users (as the whole network user) can embody the difference of the degree of being paid close attention to by different user in this field to the access situation in a field, therefore, the interest threshold value of utilizing multiple users, to the access situation in a field, this field is set can embody the actual access situation to this field to the interested user in this field more accurately, thereby whether the present embodiment, by utilizing such interest threshold value to judge this field is interesting user, can make the result of judgement more accurate.
A concrete example, be set with two fields, i.e. the first field and the second field, the first field is a field that can often be touched by everybody (as NBA), and the second field is a field that can not often be touched by everybody (as pet fish), user A tends to the access times to the second field well beyond user A to the access times in the first field, but, this can not express the interest place that the first field is user A accurately, that is to say, if by by user A, the access times to the first field and the access times to the second field compare and determine that the interest of user A is the first field, the interest of probably determining not is the interest of user A.According to the actual fact, because the chance in multiple users (as the whole network user) contact the first field is all more, and the chance in contact the second field is all less, therefore, access situation according to multiple users in network to this first field and the second field is that the interest threshold value that the first field arranges should be higher than being the interest threshold value that the second field arranges.
An example more specifically, the content update amount in sports news field is larger, user A has the amount of reading of 10 pieces of sports newss average every day, and the content update amount in pet fish field is less, user A has the amount of reading of 2 pieces of pet fish contents average every day, and from the whole network user's access situation, have every day the user of amount of reading of 20 pieces of sports newss just interested in sports news field at last, and there is every day the user of the amount of reading of 2 pieces of pet fish just can be interested in pet fish field at last.
The access to netwoks distribution situation of different user to different field and different user to the access to netwoks distribution situation in same field as shown in following table 1 and table 2.
Table 1
In table 1, Total User14560 represents that this participates in the user's of statistics quantity, Info " * * " represents * * field, User_num represents the content resource in field to carry out the number of users of access, User_prop represents that the user who the content resource in field was carried out to access accounts for the user's of this participation statistics ratio, a concrete example, for " the Internet " field, User_prop=13095/14560=0.899.
As shown in Table 1, due to the different of amount of information (information updating amount in other words) and whether be the many reasons such as popular domain, make user there is different features to the access of different field, by the access of different field being carried out recently determining that the interested field of user is irrational for same user.
Table 2
Figure 201410038066X100002DEST_PATH_IMAGE002
Table 2 is particular contents of the further displaying in " the Internet " field in table 1, User_num represents the content resource in this field to carry out the user's of access quantity, User_prop represents that the user who the content resource in this field was carried out to access accounts for the user's of this participation statistics ratio, Days represents that user accesses the number of days in " the Internet " field, pv represents tactile the reach number of times of user to " the Internet " field, and entity_num represents that user accesses the quantity of the entity word that the content resource in " the Internet " comprises.
Data in table 2 can show that the different user that " the Internet " carried out to access is in the tactile difference that reaches number of times, the aspect such as entity word quantity and the visiting frequency to this field existence reaching of touching to this field.
In the present embodiment, a concrete example of interest threshold value corresponding to the field that sets in advance is, multiple users' (as the whole network user) access to netwoks behavioral data (obtaining multiple users' access to netwoks behavioral data with offline mode, as shown in " processed offline " square frame in Fig. 2) in the collection network of timing or not timing, for each the access to netwoks behavioral data getting, determine respectively the entity word that this access to netwoks behavioral data comprises, the affiliated field of each access to netwoks behavioral data difference is determined in the entity word comprising according to access to netwoks behavioral data and predefined each field respectively multiple entity words of correspondence, afterwards, calculate the weighted value (calculating of weighted value is concrete as the description in above-mentioned S120) of each access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to each access to netwoks behavioral data (the attribute information is here concrete as the description in above-mentioned S120), then, for each field, according to the distribution situation of the weighted value of the all-network access behavioral data in each field, interest threshold value corresponding to each field is set respectively, as for a field, the weighted value of the all-network access behavioral data that belongs to this field can be put into coordinate, each weighted value is as a point in coordinate, each point is coupled together and can form a broken line, conventionally can be gathered in a milder interval in broken line weighted value corresponding to the uninterested user in this field, and conventionally can be gathered in another interval in broken line to weighted value corresponding to the interested user in this field, and this another interval is with respect to the trend that conventionally can show as unexpected rising between aforementioned zones, thereby the present embodiment can be determined the interest threshold value that this field is corresponding by searching corresponding flex point in this broken line, the present embodiment can be using the weighted value of the flex point the finding interest threshold value corresponding as this field.The present embodiment determines that a concrete example of flex point is, is covering a certain proportion of weighted value in the situation that, if when determining the difference of the slope of adjacent oblique line and reaching certain threshold value, the intersection point of adjacent oblique line can be defined as to flex point; The present embodiment can manually be adjusted the flex point of choosing.
Interest threshold value corresponding to each field that the present embodiment calculates can be stored in as shown in Figure 2 in the distribution library of field.
The present embodiment can judge user to access to netwoks behavioral data under the attention rate in field when reaching or exceeding interest threshold value corresponding to predefined this field, interest using this field as user, and recommend to meet the content resource of its interest to user accordingly, as shown in Figure 2, input message by the data of storage in " feature database " and " field distribution library " as " personalized engine ", thereby " personalized engine " can export the content resource that meets user interest, and then the present embodiment can issue its interested content resource to user.
It is passive browse custom in the situation that to have user, user's ordinary practice is in browsing various top news and playing in real time the content that window pushes, passive custom of browsing based on such just, can cause user all can have more access to netwoks phenomenon to multiple fields; But, because these access are impromptu and random, therefore, the attention rate in multiple fields that user relates to its access probably can't reach the interest threshold value in corresponding field, thereby the interest threshold value in each field that the present embodiment utilization arranges based on multiple users can be got rid of the phenomenon that the field under the content of browsing impromptu and random user is defined as to the interested field of user.
The technique scheme of utilizing the present embodiment to provide, can determine more accurately the interested field of user, further, the present embodiment can also finerly be determined the interested entity word of user, attribute information in multiple dimensions as corresponding in the access to netwoks behavioral data in the present embodiment can also comprise: the entity word that access to netwoks behavioral data is comprised is the tactile number of times that reaches in field under access to netwoks behavioral data, entity word access interval in field under described access to netwoks behavioral data that the entity word that access to netwoks behavioral data the comprises visiting frequency in field and access to netwoks behavioral data under access to netwoks behavioral data comprise, these three attribute informations are all entity words of comprising for the access to netwoks behavioral data in field, rather than for field under access to netwoks behavioral data.A concrete example, in feature database shown in Fig. 2, not only record user's many access to netwoks behavioral datas, and in this feature database, also record and reach attribute information in number of times, visiting frequency, access interval, access mode and information quality dimension and tactile number of times, visiting frequency and the access interval of reaching for the entity word in field for field touching.
Based on the above-mentioned attribute information for entity word, when the present embodiment interest threshold value corresponding in the field that arranges, the interest threshold value of the each entity word in field can also be further set, like this, not only can judge according to the interest threshold value of the each entity word in field the content of the more specifically refinement in the interested field of user, and, even if in the uninterested field of user, also can more pay close attention to by relatively judging user the content of some.
The mode of interest threshold value that entity word is set is basic identical with the above-mentioned mode that interest threshold value corresponding to field be set, and no longer describes in detail at this.The present embodiment is that the interest threshold value of the entity word setting in field also can be stored in field distribution library as shown in Figure 2.
It should be noted that, in the situation that being provided with interest threshold value for entity word in advance, the present embodiment is in the time arranging corresponding interest threshold value for field, not only should consider the attention rate of multiple users to field, can also be using interest the threshold value corresponding each entity word in this field as a reference factor determining interest threshold value corresponding to field.In addition, the information quality of the access mode of above-mentioned generation access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data can be used to entity word to arrange in the process of interest threshold value and the interested entity word of identification user, that is to say, the access mode that produces access to netwoks behavioral data can be used as the access mode that produces the entity word that access to netwoks behavioral data comprises, and the information quality of the corresponding content resource of access to netwoks behavioral data can be used as the access mode of the entity word comprising in access to netwoks behavioral data.
The present embodiment is after determining the interested field of user and interested entity word, in the time pushing its interested content resource to user, can be with reference to the interested entity word of user in the interested field of user, thus can issue the content resource that meets its finer interest to user.
The present embodiment also can be when to user's content recommendation resource, the value of each content resource in consideration content recommendation resource collection in information quality dimension, for example, for the interested field of user, in the time recommending the content resource in its domain of interest to this user, can recommend the higher content resource of value in information quality dimension in its interested field to this user; A concrete example, if user is to pet fish field (this user is the senior user in pet fish field) interested, in the time recommending the content resource of pet fish to this user, Ying Xiangqi is recommended in the higher content resource of value in information quality dimension, like this, can avoid recommending the rudimentary knowledge etc. of the culture of ornamental fish not meet the phenomenon of the content of user's actual need to it.
The present embodiment can also issue corresponding content resource to user according to the current access scenario of user, a concrete example, the network equipment receives browser client collection and transmits after the user's who comes access to netwoks behavioral data, the network equipment extracts entity word from this access to netwoks behavioral data, and utilize the entity word extracting to judge the field that this access to netwoks behavioral data is affiliated, and then, in the time determining this user according to canned data in its feature database and field distribution library and lost interest in this field, the network equipment can be searched attention rate corresponding to all entity words under this field in feature database, then, choose the corresponding content resource of entity word (being information source) that attention rate is the highest, and this content resource is handed down to user, certainly, the present embodiment also can, in the time determining this user and lost interest in this field, be recommended the lower content resource of value in information quality dimension in some these fields to this user, a concrete example, if user loses interest in to pet fish field, in the time recommending the corresponding content resource in pet fish according to the current access scenario of user to this user, Ying Xiangqi is recommended in the lower content resource of value in information quality dimension, as recommended the related content such as rudimentary knowledge and introduction guidance of the culture of ornamental fish to user.
In addition, the entity word that the present embodiment also can comprise according to user's access to netwoks behavioral data issues corresponding content to user.
Embodiment bis-, user interest are found device, and this device as shown in Figure 3.
In Fig. 3, this device mainly comprises: acquisition module 300, determine field module 310, weighted value module 320, attention rate module 330 and interest identification module 340.This device can also comprise: threshold value setting module 350 and issue module 360.
Acquisition module 300 is connected respectively with definite field module 310 and weighted value module 320.Acquisition module 300 is mainly used in gathering user's access to netwoks behavioral data.
Concrete, the user's that acquisition module 300 gathers access to netwoks behavioral data comprises: information of the commodity that the information of the blog that the information of the microblogging that the information of the webpage that user browsed, the keyword of user search, user deliver (as at least one keyword extracting from microblogging etc.), user deliver (as at least one keyword extracting from blog etc.) and user buy etc.The user's that acquisition module 300 gathers access to netwoks behavioral data can also comprise that user carries out the temporal information of access to netwoks behavior, and this temporal information can be for the calculating at follow-up visiting frequency and access interval etc.
Acquisition module 300 can utilize the browser clients in user's network-termination device to bring in the access to netwoks behavioral data that gathers user, and acquisition module 300 also can utilize API to gather user's access to netwoks behavioral data.In the case of utilizing API to gather user's access to netwoks behavioral data, acquisition module 300 can get the more access to netwoks behavioral data of user.Acquisition module 300 can also adopt and obtain user's access to netwoks behavioral data except above-mentioned two kinds other modes browser client acquisition mode and API acquisition mode utilized that exemplify.Concrete as the description of above-mentioned embodiment of the method, be not repeated.
Determine that field module 310 is also connected with attention rate module 330.Determine field module 310 be mainly used in the entity word that comprises according to access to netwoks behavioral data and predefined each field respectively corresponding multiple entity words determine the field under access to netwoks behavioral data.
Concrete, determine that field module 310 can be expressed as each field the vector being made up of a series of entity word in advance, the access to netwoks behavioral data receiving for the network equipment, determine that the entity word (as comprising one or more entity word) that field module 310 can first comprise this access to netwoks behavioral data calculates a vector by pre-defined algorithm, then, determine that field module 310 measures the distance between vector corresponding to the corresponding vector of this access to netwoks behavioral data and each field by predetermined distance function, afterwards, determine that field module 310 determines the field (as nearest field being defined as to the field under this access to netwoks behavioral data) under the above-mentioned access to netwoks behavioral data receiving according to each distance of measuring out.
Weighted value module 320 is also connected with attention rate module 330.Weighted value module 320 is mainly used according to the weighted value of the attribute information computing network access behavioral data in multiple dimensions corresponding to access to netwoks behavioral data.
Concrete, access to netwoks behavioral data in the present embodiment is to there being multiple dimensions (the dimension here also can be called statistics dimension), and in each dimension all to there being corresponding attribute information, this attribute information does not represent access to netwoks behavioral data intrinsic attribute in its corresponding dimension, but the interim attribute that a kind of access behavior due to user has it on safeguarding.
The represented implication of the design parameter that attribute information in multiple dimensions corresponding to access to netwoks behavioral data in the present embodiment is included and each parameter is as the description of above-mentioned embodiment of the method.
Weighted value module 320 can be in advance for the different attribute information in all dimensions or part dimension arranges respectively corresponding coefficient, as be that coefficient that initiatively access arranges is higher than the coefficient arranging for passive access, for another example for the coefficient of information quality setting corresponding to the content resource of being accessed by high-end user is higher than the coefficient of information quality setting corresponding to the content resource of not accessed by high-end user.Like this, weighted value module 320, after the attribute information of determining in multiple dimensions that access to netwoks behavioral data is corresponding, can utilize each attribute information and corresponding coefficient calculations to go out the weighted value of access to netwoks behavioral data.Weighted value module 320 can adopt corresponding computational methods to carry out the calculating of the weighted value of access to netwoks behavioral data according to actual conditions, and concrete computational methods are in this illustrated in greater detail no longer one by one.
Weighted value module 320 can be in the time that acquisition module 300 receives an access to netwoks behavioral data or receives many access to netwoks behavioral datas simultaneously, calculate immediately the weighted value of the access to netwoks behavioral data that receives, and by the weighted value calculating and access to netwoks behavioral data with and corresponding each dimension on attribute information together with local storage.Certainly, weighted value module 320 also can adopt the mode of timing or not timing to process the access to netwoks behavioral data that it receives, for example, every integral point moment, weighted value module 320 receives acquisition module 300 and all access to netwoks behavioral datas that calculate of local storage carry out weighted value calculating, and after calculating completes, weighted value module 320 is stored the weighted value calculating together with the attribute information in each dimension that access to netwoks behavioral data and access to netwoks behavioral data are corresponding accordingly; Again for example, weighted value module 320 is in the time that the access to netwoks behavioral data of this locality storage reaches predetermined quantity (as the shared memory space of access to netwoks behavioral data receiving and store this locality reaches predetermined memory space size), all access to netwoks behavioral datas that do not carry out weighted value calculating to this locality storage carry out weighted value calculating, and after calculating completes, weighted value module 320 is stored the each weighted value calculating together with the attribute information in corresponding access to netwoks behavioral data and each dimension corresponding to access to netwoks behavioral data.
Weighted value module 320 can adopt various ways to carry out the weighted value of computing network access behavioral data, and concrete implementation can arrange according to practical situations, exemplifies no longer in detail explanation at this.
Attention rate module 330 is also connected with interest identification module 340.Attention rate module 330 is mainly used in determining the attention rate of user to the field under access to netwoks behavioral data according to the weighted value of user's access to netwoks behavioral data.
Concrete, attention rate module 330 can real-time mode be calculated the attention rate of user to the field under access to netwoks behavioral data, that is to say, when acquisition module 300 often receives an access to netwoks behavioral data or acquisition module 300 and receives many access to netwoks behavioral datas simultaneously, attention rate module 330 can be carried out the attention rate calculating of access to netwoks behavioral data immediately, and utilizes the current attention rate calculating to revise the attention rate of user to field under this access to netwoks behavioral data.
Attention rate module 330 also can adopt non real-time mode (being offline mode) to calculate the attention rate of user to the field under access to netwoks behavioral data, for example, the morning of every day attention rate module 330 to the previous day the user that acquisition module 300 gathers access to netwoks behavioral data carry out the computing of attention rate, after computing completes, attention rate module 330 utilizes the current attention rate calculating to revise the attention rate of user to field under each access to netwoks behavioral data.
Attention rate module 330 can adopt various ways to utilize the weighted value of user's access to netwoks behavioral data to calculate the attention rate of user to field under access to netwoks behavioral data, concrete implementation can arrange according to actual conditions, exemplifies no longer in detail explanation at this.
Interest identification module 340 is also with threshold value setting module 350 and issue module 360 and be connected respectively.Interest identification module 340 is mainly used in interest threshold value identification user corresponding to according to the user attention rate to field under access to netwoks behavioral data and predefined corresponding field interest.Emerging
Concrete, interest identification module 340 can judge user to access to netwoks behavioral data under the attention rate in field when reaching or exceeding interest threshold value corresponding to predefined this field, interest using this field as user, and make to issue module 360 meets its interest accordingly content resource to user's recommendation.
Threshold value setting module 350 is mainly used in, according to the distribution of the weighted value of the access to netwoks behavioral data in each field, interest threshold value corresponding to each field difference is set.
Concrete, the concrete example that threshold value setting module 350 sets in advance interest threshold value corresponding to field is, multiple users' (as the whole network user) access to netwoks behavioral data (obtaining multiple users' access to netwoks behavioral data with offline mode) in the collection network of acquisition module 300 timings or not timing, for each the access to netwoks behavioral data getting, determine that field module 310 determines respectively the entity word that comprises of this access to netwoks behavioral data, determine entity word that field module 310 comprises according to access to netwoks behavioral data and predefined each field respectively corresponding multiple entity words determine the field under each access to netwoks behavioral data respectively, afterwards, weighted value module 320 is calculated the weighted value (calculating of weighted value is concrete as the description in above-mentioned S120) of each access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to each access to netwoks behavioral data (the attribute information is here concrete as the description in above-mentioned S120), then, for each field, threshold value setting module 350 arranges respectively interest threshold value corresponding to each field according to the distribution situation of the weighted value of the all-network access behavioral data in each field, as for a field, threshold value setting module 350 can be put into coordinate by the weighted value of the all-network access behavioral data that belongs to this field, each weighted value is as a point in coordinate, threshold value setting module 350 couples together each point can form a broken line, threshold value setting module 350 can be determined the interest threshold value that this field is corresponding by searching corresponding flex point in this broken line, threshold value setting module 350 can be using the weighted value of the flex point the finding interest threshold value corresponding as this field.
Issue module 360 can also be used for according to Real-time Collection to user's access to netwoks behavioral data under the corresponding attention rate of each entity word in field, issue corresponding content to user.
Concrete, interest identification module 340 can also finerly be determined the interested entity word of user, attribute information in multiple dimensions as corresponding in the access to netwoks behavioral data in the present embodiment can also comprise: the entity word that access to netwoks behavioral data is comprised is the tactile number of times that reaches in field under access to netwoks behavioral data, entity word access interval in field under described access to netwoks behavioral data that the entity word that access to netwoks behavioral data the comprises visiting frequency in field and access to netwoks behavioral data under access to netwoks behavioral data comprise, these three attribute informations are all entity words of comprising for the access to netwoks behavioral data in field, rather than for field under access to netwoks behavioral data.
Based on the above-mentioned attribute information for entity word, when the threshold value setting module 350 interest threshold value corresponding in the field that arranges, the interest threshold value of the each entity word in field can also be further set, like this, interest identification module 340 not only can judge according to the interest threshold value of the each entity word in field the content of the more specifically refinement in the interested field of user, and, even if in the uninterested field of user, issue module 360 and also can more pay close attention to by relatively judging user the content of some.
It is basic identical with the above-mentioned mode that interest threshold value corresponding to field be set that threshold value setting module 350 arranges the mode of interest threshold value of entity word, no longer describes in detail at this.
After determining the interested field of user and interested entity word, issue module 360 in the time pushing its interested content resource to user, can be with reference to the interested entity word of user in the interested field of user, thus can issue the content resource that meets its finer interest to user.
Issuing module 360 also can be when to user's content recommendation resource, the value of each content resource in consideration content recommendation resource collection in information quality dimension, for example, for the interested field of user, issue module 360 in the time recommending the content resource in its domain of interest to this user, can recommend the higher content resource of value in information quality dimension in its interested field to this user.
Issue module 360 and can also issue corresponding content resource to user according to the current access scenario of user, a concrete example, acquisition module 300 receives browser client collection and transmits after the user's who comes access to netwoks behavioral data, determine that field module 310 extracts entity word from this access to netwoks behavioral data, and utilize the entity word extracting to judge the field that this access to netwoks behavioral data is affiliated, and then, in the time that interest identification module 340 is determined this user and is lost interest in this field according to canned data in feature database and field distribution library, issue module 360 and can in feature database, search attention rate corresponding to all entity words under this field, then, issue module 360 and choose the corresponding content resource of entity word (being information source) that attention rate is the highest, and this content resource is handed down to user, certainly, issue module 360 and also can, in the time that interest identification module 340 is determined this user and lost interest in this field, recommend the lower content resource of value in information quality dimension in some these fields to this user, a concrete example, if user loses interest in to pet fish field, issue module 360 in the time recommending the corresponding content resource in pet fish according to the current access scenario of user to this user, issue module 360 Ying Xiangqi and be recommended in the lower content resource of value in information quality dimension, as issue module 360 and recommend to user rudimentary knowledge and the related content such as introduction guidance etc. of the culture of ornamental fish.Issue the entity word that module 360 also can comprise according to user's access to netwoks behavioral data and issue corresponding content to user.
As seen through the above description of the embodiments, those skilled in the art can be well understood to the mode that the present invention can add essential general hardware platform by software and realizes.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product can be stored in storage medium, as ROM/RAM, magnetic disc, CD etc., comprise that some instructions (can be personal computers in order to make a computer equipment, server, or the network equipment etc.) carry out the method described in some part of each embodiment of the present invention or embodiment.
Each embodiment in this specification all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually referring to, what each embodiment stressed is and the difference of other embodiment.Especially,, for the embodiment of device or system, because it is substantially similar in appearance to embodiment of the method, so describe fairly simplely, relevant part is referring to the part explanation of embodiment of the method.Apparatus and system embodiment described above is only schematic, the wherein said unit as separating component explanation can or can not be also physically to separate, the parts that show as unit can be or can not be also physical locations, can be positioned at a place, or also can be distributed in multiple network element.Can select according to the actual needs some or all of module wherein to realize the object of the present embodiment scheme.Those of ordinary skills, in the situation that not paying creative work, are appreciated that and implement.
Above user interest discover method provided by the present invention and device are described in detail, applied specific case herein principle of the present invention and execution mode are set forth, the explanation of above embodiment is just for helping to understand method of the present invention and core concept thereof; Meanwhile, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.

Claims (22)

1. a user interest discover method, is characterized in that, comprising:
Gather user's access to netwoks behavioral data;
The field under described access to netwoks behavioral data is determined in the entity word comprising according to described access to netwoks behavioral data and predefined each field respectively corresponding multiple entity words;
Calculate the weighted value of described access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to described access to netwoks behavioral data;
Determine the attention rate of described user to the field under described access to netwoks behavioral data according to the weighted value of described user's access to netwoks behavioral data;
Interest threshold value corresponding to the attention rate according to described user to field under described access to netwoks behavioral data and predefined corresponding field identified described user's interest, wherein, interest threshold value corresponding to described field is that the access to netwoks behavioral data this field being conducted interviews according to multiple users in network arranges.
2. the method for claim 1, is characterized in that, described collection user's access to netwoks behavioral data comprises:
The network-termination device that receives user transmits the user's who gathers by browser client who comes access to netwoks behavioral data; And/or
Gather user's access to netwoks behavioral data from network side by application programming interfaces API.
3. the method for claim 1, is characterized in that:
Attribute information in multiple dimensions corresponding to described access to netwoks behavioral data includes but not limited to: under access to netwoks behavioral data, field tactile reaches the information quality of visiting frequency, the access mode that produces described access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data in field under number of times, access to netwoks behavioral data; Or
Attribute information in multiple dimensions corresponding to described access to netwoks behavioral data includes but not limited to: under access to netwoks behavioral data, field tactile reaches the information quality of access interval, the access mode that produces described access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data in field under number of times, access to netwoks behavioral data; Or
Attribute information in multiple dimensions corresponding to described access to netwoks behavioral data includes but not limited to: under access to netwoks behavioral data, field tactile reaches the information quality of access interval, the access mode that produces described access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data in field under the visiting frequency, access to netwoks behavioral data in field under number of times, access to netwoks behavioral data;
Wherein, described access mode comprises: initiatively access and propelling movement access.
4. method as claimed in claim 3, is characterized in that:
The information quality of described content resource is according to the interesting user in field under described content resource is determined the access of described content resource; Or
The information quality of described content resource is determined the access of described content resource according to user interesting to field under described content resource and that the attention rate in the field under described Internet resources is reached to predetermined threshold, wherein, described predetermined threshold is higher than interest threshold value corresponding to field under described Internet resources; Or
The information quality of described content resource is according to the user interesting and that the predetermined website in the field under described Internet resources was carried out accessing of field under described content resource is determined the access of described content resource.
5. the method for claim 1, is characterized in that, interest threshold value corresponding to described field arranges by following manner:
Gather multiple users' access to netwoks behavioral data;
The affiliated field of each access to netwoks behavioral data difference is determined in the entity word comprising according to described multiple users' access to netwoks behavioral data and predefined each field respectively multiple entity words of correspondence;
Calculate the weighted value of each access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to described access to netwoks behavioral data;
According to the distribution of the weighted value of the access to netwoks behavioral data in each field, interest threshold value corresponding to each field difference is set.
6. method as claimed in claim 5, is characterized in that, the described distribution according to the weighted value of the access to netwoks behavioral data in each field arrange each field respectively the step of corresponding interest threshold value also comprise:
For a field, in the distribution of weighted value of all-network access behavioral data that belongs to this field, determine weighted value flex point, and described weighted value flex point is set to interest threshold value corresponding to this field.
7. method as claimed in claim 3, is characterized in that, the attribute information in multiple dimensions corresponding to described access to netwoks behavioral data also comprises:
The entity word that access to netwoks behavioral data comprises is the tactile number of times that reaches in field under described access to netwoks behavioral data;
The entity word that access to netwoks behavioral data comprises is the visiting frequency in field under described access to netwoks behavioral data;
The entity word that access to netwoks behavioral data comprises is the access interval in field under described access to netwoks behavioral data.
8. method as claimed in claim 7, is characterized in that, described method also comprises:
Calculate the weighted value of the entity word in described access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to described access to netwoks behavioral data;
Determine the attention rate of described user to the entity word in the field under described access to netwoks behavioral data according to the weighted value of the entity word in described access to netwoks behavioral data;
Interest threshold value corresponding to entity word in attention rate according to described user to the entity word in field under described access to netwoks behavioral data and predefined corresponding field identified described user's interest.
9. method as claimed in claim 8, is characterized in that, described method also comprises:
According to Real-time Collection to user's access to netwoks behavioral data under field in attention rate corresponding to each entity word, issue corresponding content to user.
10. the method as described in arbitrary claim in claim 1 to 9, is characterized in that, described method also comprises:
The entity word comprising according to user's access to netwoks behavioral data issues corresponding content to user.
11. methods as described in arbitrary claim in claim 1 to 9, is characterized in that:
The user's that described user arrives according to Real-time Collection the attention rate in field under described access to netwoks behavioral data and described user's interest access to netwoks behavioral data real-time update; Or
Described user to the attention rate in field under described access to netwoks behavioral data and described user's interest according to the user's who collects access to netwoks behavioral data regular update.
12. 1 kinds of user interests are found device, it is characterized in that, this device comprises:
Acquisition module, for gathering user's access to netwoks behavioral data;
Determine field module, for the entity word that comprises according to described access to netwoks behavioral data and predefined each field respectively corresponding multiple entity words determine the field under described access to netwoks behavioral data;
Weighted value module, for calculating the weighted value of described access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to described access to netwoks behavioral data;
Attention rate module, for determining the attention rate of described user to the field under described access to netwoks behavioral data according to the weighted value of described user's access to netwoks behavioral data;
Interest identification module, for according to described user interest threshold value corresponding to the attention rate to field under described access to netwoks behavioral data and predefined corresponding field identify described user's interest;
Wherein, interest threshold value corresponding to described field is that the access to netwoks behavioral data this field being conducted interviews according to multiple users in network arranges.
13. devices as claimed in claim 12, is characterized in that, described collection user's access to netwoks behavioral data comprises:
The network-termination device that receives user transmits the user's who gathers by browser client who comes access to netwoks behavioral data; And/or
Gather user's access to netwoks behavioral data from network side by application programming interfaces API.
14. devices as claimed in claim 12, is characterized in that:
Attribute information in multiple dimensions corresponding to described access to netwoks behavioral data includes but not limited to: under access to netwoks behavioral data, field tactile reaches the information quality of visiting frequency, the access mode that produces described access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data in field under number of times, access to netwoks behavioral data; Or
Attribute information in multiple dimensions corresponding to described access to netwoks behavioral data includes but not limited to: under access to netwoks behavioral data, field tactile reaches the information quality of access interval, the access mode that produces described access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data in field under number of times, access to netwoks behavioral data; Or
Attribute information in multiple dimensions corresponding to described access to netwoks behavioral data includes but not limited to: under access to netwoks behavioral data, field tactile reaches the information quality of access interval, the access mode that produces described access to netwoks behavioral data and the corresponding content resource of access to netwoks behavioral data in field under the visiting frequency, access to netwoks behavioral data in field under number of times, access to netwoks behavioral data;
Wherein, described access mode comprises: initiatively access and propelling movement access.
15. devices as claimed in claim 14, is characterized in that:
The information quality of described content resource is according to the interesting user in field under described content resource is determined the access of described content resource; Or
The information quality of described content resource is determined the access of described content resource according to user interesting to field under described content resource and that the attention rate in the field under described Internet resources is reached to predetermined threshold, wherein, described predetermined threshold is higher than interest threshold value corresponding to field under described Internet resources; Or
The information quality of described content resource is according to the user interesting and that the predetermined website in the field under described Internet resources was carried out accessing of field under described content resource is determined the access of described content resource.
16. devices as claimed in claim 12, is characterized in that, interest threshold value corresponding to described field arranges by following manner:
Gather multiple users' access to netwoks behavioral data;
The affiliated field of each access to netwoks behavioral data difference is determined in the entity word comprising according to described multiple users' access to netwoks behavioral data and predefined each field respectively multiple entity words of correspondence;
Calculate the weighted value of each access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to described each access to netwoks behavioral data;
And described device also comprises:
Threshold value setting module, for arranging interest threshold value corresponding to each field difference according to the distribution of the weighted value of the access to netwoks behavioral data in each field.
17. devices as claimed in claim 16, is characterized in that, the described distribution according to the weighted value of the access to netwoks behavioral data in each field arrange each field respectively the step of corresponding interest threshold value comprise:
For a field, in the distribution of weighted value of all-network access behavioral data that belongs to this field, determine weighted value flex point, and described weighted value flex point is set to interest threshold value corresponding to this field.
18. devices as claimed in claim 14, is characterized in that, the attribute information in multiple dimensions corresponding to described access to netwoks behavioral data also comprises:
The entity word that access to netwoks behavioral data comprises is the tactile number of times that reaches in field under described access to netwoks behavioral data;
The entity word that access to netwoks behavioral data comprises is the visiting frequency in field under described access to netwoks behavioral data;
The entity word that access to netwoks behavioral data comprises is the access interval in field under described access to netwoks behavioral data.
19. devices as claimed in claim 18, is characterized in that:
Weighted value module also for, calculate the weighted value of the entity word in described access to netwoks behavioral data according to the attribute information in multiple dimensions corresponding to described access to netwoks behavioral data;
Attention rate module also for, determine the attention rate of described user to the entity word in the field under described access to netwoks behavioral data according to the weighted value of the entity word in described access to netwoks behavioral data;
Interest identification module also for, interest threshold value corresponding to entity word in the attention rate according to described user to the entity word in field under described access to netwoks behavioral data and predefined corresponding field identified described user's interest.
20. devices as claimed in claim 19, is characterized in that, described device also comprises:
Issue module, for according to Real-time Collection to user's access to netwoks behavioral data under the corresponding attention rate of each entity word in field, issue corresponding content to user.
21. devices as described in arbitrary claim in claim 12 to 20, is characterized in that, described device also comprises:
Issue module, issue corresponding content for the entity word comprising according to user's access to netwoks behavioral data to user.
22. devices as described in arbitrary claim in claim 12 to 20, is characterized in that:
The user's that described user arrives according to Real-time Collection the attention rate in field under described access to netwoks behavioral data and described user's interest access to netwoks behavioral data real-time update; Or
Described user to the attention rate in field under described access to netwoks behavioral data and described user's interest according to the user's who collects access to netwoks behavioral data regular update.
CN201410038066.XA 2014-01-26 2014-01-26 User interest finds method and apparatus Active CN103780625B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410038066.XA CN103780625B (en) 2014-01-26 2014-01-26 User interest finds method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410038066.XA CN103780625B (en) 2014-01-26 2014-01-26 User interest finds method and apparatus

Publications (2)

Publication Number Publication Date
CN103780625A true CN103780625A (en) 2014-05-07
CN103780625B CN103780625B (en) 2017-07-04

Family

ID=50572455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410038066.XA Active CN103780625B (en) 2014-01-26 2014-01-26 User interest finds method and apparatus

Country Status (1)

Country Link
CN (1) CN103780625B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361063A (en) * 2014-11-04 2015-02-18 北京字节跳动网络技术有限公司 User interest discovering method and device
CN104991935A (en) * 2015-07-06 2015-10-21 无锡天脉聚源传媒科技有限公司 Website attention processing method and apparatus
CN105893407A (en) * 2015-11-12 2016-08-24 乐视云计算有限公司 Individual user portraying method and system
CN106202502A (en) * 2016-07-20 2016-12-07 福州大学 In music information network, user interest finds method
CN107358447A (en) * 2017-06-29 2017-11-17 安徽大学 A kind of personalized service recommendation method and system centered on service quality
WO2018152995A1 (en) * 2017-02-21 2018-08-30 中兴通讯股份有限公司 History management method and device
CN108769809A (en) * 2018-05-28 2018-11-06 成都市极米科技有限公司 Domestic consumer's behavioral data acquisition method, device and computer readable storage medium based on smart television

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6385619B1 (en) * 1999-01-08 2002-05-07 International Business Machines Corporation Automatic user interest profile generation from structured document access information
US20060224552A1 (en) * 2005-03-31 2006-10-05 Palo Alto Research Center Inc. Systems and methods for determining user interests
US20070239535A1 (en) * 2006-03-29 2007-10-11 Koran Joshua M Behavioral targeting system that generates user profiles for target objectives
CN101866341A (en) * 2009-04-17 2010-10-20 华为技术有限公司 Information push method, device and system
CN102402766A (en) * 2011-12-27 2012-04-04 纽海信息技术(上海)有限公司 User interest modeling method based on web page browsing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6385619B1 (en) * 1999-01-08 2002-05-07 International Business Machines Corporation Automatic user interest profile generation from structured document access information
US20060224552A1 (en) * 2005-03-31 2006-10-05 Palo Alto Research Center Inc. Systems and methods for determining user interests
US20070239535A1 (en) * 2006-03-29 2007-10-11 Koran Joshua M Behavioral targeting system that generates user profiles for target objectives
CN101866341A (en) * 2009-04-17 2010-10-20 华为技术有限公司 Information push method, device and system
CN102402766A (en) * 2011-12-27 2012-04-04 纽海信息技术(上海)有限公司 User interest modeling method based on web page browsing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
田云艳: "基于改进混合聚类技术的用户兴趣智能建模", 《中国优秀硕士学位论文全文数据库》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361063A (en) * 2014-11-04 2015-02-18 北京字节跳动网络技术有限公司 User interest discovering method and device
CN104361063B (en) * 2014-11-04 2018-03-16 北京字节跳动网络技术有限公司 user interest discovery method and device
CN104991935A (en) * 2015-07-06 2015-10-21 无锡天脉聚源传媒科技有限公司 Website attention processing method and apparatus
CN104991935B (en) * 2015-07-06 2019-03-12 无锡天脉聚源传媒科技有限公司 A kind for the treatment of method and apparatus of website attention rate
CN105893407A (en) * 2015-11-12 2016-08-24 乐视云计算有限公司 Individual user portraying method and system
CN106202502A (en) * 2016-07-20 2016-12-07 福州大学 In music information network, user interest finds method
CN106202502B (en) * 2016-07-20 2020-02-07 福州大学 User interest discovery method in music information network
WO2018152995A1 (en) * 2017-02-21 2018-08-30 中兴通讯股份有限公司 History management method and device
CN107358447A (en) * 2017-06-29 2017-11-17 安徽大学 A kind of personalized service recommendation method and system centered on service quality
CN107358447B (en) * 2017-06-29 2021-01-29 安徽大学 Personalized service recommendation method and system with service quality as center
CN108769809A (en) * 2018-05-28 2018-11-06 成都市极米科技有限公司 Domestic consumer's behavioral data acquisition method, device and computer readable storage medium based on smart television
CN108769809B (en) * 2018-05-28 2021-06-29 成都极米科技股份有限公司 Smart television-based home user behavior data acquisition method and device and computer-readable storage medium

Also Published As

Publication number Publication date
CN103780625B (en) 2017-07-04

Similar Documents

Publication Publication Date Title
CN103780625A (en) Method and device for discovering interest of users
US10572565B2 (en) User behavior models based on source domain
US11263217B2 (en) Method of and system for determining user-specific proportions of content for recommendation
CN103886090B (en) Content recommendation method and device based on user preferences
US20170013072A1 (en) Webpage pre-reading method, apparatus and smart terminal device
CN108363815B (en) Webpage pre-reading method and device and intelligent terminal equipment
CN102708174B (en) Method and device for displaying rich media information in browser
CN104462573A (en) Method and device for displaying video retrieval results
CN107222566A (en) Information-pushing method, device and server
US10402479B2 (en) Method, server, browser, and system for recommending text information
CN104216921B (en) A kind of addition reminding method, apparatus and system for realizing quick links in browser
CN104850546B (en) Display method and system of mobile media information
CN110163703B (en) Classification model establishing method, file pushing method and server
CN103888466A (en) User interest discovering method and device
CN105976161A (en) Time axis-based intelligent recommendation calendar and user-based presentation method
CN102340514A (en) Network information push method and system
CN112052387B (en) Content recommendation method, device and computer readable storage medium
CN110543598A (en) information recommendation method and device and terminal
CN105915956A (en) Video content recommendation method, device, server and system
CN103324645A (en) Method and device for recommending webpage
CN103455524A (en) Method and device for displaying and acquiring entry information
CN107295361A (en) A kind of content delivery method
CN104199872A (en) Information recommendation method and device
US20140214621A1 (en) Method and device for pushing information
US20200092611A1 (en) Method and system for determining a relevancy parameter for content item

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant