• / 37
  • 下载费用:10 金币  

stanford大学-大数据挖掘-PageRank-13.ppt

关 键 词:
stanford大学-大数据挖掘-PageRank-13.ppt
资源描述:
CS345 Data Mining,Link Analysis Algorithms Page Rank,Anand Rajaraman, Jeffrey D. Ullman,Link Analysis Algorithms,Page Rank Hubs and Authorities Topic-Specific Page Rank Spam Detection Algorithms Other interesting topics we won’t cover Detecting duplicates and mirrors Mining for communities,Ranking web pages,Web pages are not equally “important” www.joe-schmoe.com v www.stanford.edu Inlinks as votes www.stanford.edu has 23,400 inlinks www.joe-schmoe.com has 1 inlink Are all inlinks equal? Recursive question!,Simple recursive formulation,Each link’s vote is proportional to the importance of its source page If page P with importance x has n outlinks, each link gets x/n votes Page P’s own importance is the sum of the votes on its inlinks,Simple “flow” model,The web in 1839,y,a,m,y/2,y/2,a/2,a/2,m,y = y /2 + a /2 a = y /2 + m m = a /2,Solving the flow equations,3 equations, 3 unknowns, no constants No unique solution All solutions equivalent modulo scale factor Additional constraint forces uniqueness y+a+m = 1 y = 2/5, a = 2/5, m = 1/5 Gaussian elimination method works for small examples, but we need a better method for large graphs,Matrix formulation,Matrix M has one row and one column for each web page Suppose page j has n outlinks If j ! i, then Mij=1/n Else Mij=0 M is a column stochastic matrix Columns sum to 1 Suppose r is a vector with one entry per web page ri is the importance score of page i Call it the rank vector |r| = 1,Example,Suppose page j links to 3 pages, including i,r,Eigenvector formulation,The flow equations can be written r = Mr So the rank vector is an eigenvector of the stochastic web matrix In fact, its first or principal eigenvector, with corresponding eigenvalue 1,Example,,y 1/2 1/2 0 a 1/2 0 1 m 0 1/2 0,y a m,y = y /2 + a /2 a = y /2 + m m = a /2,Power Iteration method,Simple iterative scheme (aka relaxation) Suppose there are N web pages Initialize: r0 = [1/N,….,1/N]T Iterate: rk+1 = Mrk Stop when |rk+1 - rk|1  |x|1 = 1≤i≤N|xi| is t
展开阅读全文
  微传网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
0条评论

还可以输入200字符

暂无评论,赶快抢占沙发吧。

关于本文
本文标题:stanford大学-大数据挖掘-PageRank-13.ppt
链接地址:https://www.weizhuannet.com/p-6521618.html
微传网是一个办公文档、学习资料下载的在线文档分享平台!

微传网博客

网站资源均来自网络,如有侵权,请联系客服删除!

 网站客服QQ:80879498  会员QQ群:727456886

copyright@ 2018-2028 微传网络工作室版权所有

     经营许可证编号:冀ICP备18006529号-1 ,公安局备案号:13028102000124

收起
展开