Videos tagged with Web Mining
Lecture slides: A Crash Course in Minimal–Interval Semantics Why Minimal–Interval Semantics? How to compute it? You Can Do It Quickly! Can We Get Out Of The Boolean Tunnel? Wanna Try? Author: Sebastiano Vigna, University of Milano
Boosting Performance of Web Search Engines Using Query Logs
Lecture slides: Introduction What is a Query Log Our Logs What Does it Contain? How We Exploited Caching Policies Probability Driven Caching Static Dynamic Caching SDC Sections SDC and Prefetching Hit Ratio Throughput Collection Partitioning and Selection Innovations Contingency Matrix Co-clustering Results Experiments Precision Better Term Partitioning Why Term Partitioning? Encouraging Figure...
Ranking Web Sites with Real User Traffic
Lecture slides: Outline Sources for Ranking Data: The Link Graph Sources for Ranking Data: Dynamic Sources Sources for Ranking Data: Packet Inspection Data Collection Host graphs Structural properties: Degree Caveat: Sampling Bias Structural properties: Strength (Site Traffic) Structural properties: Weights (Link Traffic) Behavioral patterns (HUMAN) Ratios are stable Validation of PageRank Kend...
Graph Fibrations, graph isomorphism and PageRank
Lecture slides: Things related to PageRank Covering projections in algebraic topology Covering projections in modern mathematics From covering projections to fibrations My own personal relation with fibrations A graph is a graph is a graph... Graph morphisms Graph fibration A graph fibration is... A basic ingredient: universal total graph Basic property of universal total graphs Minimum base Ma...
Current Approaches to Personalized Web Search
Lecture slides: The Big Picture User Profile Elicitation Biased PageRank Direct Extensions to Biased PageRank Personalized PageRank Merging Personalized and Topic-Sensitive PageRank Output Filtering Re-Ranking Techniques Newer & Different Approaches The Future of Web Search Personalization Author: Paul-Alexandre Chirita, University of Hannover
Theoretical analysis of Link Analysis Ranking
Lecture slides: Link Analysis Ranking Why theoretical analysis of Link Analysis Ranking? Link Analysis Ranking algorithm Popular LAR algorithms Properties of Interest Distance between LAR vectors Stability: graph distance Stability Stability: Results Perturbations of PageRank Instability of PageRank Singular Value Decomposition Instability of HITS Stability of HITS Similarity Similarity: Result...
Using Rank Propagation and Probabilistic Counting for Link-based Spam Detection
Lecture slides: Content What is on the Web? Web spam (keywords + links) Web spam (mostly keywords) Search engine? Fake search engine Problem: “normal” pages that are spam Link farms Motivation Metrics Test collection Degree-based measures Degree Edge reciprocity Assortativity Automatic classifier PageRank Maximum PageRank in the Host Variance of PageRank Variance of PageRank of in-n...
Mixture Models and Collaborative Filtering Algorithms
Lecture slides: The Wisdom of Crowds Francis Galton visits a Country Fair (1906) Wisdom of Crowds on the Web Wisdom of Crowds on the Web: Google Mining the Internet People and Music Collaborative filtering Mixture model Biclustering Genes Bi-clustering algorithms Collaborative Filtering in Mixture Models A Porfolio of Iterative Biclustering Algorithms Tests on generated data Results for the dis...
Applications of Query Mining
Lecture slides: European Yahoo! Research Lab Yahoo! World Yahoo! Numbers Crawled Data Produced data Observed Data The power of social media Fight Spam My Motivations for Web Mining Mining Queries for ... Web Queries Relevance of the Context Context Using the Context Context in Web Queries User Goals Features Clustering Queries Our Approach Clusters Examples Query Recommendation Simple Query Rec...
Efficient and Decentralized PageRank Approximation in a P2P Web Search Network
Lecture slides: Outline Motivation Related Work JXP Algorithm World Node The Algorithm Example Peer Selection Strategy MIPs MIPs Example Mathematical Analysis Setup Overall performance comparison JXP in P2P Search Results Conclusions and Ongoing Work Author: Josiane Parreira, Max Planck Institute