Videos tagged with Kernel Methods
For large-scale classification problems, the training samples can be clustered beforehand as a downsampling pre-process, and then only the obtained clusters are used for training. Motivated by such assumption, we proposed a classification algorithm, Support Cluster Machine (SCM), within the learning framework introduced by Vapnik. For the SCM, a compatible kernel is adopted such that a similari...
Graph kernels and applications in chemoinformatics
Several problems in chemistry can be formulated as classification or regression problems over molecules which, when represented by their planar structure, can be seen as labeled graphs. Several approaches have been proposed recently to define positive definite kernels over labeled graphs, paving the way to the use of powerful kernel methods in chemoinformatics. In this talk I will review some o...
Completion of biological networks : the output kernel trees approach
Elucidating biological networks appears nowadays as one of the most important challenge in systems biology. Due to the availability of various sources of data, machine learning has to play a major role regarding this issue, given its large spectrum of tools ranging from generative models to concept learning methods. In this work the focus is narrowed on the completion of biological interactions...
Learning and Charting Chemical Space with Strings and Graphs: Challenges and Opportunities for AI and Machine Learning
Informatics methods and computers have not yet become as pervasive in chemistry as they have in physics and biology. Drawing analogies from bioinformatics, key ingredients for progress in chemoinformatics are the availability of large, annotated databases of compounds and reactions, data structures and algorithms to efficiently search these databases, and computational methods to predict the ph...
Textual Entailment as Syntactic Graph Distance: a rule based and a SVM based approach
Lecture slides: Textual Entailment as Syntactic Graph Distance: a rule based and a SVM based approach Classifying Textual Entailment (TE) Recognizing Textual Entailment (TE) Graph Matching (GM) Textual Entailment as Graph Matching (GM) What’s next Extended Dependency Graph (XDG) GM on XDG: definitions Finding the bijective function and evaluating the measure Constituent Similarity Depende...
Large Scale Genomic Sequence Support Vector Machines
Lecture slides: Large Scale Genomic Sequence SVM Classifiers Roadmap Large Scale Problems Formally Support Vector Machine Multiple Kernel Learning Speeding up SVM training Derivation I Derivation II Technical Remark Algorithm Parallelization Multiple Kernel Learning Roadmap A Real World Large Scale Dataset Biology: Detection of Splice Sites Approach: Weighted Degree Kernel + SVM Efficient compu...
Some Aspects of Learning Rates for SVMs
We present some learning rates for support vector machine classification. In particular we discuss a recently proposed geometric noise assumption which allows to bound the approximation error for Gaussian RKHSs. Furthermore we show how a noise assumption proposed by Tsybakov can be used to obtain learning rates between 1/sqrt(n) and 1/n. Finally, we describe the influence of the approximation e...
Implementing SVM in an RDBMS: Improved Scalability and Usability
Lecture slides: Overview Data Mining in RDBMS Database Infrastructure SVM in the Database SVM and the ODM Infrastructure SVM Data Preparation Support SVM Scalability Issues Implemented Scalability Additional Implemented Working Set Selection Who to Retain? Who to Add? Stratified Sampling Small Model Generation Build Scalability Results Scoring Scalability Results SVM Scoring as a SQL Oper...
Robustness properties of support vector machines and related methods
The talk brings together methods from two disciplines: machine learning theory and robust statistics. We argue that robustness is an important aspect and we show that many existing machine learning methods based on convex risk minimization have - besides other good properties - also the advantage of being robust if the kernel and the loss function are chosen appropriately. Our results cover cla...
Large-scale parallel implementations of SVMs
Lecture slides: Related Work The Problem: Regular, plain SVM The Algorithm Engineering Engineering in practice Vector-type optimized Kernels Sorting the data set by labels Multi-threading on multi-processor machines Paralelization: Spread-Kernel: full data [2 nodes] Paralelization: Spread-Kernel: full data [p nodes] The network max( WorkingSet ) [p nodes] Paralelization: Spread-Kernel: split da...