Research Publications
buntine | |
| Narrow your search | 30 result(s) |
By Type
By Year By Research Group
| Twitter: the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and ... We present a new hierarchical Bayesian model for unsupervised topic segmentation.~This new model integrates a point-wise boundary sampling algorithm used in Bayesian segmentation into a structured topic model that can capture a simple hierarchical ... We extend the Bayesian skill rating system of TrueSkill to accommodate score-based match outcomes. TrueSkill has proven to be a very effective algorithm for matchmaking --- the process of pairing competitors based on similar skill-level --- in ... Topic models are increasingly being used for text analysis tasks, often times replacing earlier semantic techniques such as latent seman- tic analysis. In this paper, we develop a novel adaptive topic model with the ability to adapt topics from both ... We develop dependent hierarchical normalized random measures and apply them to dynamic topic modeling. The dependency arises via {\em superposition}, {\em subsampling} and {\em point transition} on the underlying Poisson processes of these measures. ... The two parameter Poisson-Dirichlet Process (PDP), a generalisation of the Dirichlet Process, is increasingly being used for probabilistic modelling in discrete areas such as language technology, bioinformatics, and image analysis. There is a rich ... Topic models have the potential to improve search and browsing by extracting useful semantic themes from web pages and other text documents. When learned topics are coherent and interpretable, they can be valuable for faceted browsing, results set ... Hierarchical modeling and reasoning are fundamental in ma- chine intelligence, and for this the two-parameter Poisson-Dirichlet Pro- cess (PDP) plays an important role. The most popular MCMC sampling algorithm for the hierarchical PDP and hierarchical ... Understanding how topics within a document evolve over its structure is an interesting and important problem. In this paper, we address this problem by presenting a novel variant of Latent Dirichlet Allocation (LDA): Sequential LDA (SeqLDA). This ... We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the encoding of side information in the distribution over words. This results in a variety of new capabilities, such as improved estimates for infrequently occurring words, as well as ... Linear Regression is an instance of the Regression problem which is an approach to modelling a functional relationship between input variables x and an output/response variable y. In linear regression, a linear function of the input variables is used,... Regression is a fundamental problem in statistics and machine learning. In regression studies, we are typically interested in inferring a real-valued function (called a regression function) whose values correspond to the mean of a dependent (or ... The two most important concepts used in Bayesian modelling are probability and utility. Probabilities are used to model our belief about the state of the world and utilities are used to model the value to us of different outcomes, thus to model costs ... Exact Bayesian network inference exists for Gaussian and discrete distributions. For other kinds of distributions, approximations or restrictions on the kind of inference done are needed. In this paper we show how, using the two-parameter ... Documents come naturally with structure: a section contains paragraphs which itself contains sentences; a blog page contains a sequence of comments and links to related blogs. Structure, of course, implies something about shared topics. In this paper ... Ideally, one would like to perform image search using an intuitive and friendly approach. Many existing image search engines, however, present users with sets of images arranged in some default order on the screen, typically the relevance to a query, ... Recently there has been considerable interest in topic models based on the bag-of-features representation of images. The strong independence assumption inherent in the bag-of-features representation is not realistic however: patches often ... Quantile regression refers to the process of es- timating the quantiles of a conditional distribution and has many important applications within econometrics and data mining, among other domains. In this paper, we show how to estimate these ... Topic models are a discrete analogue to principle component analysis and independent component analysis that model {\it topic} at the word level within a document. They have many variants such as NMF, PLSI and LDA, and are used in many fields such as ... Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2009, Proceedings Part I The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) will take place in Bled, Slovenia, from September 7th to 11th, 2009. This event builds upon a very successful series of 19 ECML and 12 ... Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2009, Proceedings Part II The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) will take place in Bled, Slovenia, from September 7th to 11th, 2009. This event builds upon a very successful series of 19 ECML and 12 ... The goal of this paper is to evaluate and compare models and methods for learning to recognize basic entities in images in an unsupervised setting. In other words, we want to discover the objects present in the images by analyzing unlabeled data and ... Topic models are a discrete analogue to principle component analysis and independent component analysis that model topic at the word level within a document. They have many variants such as NMF, PLSI and LDA, and are used in many fields such as genetics... This report is the background theory for Discrete Component Analysis software called DCA. Currently the software is run in stand-alone mode, and scavengers data streaming libraries and Dirichlet utilities from the older MPCA system1. The software itself... |
