Research Publications

 
Search | Show all
buntine
Narrow your search
« 1 2 »

Results per Page 10 25 50 100 250
30 result(s)
By Type
By Year
By Research Group
Twitter: the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and ...
The 36th Annual ACM SIGIR Conference - July 2013
Lan Du, Wray Buntine, Mark Johnson
We present a new hierarchical Bayesian model for unsupervised topic segmentation.~This new model integrates a point-wise boundary sampling algorithm used in Bayesian segmentation into a structured topic model that can capture a simple hierarchical ...
2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - June 2013
Shengbo Guo, Scott Sanner, Thore Graepel, Wray Buntine
We extend the Bayesian skill rating system of TrueSkill to accommodate score-based match outcomes. TrueSkill has proven to be a very effective algorithm for matchmaking --- the process of pairing competitors based on similar skill-level --- in ...
European Conference on Machine Learning - September 2012
Lan Du, Wray Buntine, Huidong Jin
Topic models are increasingly being used for text analysis tasks, often times replacing earlier semantic techniques such as latent seman- tic analysis. In this paper, we develop a novel adaptive topic model with the ability to adapt topics from both ...
Empirical Methods in Natural Language Processing (EMNLP) - July 2012
Changyou Chen, Nan Ding, Wray Buntine
We develop dependent hierarchical normalized random measures and apply them to dynamic topic modeling. The dependency arises via {\em superposition}, {\em subsampling} and {\em point transition} on the underlying Poisson processes of these measures. ...
International Conference on Machine Learning (ICML) - June 2012
The two parameter Poisson-Dirichlet Process (PDP), a generalisation of the Dirichlet Process, is increasingly being used for probabilistic modelling in discrete areas such as language technology, bioinformatics, and image analysis. There is a rich ...
ARXIV - February 2012
Topic models have the potential to improve search and browsing by extracting useful semantic themes from web pages and other text documents. When learned topics are coherent and interpretable, they can be valuable for faceted browsing, results set ...
Neural Information Processing Systems (NIPS) - December 2011
Changyou Chen, Lan Du, Wray Buntine
Hierarchical modeling and reasoning are fundamental in ma- chine intelligence, and for this the two-parameter Poisson-Dirichlet Pro- cess (PDP) plays an important role. The most popular MCMC sampling algorithm for the hierarchical PDP and hierarchical ...
European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD) - September 2011
Lan Du, Wray Buntine, Huidong Jin
Understanding how topics within a document evolve over its structure is an interesting and important problem. In this paper, we address this problem by presenting a novel variant of Latent Dirichlet Allocation (LDA): Sequential LDA (SeqLDA). This ...
IEEE International Conference on Data Mining (ICDM) - December 2010
James Petterson, Smola Alex, Tiberio Caetano, Wray Buntine, Narayanamurthy Shravan
We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the encoding of side information in the distribution over words. This results in a variety of new capabilities, such as improved estimates for infrequently occurring words, as well as ...
Neural Information Processing Systems - December 2010
Novi Quadrianto, Wray Buntine
Linear Regression is an instance of the Regression problem which is an approach to modelling a functional relationship between input variables x and an output/response variable y. In linear regression, a linear function of the input variables is used,...
Encyclopedia of Machine Learning - December 2010
Novi Quadrianto, Wray Buntine
Regression is a fundamental problem in statistics and machine learning. In regression studies, we are typically interested in inferring a real-valued function (called a regression function) whose values correspond to the mean of a dependent (or ...
Encyclopedia of Machine Learning - December 2010
The two most important concepts used in Bayesian modelling are probability and utility. Probabilities are used to model our belief about the state of the world and utilities are used to model the value to us of different outcomes, thus to model costs ...
Encyclopedia of Machine Learning - December 2010
Wray Buntine, Lan Du, Petteri Nurmi
Exact Bayesian network inference exists for Gaussian and discrete distributions. For other kinds of distributions, approximations or restrictions on the kind of inference done are needed. In this paper we show how, using the two-parameter ...
The Fifth European Workshop on. Probabilistic Graphical Models - September 2010
Lan Du, Wray Buntine, Huidong Jin
Documents come naturally with structure: a section contains paragraphs which itself contains sentences; a blog page contains a sequence of comments and links to related blogs. Structure, of course, implies something about shared topics. In this paper ...
Machine Learning Journal - July 2010
Novi Quadrianto, Kristian Kersting, Tinne Tuytelaars, Wray Buntine
Ideally, one would like to perform image search using an intuitive and friendly approach. Many existing image search engines, however, present users with sets of images arranged in some default order on the screen, typically the relevance to a query, ...
11th ACM SIGMM International Conference on Multimedia Information Retrieval - March 2010
Jukka Perkio, Tinne Tuytelaars, Wray Buntine
Recently there has been considerable interest in topic models based on the bag-of-features representation of images. The strong independence assumption inherent in the bag-of-features representation is not realistic however: patches often ...
Eighth International Conference on Machine Learning and Applications (ICMLA'09) - December 2009
Novi Quadrianto, Kristian Kersting, Mark Reid, Tiberio Caetano, Wray Buntine
Quantile regression refers to the process of es- timating the quantiles of a conditional distribution and has many important applications within econometrics and data mining, among other domains. In this paper, we show how to estimate these ...
IEEE International Conference on Data Mining (ICDM). - December 2009
Topic models are a discrete analogue to principle component analysis and independent component analysis that model {\it topic} at the word level within a document. They have many variants such as NMF, PLSI and LDA, and are used in many fields such as ...
Asian Conference on Machine Learning - November 2009
Wray Buntine, Marko Grobelnik, Dunja Mladenic, John Shawe-Taylor
The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) will take place in Bled, Slovenia, from September 7th to 11th, 2009. This event builds upon a very successful series of 19 ECML and 12 ...
- September 2009
Wray Buntine, Marko Grobelnik, Dunja Mladenic, John Shawe-Taylor
The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) will take place in Bled, Slovenia, from September 7th to 11th, 2009. This event builds upon a very successful series of 19 ECML and 12 ...
- September 2009
Tinne Tuytelaars, Christoph Lampert, Matthew Blaschko, Wray Buntine
The goal of this paper is to evaluate and compare models and methods for learning to recognize basic entities in images in an unsupervised setting. In other words, we want to discover the objects present in the images by analyzing unlabeled data and ...
- June 2010
Topic models are a discrete analogue to principle component analysis and independent component analysis that model topic at the word level within a document. They have many variants such as NMF, PLSI and LDA, and are used in many fields such as genetics...
NICTA - July 2009
This report is the background theory for Discrete Component Analysis software called DCA. Currently the software is run in stand-alone mode, and scavengers data streaming libraries and Dirichlet utilities from the older MPCA system1. The software itself...
NICTA - July 2009