Monday, November 29, 2010

Carl Smith : Dec 1

"This week I plan to present topics from the first half (chapters 1-5) of Wainwright and Jordan's "Graphical Models, Exponential Families, and Variational Inference". I will emphasize the ideas of 1) conjugate duality between partition function and negative entropy, and 2) nonconvexity in mean field approaches to inference. I will present the following week on some combination of ideas from the second half of the same paper, related papers by Wainwright, and related stuff Liam and I have been working on, depending on time and what people are interested in after the first hour."

Friday, November 19, 2010

Getting started with Git

If you have a large shared software project and want an easy way to manage collaboration, branch off different versions of the project and generally keep things organized, you need a version control system. Git is the one I know (Mercurial and Subversion are also popular choices) and the one other people in the group are using, so it's the one I'll go over here.

Download git here:
Set up an ssh public key and register an account at (apologies to non-Mac users)
Enter the following at the terminal (with obvious substitutions):

export PATH
git config --global "Your Name"
git config --global

To pull a repository called "repo" from user "person":

mkdir repo (or what have you)
cd repo
git init
git remote add origin
git pull origin master

In particular, every call of "git pull origin master" pulls the currest master version off github. Be sure you're working with the current version before trying to push local changes, or you might get conflicts. To commit changes, either call

git add files_that_changed
git commit


git commit -a

which commits all changed files. Committing brings up a text editor where you describe any changes made. Be aware, if you don't write anything the commit will be aborted. Then to push your changes from the local machine to github just type

git push

and that's it! Things get messier when working with branches, checkouts and merges, but for 90% of what I do the above suffices. The web abounds with tutorials and a quick reference sheet can be found here.

Tuesday, November 16, 2010

David Pfau : Nov 17th


I'm presenting joint work with Frank Wood and Nicholas Bartlett on learning simple models for discrete sequence prediction.  We describe a novel Bayesian framework for learning probabilistic deterministic finite automata (PDFA), which are a class of simple generative models for sequences from a discrete alphabet.  We first define a prior over PDFA with a fixed number of states, and then by taking the limit as the number of states becomes unbounded, we show that the prior has a well defined limit, a model we call a Probabilistic Deterministic Infinite Automata (PDIA).  Inference is tractable with MCMC, and we show results from experiments with synthetic grammars, DNA and natural language.  In particular, we find on complex data that averaging predictions over many MCMC samples leads to improved performance, and that the learned models perform as well as 3rd-order Markov models with about 1/10th as many states.  For the curious, a write-up of my work can be found here.

Also, following the talk I'm going to give a brief tutorial on git, a free version control system used in the software community for maintaining large collaborative code bases.  I'd like to set up a git repository for the Paninski group so we can avoid too much code duplication and build on each others' work, and I promise it's actually pretty easy once you learn the basics.

Monday, November 15, 2010

Scalable inference on regularized Dirichlets

Hi all,

I'm giving a brief (30") talk about my recent work this Wednesday at noon in room 903 SSW. The abstract is below. I'll presumably be giving a fuller talk on the same eventually in our group meeting, but in case you're looking for lunchtime entertainment...


P.S. This is a one-hour situation with two half-hour presenters. So if you come, you could wind up watching someone else first, or only!

Title: Tractable inference on regularized Dirichlet distributions: a scalable class of HMM

Abstract: There is substantial interest in tractable inference on distributions of distributions, confined obviously to a simplex. Regularization of the Dirichlet distribution of random variables, without compromising tractability of inference, would be useful for encoding prior knowledge of interactions among the components, for instance in topic models. I will present a class of regularized Dirichlet distributions that are in fact especially scalable hidden Markov models. The same framework allows for tractable exact inference on certain loopy graphs of the same type.

Wednesday, November 10, 2010

Kamiar Rahnama Rad and Chaitu Ekanadham: Nov 10

Chaitu will describe the major results and proofs since the 2005 paper by Candes which gives sufficient conditions for stable recovery of sparse signals from incomplete measurements.  Kamiar will be finishing off where he left off at his last presentation.