Tuesday, December 21, 2010

fast kalman filtering draft

hi all, any comments on this draft we just finished would be very welcome.

Wednesday, December 15, 2010

Carl Smith : Dec 15

This Wednesday I'll pick up where we left off last week when we covered graphical models, exponential families, and the basic ideas behind variational inference. This week I will go over variational inference in greater depth, and then describe some approximations to the variational problem that render it tractable: sum-product and the Bethe entropy approximation; mean field methods (time permitting); and convex approximations, in particular tree-reweighted belief propagation. The material is drawn from chapters 3, 4, 5, and 7 of the same paper.

Tuesday, December 7, 2010

Yashar Ahmadian : Dec 8

Designing optimal stimuli to control neuronal spike timing

We develop fast methods for optimal control of spike times by stimulating neurons. We 
adopt an approach based on models which describe how a stimulating agent (such as an 
injected electrical current, or a laser light interacting with caged neurotransmitters or pho- 
tosensitive ion channels) affect the spiking activity of neurons. Based on these models, we 
solve the reverse problem of finding the best time-dependent modulation of the input, sub- 
ject to hardware limitations as well as physiologically inspired safety measures, that makes 
the neuron emit a spike train which with highest probability will be close to a target spike 
train. We adopt fast convex constrained optimization methods to solve this problem. Our 
methods can potentially be implemented in real time and are also generalizable to the case 
of many cells, suitable for neural prosthesis applications. Using biologically sensible param- 
eters and constraints, our method finds stimulation patterns that generate very precise spike 
trains in simulated experiments. We also tested the intracellular current injection method 
on pyramidal cells in mouse cortical slices, achieving sub-milisecond spike timing precision 
and high reliability with constrained currents.

Wednesday, December 1, 2010

cosyne abstracts

hey, our submitted cosyne abstracts are here - any feedback on any of these would be very welcome.

Monday, November 29, 2010

Carl Smith : Dec 1

"This week I plan to present topics from the first half (chapters 1-5) of Wainwright and Jordan's "Graphical Models, Exponential Families, and Variational Inference". I will emphasize the ideas of 1) conjugate duality between partition function and negative entropy, and 2) nonconvexity in mean field approaches to inference. I will present the following week on some combination of ideas from the second half of the same paper, related papers by Wainwright, and related stuff Liam and I have been working on, depending on time and what people are interested in after the first hour."

Friday, November 19, 2010

Getting started with Git

If you have a large shared software project and want an easy way to manage collaboration, branch off different versions of the project and generally keep things organized, you need a version control system. Git is the one I know (Mercurial and Subversion are also popular choices) and the one other people in the group are using, so it's the one I'll go over here.

Download git here: http://git-scm.com/download
Set up an ssh public key and register an account at github.com: http://help.github.com/mac-key-setup/ (apologies to non-Mac users)
Enter the following at the terminal (with obvious substitutions):

export PATH
git config --global user.name "Your Name"
git config --global user.email your.email@something.com

To pull a repository called "repo" from user "person":

mkdir repo (or what have you)
cd repo
git init
git remote add origin git@github.com:person/repo.git
git pull origin master

In particular, every call of "git pull origin master" pulls the currest master version off github. Be sure you're working with the current version before trying to push local changes, or you might get conflicts. To commit changes, either call

git add files_that_changed
git commit


git commit -a

which commits all changed files. Committing brings up a text editor where you describe any changes made. Be aware, if you don't write anything the commit will be aborted. Then to push your changes from the local machine to github just type

git push

and that's it! Things get messier when working with branches, checkouts and merges, but for 90% of what I do the above suffices. The web abounds with tutorials and a quick reference sheet can be found here.

Tuesday, November 16, 2010

David Pfau : Nov 17th


I'm presenting joint work with Frank Wood and Nicholas Bartlett on learning simple models for discrete sequence prediction.  We describe a novel Bayesian framework for learning probabilistic deterministic finite automata (PDFA), which are a class of simple generative models for sequences from a discrete alphabet.  We first define a prior over PDFA with a fixed number of states, and then by taking the limit as the number of states becomes unbounded, we show that the prior has a well defined limit, a model we call a Probabilistic Deterministic Infinite Automata (PDIA).  Inference is tractable with MCMC, and we show results from experiments with synthetic grammars, DNA and natural language.  In particular, we find on complex data that averaging predictions over many MCMC samples leads to improved performance, and that the learned models perform as well as 3rd-order Markov models with about 1/10th as many states.  For the curious, a write-up of my work can be found here.

Also, following the talk I'm going to give a brief tutorial on git, a free version control system used in the software community for maintaining large collaborative code bases.  I'd like to set up a git repository for the Paninski group so we can avoid too much code duplication and build on each others' work, and I promise it's actually pretty easy once you learn the basics.

Monday, November 15, 2010

Scalable inference on regularized Dirichlets

Hi all,

I'm giving a brief (30") talk about my recent work this Wednesday at noon in room 903 SSW. The abstract is below. I'll presumably be giving a fuller talk on the same eventually in our group meeting, but in case you're looking for lunchtime entertainment...


P.S. This is a one-hour situation with two half-hour presenters. So if you come, you could wind up watching someone else first, or only!

Title: Tractable inference on regularized Dirichlet distributions: a scalable class of HMM

Abstract: There is substantial interest in tractable inference on distributions of distributions, confined obviously to a simplex. Regularization of the Dirichlet distribution of random variables, without compromising tractability of inference, would be useful for encoding prior knowledge of interactions among the components, for instance in topic models. I will present a class of regularized Dirichlet distributions that are in fact especially scalable hidden Markov models. The same framework allows for tractable exact inference on certain loopy graphs of the same type.

Wednesday, November 10, 2010

Kamiar Rahnama Rad and Chaitu Ekanadham: Nov 10

Chaitu will describe the major results and proofs since the 2005 paper by Candes which gives sufficient conditions for stable recovery of sparse signals from incomplete measurements.  Kamiar will be finishing off where he left off at his last presentation.

Thursday, October 21, 2010

Micky Vidne: October 27th. s(MC)^2 or Hesitant Particle Filter.

In my talk I will describe a recent extension of the Sequential Monte Carlo (SMC) method. SMCs (particle filters) are a commonly used method to estimate a latent dynamical process from sequential noise-contaminated observations. SMCs are extremely powerful but suffer from sample impoverishment, a situation in which very few different particles represent the distribution of interest. I will describe our attempt to circumvent this fundamental problem by adding an extra MCMC step in the SMC algorithm. I will illustrate the usefulness of this algorithm by considering a toy neuroscience example.

Wednesday, October 20, 2010

Kolia Sadeghi : Oct 20th

I'll be going over this paper on Deep Boltzmann Machines and adaptive MCMC starting with some background on Restricted Boltzmann Machines.  In passing I'll give a quick overview of related architectures used to learn temporal sequences.

Thursday, September 30, 2010

Parallel computing : matlab on the HPC cluster

I've improved the codes for the parallel computing, which I talked about at the seminar a month ago - it should be really simple to use now :). Also, the problem with the atomic operation, at least under Linux, is solved as well now. I've written up a description of the codes, commented them and compiled two examples: one to be run on a single machine with several copies of Matlab running in parallel and another is for the HPC cluster. Everything can be found here: http://neurotheory.columbia.edu/~max/codes/ParallelComputation.zip

If you have comments, suggestions - will be happy to hear! Will also be glad to help resolving problems, if they arise, or to explain the code, if needed. Also, if you start using the code, please let me know - it's always encouraging to know that the work goes to masses :).

Monday, September 27, 2010

Synapses with short-term plasticity are optimal estimators of presynaptic membrane potentials

possibly of interest - Synapses with short-term plasticity are optimal estimators of presynaptic membrane potentials

Chaitu Ekanadham : Sept. 29

Recovery of sparse transformation-invariant signals with continuous basis pursuit

We study the problem of signal decomposition where the signal is a noisy superposition of template features. Each template can occur multiple times in the signal,  and associated with each instance is an unknown amount of transformation that the template undergoes. The templates and transformation types are assumed to be known, but the number of instances and associated amounts of transformation with each must be recovered from the signal. In this setting, current methods construct a dictionary containing several transformed copies of each template and employ approximate methods to solve a sparse linear inverse problem. We propose to use a set of basis functions that can interpolate the template under any small amount of  transformation(s). Both the amplitude of the feature and the amount of transformation is encoded in the basis coefficients in a way depending on the interpolation scheme used. We construct a dictionary containing transformed copies of these basis functions, where the copies are spaced as far out as the interpolation is accurate. The coefficients are obtained by solving a constrained sparse linear inverse problem where the sparsity penalty is applied across, but not within these groups. We compare our method with standard basis pursuit on a sparse deconvolution task. We find that our method outperforms these methods in that they yield sparser solutions while still having lower reconstruction error.

Monday, September 20, 2010

Eizaburo Doi : Sept. 22

Title: Testing efficient coding for a complete and inhomogeneous neural population

The theory of efficient coding under the linear Gaussian model, originally formulated by Linsker (1989), Atick & Redlich (1990), and van Hateren (1992), is quite well-known.  However, its direct test with physiological data (a complete population of receptive fields) has been hampered in the past twenty years for two reasons:  a) There is no physiological data available.  b) The earlier models are too simplistic to compare with physiological data.

We resolve these two issues, and furthermore, we develop two novel methods to assess how the structures of the retinal transform match those of the theoretically derived, optimal transform.  The main conclusion of this study is that the retinal transform is at least 80% optimal, when evaluated with the linear-Gaussian model.

We also clarify the characteristics of the retinal transform that are and are not explained by the proposed model, and discuss the future directions and preliminary results along these lines.

This is a joint work with Jeff Gauthier, Greg Field, Alexander Sher, John Shlens, Martin Greschner, Tim Machado, Keith Mathieson, Deborah Gunning, Alan Litke, Liam Paninski, EJ Chichilnisky, and Eero Simoncelli.

Monday, September 13, 2010

Ana Calabrese: Sept. 15

This wednesday Ana will discuss a recent paper by Tkacik et al. on population coding by noisy spiking neurons, using maximum entropy models.

Here's a copy of the paper:


Sunday, August 22, 2010

Daniel Soudry : Sept. 1

The neuron as a population of ion channels - 
the emergence of stochastic and history dependent behavior.


The classic view of a neuron as a point element, combining a large number of small synaptic currents, and comparing the sum to a fixed threshold, is becoming more difficult to sustain given the plethora of non-linear regenerative processes known to take place in the soma, axon and even the dendritic tree. Since a common source for the complexity in the input, soma and output is the behavior of ionic channels, we propose a view of a neuron as a population of channels.

Analyzing the stochastic nature of ion channels using recently developed mathematical model, we provide a rather general characterization of the input output relation of the neuron, which admits a surprising level of analytic tractability.

The view developed provides a clear quantitative explanation to history-dependent effects in neurons and of the observed irregularity in firing. Interestingly, the present explanation of firing irregularity does not require a globally balanced state, but, rather, results from the intrinsic properties of a single neuron.

Saturday, August 21, 2010

Yashar Ahmadian : August 25th

Yashar will be continuing where he left off:  Feyman diagrams and other goodies.

Thursday, August 19, 2010

Submit jobs to the HPC cluster from matlab

While we are talking about tools for using the HPC cluster, here's an ad for a tool of my own.

I have been using agricola to submit jobs to the HPC cluster from within matlab.  It is a very simple tool:  Instead of launching a calculation on your local machine by typing in the matlab prompt:

my_result = my_function( some_parameters ) ;

one types:

sow( 'my_result' , @()my_function( some_parameters ) ) ;

This will copy all the .m files in your current directory into a folder on the HPC submit machine, generate a submit file there, and launch the calculation on the cluster. Then some time later, when you suspect the job is done, you type:


which makes the variable  my_result  appear in your matlab workspace.  reap itself returns all the .out, .log, and .err files for you to look at from within matlab.

Unlike Max's code, agricola does not aim to parallelize your code; it just handles sending files back and forth with ssh and job submission.

Tuesday, August 17, 2010

Max Nikitchenko: Lab Meeting, Aug 18, 2010

I will try to cover two topics: multithreading with Matlab on HPC and new numeric methods for the density forward propagation.

In the first part (~15-20min), I'll briefly present Matlab code which allows easy and flexible multithreading for the loops which have independent internal blocks with different values of loop-variables. It should be useful in many computationally expensive optimization problems. The main problem here was to devise a method for locking a JobSubmit file which is used for communication between the main programs and the threads. Unfortunately, I have just discovered that the method I implemented does not give 100% result. At the same time, the code works in most of the cases and simply leads to duplicate computations in the rare situations when the file-locking method failures.

The second part will be on numeric methods for the forward propagation. In recent years a number of articles has been published which focused on the methods for the solution of the Fokker-Planck equation for the associated stochastic integrate-and-fire model. We develop a new method for the numerical estimation of the forward propagation density by computing it via direct quadratic convolution on a dynamic adaptive grid. This method allows us to significantly improve the accuracy of the computations, avoid treating the extreme cases as such and to improve (or, at least, preserve) the speed of the computation in comparison to other methods. We also found that below some value of the time step of the numeric propagation the solution becomes unstable. By considering the density being not centered in the bins centers, but distributed across the bins, we derive a simple condition for the stability of the method. Interestingly, the condition we derive binds linearly the temporal and spatial resolutions - contrary to the well-known Courant stability condition for the Fokker-Planck equation. We further improve the speed of the method by combining it with the fast gauss transform.

Tuesday, August 10, 2010

Yashar Ahmadian : August 11th

Yashar will be presenting preliminary work on applying random matrix theory to the study of transient dynamics in a non-normal linear neural network. 


The project is a collaboration with Ken Miller, and is motivated by his work on non-normal dynamics and transient amplification due to non-normality. I will give a brief background on this work
(see this paper: Balanced amplification: a new mechanism of selective amplification of neural activity patternsby B.K. Murphy and K.D. Miller), and then give an expose of the diagrammatic method for calculating averages over a random (Hermitian N x N) matrix ensemble in the large N limit.

As an example, I will present how to derive the semi-circular law for Gaussian Hermitian matrices. 

Finally, I will discuss how one can extend the method to cover the non-normal case, and I will derive a formula for the spectral density in the large N limit.

Monday, August 9, 2010

Optimal experimental design for sampling voltage on dendritic trees

Here is link to a draft of the paper that came out of my research with Liam this summer:

We are looking for feedback, so if the abstract below piques your interest please take a look at the paper and let us know what you think.

Due to the limitations of current voltage sensing techniques, optimal filtering of noisy, undersampled voltage signals on dendritic trees is a key problem in computational cellular neuroscience. These limitations lead to two sources of difficulty: 1) voltage data is incomplete (in the sense of only capturing a small portion of the full spatiotemporal signal) and 2) these data are available in only limited quantities for a single neuron. In this paper we use a Kalman filtering framework to develop optimal experimental design for voltage sampling. Our approach is to use a simple greedy algorithm with lazy evaluation to minimize the expected mean-square error of the estimated spatiotemporal voltage signal. We take advantage of some particular features of the dendritic filtering problem to efficiently calculate the estimator covariance by approximating it as a low-rank perturbation to the steady-state (zero-SNR) solution. We test our framework with simulations of real dendritic branching structures and compare the quality of both time-invariant and time-varying sampling schemes. The lazy evaluation proved critical to making the optimization tractable. In the time-invariant case improvements ranged from 30-100% over simpler methods, with larger gains for smaller numbers of observations. Allowing for time-dependent sampling produced up to an additional 30% improvement.

Thursday, July 29, 2010

Kolia Sadeghi : August 4


I will attempt to take the latest paper on the deterministic particle flow filter discussed in a previous blog post, and strip it down to the essentials.  The authors present a more general, stable and improved version of their previous deterministic particle flow filter, supposedly. This paper is rife with ideas and peculiarly written; for a gentler introduction, please refer to the papers linked to in the previous blog post. Here is the paper:

Exact particle flow for nonlinear filters by Fred Daum, Jim Huang and Arjang Noushin,
Numerical experiments for nonlinear filters with exact particle flow induced by log-homotopy (companion paper)

Friday, July 23, 2010

Deterministic particle filtering

No resampling, rejection, or importance sampling are used. Particles are propagated through time by numerically integrating an ODE. The method is very similar in spirit to Jascha Sohl-Dickstein, Peter Battaglino and Mike DeWeese's Minimum probability flow learning, but applied to nonlinear filtering.

The authors report orders of magnitude speedups for higher dimensional state spaces where sampling rejection would be a problem.

Particle flow for nonlinear filters with log-homotopy by Fred Daum & Jim Huang

There are a couple of papers companion to this one:
Nonlinear filters with particle flow induced by log-homotopy
Seventeen dubious methods to approximate the gradient for nonlinear filters with particle flow

As you may see, the authors have a very peculiar writing style.

However, one very recent paper by Lingji Chen and Raman Mehra points out some flaws in the approach:
A study of nonlinear filters with particle flow induced by log-homotopy
(but see the group meeting announcement above for Fred Daum and Jim Huang's recent answer to this).

Kolia Sadeghi : July 28

I will present work done with Liam, Jeff Gauthier and others in EJ Chichilnisky's lab on locating retinal cones from multiple ganglion cell recordings.  We write down a single hierarchical model where ganglion cell responses are modeled as independent GLMs with space-time-color separable filters and no spike history.  Assuming the stimulus was gaussian ensures that the ganglion cell Spike Triggered Averages are sufficient statistics.  The spatial component is then assumed to be a weighted sum of non-overlapping and appropriately placed archetypical cone receptive fields.  With a benign approximation, we can integrate out the weights and focus on doing MCMC in the space of cone locations and colors only.  As it turns out, this likelihood landscape has many nasty local maxima; we use parallel tempering and a few techniques specific to this problem to ensure ergodicity of the markov chain.

Doing a google scholar search on parallel tempering, also known as replica exchange, or just exchange Monte Carlo, will bring up many papers on this simple technique. Here is a review:
Parallel tempering: Theory, applications, and new perspectives

Thursday, July 22, 2010

Some classic stats papers

These are a bit more old-school, but still of interest:

Some interesting papers from AISTATS 2010

Here are a few potentially interesting papers from AISTATS this year. All pdf's available from

by Botond Cseke, Tom Heskes

by Lauren Hannah, David Blei, Warren Powell

by Jun Li, Dacheng Tao

by Mark Schmidt, Kevin Murphy

by Sajid Siddiqi, Byron Boots, Geoffrey Gordon

by Aarti Singh, Robert Nowak, Robert Calderbank

by Nikolai Slavov

by Bharath Sriperumbudur, Kenji Fukumizu, Gert Lanckriet

by Ryan Turner, Marc Deisenroth, Carl Rasmussen

by James Martens, Ilya Sutskever

by Jimmy Olsson, Jonas Strojby

by Steve Hanneke, Liu Yang

Jittering spike trains carefully

Presented in lab meeting by Alex Ramirez on July 14th, 2010.

by Matthew Harrison and Stuart Geman

In it the authors describe an algorithm that takes a spike train and jitters the spike times to create a new spike train which is maximally random while preserving the firing rate and recent spike-history of the original train.

Sunday, July 18, 2010

Logan Grosenick : 3pm July 21

Logan Grosenick: Center for Mind, Brain, and Computation & Department of
Bioengineering, Stanford University.

TITLE: Fast classification, regression, and multivariate methods for
sparse but structured data with applications to whole-brain fMRI and
volumetric calcium imaging

ABSTRACT: Modern neuroimaging methods allow the rapid collection of
large (> 100,000 voxel) volumetric time-series. Consequently there has
been a growing interest in applying supervised (classification,
regression) and unsupervised (factor analytic) machine learning
methods to uncover interesting patterns in these rich data.

However, as classically formulated, such approaches are difficult to
interpret when fit to correlated, multivariate data in the presence of
noise. In such cases, these models may suffer from coefficient
instability and sensitivity to outliers, and typically return dense
rather than parsimonious solutions. Furthermore, on large data they
can take an unreasonably long time to compute.

I will discuss ongoing research in the area of sparse but structured
methods for classification, regression, and factor analysis that aim
to produce interpretable solutions and to incorporate realistic
physical priors in the face of large, spatially and temporally
correlated data. Two examples--whole-brain classification of
spatiotemporal fMRI data and nonnegative sparse PCA applied to 3D
calcium imaging--will be presented.

Saturday, July 10, 2010

Alex Ramirez : July 14

I'll be presenting a paper from Matt Harrsion.  In it the authors describe an algorithm that takes a spike train and jitters the spike times to create a new spike train which is maximally random while preserving the firing rate and recent spike-history of the original train.

Thursday, July 1, 2010

Carl Smith : July 8

We have developed a simple model neuron for inference on noisy spike trains. In particular, we have in mind to use this model for computationally tractable quantification of information loss due to spike-time jitter. I will introduce the model, and in particular its favorable scaling properties. I'll display some results from inference done on synthetic data. Lastly, I'll describe an efficient scheme we devised for inference with a particular class of priors on the stimulus space that could be interesting outside the context of this model.