Computational Statistics and Neuroscience: 2011

Monday, December 19, 2011

David Pfau: Dec. 20th

David will be giving a fly-by view of a number of cool papers from NIPS.

First is Empirical Models of Spiking in Neural Populations by Macke, Büsing, Cunningham, Yu, Shenoy and Mahani, where they evaluate the relative merits of GLMs with pairwise coupling and state space models on multielectrode recording in motor cortex.

Next, Quasi-Newton Methods for Markov Chain Monte Carlo by Zhang and Sutton looks at how to use approximate second-order methods like L-BFGS for MCMC while still preserving detailed balance.

Then, Demixed Principal Component Analysis is an extension of PCA which demixes the dependence of different latent dimensions on different observed parameters, and is used to analyze neural data from PFC

Finally, Learning to Learn with Compound Hierarchical-Deep Models, which combines a deep neural network for learning visual features with a hierarchical nonparametric Bayesian model for learning object categories to make one cool-looking demo.

Wednesday, December 7, 2011

Previous Group Meetings (for archival purposes)

Universal MAP Estimation in Compressed Sensing, by Baron and Duarte

Quantifying Statistical Interdependence by Message Passing on Graphs, by Dauwels, Vialatte, Weber and Chichocki. Part I and Part II

The No-U_Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo, by Hoffman and Gelman.

Ari Pakman: Dec. 13th

"Rescaling, thinning or complementing? On goodness-of-ﬁt procedures for point process models and Generalized Linear Models" by Gerhard and Gerstner (NIPS 2010).

The link is http://books.nips.cc/papers/files/nips23/NIPS2010_0767.pdf

The abstract reads:

"Generalized Linear Models (GLMs) are an increasingly popular framework for modeling neural spike trains. They have been linked to the theory of stochastic point processes and researchers have used this relation to assess goodness-of-ﬁt using methods from point-process theory, e.g. the time-rescaling theorem. However, high neural ﬁring rates or coarse discretization lead to a breakdown of the assumptions necessary for this connection. Here, we show how goodness-of-ﬁt tests from point-process theory can still be applied to GLMs by constructing equivalent surrogate point processes out of time-series observations. Furthermore, two additional tests based on thinning and complementing point processes are introduced. They augment the instruments available for checking model adequacy of point processes as well as discretized models."

Sunday, September 11, 2011

Kolia Sadeghi : Sept. 20

This week, I'll be giving a fly-by overview of a string of recent papers on exact sparse signal recovery that do better than LASSO by solving a sequence of L1 or L2 penalized problems. Here is a basic narrative:

LASSO uses a penalty weighted by the same lambda for all coefficients. What happens if you assign different lambdas to each coefficient, and update these lambdas iteratively? Candes and Boyd do this in Enhancing sparsity by reweighted L1 minimization

You can obtain sparsity by iterative reweighting even for L2-penalized problems: if some of the lambdas become infinite, the corresponding coefficients become exactly zero. Chartrand and Yin find a particularly good L2 reweighing scheme in Iteratively reweighted algorithms for compressive sensing

All of the above methods reweigh each lambda based only on the value of its corresponding coefficient: they are separable. In Iterative reweighted l1 and l2 methods for finding sparse solutions, Wipf considers non-separable reweighting schemes that come out of Sparse Bayesian Learning (SBL), which you might also know by the name of Relevance Vector Machine or Automatic Revelance Determination.

Tuesday, July 5, 2011

Alex Ramirez: July 6th

Alex will continue with the presentation of the recent paper by Agarwal et al. (see previous post)

Sunday, June 12, 2011

Alex Ramirez: June 21th

Alex will be presenting a short version of this paper. In it the authors consider loss functions, for many estimators, that obey certain smoothness and convexity requirements and prove a global, geometric convergence (fast) rate of convergence under Nestorv's Gradient descent method up to a level of Statistical precision.

There will be no meeting on June 14th.

Monday, June 6, 2011

Kamiar Rahnama Rad: June 7th

Information rates and Optimal decoding in Large Populations

Many fundamental questions in theoretical neuroscience involve optimal decoding and the computation of Shannon information rates in populations of spiking neurons. In this paper, we apply methods from the asymptotic theory of statistical inference to obtain a clearer analytical understanding of these quantities. We find that for large neural populations carrying a finite total amount of information, the full spiking population response is asymptotically as informative as a single observation from a Gaussian process whose mean and covariance can be characterized explicitly in terms of network and single neuron properties. The Gaussian form of this asymptotic sufficient statistic allows us in certain cases to perform opti- mal Bayesian decoding by simple linear transformations, and to obtain closed-form expressions of the Shannon information carried by the network. One technical advantage of the theory is that it may be applied easily even to non-Poisson point process network models; for example, we find that under some conditions, neural populations with strong history-dependent (non-Poisson) effects carry exactly the same information as do simpler equivalent populations of non-interacting Poisson neurons with matched firing rates. We argue that our findings help to clarify some results from the recent literature on neural decoding and neuroprosthetic design.

Monday, May 30, 2011

Eric Shea-Brown : May 31st

Eric Shea-Brown who has come all the way from U of Washington will be speaking about:

A mechanistic approach to multi-spike patterns in neural circuits:
There is a combinatorial explosion in the number of possible activity patterns in neural circuits of increasing size, enabling an enormous complexity in which patterns occur and how this depends on incoming stimuli. However, recent experiments show that this complexity is not always accessed -- the activity of many neural populations is remarkably well captured by simpler descriptions that rely only on the activity of single neurons and neuron pairs.

What is especially intriguing is that these pairwise descriptions succeed even in cases where circuit architecture seems likely to create a far more complex set of outputs. We seek a mechanistic understanding of this phenomenon -- and predictions for when it will break down -- based on simple models of spike generation, circuit connectivity, and stimuli. This also offers a chance to explore how much (and how little) beyond-pairwise spike patterns can matter to coding in different circuits.

As a specific application, we consider the empirical success of pairwise models in capturing the activity of ON-parasol retinal ganglion cells. We first use intracellular recordings to fully constrain a model of the underlying circuit dynamics. Our theory then provides an explanation for experimental findings based on ON-parasol stimulus filtering and spike generation properties.

This is joint work with Andrea Barreiro, Julijana Gjorgjieva, and Fred Rieke.

Monday, May 9, 2011

Max Nikitchenko: May 10

This Tuesday, on 2011/05/10, I will discuss methods for the acceleration of the convergence of algorithms with linear convergence near the fixed point, such as EM, which are known to be notoriously slow in that area. Two approaches are possible: modify the iterative algorithm itself (PX-EM (by parameter-expansion), ECM (by maximizing the maximizer individually for each parameter, keeping the others fixed), etc), or use the recent history of the iterations to extrapolate them closer to the fixed point (in which case you keep all your machinery intact and only plug in an auxiliary function for extrapolating the already computed iteration steps). I will talk about the second class of the accelerators.

I will start with the method I derived myself, which is visual, but powerful at the same time. I will then focus on two papers which seem to become the gold standard in the acceleration techniques: Varadhan, R. & Roland, C. "Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm" (dx.doi.org/10.1111/j.1467-9469.2007.00585.x) from 2008 and Zhou, H.; Alexander, D. & Lange, K. "A quasi-Newton acceleration for high-dimensional optimization algorithms" (dx.doi.org/10.1007/s11222-009-9166-3) from 2011. I have just found out about the second paper and it seems to overlap heavily with the method I derived. I hope we will clear this question up!

Sunday, May 1, 2011

class presentations may 3 at 3:00

hi all - this tuesday we won't have normal group meeting. instead, the
students in my class will be giving presentations about the projects they
have been working on this semester. talk titles are here:
http://www.stat.columbia.edu/~liam/teaching/neurostat-spr11/talks.txt

presentations will begin at 3, and each one should last 15 min or so.
everyone's welcome to attend - hope to see you there.
L

Monday, April 25, 2011

Jianing Shi : April 26th

I will discuss Nesterov's optimal gradient method at the group meeting. I will talk about Nesterov's method for minimizing composite objective function, together with its implication for L1 minimization.

There is unfortunately no short story on Nesterov's method, however you can find his work at

http://www.core.ucl.ac.be/~nesterov/

Jianing's nicely done slides can be found here.

Friday, April 8, 2011

Jonathan Huggins : April 19

Submodularity part II, starting at 5:45pm.

After a brief review of two weeks ago, I will describe Queyranne's efficient and fully combinatorial algorithm for minimizing symmetric submodular functions. Next, I will give the details of the convex Lovasz extension of submodular functions, including a sketch of the proof of how to efficiently calculate the extension. Finally, I'll discuss portions of a recent paper on decomposable submodular functions by Stobbe and Krause, emphasizing its application to Markov Random Fields and the connections to the Lovasz extension and concave functions

Tim Machado : April 12

Learning Dictionaries of Stable Autoregressive Models for Audio Scene Analysis by Youngmin Cho and Lawrence K. Saul

Tuesday, April 5, 2011

Jonathan Huggins: April 5

This week I will be talking about submodular set functions, which possess a useful and intuitive diminishing returns property. I will begin with the definition and give a variety of examples of situations in which submodular functions arise. I'll discuss some connections to convex and concave functions, as well as strategies for minimization and maximization. I will mainly draw from the classic paper "Submodular functions and convexity" by Lovász, as well as a recent paper by Stobbe and Krause, which provides a nice summary of key results, as well as a discussion of many of the advances since the Lovász paper. If the talk sparks your interest, then I highly recommend the tutorial (complete with hours of video!) by Krause and Guestrin.

Sunday, March 20, 2011

Carl Smith : March 22

This week in group meeting I will be presenting a somewhat recent paper from Josh Tenenbaum's group entitled "Modelling Relational Data using Bayesian Clustered Tensor Factorization", in which a model for relational data is proposed and explored and argued to be a happy compromise of the pros and cons of clustering methods and factorization models. I plan to present the model itself, some issues it addresses, and some of the results described in the paper.

Monday, March 14, 2011

Eizaburo Doi : March 15

I will discuss the details of the following paper:

B. G. Borghuis, C. P. Ratliff, R. G. Smith, P. Sterling, and V. Balasubramanian. Design of a neuronal array. Journal of Neuroscience, 28:3178–3189, 2008.

I'd also mention a couple of related papers, including those cited in:

T. E. Holy. ”Yes! We’re all individuals!”: redundancy in neuronal circuits. Nature Neuroscience, 13:1306–1307, 2010.

Basically I plan to lead a discussion of efficient coding, population coding, redundancies in neural populations, and retinal coding. This is partly because we're finishing a journal draft on this topic. It would be great if you could bring any other papers that you'd like to discuss.

Monday, March 7, 2011

Kolia Sadeghi : March 8

At COSYNE, Cadieu and Koepsell had an interesting poster on joint models of amplitude and phase couplings between LFPs of different areas. There is a paper out on experimental findings [pdf] [supplement], and older papers on estimating models of joint phase couplings [pdf], both of which are interesting. The model including amplitudes is poster only for now, so I'll go over those papers quickly first.

Fritz Sommer's Adaptive compressive Sensing is good to have seen at least once, so I'll go over it quickly as well if time allows.

Thursday, March 3, 2011

Adaptive Compressive Sensing

Fritz Sommer gave a COSYNE 2011 workshop presentation of seemingly magical results coauthored by Guy Isely and Christopher Hillar.

Suppose an area of the brain deals in a signal which is sparse in some underlying unknown dictionary. This area subsamples the signal with say a random measurement matrix, and sends the subsampled signal to another area. The receiving area doesn't know what the original signals were, or what the underlying sparsifying dictionary was, or what the measurement matrix were; all it knows are the subsampled measurements it has received. If the receiving area learns a dictionary in which the subsampled signals it received are sparse, can this sparse representation also be used to linearly represent the original signal? The answer is yes.

To restore normality and disprove magic, read their NIPS paper. Apparently a longer paper with proofs is due to come out soon.

Monday, February 21, 2011

David Pfau : Feb 22nd

This week I'll be presenting a machine learning classic, Lee and Seung's "Algorithms for Non-negative Matrix Factorization". It's a short paper, so you won't be too distracted from your CoSyNe preparations. If I have time I'll also present some parts of "Online Learning for Matrix Factorization and Sparse Coding" by Mairal et al, so if you're so inclined please peruse that as well.

Wednesday, February 16, 2011

student seminar next wednesday

for anyone interested, peter crosta from CUIT will be talking about the new cluster scheduler (Torque/Maui) in the student seminar on wed, feb 23 - https://sites.google.com/site/colss1011/

Monday, February 7, 2011

Alex Ramirez : Feb. 8

Alex will be talking about: "Spike patterns in retinal ganglion cells required for decoding - a progress report."

Monday, January 31, 2011

Eftychios: Feb. 1

I'll be going over the following paper by Machens: Demixing population activity in higher cortical areas. Time permitting, I'll get into the more technical details of their approach described in this paper.