Title: Stan: A Statistical Modeling Language and Compiler

Abstract: Stan (http://mc-stan.org/) is a modeling language derived loosely from BUGS/JAGS. Rather than using Gibbs sampling, Stan uses (adaptive) Hamiltonian Monte Carlo sampling, which is implemented using automatic differentiation. Stan is also compiled rather than interpreted and uses a more flexible, imperative language instead of BUGS's declarative graph-specification language. I'll start with the variable declarations and blocks used by Stan, with an emphasis on why they are organized as they are and how the modeling language constructs are converted to code for compilation. Variable types include constrained and unconstrained scalars, vectors, matrices, and arrays; blocks include (transformed) data, (transformed) parameters, model, and generated quantities blocks, corresponding to how variables are used in Bayesian modeling. I'll then discuss how Stan's implemented with automatic differentiation to automatically convert models from constrained to unconstrained parameter spaces and how we use template metaprogramming to implement vectorized probability functions up to a constant. I'll also mention some future applications of our optimization features: maximum marginal likelihood point estimates and stochastic variational approximate Bayesian inference. If anyone's interested, I can also talk about what we've learned about open-source project management as the team and scope grow.

## Friday, December 13, 2013

## Friday, November 15, 2013

### Prof. Yves Atchade (Michigan/Columbia): November 20th

Title: Bayesian inference for doubly intractable distributions.

Abstract: The talk will review some of the recent developments in computational statistics to deal with statistical models with intractable likelihoods (viz. intractable normalizing constants). I will describe in particular our recent methodology that exploits the exact-approximate MCMC framework combined with a Russian roulette trick.

Abstract: The talk will review some of the recent developments in computational statistics to deal with statistical models with intractable likelihoods (viz. intractable normalizing constants). I will describe in particular our recent methodology that exploits the exact-approximate MCMC framework combined with a Russian roulette trick.

## Sunday, November 10, 2013

### Prof. Dana Pe'er: November 13th

Title: Revealing tumor heterogeneity between and within tumors

Abstract: Systematic characterization of cancer genomes has revealed a staggering complexity and heterogeneity of aberrations among individuals. More recently appreciated that intra-tumor heterogeneity is of critical importance, each tumor harboring sub-populations that vary in clinically important phenotypes such as drug sensitivity. A major challenge involves the development of analysis methods to integrate the flood of high-throughput data on tumors towards a past of personalized care. We will elaborate on two computational approaches on this path: (1) Integration of genetic and genomic data to identify genetic determinants of cancer. (2) Single cell analysis of signaling based on mass cytometry, a novel technology that can accurately measure more than forty signaling molecules simultaneously single cells.

Abstract: Systematic characterization of cancer genomes has revealed a staggering complexity and heterogeneity of aberrations among individuals. More recently appreciated that intra-tumor heterogeneity is of critical importance, each tumor harboring sub-populations that vary in clinically important phenotypes such as drug sensitivity. A major challenge involves the development of analysis methods to integrate the flood of high-throughput data on tumors towards a past of personalized care. We will elaborate on two computational approaches on this path: (1) Integration of genetic and genomic data to identify genetic determinants of cancer. (2) Single cell analysis of signaling based on mass cytometry, a novel technology that can accurately measure more than forty signaling molecules simultaneously single cells.

## Sunday, November 3, 2013

### Prof. Wei Ji Ma (NYU): November 6th

Abstract: My lab just arrived at NYU (www.cns.nyu.edu/malab). We do human psychophysics, behavioral modeling, and neural modeling. Today, I will be telling three short stories that are still in development:

1) Do humans aspire to optimality or favor simple heuristics in their decision-making? Both notions are prominent in different domains, but it is rare that they can be pitted directly against each other. We do so in a simple, new visual search task.

2) Confidence ratings are widely used in psychophysics, but rarely fitted. In a working memory task in which stimulus estimates and confidence ratings were collected, we tested different mappings from precision to confidence. It seems this mapping is logarithmic.

3) Using forward models of fMRI activity, we are trying to not just decode the stimulus but also uncertainty. A big problem is how to estimate the covariance matrix. I will discuss where we are currently stuck.

1) Do humans aspire to optimality or favor simple heuristics in their decision-making? Both notions are prominent in different domains, but it is rare that they can be pitted directly against each other. We do so in a simple, new visual search task.

2) Confidence ratings are widely used in psychophysics, but rarely fitted. In a working memory task in which stimulus estimates and confidence ratings were collected, we tested different mappings from precision to confidence. It seems this mapping is logarithmic.

3) Using forward models of fMRI activity, we are trying to not just decode the stimulus but also uncertainty. A big problem is how to estimate the covariance matrix. I will discuss where we are currently stuck.

## Sunday, October 27, 2013

### Mijung Park: Oct 30th

Title: Bayesian learning methods for neural coding.

Abstract: A primary goal in systems neuroscience is to understand how neural spike responses encode information about the external world. A popular approach to this problem is to build an explicit probabilistic model that characterizes the encoding relationship in terms of a cascade of stages: (1) linear dimensionality reduction of a high-dimensional stimulus space using a bank of filters or receptive fields (RFs); (2) a nonlinear function from filter outputs to spike rate; and (3) a stochastic spiking process with recurrent feedback. These models have described single- and multi-neuron spike responses in a wide variety of brain areas.

In this talk, I will present my Ph.D. work that focuses on developing Bayesian methods to efficiently estimate the linear and non-linear stages of the cascade encoding model. First, I will describe a novel Bayesian receptive field estimator based on a hierarchical prior that flexibly incorporates knowledge about the shapes of neural receptive fields. This estimator achieves error rates several times lower than existing methods, and can be applied to a variety of other neural inference problems such as extracting structure in fMRI data. Furthermore, I will present a novel low-rank description of the high dimensional receptive field, combined with a hierarchical prior for more efficient receptive field estimation. Second, I will describe new models for neural nonlinearities using Gaussian processes (GPs) and Bayesian active learning algorithms in ``closed-loop" neurophysiology experiments to rapidly estimate neural nonlinearities. These approaches significantly improve the efficiency of neurophysiology experiments, where data are often limited by the difficulty of maintaining stable recordings from a neuron or neural population.

Abstract: A primary goal in systems neuroscience is to understand how neural spike responses encode information about the external world. A popular approach to this problem is to build an explicit probabilistic model that characterizes the encoding relationship in terms of a cascade of stages: (1) linear dimensionality reduction of a high-dimensional stimulus space using a bank of filters or receptive fields (RFs); (2) a nonlinear function from filter outputs to spike rate; and (3) a stochastic spiking process with recurrent feedback. These models have described single- and multi-neuron spike responses in a wide variety of brain areas.

In this talk, I will present my Ph.D. work that focuses on developing Bayesian methods to efficiently estimate the linear and non-linear stages of the cascade encoding model. First, I will describe a novel Bayesian receptive field estimator based on a hierarchical prior that flexibly incorporates knowledge about the shapes of neural receptive fields. This estimator achieves error rates several times lower than existing methods, and can be applied to a variety of other neural inference problems such as extracting structure in fMRI data. Furthermore, I will present a novel low-rank description of the high dimensional receptive field, combined with a hierarchical prior for more efficient receptive field estimation. Second, I will describe new models for neural nonlinearities using Gaussian processes (GPs) and Bayesian active learning algorithms in ``closed-loop" neurophysiology experiments to rapidly estimate neural nonlinearities. These approaches significantly improve the efficiency of neurophysiology experiments, where data are often limited by the difficulty of maintaining stable recordings from a neuron or neural population.

## Saturday, October 19, 2013

### Prof. Tian Zheng: Oct 16th

Title: Latent Space Model for Aggregated Relational Data

Abstract: Aggregated Relational Data (ARD) are indirect network data collected using survey questions of the form "how many X's do you know?" It is most often used to estimate the size of populations that are difficult to count directly and allows researchers to choose specific subpopulations of interest without sampling or surveying members of these subpopulations directly. What has been under-utilized is the indirect information on social structure captured by ARD. In this talk, I present a latent space model and Bayesian computation framework for inference and estimation of social structures using ARD from non-network samples in social networks, the variation of social structures in subnetworks, and the relations between (hard-to-reach) subpopulations.

Abstract: Aggregated Relational Data (ARD) are indirect network data collected using survey questions of the form "how many X's do you know?" It is most often used to estimate the size of populations that are difficult to count directly and allows researchers to choose specific subpopulations of interest without sampling or surveying members of these subpopulations directly. What has been under-utilized is the indirect information on social structure captured by ARD. In this talk, I present a latent space model and Bayesian computation framework for inference and estimation of social structures using ARD from non-network samples in social networks, the variation of social structures in subnetworks, and the relations between (hard-to-reach) subpopulations.

## Sunday, October 6, 2013

### Prof. Rahul Mazumder: Oct 9th

Title: Low-rank Matrix Regularization: Statistical Models and Large Scale Algorithms

Abstract: Low-rank matrix regularization is an important area of research in statistics and machine learning with a wide range of applications --- the task is to estimate X, under a low rank constraint and possibly additional affine (or more general convex) constraints on X. In practice, the matrix dimensions frequently range from hundreds of thousands to even a million --- leading to severe computational challenges. In this talk, I will describe computationally tractable models and scalable (convex) optimization based algorithms for a class of low-rank regularized problems. Exploiting problem-specific statistical insights, problem structure and using novel tools for large scale SVD computations play important roles in this task. I will describe how we can develop a unified, tractable convex optimization framework for general exponential family models, incorporating meta-features on the rows/columns.

Abstract: Low-rank matrix regularization is an important area of research in statistics and machine learning with a wide range of applications --- the task is to estimate X, under a low rank constraint and possibly additional affine (or more general convex) constraints on X. In practice, the matrix dimensions frequently range from hundreds of thousands to even a million --- leading to severe computational challenges. In this talk, I will describe computationally tractable models and scalable (convex) optimization based algorithms for a class of low-rank regularized problems. Exploiting problem-specific statistical insights, problem structure and using novel tools for large scale SVD computations play important roles in this task. I will describe how we can develop a unified, tractable convex optimization framework for general exponential family models, incorporating meta-features on the rows/columns.

## Friday, September 27, 2013

### Dean Eckles (Facebook): October 2nd

Title: Design and analysis of experiments in networks

Abstract: Random assignment of individuals to treatments is often used to predict what will happen if the treatment is applied to everyone, but resulting estimates can suffer substantial bias in the presence of peer effects (i.e., interference, spillovers, social interactions). We describe experimental designs that reduce this bias by producing treatment assignments that are correlated in the network. For example, we can use graph partitioning methods to construct clusters of individuals who are then assigned to treatment or control together. This clustered assignment alone can substantially reduce bias, as can incorporating information about peers' treatment assignments or behaviors into the analysis. Simulation results show how this bias reduction varies with network structure and the size of direct and peer effects. We illustrate this method with real experiments, including a large experiment on Thanksgiving Day 2012.

Abstract: Random assignment of individuals to treatments is often used to predict what will happen if the treatment is applied to everyone, but resulting estimates can suffer substantial bias in the presence of peer effects (i.e., interference, spillovers, social interactions). We describe experimental designs that reduce this bias by producing treatment assignments that are correlated in the network. For example, we can use graph partitioning methods to construct clusters of individuals who are then assigned to treatment or control together. This clustered assignment alone can substantially reduce bias, as can incorporating information about peers' treatment assignments or behaviors into the analysis. Simulation results show how this bias reduction varies with network structure and the size of direct and peer effects. We illustrate this method with real experiments, including a large experiment on Thanksgiving Day 2012.

## Sunday, September 22, 2013

### Donald Pianto: September 25th

Title: Dealing with monotone likelihood in a model for speckled data

Abstract: In this paper we study maximum likelihood estimation (MLE) of the roughness parameter of the G_{A}^{0} distribution for speckled imagery (Frery et al., 1997). We discover that when a certain criterion is satisfied by the sample moments, the likelihood function is monotone and MLE estimates are infinite, implying an extremely homogeneous region. We implement three corrected estimators in an attempt to obtain finite parameter estimates. Two of the estimators are taken from the literature on monotone likelihood (Firth, 1993; Jeffreys, 1946) and one, based on resampling, is proposed by the authors. We perform Monte Carlo experiments to compare the three estimators. We find the estimator based on the Jeffreys prior to be the worst. The choice between Firth’s estimator and the Bootstrap

estimator depends on the value of the number of looks (which is given before estimation) and the specific needs of the user. We also apply the estimators to real data obtained from synthetic aperture radar (SAR). These results corroborate the Monte Carlo findings.

Abstract: In this paper we study maximum likelihood estimation (MLE) of the roughness parameter of the G_{A}^{0} distribution for speckled imagery (Frery et al., 1997). We discover that when a certain criterion is satisfied by the sample moments, the likelihood function is monotone and MLE estimates are infinite, implying an extremely homogeneous region. We implement three corrected estimators in an attempt to obtain finite parameter estimates. Two of the estimators are taken from the literature on monotone likelihood (Firth, 1993; Jeffreys, 1946) and one, based on resampling, is proposed by the authors. We perform Monte Carlo experiments to compare the three estimators. We find the estimator based on the Jeffreys prior to be the worst. The choice between Firth’s estimator and the Bootstrap

estimator depends on the value of the number of looks (which is given before estimation) and the specific needs of the user. We also apply the estimators to real data obtained from synthetic aperture radar (SAR). These results corroborate the Monte Carlo findings.

## Sunday, September 15, 2013

### Prof. John Paisley: September 18th

Title: Variational Inference and Big Data

Abstract: I will discuss a scalable algorithm for approximating posterior distributions called stochastic variational inference. Stochastic variational inference lets one apply complex Bayesian models to massive data sets. This technique applies to a large class of probabilistic models and outperforms traditional batch variational inference, which can only handle small data sets. Stochastic inference is a simple modification to the batch approach, so a significant part of the discussion will focus on reviewing this traditional batch inference method.

Abstract: I will discuss a scalable algorithm for approximating posterior distributions called stochastic variational inference. Stochastic variational inference lets one apply complex Bayesian models to massive data sets. This technique applies to a large class of probabilistic models and outperforms traditional batch variational inference, which can only handle small data sets. Stochastic inference is a simple modification to the batch approach, so a significant part of the discussion will focus on reviewing this traditional batch inference method.

## Friday, September 6, 2013

### David Carlson: September 11th

Title: Real-Time Inference for a Gamma Process Model of Neural Spiking

Abstract: With simultaneous measurements from ever increasing populations of neurons, there is a growing need for sophisticated tools to recover signals from individual neurons. In electrophysiology experiments, this classically proceeds in a two-step process: (i) threshold the waveforms to detect putative spikes and (ii) cluster the waveforms into single units (neurons). We extend previous Bayesian nonparametric models of neural spiking to jointly detect and cluster neurons using a Gamma process model. We develop an online approximate inference scheme enabling real-time analysis, with performance exceeding the previous state-of-the-art. Via exploratory data analysis we find several features of our model collectively contribute to our improved performance including: (i) accounting for colored noise, (ii) detecting overlapping spikes, (iii) tracking waveform dynamics, and (iv) using multiple channels.

In my talk, I will give a brief overview of the Bayesian nonparametric structures that have been used in the spike-sorting problem. From there, I will give details on how we've taken the spike sorting model and integrated it with a Poisson process to improve the noisy detection problem, and give details on learning the model using real-time online methods. Additionally, I will discuss extensions to evolving waveform dynamics and multiple channels, and present results from a tetrode as well as from novel 3-channel and 8-channel multi-electrode arrays where action potentials may appear on some but not all of the channels.

Abstract: With simultaneous measurements from ever increasing populations of neurons, there is a growing need for sophisticated tools to recover signals from individual neurons. In electrophysiology experiments, this classically proceeds in a two-step process: (i) threshold the waveforms to detect putative spikes and (ii) cluster the waveforms into single units (neurons). We extend previous Bayesian nonparametric models of neural spiking to jointly detect and cluster neurons using a Gamma process model. We develop an online approximate inference scheme enabling real-time analysis, with performance exceeding the previous state-of-the-art. Via exploratory data analysis we find several features of our model collectively contribute to our improved performance including: (i) accounting for colored noise, (ii) detecting overlapping spikes, (iii) tracking waveform dynamics, and (iv) using multiple channels.

In my talk, I will give a brief overview of the Bayesian nonparametric structures that have been used in the spike-sorting problem. From there, I will give details on how we've taken the spike sorting model and integrated it with a Poisson process to improve the noisy detection problem, and give details on learning the model using real-time online methods. Additionally, I will discuss extensions to evolving waveform dynamics and multiple channels, and present results from a tetrode as well as from novel 3-channel and 8-channel multi-electrode arrays where action potentials may appear on some but not all of the channels.

## Tuesday, July 23, 2013

### Prof. Qi Wang (Biomedical Engineering, Columbia): July 24th

Title: Reading and Writing the Neural Code: Initial Steps toward Engineered Sensory Percepts

Abstract: The transformation of sensory signals into spatiotemporal patterns of neural activity in the brain is critical in forming our perception of the external world. Physical signals, such as light, sound, and force, are transduced to neural electrical impulses, or spikes, at the periphery, and these spikes are subsequently transmitted to the brain through various stages of the sensory pathways, ultimately forming the representation of the sensory world. Deciphering the information conveyed in the spike trains is often referred to as “reading the neural code”. On the other hand, prosthetic devices designed to restore lost sensory function, such as cochlear implants, rely primarily on the principle of artificially activating neural circuits to induce a desired perception, which we might refer to as “writing the neural code”. This requires not only significant challenges in biomaterials and interfaces, but also in knowing precisely what to tell the brain to do.

My talk will focus on three topics. First, I will talk about the control of peripheral tactile sensations. Specifically, I will discuss the synthesis of virtual tactile sensations using a custom-built, high spatiotemporal resolution tactile display, a device we designed to create high fidelity, computer-controlled tactile sensations on the fingertip similar to those arising naturally. Second, I will utilize a decoding paradigm to discuss the neural representations of tactile sensations and how they are encoded and transformed across early stages of processing in the somatosensory pathway. Finally, I will discuss the design of sub-cortical microstimulation to control cortical activation, using downstream cortical measurements as a benchmark of the fidelity of the surrogate signaling. Taken together, an understanding of how to read and write the neural code is essential not only for the development of technologies for translating thoughts into actions (motor prostheses), but also for the development of technologies for creating artificial sensory percepts (sensory prostheses).

Abstract: The transformation of sensory signals into spatiotemporal patterns of neural activity in the brain is critical in forming our perception of the external world. Physical signals, such as light, sound, and force, are transduced to neural electrical impulses, or spikes, at the periphery, and these spikes are subsequently transmitted to the brain through various stages of the sensory pathways, ultimately forming the representation of the sensory world. Deciphering the information conveyed in the spike trains is often referred to as “reading the neural code”. On the other hand, prosthetic devices designed to restore lost sensory function, such as cochlear implants, rely primarily on the principle of artificially activating neural circuits to induce a desired perception, which we might refer to as “writing the neural code”. This requires not only significant challenges in biomaterials and interfaces, but also in knowing precisely what to tell the brain to do.

My talk will focus on three topics. First, I will talk about the control of peripheral tactile sensations. Specifically, I will discuss the synthesis of virtual tactile sensations using a custom-built, high spatiotemporal resolution tactile display, a device we designed to create high fidelity, computer-controlled tactile sensations on the fingertip similar to those arising naturally. Second, I will utilize a decoding paradigm to discuss the neural representations of tactile sensations and how they are encoded and transformed across early stages of processing in the somatosensory pathway. Finally, I will discuss the design of sub-cortical microstimulation to control cortical activation, using downstream cortical measurements as a benchmark of the fidelity of the surrogate signaling. Taken together, an understanding of how to read and write the neural code is essential not only for the development of technologies for translating thoughts into actions (motor prostheses), but also for the development of technologies for creating artificial sensory percepts (sensory prostheses).

## Tuesday, July 9, 2013

### Carl Smith: July 9th

Title: Low-rank graphical models and Bayesian inference in the statistical analysis of noisy neural data

Abstract: We develop new methods of Bayesian inference, largely in the context of analysis of neuroscience data. The work is broken into several parts. In the first part, we introduce a novel class of joint probability distributions in which exact inference is tractable. Previously it has been difficult to find general constructions for models in which efficient exact inference is possible, outside of certain classical cases. We identify a class of such models that are tractable owing to a certain “low-rank” structure in the potentials that couple neighboring variables. In the second part we develop methods to quantify and measure information loss in analysis of neuronal spike train data due to two types of noise, making use of the ideas developed in the first part. Information about neuronal identity or temporal resolution may be lost during spike detection and sorting, or precision of spike times may be corrupted by various effects. We quantify the information lost due to these effects for the relatively simple but sufficiently broad class of Markovian model neurons. We find that decoders that model the probability distribution of spike-neuron assignments significantly outperform decoders that use only the most likely spike assignments. We also apply the ideas of the low-rank models from the first section to defining a class of prior distributions over the space of stimuli (or other covariate) which, by conjugacy, preserve the tractability of inference. In the third part, we treat Bayesian methods for the estimation of sparse signals, with application to the locating of synapses in a dendritic tree. We develop a compartmentalized model of the dendritic tree. Building on previous work that applied and generalized ideas of least angle regression to obtain a fast Bayesian solution to the resulting estimation problem, we describe two other approaches to the same problem, one employing a horseshoe prior and the other using various spike-and-slab priors. In the last part, we revisit the low-rank models of the first section and apply them to the problem of inferring orientation selectivity maps from noisy observations of orientation preference. The relevant low-rank model exploits the self-conjugacy of the von Mises distribution on the circle. Because the orientation map model is loopy, we cannot do exact inference on the low-rank model by the forward back- ward algorithm, but block-wise Gibbs sampling by the forward backward algorithm speeds mixing. We explore another von Mises coupling potential Gibbs sampler that proves to effectively smooth noisily observed orientation maps.

Abstract: We develop new methods of Bayesian inference, largely in the context of analysis of neuroscience data. The work is broken into several parts. In the first part, we introduce a novel class of joint probability distributions in which exact inference is tractable. Previously it has been difficult to find general constructions for models in which efficient exact inference is possible, outside of certain classical cases. We identify a class of such models that are tractable owing to a certain “low-rank” structure in the potentials that couple neighboring variables. In the second part we develop methods to quantify and measure information loss in analysis of neuronal spike train data due to two types of noise, making use of the ideas developed in the first part. Information about neuronal identity or temporal resolution may be lost during spike detection and sorting, or precision of spike times may be corrupted by various effects. We quantify the information lost due to these effects for the relatively simple but sufficiently broad class of Markovian model neurons. We find that decoders that model the probability distribution of spike-neuron assignments significantly outperform decoders that use only the most likely spike assignments. We also apply the ideas of the low-rank models from the first section to defining a class of prior distributions over the space of stimuli (or other covariate) which, by conjugacy, preserve the tractability of inference. In the third part, we treat Bayesian methods for the estimation of sparse signals, with application to the locating of synapses in a dendritic tree. We develop a compartmentalized model of the dendritic tree. Building on previous work that applied and generalized ideas of least angle regression to obtain a fast Bayesian solution to the resulting estimation problem, we describe two other approaches to the same problem, one employing a horseshoe prior and the other using various spike-and-slab priors. In the last part, we revisit the low-rank models of the first section and apply them to the problem of inferring orientation selectivity maps from noisy observations of orientation preference. The relevant low-rank model exploits the self-conjugacy of the von Mises distribution on the circle. Because the orientation map model is loopy, we cannot do exact inference on the low-rank model by the forward back- ward algorithm, but block-wise Gibbs sampling by the forward backward algorithm speeds mixing. We explore another von Mises coupling potential Gibbs sampler that proves to effectively smooth noisily observed orientation maps.

## Sunday, June 30, 2013

### Tim Machado: July 3rd

Title: Functional organization of motor neurons during fictive locomotor behavior revealed by large-scale optical imaging

Abstract: The isolated neonatal mouse spinal cord is capable of generating sustained rhythmic network activity, termed fictive locomotion (Kiehn and Kjaerulff 1996, Markin et al. 2012). However, the spatiotemporal pattern of motor neuron activity during fictive locomotion has not been measured at single-cell resolution, nor has the variation across a motor pool been quantified. We have measured the activity of thousands of retrogradely labeled motor neurons using large-scale, cellular resolution calcium imaging. Spike inference methods (Vogelstein et al. 2010) have been used to estimate peak firing phase. This approach was validated in each experiment using antidromic stimulation of ventral roots to generate data where spike timing information is known. Our imaging approach has revealed that neurons within the same motor pool fire synchronously. In contrast, neurons innervating muscles that have slightly different phase tunings during walking also showed slightly offset burst times during fictive locomotion. Neurons innervating antagonist muscles reliably fired 180° out of phase with one another. Finally, groups of motor neurons that fired asynchronously were found at each lumbar spinal segment, suggesting that the recruitment of motor neurons during fictive locomotion is determined by pool identity, rather than by segmental position. These spatiotemporal patterns were each highly reproducible between preparations. Our approach has revealed complexity and specificity in the patterns of motor neuron recruitment during locomotor-like network activity. We are currently analyzing the relationship between the activity of genetically defined pre-motor interneurons and the activity of identified motor neuron pools.

Abstract: The isolated neonatal mouse spinal cord is capable of generating sustained rhythmic network activity, termed fictive locomotion (Kiehn and Kjaerulff 1996, Markin et al. 2012). However, the spatiotemporal pattern of motor neuron activity during fictive locomotion has not been measured at single-cell resolution, nor has the variation across a motor pool been quantified. We have measured the activity of thousands of retrogradely labeled motor neurons using large-scale, cellular resolution calcium imaging. Spike inference methods (Vogelstein et al. 2010) have been used to estimate peak firing phase. This approach was validated in each experiment using antidromic stimulation of ventral roots to generate data where spike timing information is known. Our imaging approach has revealed that neurons within the same motor pool fire synchronously. In contrast, neurons innervating muscles that have slightly different phase tunings during walking also showed slightly offset burst times during fictive locomotion. Neurons innervating antagonist muscles reliably fired 180° out of phase with one another. Finally, groups of motor neurons that fired asynchronously were found at each lumbar spinal segment, suggesting that the recruitment of motor neurons during fictive locomotion is determined by pool identity, rather than by segmental position. These spatiotemporal patterns were each highly reproducible between preparations. Our approach has revealed complexity and specificity in the patterns of motor neuron recruitment during locomotor-like network activity. We are currently analyzing the relationship between the activity of genetically defined pre-motor interneurons and the activity of identified motor neuron pools.

## Monday, June 24, 2013

### José Miguel Hernández Lobato: June 27th

Title: Gaussian Process Vine Copulas for Multivariate Dependence

Abstract: Copulas allow to learn marginal distributions separately from the multivariate dependence structure (copula) that links them together into a density function. Vine factorizations ease the learning of high-dimensional copulas by constructing a hierarchy of conditional bivariate copulas. However, to simplify inference, it is common to assume that each of these conditional bivariate copulas is independent from its conditioning variables. In this work, we relax this assumption by discovering the latent functions that specify the shape of a conditional copula given its conditioning variables We learn these functions by following a Bayesian approach based on sparse Gaussian processes with expectation propagation for scalable, approximate inference. Experiments on real-world datasets show that, when modeling all conditional dependencies, we obtain better estimates of the underlying copula of the data.

Special location: Mudd 210.

### Roy Fox: June 26th

Title: KL-regularized reinforcement-learning problems

Abstract: Of the many justifications for regularizing reinforcement-learning problems with KL-divergence terms, perhaps the most obviously compelling is when it leads to efficient algorithms. This is the case under the assumptions of full observability and controllability, as in Emo Todorov's work on Linearly-Solvable Markov Decision Processes. In this talk I will present these ideas, mostly introduced in these two papers:

http://homes.cs.washington.edu/~todorov/papers/MDP.pdf

http://homes.cs.washington.edu/~todorov/papers/duality.pdf

Then I will share insights and challenges in applying similar approaches to partially observable and controllable MDPs.

Abstract: Of the many justifications for regularizing reinforcement-learning problems with KL-divergence terms, perhaps the most obviously compelling is when it leads to efficient algorithms. This is the case under the assumptions of full observability and controllability, as in Emo Todorov's work on Linearly-Solvable Markov Decision Processes. In this talk I will present these ideas, mostly introduced in these two papers:

http://homes.cs.washington.edu/~todorov/papers/MDP.pdf

http://homes.cs.washington.edu/~todorov/papers/duality.pdf

Then I will share insights and challenges in applying similar approaches to partially observable and controllable MDPs.

## Friday, June 14, 2013

### Ari Pakman: June 19th

Title: Exact Hamiltonian Monte Carlo for Binary Distributions

Abstract: I will present a new approach to sample from generic binary distributions, based on an exact Hamiltonian Monte Carlo algorithm applied to a piecewise continuous augmentation of the binary distribution of interest. An extension of this idea to distributions over mixtures of binary and continuous variables permits sampling from posteriors of linear and probit regression models with spike-and-slab priors and truncated parameters.

## Sunday, June 9, 2013

### Prof. Michael Shadlen: June 12th

Title: Firing rate autocorrelation as a signature of noisy evidence accumulation

Abstract: I plan to discuss insights about the nature of neural noise and computation. I will also address neural mechanisms involved in decision making and bring the two topics together by introducing new tools that reveal noisy evidence accumulation (e.g., drift-diffusion) in the spike-trains of single neurons on single trials during decision formation.

Abstract: I plan to discuss insights about the nature of neural noise and computation. I will also address neural mechanisms involved in decision making and bring the two topics together by introducing new tools that reveal noisy evidence accumulation (e.g., drift-diffusion) in the spike-trains of single neurons on single trials during decision formation.

## Friday, May 24, 2013

### Daryl Hochman: May 29th

Title: Optical Imaging Data Acquired From the Human Brain

Abstract: The amount of light absorbed and scattered by brain tissue is altered by neuronal activity. Imaging of “intrinsic optical signals” (ImIOS) is the technique of mapping these dynamic optical changes with high spatial and temporal resolution. ImIOS of the exposed brains of awake patients, performed during their neurosurgical treatment for intractable epilepsy, has unique advantages for studying certain aspects of the human brain. Better methods for the analysis and visualization of ImIOS data are motivated by at least two reasons. First, ImIOS can be used for investigating basic biological questions concerning the regulation of blood flow in the human brain during normal and epileptic activity. Second, optical imaging has the potential to be a practical clinical tool for localizing functional and epileptic brain regions in the operating room. My talk will focus on explaining the types of questions that can be investigated with optical imaging of the human brain, and illustrating the spatial and temporal features of these types of data that could benefit from better methods for their visualization and analysis.

A couple of relevant references:

1) https://www.ncbi.nlm.nih.gov/pubmed/21640137

2) https://www.ncbi.nlm.nih.gov/pubmed/1495561

Abstract: The amount of light absorbed and scattered by brain tissue is altered by neuronal activity. Imaging of “intrinsic optical signals” (ImIOS) is the technique of mapping these dynamic optical changes with high spatial and temporal resolution. ImIOS of the exposed brains of awake patients, performed during their neurosurgical treatment for intractable epilepsy, has unique advantages for studying certain aspects of the human brain. Better methods for the analysis and visualization of ImIOS data are motivated by at least two reasons. First, ImIOS can be used for investigating basic biological questions concerning the regulation of blood flow in the human brain during normal and epileptic activity. Second, optical imaging has the potential to be a practical clinical tool for localizing functional and epileptic brain regions in the operating room. My talk will focus on explaining the types of questions that can be investigated with optical imaging of the human brain, and illustrating the spatial and temporal features of these types of data that could benefit from better methods for their visualization and analysis.

A couple of relevant references:

1) https://www.ncbi.nlm.nih.gov/pubmed/21640137

2) https://www.ncbi.nlm.nih.gov/pubmed/1495561

## Wednesday, May 15, 2013

Lars Buesing: May 22nd

Title: Dynamical System Models for Characterizing Multi-Electrode Recordings of Cortical Population Activity

Abstract: Multi-electrode techniques now make it possible to record from up to hundreds of cortical neurons simultaneously, and thus open the door to unprecedented insights into cortical neural population activity and the associated computations. However, in order to exploit this potential we need computationally tractable statistical methods that are able to see beyond signal and variability in individual neurons to structured activity that underlies reliable population computation. Such methods will very likely depend on analyzing the activity of the ensemble as a whole, rather than on simple single-neuron or pairwise analysis. In this talk I will argue that Dynamical System models, and more specifically Linear Dynamical Systems with Poisson observations (PLDS), meet these desiderata, while at the same time providing a parsimonious, statistically accurate description of the data. I will present a fast, robust algorithm for fitting PLDS models, which is based on spectral subspace methods. This algorithm substantially improves over standard approximate Expectation-Maximization for PLDS models in terms of both computational efficiency as well as quality of estimated parameters, hence greatly facilitating the application of these models to real multi-electrode recordings. Finally, I will show how Dynamical System models can be used to characterize fundamental dynamical properties of multi-electrode recordings from motor areas of awake, behaving macaque monkeys.This analysis reveals that different epochs of task-relevant behavior manifest themselves in different dynamics of the recorded neural population.

## Saturday, May 11, 2013

David Pfau: May 15th

Title: Robust Learning of Low-Dimensional Dynamics from Large Neural Ensembles

Abstract: Progress in neural recording technology has made it possible to record spikes from ever larger populations of neurons. To cope with this deluge, a common strategy is to reduce the dimensionality of the data, most commonly by principal component analysis (PCA). In recent years a number of extensions to PCA have been introduced in the neuroscience literature, including jPCA and demixed principal component analysis (dPCA). A downside of these methods is that they do not treat either the discrete nature of spike data or the positivity of firing rates in a statistically principled way. In fact it is common practice to smooth the data substantially or average over many trials, losing information about fine temporal structure and inter-trial variability.

A more principled approach is to fit a state space model directly from spike data, where the latent state is low dimensional. Such models can account for the discreteness of spikes by using point-process models for the observations, and can incorporate temporal dependencies into the latent state model. State space models can include complex interactions such as switching linear dynamics and direct coupling between neurons. These methods have drawbacks too: they are typically fit by approximate EM or other methods that are prone to local minima, the number of latent dimensions must be chosen ahead of time (though nonparametric Bayesian models could avoid this issue) and a certain class of possible dynamics must be chosen before doing dimensionality reduction.

We attempt to combine the computational tractability of PCA and related methods with the statistical richness of state space models. Our approach is convex and based on recent advances in system identification using nuclear norm minimization, a relaxation of matrix rank minimization. Our contribution is threefold. 1) Low-dimensional subspaces can be accurately recovered, even when the dynamics are unknown and nonstationary. 2) Spectral methods can faithfully recover the parameters of state space models when applied to data projected into the recovered subspace. 3) Low-dimensional common inputs can be separated from sparse local interactions, suggesting that these techniques could be useful for inferring synaptic connectivity.

## Thursday, May 2, 2013

Title: Inferring neural connectivity

Abstract: Advances in large-scale multineuronal recordings have made it possible to study the simultaneous activity of complete ensembles of neurons. These techniques in principle provide the opportunity to discern the architecture of neuronal networks. However, current technologies can sample only small fraction of the underlying circuitry, therefore unmeasured neurons probably have a large collective impact on network dynamics and coding properties For example, it is well understood that common input plays an essential role in the interpretation of pairwise cross-correlograms. To infer the correct connectivity and computations in the circuit requires modelling tools that account for unrecorded neurons. We develop a model for fast inference of neural connectivity under the constraint that we only observe a subset of neurons in the population at a time.

## Monday, April 29, 2013

Ben Shababo: May 1st

Title: Optimal Sequential Stimulation of Neural Populations For Inferring Functional Connectivity

Abstract: In this talk, we will review ongoing work in which we use methods from Bayesian experimental design, a subset of Active Learning, to guide an circuit mapping experiment. Specifically, the experimental paradigm we assume includes the recording of some output from a single cell - such as membrane voltage or current - and the ability to stimulate some subset of nearby neurons. The goal of the experiment is to learn the vector of weights that describe the influence of the cells we can stimulate on the cell we are recording from. In Bayesian experimental design the objective is to maximize the mutual information between the data and the parameters one wishes to learn which in turn entails a probabilistic model. For our model, we use a spike-and-slab prior on the weights with a linear gaussian likelihood. Furthermore, since this algorithm must perform in an online setting, we speed up the algorithm by approximating the optimization with a greedy version of the algorithm and by using online Bayesian updating of the posterior during stimulus selection. We will present results that show that within a specific regime our procedure outperforms random stimulation. We will also present some ideas we are currently incorporating into our model to make it more robust and applicable for the current state of experimental technology.

## Thursday, April 18, 2013

Title: Accurate Optical AP Detection During ‘Natural’ Behavior: Two Inference Problems

Abstract: Two-photon calcium imaging can detect single action potentials in populations of spatially resolved neurons in vivo, but using it to quantitatively compare spiking and behavior requires solving several problems of analysis and experimental technique. This talk will focus on two such problems: accurately inferring spike counts from fluorescence signals, and measuring visual input in freely moving animals. For optical action potentials detection several algorithms exist along with a growing corpus of ground truth datasets. I will describe these as well as some current work to develop algorithms that are effective on a wide range of in vivo data, to develop metrics for testing spike inference, and to create a public database of ground truth measurements. In the second half of my talk, I will describe a system for eye tracking in freely moving rats compatible with two-photon imaging through optical fibers. I will also briefly describe some insights into the activity of cortical populations and rodent visual behavior provided by these methods.

## Monday, April 15, 2013

Title: A simple proof for the Marchenko-Pastur law

Abstract: I review the proof of the Wigner semicircle law for Wigner matrices using the Stieltjes transform method described here. Using this method, I will present a new simple proof to the old Marchenko-Pastur law, which describes the asymptotic behavior of singular values of large sample covariance matrices.

## Tuesday, April 2, 2013

Attila Losonczy: April 10th

Title: Functional imaging hippocampal microcircuits in behaving mice.

Abstract: I my talk I will introduce recently developed methods for functional two-photon imaging genetically and anatomically-defined cellular and subcellular components of the hippocampal CA1 microcircuit in awake behaving mice. I will review some advantages of this approach as well as current experimental and analytical challenges to dissect the role of identified presynaptic and postsynaptic circuit motifs in hippocampal memory behaviors.

## Monday, April 1, 2013

Eftychios Pnevmatikakis: April 3rd

Title: A brief introduction to determinantal point processes

Abstract: a brief introduction to determinantal point processes, a class of probabilistic models that model global negative interactions, yet allow for tractable inference. Material will be drawn from http://arxiv.org/abs/1207.6083 from (chapters 1-4). Time permitting, we will also discuss the approach of this paper: http://books.nips.cc/papers/

## Friday, March 22, 2013

Volker Pernice: March 27th

Title: Correlations and connectivity in populations of neurons

Abstract: Due to their ubiquity in experiments and their importance for neural network dynamics and function, covariances between neural spike trains have been studied extensively. Their origin and their relation to the structure of the network of synaptic connections can be understood in the framework of linearly interacting point processes. This model also approximately describes the covariances in networks of leaky integrate-and-fire neurons. Because direct as well as indirect connections generate covariances, the solution to the inverse problem of inferring network structure from covariances is not unique. However, the ambiguity can partly be resolved under the assumption of a sparse network.

## Wednesday, March 20, 2013

Josh Merel: March 20

Josh will talk about single index models and multiple index models. Some references:

- example single index model estimation - http://snowbird.djvuzone.org/2008/abstracts/183.pdf

- newer (better?) single index model estimation - http://arxiv.org/abs/1104.2018

- a video-lecture on this stuff and applications - http://videolectures.net/nipsworkshops2012_ravikumar_single/

- previous well-known work employing Bregman divergence (for reference) - http://jmlr.csail.mit.edu/papers/volume6/banerjee05b/banerjee05b.pdf

If there is time he will also review spike train kernels and propose a (new?) kernel - looking for feedback on this. Some references:

- easier version - http://arxiv.org/abs/1302.5964

- antecedent work - http://www.magicbroom.info/Papers/ShpigelmanSiPaVa05.pdf

## Tuesday, March 12, 2013

Arian Maleki: March 13

Title: Minimax image denoising via anisotropic nonlocal means

Abstract: Image denoising is a fundamental primitive in image processing and computer vision. Denoising algorithms have evolved from the classical linear and median filters to more modern schemes like total variation denoising, wavelet thresholding, and bilateral filters. A particularly successful denoising scheme is the nonlocal means (NLM) algorithm, which estimates each pixel value as a weighted average of other, similar noisy pixels. I start my talk by proving that the popular nonlocal means (NLM) denoising algorithm does not "optimally" denoise images with sharp edges. Its weakness lies in the isotropic nature of the neighborhoods it uses in order to set its smoothing weights. In response, I introduce the anisotropic nonlocal means (ANLM) algorithm and prove that it is near minimax optimal for edge-dominated images from the Horizon class. On real-world test images, an ANLM algorithm that adapts to the underlying image gradients outperforms NLM by a significant margin, up to 2dB in mean square error.

## Tuesday, March 5, 2013

## Tuesday, February 26, 2013

David Blei: Feb 27th

Stochastic Variational Inference

Abstract: We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet process topic model. Using stochastic variational inference, we analyze several large collections of documents: 300K articles from Nature, 1.8M articles from The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can handle the full data, and outperforms traditional variational inference on a subset. (Further, we show that the Bayesian nonparametric topic model outperforms its parametric counterpart.) Stochastic variational inference lets us apply complex Bayesian models to very large data sets.

You can read the paper here.

## Wednesday, February 20, 2013

Carl Smith and Ari Pakman: Feb. 20

We review and present new results on spike-and-slab priors to impose sparsity in regression problems.

Outline:

- Why spike-and-slab?

- Variational Bayes approximation to the posterior

- Computing hyperparameters using Empirical Bayes.

- Singular and non-singular Markov Chains for MCMC.

- Gibbs sampler for the posterior sparsity variables.

- Extension to regression with positive coefficients.

- Example application: finding synaptic weights in a dendritic tree

## Friday, February 8, 2013

Garud Iyengar: Feb 13

Title: Fast first-order augmented Lagrangian algorithms for sparse optimization problems

Abstract:

In this talk we will survey recent work on fast first-order algorithms for solving optimization problems with non-trivial conic constraints. These algorithms are augmented Lagrangian algorithms; however, unlike traditional augmented Lagrangian algorithms we update the penalty multiplier during the course of the algorithm. The algorithm iterates are epsilon-feasible and epsilon-optimal in O(log(1/epsilon))-multiplier update steps with an overal complexity of O(1/epsilon). We will discuss the key steps in the algorithm development and show numerical results for basis pursuit, principal component pursuit and stable principal component pursuit.

Joint work with N. Serhat Aybat (Penn State)

Subscribe to:
Posts (Atom)