Accelerating Monte Carlo methods for Bayesian inference in dynamical models
Title | Accelerating Monte Carlo methods for Bayesian inference in dynamical models PDF eBook |
Author | Johan Dahlin |
Publisher | Linköping University Electronic Press |
Pages | 139 |
Release | 2016-03-22 |
Genre | |
ISBN | 9176857972 |
Making decisions and predictions from noisy observations are two important and challenging problems in many areas of society. Some examples of applications are recommendation systems for online shopping and streaming services, connecting genes with certain diseases and modelling climate change. In this thesis, we make use of Bayesian statistics to construct probabilistic models given prior information and historical data, which can be used for decision support and predictions. The main obstacle with this approach is that it often results in mathematical problems lacking analytical solutions. To cope with this, we make use of statistical simulation algorithms known as Monte Carlo methods to approximate the intractable solution. These methods enjoy well-understood statistical properties but are often computational prohibitive to employ. The main contribution of this thesis is the exploration of different strategies for accelerating inference methods based on sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC). That is, strategies for reducing the computational effort while keeping or improving the accuracy. A major part of the thesis is devoted to proposing such strategies for the MCMC method known as the particle Metropolis-Hastings (PMH) algorithm. We investigate two strategies: (i) introducing estimates of the gradient and Hessian of the target to better tailor the algorithm to the problem and (ii) introducing a positive correlation between the point-wise estimates of the target. Furthermore, we propose an algorithm based on the combination of SMC and Gaussian process optimisation, which can provide reasonable estimates of the posterior but with a significant decrease in computational effort compared with PMH. Moreover, we explore the use of sparseness priors for approximate inference in over-parametrised mixed effects models and autoregressive processes. This can potentially be a practical strategy for inference in the big data era. Finally, we propose a general method for increasing the accuracy of the parameter estimates in non-linear state space models by applying a designed input signal. Borde Riksbanken höja eller sänka reporäntan vid sitt nästa möte för att nå inflationsmålet? Vilka gener är förknippade med en viss sjukdom? Hur kan Netflix och Spotify veta vilka filmer och vilken musik som jag vill lyssna på härnäst? Dessa tre problem är exempel på frågor där statistiska modeller kan vara användbara för att ge hjälp och underlag för beslut. Statistiska modeller kombinerar teoretisk kunskap om exempelvis det svenska ekonomiska systemet med historisk data för att ge prognoser av framtida skeenden. Dessa prognoser kan sedan användas för att utvärdera exempelvis vad som skulle hända med inflationen i Sverige om arbetslösheten sjunker eller hur värdet på mitt pensionssparande förändras när Stockholmsbörsen rasar. Tillämpningar som dessa och många andra gör statistiska modeller viktiga för många delar av samhället. Ett sätt att ta fram statistiska modeller bygger på att kontinuerligt uppdatera en modell allteftersom mer information samlas in. Detta angreppssätt kallas för Bayesiansk statistik och är särskilt användbart när man sedan tidigare har bra insikter i modellen eller tillgång till endast lite historisk data för att bygga modellen. En nackdel med Bayesiansk statistik är att de beräkningar som krävs för att uppdatera modellen med den nya informationen ofta är mycket komplicerade. I sådana situationer kan man istället simulera utfallet från miljontals varianter av modellen och sedan jämföra dessa mot de historiska observationerna som finns till hands. Man kan sedan medelvärdesbilda över de varianter som gav bäst resultat för att på så sätt ta fram en slutlig modell. Det kan därför ibland ta dagar eller veckor för att ta fram en modell. Problemet blir särskilt stort när man använder mer avancerade modeller som skulle kunna ge bättre prognoser men som tar för lång tid för att bygga. I denna avhandling använder vi ett antal olika strategier för att underlätta eller förbättra dessa simuleringar. Vi föreslår exempelvis att ta hänsyn till fler insikter om systemet och därmed minska antalet varianter av modellen som behöver undersökas. Vi kan således redan utesluta vissa modeller eftersom vi har en bra uppfattning om ungefär hur en bra modell ska se ut. Vi kan också förändra simuleringen så att den enklare rör sig mellan olika typer av modeller. På detta sätt utforskas rymden av alla möjliga modeller på ett mer effektivt sätt. Vi föreslår ett antal olika kombinationer och förändringar av befintliga metoder för att snabba upp anpassningen av modellen till observationerna. Vi visar att beräkningstiden i vissa fall kan minska ifrån några dagar till någon timme. Förhoppningsvis kommer detta i framtiden leda till att man i praktiken kan använda mer avancerade modeller som i sin tur resulterar i bättre prognoser och beslut.
Machine learning using approximate inference
Title | Machine learning using approximate inference PDF eBook |
Author | Christian Andersson Naesseth |
Publisher | Linköping University Electronic Press |
Pages | 62 |
Release | 2018-11-27 |
Genre | |
ISBN | 9176851613 |
Automatic decision making and pattern recognition under uncertainty are difficult tasks that are ubiquitous in our everyday life. The systems we design, and technology we develop, requires us to coherently represent and work with uncertainty in data. Probabilistic models and probabilistic inference gives us a powerful framework for solving this problem. Using this framework, while enticing, results in difficult-to-compute integrals and probabilities when conditioning on the observed data. This means we have a need for approximate inference, methods that solves the problem approximately using a systematic approach. In this thesis we develop new methods for efficient approximate inference in probabilistic models. There are generally two approaches to approximate inference, variational methods and Monte Carlo methods. In Monte Carlo methods we use a large number of random samples to approximate the integral of interest. With variational methods, on the other hand, we turn the integration problem into that of an optimization problem. We develop algorithms of both types and bridge the gap between them. First, we present a self-contained tutorial to the popular sequential Monte Carlo (SMC) class of methods. Next, we propose new algorithms and applications based on SMC for approximate inference in probabilistic graphical models. We derive nested sequential Monte Carlo, a new algorithm particularly well suited for inference in a large class of high-dimensional probabilistic models. Then, inspired by similar ideas we derive interacting particle Markov chain Monte Carlo to make use of parallelization to speed up approximate inference for universal probabilistic programming languages. After that, we show how we can make use of the rejection sampling process when generating gamma distributed random variables to speed up variational inference. Finally, we bridge the gap between SMC and variational methods by developing variational sequential Monte Carlo, a new flexible family of variational approximations.
Controllability of Complex Networks at Minimum Cost
Title | Controllability of Complex Networks at Minimum Cost PDF eBook |
Author | Gustav Lindmark |
Publisher | Linköping University Electronic Press |
Pages | 38 |
Release | 2020-04-30 |
Genre | Electronic books |
ISBN | 9179298478 |
The control-theoretic notion of controllability captures the ability to guide a system toward a desired state with a suitable choice of inputs. Controllability of complex networks such as traffic networks, gene regulatory networks, power grids etc. can for instance enable efficient operation or entirely new applicative possibilities. However, when control theory is applied to complex networks like these, several challenges arise. This thesis considers some of them, in particular we investigate how a given network can be rendered controllable at a minimum cost by placement of control inputs or by growing the network with additional edges between its nodes. As cost function we take either the number of control inputs that are needed or the energy that they must exert. A control input is called unilateral if it can assume either positive or negative values, but not both. Motivated by the many applications where unilateral controls are common, we reformulate classical controllability results for this particular case into a more computationally-efficient form that enables a large scale analysis. Assuming that each control input targets only one node (called a driver node), we show that the unilateral controllability problem is to a high degree structural: from topological properties of the network we derive theoretical lower bounds for the minimal number of unilateral control inputs, bounds similar to those that have already been established for the minimal number of unconstrained control inputs (e.g. can assume both positive and negative values). With a constructive algorithm for unilateral control input placement we also show that the theoretical bounds can often be achieved. A network may be controllable in theory but not in practice if for instance unreasonable amounts of control energy are required to steer it in some direction. For the case with unconstrained control inputs, we show that the control energy depends on the time constants of the modes of the network, the longer they are, the less energy is required for control. We also present different strategies for the problem of placing driver nodes such that the control energy requirements are reduced (assuming that theoretical controllability is not an issue). For the most general class of networks we consider, directed networks with arbitrary eigenvalues (and thereby arbitrary time constants), we suggest strategies based on a novel characterization of network non-normality as imbalance in the distribution of energy over the network. Our formulation allows to quantify network non-normality at a node level as combination of two different centrality metrics. The first measure quantifies the influence that each node has on the rest of the network, while the second measure instead describes the ability to control a node indirectly from the other nodes. Selecting the nodes that maximize the network non-normality as driver nodes significantly reduces the energy needed for control. Growing a network, i.e. adding more edges to it, is a promising alternative to reduce the energy needed to control it. We approach this by deriving a sensitivity function that enables to quantify the impact of an edge modification with the H2 and H? norms, which in turn can be used to design edge additions that improve commonly used control energy metrics.
Sensor Management for Target Tracking Applications
Title | Sensor Management for Target Tracking Applications PDF eBook |
Author | Per Boström-Rost |
Publisher | Linköping University Electronic Press |
Pages | 61 |
Release | 2021-04-12 |
Genre | |
ISBN | 9179296726 |
Many practical applications, such as search and rescue operations and environmental monitoring, involve the use of mobile sensor platforms. The workload of the sensor operators is becoming overwhelming, as both the number of sensors and their complexity are increasing. This thesis addresses the problem of automating sensor systems to support the operators. This is often referred to as sensor management. By planning trajectories for the sensor platforms and exploiting sensor characteristics, the accuracy of the resulting state estimates can be improved. The considered sensor management problems are formulated in the framework of stochastic optimal control, where prior knowledge, sensor models, and environment models can be incorporated. The core challenge lies in making decisions based on the predicted utility of future measurements. In the special case of linear Gaussian measurement and motion models, the estimation performance is independent of the actual measurements. This reduces the problem of computing sensing trajectories to a deterministic optimal control problem, for which standard numerical optimization techniques can be applied. A theorem is formulated that makes it possible to reformulate a class of nonconvex optimization problems with matrix-valued variables as convex optimization problems. This theorem is then used to prove that globally optimal sensing trajectories can be computed using off-the-shelf optimization tools. As in many other fields, nonlinearities make sensor management problems more complicated. Two approaches are derived to handle the randomness inherent in the nonlinear problem of tracking a maneuvering target using a mobile range-bearing sensor with limited field of view. The first approach uses deterministic sampling to predict several candidates of future target trajectories that are taken into account when planning the sensing trajectory. This significantly increases the tracking performance compared to a conventional approach that neglects the uncertainty in the future target trajectory. The second approach is a method to find the optimal range between the sensor and the target. Given the size of the sensor's field of view and an assumption of the maximum acceleration of the target, the optimal range is determined as the one that minimizes the tracking error while satisfying a user-defined constraint on the probability of losing track of the target. While optimization for tracking of a single target may be difficult, planning for jointly maintaining track of discovered targets and searching for yet undetected targets is even more challenging. Conventional approaches are typically based on a traditional tracking method with separate handling of undetected targets. Here, it is shown that the Poisson multi-Bernoulli mixture (PMBM) filter provides a theoretical foundation for a unified search and track method, as it not only provides state estimates of discovered targets, but also maintains an explicit representation of where undetected targets may be located. Furthermore, in an effort to decrease the computational complexity, a version of the PMBM filter which uses a grid-based intensity to represent undetected targets is derived.
Gaussian Processes for Positioning Using Radio Signal Strength Measurements
Title | Gaussian Processes for Positioning Using Radio Signal Strength Measurements PDF eBook |
Author | Yuxin Zhao |
Publisher | Linköping University Electronic Press |
Pages | 74 |
Release | 2019-02-27 |
Genre | |
ISBN | 9176851621 |
Estimation of unknown parameters is considered as one of the major research areas in statistical signal processing. In the most recent decades, approaches in estimation theory have become more and more attractive in practical applications. Examples of such applications may include, but are not limited to, positioning using various measurable radio signals in indoor environments, self-navigation for autonomous cars, image processing, radar tracking and so on. One issue that is usually encountered when solving an estimation problem is to identify a good system model, which may have great impacts on the estimation performance. In this thesis, we are interested in studying estimation problems particularly in inferring the unknown positions from noisy radio signal measurements. In addition, the modeling of the system is studied by investigating the relationship between positions and radio signal strength measurements. One of the main contributions of this thesis is to propose a novel indoor positioning framework based on proximity measurements, which are obtained by quantizing the received signal strength measurements. Sequential Monte Carlo methods, to be more specific particle filter and smoother, are utilized for estimating unknown positions from proximity measurements. The Cramér-Rao bounds for proximity-based positioning are further derived as a benchmark for the positioning accuracy in this framework. Secondly, to improve the estimation performance, Bayesian non-parametric modeling, namely Gaussian processes, have been adopted to provide more accurate and flexible models for both dynamic motions and radio signal strength measurements. Then, the Cramér-Rao bounds for Gaussian process based system models are derived and evaluated in an indoor positioning scenario. In addition, we estimate the positions of stationary devices by comparing the individual signal strength measurements with a pre-constructed fingerprinting database. The positioning accuracy is further compared to the case where a moving device is positioned using a time series of radio signal strength measurements. Moreover, Gaussian processes have been applied to sports analytics, where trajectory modeling for athletes is studied. The proposed framework can be further utilized to carry out, for instance, performance prediction and analysis, health condition monitoring, etc. Finally, a grey-box modeling is proposed to analyze the forces, particularly in cross-country skiing races, by combining a deterministic kinetic model with Gaussian process.
Flight Test System Identification
Title | Flight Test System Identification PDF eBook |
Author | Roger Larsson |
Publisher | Linköping University Electronic Press |
Pages | 326 |
Release | 2019-05-15 |
Genre | Science |
ISBN | 9176850706 |
With the demand for more advanced fighter aircraft, relying on unstable flight mechanical characteristics to gain flight performance, more focus has been put on model-based system engineering to help with the design work. The flight control system design is one important part that relies on this modeling. Therefore, it has become more important to develop flight mechanical models that are highly accurate in the whole flight envelope. For today’s modern fighter aircraft, the basic flight mechanical characteristics change between linear and nonlinear as well as stable and unstable as an effect of the desired capability of advanced maneuvering at subsonic, transonic and supersonic speeds. This thesis combines the subject of system identification, which is the art of building mathematical models of dynamical systems based on measurements, with aeronautical engineering in order to find methods for identifying flight mechanical characteristics. Here, some challenging aeronautical identification problems, estimating model parameters from flight-testing, are treated. Two aspects are considered. The first is online identification during flight-testing with the intent to aid the engineers in the analysis process when looking at the flight mechanical characteristics. This will also ensure that enough information is available in the resulting test data for post-flight analysis. Here, a frequency domain method is used. An existing method has been developed further by including an Instrumental Variable approach to take care of noisy data including atmospheric turbulence and by a sensor-fusion step to handle varying excitation during an experiment. The method treats linear systems that can be both stable and unstable working under feedback control. An experiment has been performed on a radio-controlled demonstrator aircraft. For this, multisine input signals have been designed and the results show that it is possible to perform more time-efficient flight-testing compared with standard input signals. The other aspect is post-flight identification of nonlinear characteristics. Here the properties of a parameterized observer approach, using a prediction-error method, are investigated. This approach is compared with four other methods for some test cases. It is shown that this parameterized observer approach is the most robust one with respect to noise disturbances and initial offsets. Another attractive property is that no user parameters have to be tuned by the engineers in order to get the best performance. All methods in this thesis have been validated on simulated data where the system is known, and have also been tested on real flight test data. Both of the investigated approaches show promising results.
Time of Flight Estimation for Radio Network Positioning
Title | Time of Flight Estimation for Radio Network Positioning PDF eBook |
Author | Kamiar Radnosrati |
Publisher | Linköping University Electronic Press |
Pages | 103 |
Release | 2020-02-17 |
Genre | |
ISBN | 9179298842 |
Trilateration is the mathematical theory of computing the intersection of circles. These circles may be obtained by time of flight (ToF) measurements in radio systems, as well as laser, radar and sonar systems. A first purpose of this thesis is to survey recent efforts in the area and their potential for localization. The rest of the thesis then concerns selected problems in new cellular radio standards as well as fundamental challenges caused by propagation delays in the ToF measurements, which cannot travel faster than the speed of light. We denote the measurement uncertainty stemming from propagation delays for positive noise, and develop a general theory with optimal estimators for selected distributions, which can be applied to trilateration but also a much wider class of estimation problems. The first contribution concerns a narrow-band mode in the long-term evolution (LTE) standard intended for internet of things (IoT) devices. This LTE standard includes a special position reference signal sent synchronized by all base stations (BS) to all IoT devices. Each device can then compute several pair-wise time differences that correspond to hyperbolic functions. The simulation-based performance evaluation indicates that decent position accuracy can be achieved despite the narrow bandwidth of the channel. The second contribution is a study of how timing measurements in LTE can be combined. Round trip time (RTT) to the serving BS and time difference of arrival (TDOA) to the neighboring BS are used as measurements. We propose a filtering framework to deal with the existing uncertainty in the solution and evaluate with both simulated and experimental test data. The results indicate that the position accuracy is better than 40 meters 95% of the time. The third contribution is a comprehensive theory of how to estimate the signal observed in positive noise, that is, random variables with positive support. It is well known from the literature that order statistics give one order of magnitude lower estimation variance compared to the best linear unbiased estimator (BLUE). We provide a systematic survey of some common distributions with positive support, and provide derivations and summaries of estimators based on order statistics, including the BLUE one for comparison. An iterative global navigation satellite system (GNSS) localization algorithm, based on the derived estimators, is introduced to jointly estimate the receiver’s position and clock bias. The fourth contribution is an extension of the third contribution to a particular approach to utilize positive noise in nonlinear models. That is, order statistics have been employed to derive estimators for a generic nonlinear model with positive noise. The proposed method further enables the estimation of the hyperparameters of the underlying noise distribution. The performance of the proposed estimator is then compared with the maximum likelihood estimator when the underlying noise follows either a uniform or exponential distribution.