-
Let's Cover Homology (pt 1)
Topology What is it? Topology is essentially the study of proximity, which is a close analogue for shape. The specifics of how that’s done are not really the point of this post. That’s all for this section. How is it related to analyzing data? Most data analysis techniques are already about quantifying some geometric attributes. For instance, when you fit a linear model, you’re really just imposing an affine model on the data, and computing the appropriate parameters (slope and intercept). But these sorts of things work best if you have an idea for what an appropriate shape would be. That’s one area where topology might be able to help. But to hear more about that, you’ll have to wait for my next post. There is a notion of equivalence (called isomorphism) of topological spaces (bi-...
-
Decomposition of Autoregressive Models
Autoregression Background This post will be a formal introduction into some of the theory of Autoregressive models. Specifically, we’ll tackle how to decompose an AR(p) model into a bunch of AR(1) models. We’ll discuss the interpretability of these models and howto therein. This post will develop the subject using what seems to be an atypical approach, but one that I find to be very elegant. The traditional way Let $x_t$ be an AR(p) process. So $x_t = \sum\limits_{i=1}^pa_ix_{t-i} + w_t$. We can express $x_t$ thusly: \[\begin{align*} x_t =& \sum\limits_{i=1}^pa_ix_{t-i} + w_t \\ x_t - \sum\limits_{i=1}^pa_ix_{t-i} =& w_t \\ \left(1 - \sum\limits_{i=1}^pa_iL^i\right)x_t =& w_t \end{align*}\] Where $L$ is the lag operator. We define the AR polynomial $\Phi$ as $\Phi(L)...
-
Markov Chain Monte Carlo
$\text{???} = (MC)^2$ Background What if we know the relative likelihood, but want the probability distribution? \[\mathbb{P}(X=x) = \frac{f(x)}{\int_{-\infty}^\infty f(x)dx}\] But what if $\int f(x)dx$ is hard, or you can’t sample from $f$ directly? This is the problem we will be trying to solve. First approach If space if bounded (integral is between $a,b$) we can use Monte Carlo to estimate $\int\limits_a^b f(x)dx$ Pick $\alpha \in (a,b)$ Compute $f(\alpha)/(b-a)$ Repeat as necessary Compute the expected value of the computed values The big bag of nope What if we can’t sample from $f(x)$ but can only determine likelihood ratios? \[\frac{f(x)}{f(y)}\] Enter Markov Chain Monte Carlo The obligatory basics A Markov Chain is a stochastic process (a collection of i...
-
Linear Regression -- The Basics
The basics Yeah. It’s not a good sign if I’m starting out already repeating myself. But that’s how things seem to be with linear regression, so I guess it’s fitting. It seems like every day one of my professors will talk about linear regression, and it’s not due to laziness or lack of coordination. Indeed, it’s an intentional part of the curriculum here at New College of Florida because of how ubiquitous linear regression is. Not only is it an extremely simple yet expressive formulation, it’s also the theoretical basis of a whole slew of other tactics. Let’s just get right into it, shall we? Linear Regression: Let’s say you have some data from the real world (and hence riddled with real-world error). A basic example for us to start with is this one: There’s clearly a linear tren...
-
Probability -- A Measure Theoretic Approach
Probability using Measure Theory A mathematically rigorous definition of probability, and some examples therein. The Traditional Definition: Consider a set $\Omega$ (called the sample space), and a function $X:\Omega\rightarrow\mathbb{R}$ (called a random variable. If $\Omega$ is countable (or finite), a function $\mathbb{P}:\Omega\rightarrow\mathbb{R}$ is called a probability distribution if it satisfies the following 2 conditions: For each $x \in \Omega$, $\mathbb{P}(x) \geq 0$ If $A_i\cap A_j = \emptyset$, then $\mathbb{P}(\bigcup\limits_0^\infty A_i) = \sum\limits_0^\infty\mathbb{P}(A_i)$ And if $\Omega$ is uncountable, a function $F:\mathbb{R}\rightarrow\mathbb{R}$ is called a probability distribution or a cumulative distribution function if it satisf...
-
A dirty little ditty on Finite Automata
This post builds on the previous post about Formal Languages. Some Formal Definitions A Deterministic Finite Automata (DFA) is A set $\mathcal{Q}$ called “states” A set $\Sigma$ called “symbols” or “alphabet” A function $\delta_F:\mathcal{Q}\times\Sigma \to \mathcal{Q}$ A designated state $q_0\in\mathcal{Q}$ called the start point A subset $F\subseteq\mathcal{Q}$ called the “accepting states” The DFA is then often referred to as the ordered quintuple $A=(\mathcal{Q},\Sigma,\delta_F,q_0,F)$. Defining how strings act on DFAs. Given a DFA, $A=(\mathcal{Q}, \Sigma, \delta, q_0, F)$, a state $q_i\in\mathcal{Q}$, and a string $w\in\Sigma^*$, we can define $\delta(q_i,w)$ like so: If $w$ only has one symbol, we can consider $w$ to be the symbol and define $\delta(q_i,w)...
-
What's in a language?
Languages in abstraction This post is about Languages from a mathematical and abstract linguistics point of view. Not much more to say about that, so let’s get right to it! The Rigorous definition: Let $\Sigma$ be an alphabet and let $\Sigma^k$ be the set of all strings of length k over that alphabet. Then, we define $\Sigma^*$ to be $\bigcup\limits_{k\in\mathbb{N}}\Sigma^k$ (the union of $\Sigma^k$ over all natural numbers k). If $L\subseteq\Sigma^*$ , we call $L$ a language. The Intuition Behind the Definition Consider an alphabet (some finite set of characters), for example we can consider the letters of the English language, the ASCII symbols, the symbols ${0, 1}$ (otherwise known as binary), or the symbols ${1, 2, 3, 4, 5, 6, 7, 8, 9, 0, +, \times , =}$ . We can then c...
-
First Post, An Explication
This is my first blog, so I guess I should start off by explaining my motives behind writing what will probably be a sporadically updated blog. Basically, as I learn things that I find pretty difficult to find online, I’ll try to explain them as best I can here. Also, since I enjoy learning math, I’m going to try to keep up a (semi) regular stream of math posts. If you have any questions, feel free to contact me. My up-to-date contact info can be found on my website https://aaron.niskin.org