This is my place for putting loose ends and stuck projects. My hope is that, by doing so, I can connect with people who may have ideas or are interested in picking up the reins. Please contact me if you are interested in pursuing any of these directions, or have an idea related to one of them.
This work extends the paper “Quantifying Causal Influence” by Janzing et al. In that work, the authors created a causal variant of the mutual information, giving a single number summarizing the effect of one edge in a causal model. In this work, I extended their definition to work on arbitrary paths. The main application is to compute the total causal influence of one variable on another, taking into account both its direct effect as well as its effect via intermediaries.
While the core of the research is done, this work is paused due to lacking motivation. (According to my would-be collaborator Caroline Uhler, statisticians would not care about this.)
If you observe some code which runs the line “mystery(‘azxy’)”, and then a little later ‘azxy’ appears on the screen, you can infer that the function “mystery” had something to do with it, without having to read its source code. The goal of this project is to get a computer to do the same.
My approach was to attempt to show that the quantified causal influence (using the Janzing et al definition) between two program variables was positive. I do this using a small amount of local static analysis to constrain the causal structure, followed by empirically bounding the causal influence, an approach that goes under the tagline “Local Analysis + Data = Global Analysis.”
This approach has little prospect of scaling beyond toy programs for multiple reasons. A chief one is that it requires estimating the mutual information between parts of programs.
These slides explain the idea, and show the toy example that it actually works on:
If the documentation for a function says that an argument is the “number of milliseconds,” that means it’s always positive, which can be checked by a static analyzer. Our goal was to learn these specs from the documentation.
How would it do this? We would use a static analyzer to check usages of an existing library, extract status analysis facts, and infer these specs. These specs become training data for an NLP algorithm, training a model to predict specs from documentation. We then use this model on the documentation of a new library, one lacking an existing corpus of uses.
I implemented an early version as a class project with Tej Chajed in 2015. Afterwards, I begin looking for a compelling story. After 2 months of searching, we failed to find any compelling bugs that this approach would catch.
In late 2015, I proposed to my advisor the idea of building a framework . He suggested doing it in Java, as it would be “easier to use.” In this project, you could write an interpreter for a language, automatically obtain a CFG-generator for that language by executing that interpreter with different parameters, and then reuse this code to get an interpreter and CFG-generator for a different language which shares some nodes. Although nodes may be mixed-and-matched between languages, the type system guarantees that each term only has nodes allowed in that language. Sketches were in the works for how to also get a compiler and various static analyzers, all by running the interpreter with different parameters, using no symbolic manipulation.
But the most useful ideas came from my struggles to do ultra-generic programming in the confines of Java’s type system (losing hope of being “easier to use”). xI prepared an early writeup as I went away for the summer, telling my advisor “I think there are 3 papers worth of ideas in here,” not counting the framework itself, whose main goal was to combine existing ideas.
The “monads without lambdas” idea blossomed into my ICFP 2018 paper on thermometer continuations. The “parametric interpretation” idea got scooped by David Darais, and became his ICFP 2017 paper “Abstracting Definitional Interpreters.” I still think there’s a third paper to be had here.
I developed several new techniques for doing higher-kinded programming in Java. The techniques and goals were similar to the HighJ project, but I surpassed HighJ in several ways. Most of these fell under the umbrella of using proof terms to give type-level evidence of arbitrary conditions, especially to reify Java’s built-in “extends” constraint and surpass Java’s limitations on uses of this constraint. There was also a hack to get first-class (and hence higher-ranked) F-bounded polymorphism.
This avenue was promising to make it possible to encode arbitrary first-order type systems into Java. The application I find most exciting is adding linear types to Java using the known encoding of linear logic into first-order propositional logic.
Upon returning from the summer, I realized the potential of the ideas that would become Cubix, and switched to it full-time, yielding my first truly successful research project. I am much less excited by this idea than by my current research. I am not currently interested in pursing this, but am open to collaboration.