Regression Trees & Bagging (more functional R)

I recently ran across this excellent article explaining gradient boosting in the context of regression trees. The article concludes by describing how the technique implements a gradient-descent process, but what I find most fascinating is the concept of “functional modeling”–building machine learning models from other models as building blocks. This post explores that idea by implementing regression trees in base R (with a little visualization help from ggplot2, dplyr, and tidyr) with functional programming concepts, including a technique called bootstrap aggregating.

Read more

Share

Automatic Differentiation & Functional Operators in R

I’ve been studying up on deep learning recently (I know, trendy), and I learned something along the way that I think is just incredible.1 First, a little background: deep learning models are artificial neural networks, represented as potentially thousands of nodes with millions of weighted connections between them. Input numbers are fed in to some nodes on one side, and out pops output numbers from some nodes on the other side, after winding through the nodes and weighted connections.

Read more

Share

Bio/Recursion: CS and Bioinformatics in R

This book is based on a workshop I taught a few times at OSU, meant to introduce computer science theory to biologists. Digital version available online at Leanpub. Also available in print at Amazon. It introduces topics in programming, computer science, and bioinformatics via examples in the R programming language. While often associated with statistics, Bio/Recursion employs R’s algorithmic capabilities to implement and visualize several fascinating methods, ranging from DNA alignment to drawing fractal trees.

Read more

Share

A Primer for Computational Biology

Read Online at Open Oregon State (Open-Access & Free!) Or order a copy from Amazon or OSU Press. (Scribble in the margins!) This Open-Access textbook was a collaboration with Open Oregon State and OSU Press. It aims to provide life scientists and students the skills necessary for research in a data-rich world. The text covers accessing and using remote servers via the command-line, writing programs and pipelines for data analysis, and provides useful vocabulary for interdisciplinary work.

Read more

Share

3D Printing & Modeling, Online Tools

I’ve been teaching an evening class at the Oregon State University Craft Center in “3D modeling and printing.” I don’t know a ton about this topic, to be honest. But I know a little more than I did a year ago, and it was hard-earned knowledge. Why not share it? At OSU it seems like I can’t walk out the door without bumping into a 3d printer (and students and faculty get access to them), but I found very little in-person information about how to do basic 3d design.

Read more

Share

Rstackdeque

I’m (usually) a fan of the R programming language, though in a few ways it lacks features computer scientists expect. For example, R lacks many data structures provided by high-level languages, such as trees, queues, and stacks. This is somewhat complicated by R’s functional nature–R programmers expect data structures to be “persistent,” such that previous versions of the structure are available even after insertions or deletions. Fortunately, data structures in purely-functional languages (generally not R) is a topic of past and ongoing research, spurred initially by Chris Okasaki’s excellent Purely Functional Data Structures.

Read more

Share

Stained Glass

These were made between 2012 and 2015, some at a studio in Marquette, MI, and some at the Notre Dame Glass Club.

Read more

Share

Functional Parameterized L-System

There’s a pretty cool app called Codea, which is a programming IDE similar processing.org, but in Lua, on the iPad. Lua supports first-class functions, so I played around with designing a parameterized L-System which builds sentences as lists of closures.

Read more

Share