Snorkel for Deep Learning

Snorkel is an open-source software platform for deep learning. Current approaches for building predictive models require large, structured, labeled datasets for training. These gold standard datasets are difficult to come by, particularly in biomedicine, limiting our ability to make predictions from our data.

Snorkel was created in response to this challenge. It constructs knowledge bases from “dark data”—data that are unstructured, such as scientific articles or clinical notes. Unlike other approaches, which require precisely labeled data to train and build the models, Snorkel can work with just a set of user-input rules and performs as well or better than gold standard datasets for training predictive models. It has been used in a wide variety of applications, from paleobiology to crime fighting to biomedicine.