Robert L. Logan IV

Ph.D. Student at UC Irvine

I am a Ph.D. student at the University of California, Irvine, studying machine learning and natural language processing under Padhraic Smyth and Sameer Singh. My research primarily focuses on the interplay between language modeling and information extraction. In particular, I am interested in using structured knowledge to improve the quality of natural language generation systems and language representation models. Before coming to Irvine, I recieved BAs in Mathematics and Economics from the University of California, Santa Cruz. I have also conducted machine learning research as an intern at Diffbot and worked as a research analyst at Prologis.

Research

Knowledge Enhanced Contextual Word Representations

2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Mattew E. Peters - Mark Neumann - Robert L. Logan IV - Roy Schwartz - Vidur Joshi - Sameer Singh - Noah A. Smith
Paper
Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities. We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. For each KB, we first use an integrated entity linker to retrieve relevant entity embeddings, then update contextual word representations via a form of word-to-entity attention. In contrast to previous approaches, the entity linkers and self-supervised language modeling objective are jointly trained end-to-end in a multitask setting that combines a small amount of entity linking supervision with a large amount of raw text. After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. KnowBert’s runtime is comparable to BERT’s and it scales to large KBs.

Barack's Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling

The 57th Annual Meeting of the Association for Computational Linguistics (ACL)
Robert L. Logan IV - Nelson F. Liu - Mattew E. Peters - Matt Gardner - Sameer Singh
Paper Dataset
Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge. However, traditional language models are only capable of remembering facts seen at training time, and often have difficulty recalling them. To address this, we introduce the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context. These mechanisms enable the model to render information it has never seen before, as well as generate out-of-vocabulary tokens. We also introduce the Linked WikiText-2 dataset, a corpus of annotated text aligned to the Wikidata knowledge graph whose contents (roughly) match the popular WikiText-2 benchmark. In experiments, we demonstrate that the KGLM achieves significantly better performance than a strong baseline language model. We additionally compare different language model's ability to complete sentences requiring factual knowledge, showing that the KGLM outperforms even very large language models in generating facts.

Bayesian Evaluation of Black-Box Classifiers

ICML 2019 Workshop on Uncertainty & Robustness in Deep Learning
Disi Ji - Robert L. Logan IV - Padhraic Smyth - Mark Steyvers
Paper
There is an increasing need for accurate quantitative assessment of the performance of prediction models (such as deep neural networks), out-of-sample, e.g., in new environments after they have been trained. In this context we propose a Bayesian framework for assessing performance characteristics of black-box classifiers, performing inference on quantities such as accuracy and calibration bias. We demonstrate the approach using three deep neural networks applied to large real-world data sets, performing inference and active learning to assess class-specific performance.

PoMo: Generating Entity-Specific Post-Modifiers in Context

2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)
Jun Seok Kang - Robert L. Logan IV - Zewei Chu - Yang Chen - Dheeru Dua - Kevin Gimpel - Sameer Singh - Niranjan Balasubramanian
Paper Dataset
We introduce entity post-modifier generation as an instance of a collaborative writing task. Given a sentence about a target entity, the task is to automatically generate a post-modifier phrase that provides contextually relevant information about the entity. For example, for the sentence, "Barack Obama, _______, supported the \#MeToo movement.", the phrase "a father of two girls" is a contextually relevant post-modifier. To this end, we build PoMo, a post-modifier dataset created automatically from news articles reflecting a journalistic need for incorporating entity information that is relevant to a particular news event. PoMo consists of more than 231K sentences with post-modifiers and associated facts extracted from Wikidata for around 57K unique entities. We use crowdsourcing to show that modeling contextual relevance is necessary for accurate post-modifier generation. We adapt a number of existing generation approaches as baselines for this dataset. Our results show there is large room for improvement in terms of both identifying relevant facts to include (knowing which claims are relevant gives a >20% improvement in BLEU score), and generating appropriate post-modifier text for the context (providing relevant claims is not sufficient for accurate generation). We conduct an error analysis that suggests promising directions for future research.

Multimodal Attribute Extraction

6th Workshop on Automated Knowledge Base Construction (AKBC) 2017
Robert L. Logan IV - Samuel Humeau - Sameer Singh
Paper Poster Code Dataset
The broad goal of information extraction is to derive structured information from unstructured data. However, most existing methods focus solely on text, ignoring other types of unstructured data such as images, video and audio which comprise an increasing portion of the information on the web. To address this shortcoming, we propose the task of multimodal attribute extraction. Given a collection of unstructured and semi-structured contextual information about an entity (such as a textual description, or visual depictions) the task is to extract the entity's underlying attributes. In this paper, we provide a dataset containing mixed-media data for over 2 million product items along with 7 million attribute-value pairs describing the items which can be used to train attribute extractors in a weakly supervised manner. We provide a variety of baselines which demonstrate the relative effectiveness of the individual modes of information towards solving the task, as well as study human performance.

Contact