Robert L. Logan IV

Ph.D. Student at UC Irvine

I am a Ph.D. student at the University of California, Irvine, studying machine learning and natural language processing under Padhraic Smyth and Sameer Singh. My research primarily focuses on the interplay between language modeling and information extraction. In particular, I am interested in using structured knowledge to improve the quality of natural language generation systems and language representation models. Before coming to Irvine, I recieved BAs in Mathematics and Economics from the University of California, Santa Cruz. I have also conducted machine learning research as an intern at Diffbot and worked as a research analyst at Prologis.


Eliciting Knowledge from Language Models Using Automatically Generated Prompts

2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Taylor Shin* - Yasaman Razeghi* - Robert L. Logan IV* - Eric Wallace - Sameer Singh
Determining the knowledge captured by pretrained language models is an important challenge, and is commonly tackled by probing model representations using classifiers. However, it is difficult to design probes for semantic knowledge such as facts or textual entailment. Reformulating these semantic tasks as cloze tests (i.e., fill-in-the-blank problems) is a promising method for probing such knowledge, but, requires manual crafting of textual prompts to elicit this knowledge, limiting its use. In this paper, we develop an automated, task-agnostic method to create cloze prompts for any classification task, based on a gradient-guided search. We find prompts that demonstrate MLMs have an inherent capability to perform sentiment analysis and natural language inference, and without any finetuning, sometimes achieve performance on-par with recent state-of-the-art supervised models. We also show that our prompts elicit more accurate factual knowledge from MLMs compared to manual prompts, and further, MLMs can be used as relation extractors out of the box, when prompted with suitable prompts, more effectively than recent supervised RE models.

Detecting COVID-19 Misinformation on Social Media

EMNLP 2020 Workshop NLP-COVID
Tamanna Hossain* - Robert L. Logan IV* - Arjuna Ugarte* - Yoshitomo Matsubara* - Sean Young - Sameer Singh
The ongoing pandemic has heightened the need for developing tools to flag COVID-19-related misinformation on the internet, specifically on social media such as Twitter. However, due to novel language and the rapid change of information, existing misinformation detection datasets are not effective in evaluating systems designed to detect misinformation on this topic. Misinformation detection can be subdivided into two sub-tasks - retrieval of misconceptions relevant to posts being checked for veracity, and stance detection to identify whether the posts agree, disagree, or express no stance towards the retrieved misconceptions. To facilitate research on this task, we release COVID-Lies, a dataset of 5K expert-annotated tweets to evaluate the performance of misinformation detection systems on 86 different pieces of COVID-19 related misinformation. We evaluate existing NLP systems on this dataset, providing first benchmarks and identifying key challenges for future models to improve upon.

On Importance Sampling-Based Evaluation of Latent Language Models

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL)
Robert L. Logan IV - Matt Gardner - Sameer Singh
Language models that use additional latent structures (e.g., syntax trees, coreference chains, knowledge graph links) provide several advantages over traditional language models. However, likelihood-based evaluation of these models is often intractable as it requires marginalizing over the latent space. Existing works avoid this issue by using importance sampling. Although this approach has asymptotic guarantees, analysis is rarely conducted on the effect of decisions such as sample size and choice of proposal distribution on the reported estimates. In this paper, we carry out this analysis for three models: RNNG, EntityNLM, and KGLM. In addition, we elucidate subtle differences in how importance sampling is applied in these works that can have substantial effects on the final estimates, as well as provide theoretical results which reinforce the validity of this technique.

Detecting Conversation Topics in Primary Care Office Visits from Transcripts of Patient-Provider Interactions

Journal of the American Medical Informatics Association, Volume 26, Issue 12, December 2019
Jihyun Park - Dimitrios Kotzias - Patty Kuo - Robert L. Logan IV - Kritzia Merced - Sameer Singh - Michael Tanana - Efi Karra Taniskidou - Jennifer Elston Lafata - David C Atkins - Ming Tai-Seale - Zac E Imel - Padhraic Smyth
Amid electronic health records, laboratory tests, and other technology, office-based patient and provider communication is still the heart of primary medical care. Patients typically present multiple complaints, requiring physicians to decide how to balance competing demands. How this time is allocated has implications for patient satisfaction, payments, and quality of care. We investigate the effectiveness of machine learning methods for automated annotation of medical topics in patient-provider dialog transcripts. We used dialog transcripts from 279 primary care visits to predict talk-turn topic labels. Different machine learning models were trained to operate on single or multiple local talk-turns (logistic classifiers, support vector machines, gated recurrent units) as well as sequential models that integrate information across talk-turn sequences (conditional random fields, hidden Markov models, and hierarchical gated recurrent units). Evaluation was performed using cross-validation to measure 1) classification accuracy for talk-turns and 2) precision, recall, and F1 scores at the visit level. Experimental results showed that sequential models had higher classification accuracy at the talk-turn level and higher precision at the visit level. Independent models had higher recall scores at the visit level compared with sequential models. Incorporating sequential information across talk-turns improves the accuracy of topic prediction in patient-provider dialog by smoothing out noisy information from talk-turns. Although the results are promising, more advanced prediction techniques and larger labeled datasets will likely be required to achieve prediction performance appropriate for real-world clinical applications.

Knowledge Enhanced Contextual Word Representations

2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Mattew E. Peters - Mark Neumann - Robert L. Logan IV - Roy Schwartz - Vidur Joshi - Sameer Singh - Noah A. Smith
Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities. We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. For each KB, we first use an integrated entity linker to retrieve relevant entity embeddings, then update contextual word representations via a form of word-to-entity attention. In contrast to previous approaches, the entity linkers and self-supervised language modeling objective are jointly trained end-to-end in a multitask setting that combines a small amount of entity linking supervision with a large amount of raw text. After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. KnowBert’s runtime is comparable to BERT’s and it scales to large KBs.

Barack's Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling

The 57th Annual Meeting of the Association for Computational Linguistics (ACL)
Robert L. Logan IV - Nelson F. Liu - Mattew E. Peters - Matt Gardner - Sameer Singh
Paper Dataset
Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge. However, traditional language models are only capable of remembering facts seen at training time, and often have difficulty recalling them. To address this, we introduce the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context. These mechanisms enable the model to render information it has never seen before, as well as generate out-of-vocabulary tokens. We also introduce the Linked WikiText-2 dataset, a corpus of annotated text aligned to the Wikidata knowledge graph whose contents (roughly) match the popular WikiText-2 benchmark. In experiments, we demonstrate that the KGLM achieves significantly better performance than a strong baseline language model. We additionally compare different language model's ability to complete sentences requiring factual knowledge, showing that the KGLM outperforms even very large language models in generating facts.

Bayesian Evaluation of Black-Box Classifiers

ICML 2019 Workshop on Uncertainty & Robustness in Deep Learning
Disi Ji* - Robert L. Logan IV* - Padhraic Smyth - Mark Steyvers
There is an increasing need for accurate quantitative assessment of the performance of prediction models (such as deep neural networks), out-of-sample, e.g., in new environments after they have been trained. In this context we propose a Bayesian framework for assessing performance characteristics of black-box classifiers, performing inference on quantities such as accuracy and calibration bias. We demonstrate the approach using three deep neural networks applied to large real-world data sets, performing inference and active learning to assess class-specific performance.

PoMo: Generating Entity-Specific Post-Modifiers in Context

2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)
Jun Seok Kang - Robert L. Logan IV - Zewei Chu - Yang Chen - Dheeru Dua - Kevin Gimpel - Sameer Singh - Niranjan Balasubramanian
Paper Dataset
We introduce entity post-modifier generation as an instance of a collaborative writing task. Given a sentence about a target entity, the task is to automatically generate a post-modifier phrase that provides contextually relevant information about the entity. For example, for the sentence, "Barack Obama, _______, supported the \#MeToo movement.", the phrase "a father of two girls" is a contextually relevant post-modifier. To this end, we build PoMo, a post-modifier dataset created automatically from news articles reflecting a journalistic need for incorporating entity information that is relevant to a particular news event. PoMo consists of more than 231K sentences with post-modifiers and associated facts extracted from Wikidata for around 57K unique entities. We use crowdsourcing to show that modeling contextual relevance is necessary for accurate post-modifier generation. We adapt a number of existing generation approaches as baselines for this dataset. Our results show there is large room for improvement in terms of both identifying relevant facts to include (knowing which claims are relevant gives a >20% improvement in BLEU score), and generating appropriate post-modifier text for the context (providing relevant claims is not sufficient for accurate generation). We conduct an error analysis that suggests promising directions for future research.

Multimodal Attribute Extraction

6th Workshop on Automated Knowledge Base Construction (AKBC) 2017
Robert L. Logan IV - Samuel Humeau - Sameer Singh
Paper Poster Code Dataset
The broad goal of information extraction is to derive structured information from unstructured data. However, most existing methods focus solely on text, ignoring other types of unstructured data such as images, video and audio which comprise an increasing portion of the information on the web. To address this shortcoming, we propose the task of multimodal attribute extraction. Given a collection of unstructured and semi-structured contextual information about an entity (such as a textual description, or visual depictions) the task is to extract the entity's underlying attributes. In this paper, we provide a dataset containing mixed-media data for over 2 million product items along with 7 million attribute-value pairs describing the items which can be used to train attribute extractors in a weakly supervised manner. We provide a variety of baselines which demonstrate the relative effectiveness of the individual modes of information towards solving the task, as well as study human performance.