Nlp Highlights

Nlp Highlights

Author: Vários
Narrator: Vários
Publisher: Podcast
Duration: 80:57:29

More information

Synopsis

Discussing recent and interesting work related to natural language processing. Matt Gardner and Waleed Ammar, research scientists at the Allen Institute for Artificial Intelligence, give short discussions of papers, mostly in interviews with authors about their work.

Show more

Episodes

84 - Large Teams Develop, Small Groups Disrupt, with Lingfei Wu

26/03/2019 Duration: 38min

In a recent Nature paper, Lingfei Wu (Ling) suggests that smaller teams of scientists tend to do more disruptive work. In this episode, we invite Ling to discuss their results, how they define disruption and possible reasons why smaller teams may be better positioned to do disruptive work. We also touch on robustness of the disruption metric, differences between research disciplines, and sleeping beauties in science. Lingfei Wu’s homepage: https://www.knowledgelab.org/people/detail/lingfei_wu/ Paper: https://www.nature.com/articles/s41586-019-0941-9 Note: Lingfei is on the job market for faculty positions at the intersection of social science, computer science and communication.

Listen
83 - Knowledge Base Construction, with Sebastian Riedel

13/03/2019 Duration: 38min

In this episode, we invite Sebastian Riedel to talk about knowledge base construction (KBC). Why is it an important research area? What are the tradeoffs between using an open vs. closed schema? What are popular methods currently used, and what challenges prevent the adoption of KBC methods? We also briefly discuss the AKBC workshop and its graduation into a conference in 2019. Sebastian Riedel's homepage: http://www.riedelcastro.org/ AKBC conference: http://www.akbc.ws/2019/

Listen
82 - Visual Reasoning, with Yoav Artzi

06/03/2019 Duration: 42min

In this episode, Yoav Artzi joins us to talk about visual reasoning. We start by defining what visual reasoning is, then discuss the pros and cons of different tasks and datasets. We discuss some of the models used for visual reasoning and how they perform, before ending with open questions in this young, exciting research area. Yoav Artzi: https://yoavartzi.com/ NLVR: https://github.com/clic-lab/nlvr/tree/master/nlvr NLVR2: https://github.com/clic-lab/nlvr/tree/master/nlvr2 CLEVR dataset: https://cs.stanford.edu/people/jcjohns/clevr/ VQA: https://visualqa.org/ GQA: https://cs.stanford.edu/people/dorarad/gqa/index.html Neural module networks: https://arxiv.org/abs/1511.02799

Listen
81 - BlackboxNLP, with Afra Alishahi and Tal Linzen

06/02/2019 Duration: 31min

Neural models recently resulted in large performance improvements in various NLP problems, but our understanding of what and how the models learn remains fairly limited. In this episode, Tal Linzen and Afra Alishahi talk to us about BlackboxNLP, an EMNLP’18 workshop dedicated to the analysis and interpretation of neural networks for NLP. In the workshop, computer scientists and cognitive scientists joined forces to probe and analyze neural NLP models. BlackboxNLP 2018 website: https://blackboxnlp.github.io/2018/ BlackboxNLP 2018 proceedings: https://aclanthology.info/events/ws-2018#W18-54 BlackboxNLP 2019 website: https://blackboxnlp.github.io/

Listen
80 - Leaderboards and Science, with Siva Reddy

29/01/2019 Duration: 29min

Originally used to entice fierce competitions in arcade games, leaderboards recently made their way into NLP research circles. Leaderboards could help mitigate some of the problems in how researchers run experiments and share results (e.g., accidentally overfitting models on a test set), but they also introduce new problems (e.g., breaking author anonymity in peer reviewing). In this episode, Siva Reddy joins us to talk about the good, the bad, and the ugly of using leaderboards in science. We also discuss potential solutions to address some of the outstanding problems with existing leaderboard frameworks. Software platforms for leaderboards: http://codalab.org/ https://leaderboard.allenai.org/

Listen
79 - The glass ceiling in NLP, with Natalie Schluter

21/01/2019 Duration: 26min

In this episode, Natalie Schluter talks to us about a data-driven analysis of career progression of male vs. female researchers in NLP through the lens of mentor-mentee networks based on ~20K papers in the ACL anthology. Directed edges in the network describe a mentorship relation from the last author on a paper to the last author, and author names were annotated for gender when possible. Interesting observations include the increase of percentage of mentors (regardless of gender), and an increasing gap between the fraction of mentors who are males and females since the early 2000s. By analyzing the number of years between a researcher’s first publication and the year at which they achieve mentorship status at threshold T, defined by publishing T or more papers as a last author, Natalie also found that female researchers tend to take much longer to be mentors. Another interesting finding is that in-gender mentorship is a strong predictor of the mentee’s success in becoming mentors themselves. Finally, Natalie

Listen
78. Where do corpora come from?, with Matt Honnibal and Ines Montani

15/01/2019 Duration: 30min

Most NLP projects rely crucially on the quality of annotations used for training and evaluating models. In this episode, Matt and Ines of Explosion AI tell us how Prodigy can improve data annotation and model development workflows. Prodigy is an annotation tool implemented as a python library, and it comes with a web application and a command line interface. A developer can define input data streams and design simple annotation interfaces. Prodigy can help break down complex annotation decisions into a series of binary decisions, and it provides easy integration with spaCy models. Developers can specify how models should be modified as new annotations come in in an active learning framework. Prodigy: https://prodi.gy Prodigy recipe scripts: https://github.com/explosion/prodigy-recipes Twitter: https://twitter.com/_inesmontani https://twitter.com/honnibal

Listen
77. On Writing Quality Peer Reviews, with Noah A. Smith

07/01/2019 Duration: 38min

It's not uncommon for authors to be frustrated with the quality of peer reviews they receive in (NLP) conferences. In this episode, Noah A. Smith shares his advice on how to write good peer reviews. The structure Noah recommends for writing a peer review starts with a dispassionate summary of what a paper has to offer, followed by the strongest reasons the paper may be accepted, followed by the strongest reasons it may be rejected, and concludes with a list of minor, easy-to-fix problems (e.g., typos) which can be easily addressed in the camera ready. Noah stresses on the importance of thinking about how the reviews we write could demoralize (junior) researchers, and how to be precise and detailed when discussing the weaknesses of a paper to help the authors see the path forward. Other questions we discuss in this episode include: How to read a paper for reviewing purposes? How long it takes to review a paper and how many papers to review? What types of mistakes to be on the lookout for while reviewing? How t

Listen
76 - Increasing In-Class Similarity by Retrofitting Embeddings with Demographics, with Dirk Hovy

27/11/2018 Duration: 29min

EMNLP 2018 paper by Dirk Hovy and Tommaso Fornaciari. https://www.semanticscholar.org/paper/Improving-Author-Attribute-Prediction-by-Linguistic-Hovy-Fornaciari/71aad8919c864f73108aafd8e926d44e9df51615 In this episode, Dirk Hovy talks about natural language as social phenomenon which can provide insights about those who generate it. For example, this paper uses retrofitted embeddings to improve on two tasks: predicting the gender and age group of a person based on their online reviews. In this approach, authors embeddings are first generated using Doc2Vec, then retrofitted such that authors with similar attributes are closer in the vector space. In order to estimate the retrofitted vectors for authors with unknown attributes, a linear transformation is learned which maps Doc2Vec vectors to the retrofitted vectors. Dirk also used a similar approach to encode geographic information to model regional linguistic variations, in another EMNLP 2018 paper with Christoph Purschke titled “Capturing Regional Variation w

Listen
75 - Reinforcement / Imitation Learning in NLP, with Hal Daumé III

21/11/2018 Duration: 43min

In this episode, we invite Hal Daumé to continue the discussion on reinforcement learning, focusing on how it has been used in NLP. We discuss how to reduce NLP problems into the reinforcement learning framework, and circumstances where it may or may not be useful. We discuss imitation learning, roll-in and roll-out, and how to approximate an expert with a reference policy. DAgger: https://www.semanticscholar.org/paper/A-Reduction-of-Imitation-Learning-and-Structured-to-Ross-Gordon/17eddf33b513ae1134abadab728bdbf6abab2a05?navId=citing-papers RESLOPE: http://legacydirs.umiacs.umd.edu/~hal/docs/daume18reslope.pdf

Listen
74 - Deep Reinforcement Learning Doesn't Work Yet, with Alex Irpan

16/11/2018 Duration: 40min

Blog post by Alex Irpan titled "Deep Reinforcement Learning Doesn't Work Yet" https://www.alexirpan.com/2018/02/14/rl-hard.html In this episode, Alex Irpan talks about limitations of current deep reinforcement learning methods and why we have a long way to go before they go mainstream. We discuss sample inefficiency, instability, the difficulty to design reward functions and overfitting to the environment. Alex concludes with a list of recommendations he found useful when training models with deep reinforcement learning.

Listen
73 - Supersense Disambiguation of English Prepositions and Possessives, with Nathan Schneider

13/11/2018 Duration: 52min

ACL 2018 paper by Nathan Schneider, Jena D. Hwang, Vivek Srikumar, Jakob Prange, Austin Blodgett, Sarah R. Moeller, Aviram Stern, Adi Bitan, Omri Abend. In this episode, Nathan discusses how the meaning of prepositions varies, proposes a hierarchy for classifying the semantics of function words (e.g., comparison, temporal, purpose), and describes empirical results using the provided dataset for disambiguating preposition semantics. Along the way, we talk about lexicon-based semantics, multilinguality and pragmatics. https://www.semanticscholar.org/paper/Comprehensive-Supersense-Disambiguation-of-English-Schneider-Hwang/8310213af102913b9e74e7dfe6864f3aa62a5a5e

Listen
72 - The Anatomy Question Answering Task, with Jordan Boyd-Graber

16/10/2018 Duration: 43min

Our first episode in a new format: broader surveys of areas, instead of specific discussions on individual papers. In this episode, we talk with Jordan Boyd-Graber about question answering. Matt starts the discussion by giving five different axes on which question answering tasks vary: (1)how complex is the language in the question, (2)what is the genre of the question / nature of the question semantics, (3)what is the context or knowledge source used to answer the question, (4)how much "reasoning" is required to answer the question, and (5) what's the format of the answer? We talk about each of these in detail, giving examples from Jordan's and others' work. In the end, we conclude that "question answering" is a format to study a particular phenomenon, it is not a "phenomenon" in itself. Sometimes it's useful to pose a phenomenon you want to study as a question answering task, and sometimes it's not. During the conversation, Jordan mentioned the QANTA competition; you can find that here: http://qanta.o

Listen
71 - DuoRC: Complex Language Understanding with Paraphrased Reading Comprehension, with Amrita Saha

12/10/2018 Duration: 33min

ACL 2018 paper by Amrita Saha, Rahul Aralikatte, Mitesh M. Khapra, Karthik Sankaranarayanan Amrita and colleagues at IBM Research introduced a harder dataset for "reading comprehension", where you have to answer questions about a given passage of text. Amrita joins us on the podcast to talk about why a new dataset is necessary, what makes this one unique and interesting, and how well initial baseline systems perform on it. Along the way, we talk about the problems with using BLEU or ROUGE as evaluation metrics for question answering systems. https://www.semanticscholar.org/paper/DuoRC%3A-Towards-Complex-Language-Understanding-with-Saha-Aralikatte/1e70a4830840d48486ecfbc6c89b774cdd0b6399

Listen
70 - Measuring the Evolution of a Scientific Field through Citation Frames, with David Jurgens

18/09/2018 Duration: 40min

TACL 2018 paper (presented at ACL 2018) by David Jurgens, Srijan Kumar, Raine Hoover, Daniel A. McFarland, and Daniel Jurafsky David comes on the podcast to talk to us about citation frames. We discuss the dataset they created by painstakingly annotating the "citation type" for all of the citations in a large collection of papers (around 2000 citations in total), then training a classifier on that data to annotate the rest of the ACL anthology. This process itself is interesting, including how exactly the citations are classified, and we talk about this for a bit. The second half of the podcast talks about the analysis that David and colleagues did using the (automatically) annotated ACL anthology, trying to gauge how the field has changed over time. https://www.semanticscholar.org/paper/Measuring-the-Evolution-of-a-Scientific-Field-Jurgens-Kumar/65118f3a7463f54bdf9b9e5cdd655953a2488c2f

Listen
69 - Second language acquisition modeling, with Burr Settles

10/09/2018 Duration: 34min

A shared task held in conjunction with a NAACL 2018 workshop, organized by Burr Settles and collaborators at Duolingo. Burr tells us about the shared task. The goal of the task was to predict errors that a language learner would make when doing exercises on Duolingo. We talk about the details of the data, why this particular data is interesting to study for second language acquisition, what could be better about it, and what systems people used to approach this task. We also talk a bit about what you could do with a system that can predict these kinds of errors to build better language learning systems. https://www.semanticscholar.org/paper/Second-Language-Acquisition-Modeling-Settles-Brust/10685728fab1dfe9d1cf0cd4240ed687dd601ac6

Listen
68 - Neural models of factuality, with Rachel Rudinger

04/09/2018 Duration: 36min

NAACL 2018 paper, by Rachel Rudinger, Aaron Steven White, and Benjamin Van Durme Rachel comes on to the podcast, telling us about what factuality is (did an event happen?), what datasets exist for doing this task (a few; they made a new, bigger one), and how to build models to predict factuality (turns out a vanilla biLSTM does quite well). Along the way, we have interesting discussions about how you decide what an "event" is, how you label factuality (whether something happened) on inherently uncertain text (like "I probably failed the test"), and how you might use a system that predicts factuality in some end task. https://www.semanticscholar.org/paper/Neural-models-of-factuality-Rudinger-White/4d62a1e7819f9e3f8c837832c66659db5a6d9b37

Listen
67 - GLUE: A Multi-Task Benchmark and Analysis Platform, with Sam Bowman

27/08/2018 Duration: 39min

Paper by Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. Sam comes on to tell us about GLUE. We talk about the motivation behind setting up a benchmark framework for natural language understanding, how the authors defined "NLU" and chose the tasks for this benchmark, a very nice diagnostic dataset that was constructed for GLUE, and what insight they gained from the experiments they've run so far. We also have some musings about the utility of general-purpose sentence vectors, and about leaderboards. https://www.semanticscholar.org/paper/GLUE%3A-A-Multi-Task-Benchmark-and-Analysis-Platform-Wang-Singh/a2054eff8b4efe0f1f53d88c08446f9492ae07c1

Listen
66 - Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods, with Jieyu Zhao

20/08/2018 Duration: 26min

NACL 2018 paper, by Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. Jieyu comes on the podcast to talk about bias in coreference resolution models. This bias makes models rely disproportionately on gender when making decisions for whether "she" refers to a noun like "secretary" or "physician". Jieyu and her co-authors show that coreference systems do not actually exhibit much bias in standard evaluation settings (OntoNotes), perhaps because there is a broad document context to aid in making coreference decisions. But they then construct a really nice diagnostic dataset that isolates simple coreference decisions, and evaluates whether the model is using common sense, grammar, or gender bias to make those decisions. This dataset shows that current models are quite biased, particularly when it comes to common sense, using gender to make incorrect coreference decisions. Jieyu then tells us about some simple methods to correct the bias without much of a drop in overall accuracy. h

Listen
65 - Event Representations with Tensor-based Compositions, with Niranjan Balasubramanian

13/08/2018 Duration: 38min

AAAI 2018 paper by Noah Weber, Niranjan Balasubramanian, and Nathanael Chambers Niranjan joins us on the podcast to tell us about his latest contribution in a line of work going back to Shank's scripts. This work tries to model sequences of events to get coherent narrative schemas, mined from large collections of text. For example, given an event like "She threw a football", you might expect future events involving catching, running, scoring, and so on. But if the event is instead "She threw a bomb", you would expect future events to involve things like explosions, damage, arrests, or other related things. We spend much of our conversation talking about why these scripts are interesting to study, and the general outline for how one might learn these scripts from text, and spend a little bit of time talking about the particular contribution of this paper, which is a better model that captures interactions among all of the arguments to an event. https://www.semanticscholar.org/paper/Event-Representations-

Listen

|<
<<
>>
>|

page 4 from 8