I'm a PhD candidate at the The Blavatnik School of Computer Science at Tel-Aviv University, working at Bar-Ilan university's NLP lab.
My research is in the field of Natural Language Processing and is done under the supervision of Prof. Ido Dagan, Prof. Eytan Ruppin and Prof. Shimon Edelman. I am interested in machine learning algorithms that learn semantic relations between predicates from large corpora in a minimially-supervised manner.
New
Asher Stern, Amnon Lotan, Shachar Mirkin, Eyal Shnarch, Lili Kotlerman, Jonathan Berant and Ido Dagan
Knowledge and Tree-Edits in Learnable Entailment Proofs. Proceedings of TAC 2011.
This paper describes BIUTEE - Bar Ilan University Textual Entailment Engine. BIUTEE is a natural language inference system in which
the hypothesis is proven by the text, based on linguistic- and world- knowledge resources, as well as syntactically motivated tree transformations.
The main progress in BIUTEE in the last year is a new confidence model that estimates the validity of the proof found by BIUTEE.
New Jonathan Berant, Ido Dagan and Jacob Goldberger
Learning Entailment Relations by Global
Graph Structure Optimization. Long paper in The Journal of Computational Linguistics 38(1).
Identifying entailment relations between predicates is an important part of applied semantic
inference. In this article we propose a global inference algorithm that learns such entailment
rules. First, We define a graph structure over predicates that represents entailment relations as
directed edges. Then, we use a global transitivity constraint on the graph to learn the optimal set
of edges, formulating the optimization problem as an Integer Linear Program. The algorithm is
applied in a setting where given a target concept, the algorithm learns on-the-fly all entailment
rules between predicates that co-occur with this concept. Results show that our global algorithm
improves performance over baseline algorithms by more than 10%.
Jonathan Berant, Ido Dagan and Jacob Goldberger
Global Learning of Typed Entailment Rules. Long paper in the proceedings of ACL 2011
(best student paper)
Extensive knowledge bases of entailment rules between predicates are crucial for applied semantic inference. In this paper we propose an algorithm that utilizes transitivity constraints to learn a globally-optimal set of entailment rules for typed predicates. We model the task as a graph learning problem and suggest methods that scale the algorithm to larger graphs. We apply the algorithm over a large data set
of extracted predicate instances, from which a resource of typed entailment rules has been recently released (Schoenmackers et al., 2010).
Our results show that using global transitivity information substantially improves performance over this resource and several baselines, and that our scaling methods allow us to increase the scope of global learning of entailment-rule graphs.
Catherine L. Caldwell-Harris, Jonathan Berant and Shimon Edelman
Measuring Mental Entrenchment of Phrases with Perceptual Identification, Familiarity Ratings, and Corpus Frequency Statistics. To appear in S. T. Gries and D. Divjak (eds.), Frequency effects in cognitive linguistics (Vol. 1): Statistical effects in learnability, processing and change, The Hague, The Netherlands: De Gruyter Mouton (2011).
Asher Stern, Eyal Shnarch, Amnon Lotan, Shachar Mirkin, Lili Kotlerman, Naomi Zeichner, Jonathan Berant and Ido Dagan
Rule Chaining and Approximate Match in Textual Inference.Text Analysis Conference 2010 (RTE-6)
This paper describes the participation of Bar-Ilan university in the sixth RTE challenge. Our textual-entailment engine, BiuTee , was enhanced with new components that introduce chaining
of lexical-entailment rules, and tackle the problem of approximately matching the text and the hypothesis after all available knowledge of entailment rules was utilized. We have also re-engineered
our system aiming at an open-source open architecture. BiuTee's performance is better than the median of all-submissions, and outperforms significantly an IR-oriented baseline.
Shachar Mirkin, Jonathan Berant, Ido Dagan and Eyal Shnarch
Recognising Entailment within Discourse. Proceedings of COLING, 2010.
Texts are commonly interpreted based on the entire discourse in which they are situated. Discourse processing has been shown useful for inference-based application; yet, most systems for textual entailment - a popular paradigm for applied inference - have only addressed discourse considerations via off-the-shelf coreference resolvers. In this paper we explore
various discourse aspects in entailment inference, suggest initial solutions for them and investigate their impact on entailment
performance. Our experiments suggest that discourse provides useful information, which signi?cantly improves entailment inference, and should be better addressed by future entailment systems.
Jonathan Berant, Ido Dagan and
Jacob Goldberger Global Learning of Focused Entailment Graphs. Long paper in the proceedings of ACL, 2010.
We propose a global algorithm for learning entailment relations between predicates. We define a graph structure over predicates that represents entailment relations as directed edges, and use a global transitivity constraint on the graph to learn the optimal set of edges, by formulating the optimization problem as an Integer Linear Program. We motivate this graph
with an application that provides a hierarchical summary for a set of propositions that focus on a target concept, and show that our global algorithm improves performance by more than 10% over baseline algorithms.
Shachar Mirkin, Roy Bar-Haim, Jonathan Berant, Ido Dagan, Eyal Shnarch,
Asher Stern and Idan Szpektor
Addressing Discourse and Document Structure in the RTE Search Task. Proceedings of TAC, 2009.
This paper describes Bar-Ilan University's submissions to RTE-5. This year we focused on the Search pilot, enhancing our entailment system to address two main issues introduced by this new setting: scalability and, primarily, document-level discourse. Our system achieved the highest score on the Search task amongst participating groups, and proposes first steps towards addressing this challenging setting.
Roy Bar-Haim, Jonathan Berant and Ido Dagan
A Compact Forest for Scalable Inference over Entailment and Paraphrase Rules. Proceedings of EMNLP, 2009.
A large body of recent research has been investigating the acquisition and application of applied inference knowledge. Such knowledge may be typically captured as entailment rules, applied over syntactic representations. Efficient inference with such knowledge then becomes a fundamental problem. Starting out from a formalism for entailment-rule application we
present a novel packed data-structure and a corresponding algorithm for its scalable implementation. We proved the validity of
the new algorithm and established its efficiency analytically and empirically.
Roy Bar-Haim, Jonathan Berant, Ido Dagan, Iddo Greental, Shachar Mirkin,
Eyal Shnarch and
Idan Szpektor
Efficient Semantic Deduction and Approximate Matching over Compact Parse Forests. Proceedings of TAC, 2008.
Semantic inference is often modeled as application of entailment rules, which specify generation of entailed sentences from a source sentence. Efficient generation and representation of entailed consequents is a fundamental problem common to such inference methods. We present a new data structure, termed compact forest, which allows efficient generation and representation of entailed consequents, each represented as a parse tree. Rule-based inference is complemented with a new approximate
matching measure inspired by tree kernels, which is computed efficiently over compact forests. Our system also makes use of novel large-scale entailment rule bases, derived fromWikipedia as well as from information about predicates and their argument mapping,
gathered from available lexicons and complemented by unsupervised learning.
Jonathan Berant,
Catherine Caldwell-Harris and Shimon Edelman
Tracks in the Mind: Differential Entrenchment of Common and Rare Liturgical and Everyday Multiword Phrases in Religious and Secular Hebrew Speakers . Proceedings of CogSci, 2008.
We tested the hypothesis that more frequent exposure to multiword phrases results in deeper entrenchment of their representations, by examining the performance of subjects of different religiosity in the recognition of briefly presented liturgical and secular phrases drawn from several frequency classes. Three of the sources were prayer texts that religious Jews are required to recite on a daily, weekly, and annual basis, respectively; two others were common and rare expressions encountered in the general secular Israeli culture. As expected, linear dependence of recognition score on frequency was found for the religious subjects (being most pronounced for men, who are usually more observant than women); both religious and secular subjects performed better on common than on rare general culture items. Our results support the notion of graded entrenchment introduced by Langacker and shared by several cognitive linguistic theories of language comprehension and production.
Jonathan Berant, Yaron Gross, Matan Mussel,
Ben Sandbank, Eytan Ruppin and Shimon Edelman
Boosting Unsupervised Grammar Induction by Splitting Complex Sentences on Function Words. Proceedings of BUCLD, 2007.
Global Learning of Entailment Graphs, NYU, Columbia, MIT and UIUC seminars, January 2011.
Global Learning of Focused Entailment Graphs, University of Washington AI seminar, Seattle, October 2010.
An Entailment-based Ontology for Domain-Specific Relations, ITCH workshop, Trento, September 2009.
Standard and Non-standard Parse Trees Equally Improve Grammar Induction, ISCOL, Ramat-Gan, September 2008.
Short presentation about the argument from the poverty of the stimulus.