-
Joint Extraction of Events and Entities within a Document Context
B. Yang and T. Mitchell.
In Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2016
[PDF]
[abstract]
[bib]
Events and entities are closely related; entities are often actors or participants in events and events
without entities are uncommon. The interpretation of events and entities is highly contextually dependent.
Existing work in information extraction typically models events separately from entities, and performs
inference at the sentence level, ignoring the rest of the document. In this paper, we propose a novel
approach that models the dependencies among variables of events, entities, and their relations, and
performs joint inference of these variables across a document. The goal is to enable access to document-level
contextual information and facilitate context-aware predictions. We demonstrate that our approach substantially
outperforms the state-of-the-art methods for event extraction as well as a strong baseline for entity extraction.
@inproceedings{bishan2016event,
author = {Yang, Bishan and Mitchell, Tom},
title = {Joint Extraction of Events and Entities within a Document Context},
booktitle = {Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL)},
year = {2016}}
-
Mapping Verbs in Different Languages to Knowledge Base Relations using Web Text as Interlingua
D. T. Wijaya and T. Mitchell.
In Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2016
[PDF]
[abstract]
[bib]
In recent years many knowledge bases (KBs) have been constructed, yet there is not yet a verb resource
that maps to these growing KB resources. A resource that maps verbs in different languages to KB relations
would be useful for extracting facts from text into the KBs, and to aid alignment and integration of
knowledge across different KBs and languages. Such a multi-lingual verb resource would also be useful
for tasks such as machine translation and machine reading. In this paper, we present a scalable approach
to automatically construct such a verb resource using a very large web text corpus as a kind of interlingua
to relate verb phrases to KB relations. Given a text corpus in any language and any KB, it can produce
a mapping of that language’s verb phrases to the KB relations. Experiments with the English NELL KB
and ClueWeb corpus show that the learned English verb-to-relation mapping is effective for extracting
relation instances from English text. When applied to a Portuguese NELL KB and a Portuguese text corpus,
the same method automatically constructs a verb resource in Portuguese that is effective for extracting
relation instances from Portuguese text.
@inproceedings{wijaya2016mapping,
author = {Wijaya, Derry Tanti and Mitchell, Tom },
title = {Mapping Verbs In Different Languages to Knowledge Base Relations using Web Text as Interlingua},
booktitle = {Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL)},
year = {2016}}
-
Translation Invariant Word Embeddings
M. Gardner, K. Huang, E. Papalexakis, X. Fu, P. Talukdar, C. Faloutsos, N. Sidiropoulos, T. Mitchell.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015
[PDF]
[abstract]
[bib]
This work focuses on the task of finding latent vector representations of the words in a corpus.
In particular, we address the issue of what to do when there are multiple lanugages in the corpus.
Prior work has, among other techniques, used canonical correlation analysis to project pre-trained vectors in two languages into a common space.
We propose a simple and scalable method that is inspired by the notion that the learned vector representations should be invariant to translation between languages.
We show empirically that our method outperforms prior work on multilingual tasks, matches the performance of prior work on monolingual tasks, and scales linearly with the size of the input data (and thus the number of languages being embedded).
@inproceedings{gardner2015translation,
author = {Gardner, Matt and Huang, Kejun Huang and Papalexakis , Evangelos and Fu, Xiao and Talukdar, Partha and Faloutsos , Christos and Sidiropoulos, Nicholas and Mitchell, Tom },
title = {Translation Invariant Word Embeddings},
booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)},
year = {2015}}
-
AskWorld: Budget-Sensitive Query Evaluation for Knowledge-on-Demand
M. Samadi, P. Talukdar, M. Veloso, T. Mitchell.
In International Joint Conference on Artificial Intelligence (IJCAI), 2015.
[PDF]
[abstract]
[bib]
Recently, several Web-scale knowledge harvesting systems have been built, each of which is competent at extracting information from certain types of data (e.g., unstructured text, structured tables on the web, etc.).
In order to determine the response to a new query posed to such systems (e.g., is sugar a healthy food?), it is useful to integrate opinions from multiple systems.
If a response is desired within a specific time budget (e.g., in less than 2 seconds), then maybe only a subset of these resources can be queried.
In this paper, we address the problem of kowledge integration for on-demand time-budgeted query answering.
We propose a new method, AskWorld, which learns a policy that chooses which queries to send to which resources, by accommodating varying budget constraints that are available only at query (test) time.
Through extensive experiments on real world datasets, we demonstrate AskWorld's capability in selecting the most informative resources to query within test-time constraints, resulting in improved performance compared to competitive baselines.
@inproceedings{samadi2015askworld,
title={AskWorld: Budget-Sensitive Query Evaluation for Knowledge-on-Demand},
author={Samadi, Mehdi and Talukdar, Partha and Veloso, Manuela and Mitchell, Tom},
year={2015},
booktitle={International Joint Conference on Artificial Intelligence (IJCAI)}}
-
Automatic Gloss Finding for a Knowledge Base using Ontological Constraints
B. Dalvi, E. Minkov, P. P. Talukdar, W. W. Cohen.
In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM), 2015.
[PDF]
[abstract]
[bib]
While there has been much research on automatically constructing structured Knowledge Bases (KBs), most of it has focused on generating facts to populate a KB.
However, a useful KB must go beyond facts.
For example, glosses (short natural language definitions) have been found to be very useful in tasks such as Word Sense Disambiguation.
However, the important problem of Automatic Gloss Finding, i.e., assigning glosses to entities in an initially gloss-free KB, is relatively unexplored.
We address that gap in this paper.
In particular, we propose GLOFIN, a hierarchical semi-supervised learning algorithm for this problem which makes effective use of limited amounts of supervision and available ontological constraints.
To the best of our knowledge, GLOFIN is the first system for this task.
Through extensive experiments on real-world datasets, we demonstrate GLOFIN's effectiveness.
It is encouraging to see that GLOFIN outperforms other state-of-the-art SSL algorithms, especially in low supervision settings.
We also demonstrate GLOFIN's robustness to noise contributed (e.g., Freebase) to automatically constructed (e.g., NELL).
To facilitate further research in this area, we have already made datasets and code used in this paper publicly available.
@inproceedings{dalvi2015automatic,
title={Automatic gloss finding for a knowledge base using ontological constraints},
author={Dalvi, Bhavana and Minkov, Einat and Talukdar, Partha P and Cohen, William W},
booktitle={Proceedings of the Eighth ACM International Conference on Web Search and Data Mining},
year={2015}}
-
A Compositional and Interpretable Semantic Space
A. Fyshe, L. Wehbe, P. Talukdar, B. Murphy, T. Mitchell
In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2015).
[PDF]
[abstract]
[bib]
Vector Space Models (VSMs) of Semantics are useful tools for exploring the semantics of single words, and the composition of words to make phrasal meaning.
While many methods can estimate the meaning (i.e. vector) of a phrase, few do so in an interpretable way.
We introduce a new method (CNNSE) that allows word and phrase vectors to adapt to the notion of composition.
Our method learns a VSM that is both tailored to support a chosen semantic composition operation, and whose resulting features have an intuitive interpretation.
Interpretability allows for the exploration of phrasal semantics, which we leverage to analyze performance on a behavioral task.
@inproceedings{fyshe2015compositional,
title={A Compositional and Interpretable Semantic Space},
author={Fyshe, Alona and Wehbe, Leila and Talukdar, Partha P and Murphy, Brian and Mitchell, Tom M},
booktitle ={Proceedings of the NAACL-HLT},
year={2015}}
-
"A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce": Learning State Changing Verbs from Wikipedia Revision History
D. T. Wijaya, N. Nakashole, T. Mitchell.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015.
[abstract]
[bib]
Learning to determine when the time-varying facts of a Knowledge Base (KB) have to
be updated is a challenging task. We propose to learn state changing verbs from
Wikipedia edit history. When a state-changing event, such as a marriage or death,
happens to an entity, the infobox on the entity’s Wikipedia page usually gets updated.
At the same time, the article text may be updated with verbs either being added or
deleted to reflect the changes made to the infobox. We use Wikipedia edit history
to distantly supervise a method for automatically learning verbs and state changes.
Additionally, our method uses constraints to effectively map verbs to infobox changes.
We observe in our experiments that when state-changing verbs are added or deleted from
an entity’s Wikipedia page text, we can predict the entity’s infobox updates with
88% precision and 76% recall. One compelling application of our verbs is to
incorporate them as triggers in methods for updating existing KBs, which are
currently mostly static.
@inproceedings{wijaya2015statechangingverbs,
author = {Wijaya, Derry Tanti and Nakashole, Ndapa and Mitchell, Tom M},
title = {{"A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce": Learning State Changing Verbs from Wikipedia Revision History}},
booktitle = {Proceedings of the Comnference on Emprical Methods in Natural Language Processing (EMNLP)},
year = {2015}
-
Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction.
M. Gardner, T. Mitchell.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015.
[abstract]
[bib]
We explore some of the practicalities
of using random walk inference methods,
such as the Path Ranking Algorithm
(PRA), for the task of knowledge base
completion. We show that the random
walk probabilities computed (at great expense)
by PRA provide no discernible
benefit to performance on this task, and so
they can safely be dropped. This result allows
us to define a simpler algorithm for
generating feature matrices from graphs,
which we call subgraph feature extraction
(SFE). In addition to being conceptually
simpler than PRA, SFE is much more efficient,
reducing computation by an order
of magnitude, and more expressive, allowing
for much richer features than just paths
between two nodes in a graph. We show
experimentally that this technique gives
substantially better performance than PRA
and its variants, improving mean average
precision from .432 to .528 on a knowledge
base completion task using the NELL
knowledge base.
@inproceedings{gardner2015sfe,
author = {Gardner, Matt and Mitchell, Tom M},
title = {{Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction}},
booktitle = {Proceedings of the Comnference on Emprical Methods in Natural Language Processing (EMNLP)},
year = {2015}
-
A Knowledge-Intensive Model for Prepositional Phrase Attachment.
N. Nakashole, T. Mitchell.
In Proceedings of
the 53rd Annual Meeting of the Association for Computational Linguistics (ACL), 2015.
[PDF]
[abstract]
[bib]
Prepositional phrases (PPs) express crucial
information that knowledge base construction
methods need to extract. However,
PPs are a major source of syntactic
ambiguity and still pose problems in parsing.
We present a method for resolving
ambiguities arising from PPs, making extensive
use of semantic knowledge from
various resources. As training data, we use
both labeled and unlabeled data, utilizing
an expectation maximization algorithm for
parameter estimation. Experiments show
that our method yields improvements over
existing methods including a state of the
art dependency parser.
@inproceedings{nakashole2015ppa,
author = {Nakashole, Ndapandula and Mitchell, Tom M},
title = {{A Knowledge-Intensive Model for Prepositional Phrase Attachment}},
booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational
Linguistics (ACL)},
year = {2015},
pages = {365-375}
-
Learning a Compositional Semantics for Freebase with an Open Predicate Vocabulary.
Jayant Krishnamurthy and Tom M Mitchell. In Transactions of the Association for Computational Linguistics, Volume 3, 2015.
[PDF]
-
AskWorld: Budget-Sensitive Query Evaluation for Knowledge-on-Demand.
Mehdi Samadi, Partha Pratim Talukdar, Manuela M. Veloso, Tom M. Mitchell. In Proceedings of 24th International Joint Conference on Artificial Intelligence (IJCAI), 2015.
[PDF]
-
Weakly Supervised Extraction of Computer Security Events from Twitter.
Alan Ritter, Evan Wright, William Casey and Tom M. Mitchell.
In Proceedings of the 24th International Conference on World Wide Web,
(WWW), 2015
[PDF]
[abstract]
[bib]
Twitter contains a wealth of timely information, however.
@inproceedings{ritter2015secevent,
author = {Alan Ritter and
Evan Wright and
William Casey and
Tom M. Mitchell},
title = {{Weakly Supervised Extraction of Computer Security Events from Twitter}},
booktitle = {Proceedings of the 24th International Conference on World Wide Web,
{WWW}},
year = {2015},
pages = {896--905}
-
Never-Ending Learning.
T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner,
B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. Platanios,
A. Ritter, M. Samadi, B. Settles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves,
J. Welling.
In Proceedings of the Conference on Artificial Intelligence (AAAI), 2015.
[PDF]
[abstract]
[bib]
Whereas people learn many different types of knowledge from diverse experiences over many
years, most current machine learning systems acquire just a single function or data model
from just a single data set. We propose a never-ending learning paradigm for machine
learning, to better reflect the more ambitious and encompassing type of learning performed
by humans. As a case study, we describe the Never-Ending Language Learner (NELL), which
achieves some of the desired properties of a never-ending learner, and we discuss lessons
learned. NELL has been learning to read the web 24 hours/day since January 2010, and so far
has acquired a knowledge base with over 80 million confidence-weighted beliefs (e.g.,
servedWith(tea, biscuits)). NELL has also learned millions of features and parameters that
enable it to read these beliefs from the web. Additionally, it has learned to reason over
these beliefs to infer new beliefs, and is able to extend its ontology by synthesizing new
relational predicates. NELL can be tracked online at http://rtw.ml.cmu.edu, and followed on
Twitter at @CMUNELL.
@inproceedings{NELL-aaai15,
Title = {Never-Ending Learning},
Author = {T. Mitchell and W. Cohen and E. Hruschka and P. Talukdar and J. Betteridge and
A. Carlson and B. Dalvi and M. Gardner and B. Kisiel and J. Krishnamurthy and N. Lao and
K. Mazaitis and T. Mohamed and N. Nakashole and E. Platanios and A. Ritter and M. Samadi and
B. Settles and R. Wang and D. Wijaya and A. Gupta and X. Chen and A. Saparov and M. Greaves and J. Welling},
Booktitle = {Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI-15)},
Year = {2015}}
-
Joint Syntactic and Semantic Parsing with Combinatory Categorial Grammar.
J. Krishnamurthy, T. Mitchell.
In Proceedings of
the 52nd Annual Meeting of the Association for Computational Linguistics (ACL), 2014.
[PDF]
[abstract]
[bib]
We present an approach to training a joint syntactic and semantic parser
that combines syntactic training information from CCGbank with semantic
training information from a knowledge base via distant supervision.
The trained parser produces a full syntactic parse of any sentence,
while simultaneously producing logical forms for portions of the sentence
that have a semantic representation within the parser's predicate vocabulary.
We demonstrate our approach by training a parser whose semantic
representation contains 130 predicates from the NELL ontology.
A semantic evaluation demonstrates that this parser produces logical forms
better than both comparable prior work and a pipelined syntax-then-semantics approach.
A syntactic evaluation on CCGbank demonstrates that the parser's dependency Fscore is
within 2.5% of state-of-the-art.
@inproceedings{krishnamurthy2014jointccg,
author = {Krishnamurthy, Jayant and Mitchell, Tom M},
title = {{Joint Syntactic and Semantic Parsing with Combinatory Categorial Grammar}},
booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational
Linguistics (ACL)},
year = {2014},
pages = {1188--1198}}
-
Assuming Facts Are Expressed More Than Once.
J. Betteridge, A. Ritter and T. Mitchell
In Proceedings of the 27th International Florida Artificial Intelligence Research Society Conference (FLAIRS-27), 2014.
[PDF]
[abstract]
[bib]
Distant supervision (DS) is a method for training
sentence-level information extraction models using
only an unlabeled corpus and a knowledge base (KB).
Fundamental to many DS approaches is the assumption
that KB facts are expressed at least once (EALO)
in the text corpus. Often, however, KB facts are actually
expressed in the corpus many times, in which cases
EALO-based systems underuse the available training
data. To address this problem, we introduce "expressed
at least α percent" EALA) assumption, which
asserts that expressions of KB facts account for up to
α% f the corresponding mentions. We show that for
the same level of precision as the EALO approach, the
EALA approach achieves up to 66% higher recall on
category recognition and 53% higher recall on relation
recognition.
@inproceedings{betteridge2014assuming,
Author = {Betteridge, Justin and Ritter, Alan and Mitchell, Tom},
Booktitle = {The Twenty-Seventh International Flairs Conference},
Title = {Assuming Facts Are Expressed More Than Once},
Year = {2014}}
-
Estimating Accuracy from Unlabeled Data.
E. A. Platanios, A. Blum, T. Mitchell.
In Uncertainty in Artificial Intelligence (UAI), 2014.
[PDF]
[abstract]
[bib]
We consider the question of how unlabeled data can be used to estimate
the true accuracy of learned classifiers. This is an important question
for any autonomous learning system that must estimate its accuracy
without supervision, and also when classifiers trained from one data
distribution must be applied to a new distribution (e.g., document
classifiers trained on one text corpus are to be applied to a second
corpus). We first show how to estimate error rates exactly from
unlabeled data when given a collection of competing classifiers that
make independent errors, based on the agreement rates between subsets
of these classifiers. We further show that even when the competing
classifiers do not make independent errors, both their accuracies and
error dependencies can be estimated by making certain relaxed
assumptions. Experiments on two data real-world data sets produce
estimates within a few percent of the true accuracy, using solely
unlabeled data. These results are of practical significance in
situations where labeled data is scarce and shed light on the more
general question of how the consistency among multiple functions is
related to their true accuracies.
@inproceedings{Platanios:2014ti,
author = {Platanios, Emmanouil Antonios and Blum, Avrim and Mitchell, Tom M},
title = {{Estimating Accuracy from Unlabeled Data}},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {2014},
pages = {1--10}}
-
CTPs: Contextual Temporal Profiles for Time Scoping Facts via Entity State Change Detection.
D.T. Wijaya, N. Nakashole and T.M. Mitchell.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.
[PDF]
[abstract]
[bib]
Temporal scope adds a time dimension to facts in Knowledge Bases (KBs).
Existing methods for temporal scope infer- ence and extraction still
suffer from low accuracy. In this paper, we present a novel method that
leverages temporal profiles augmented with context-- Contextual Temporal
Profiles (CTPs) of entities. Through change patterns in an entity’s CTP,
we model the entity’s state change brought about by real world events
that happen to the entity (e.g, hired, fired, divorced, etc.). This leads
to a new formulation of temporal scoping problem as a state change detection
problem. Our experiments show that this formulation, and the resulting
solution are highly effective for inferring temporal scope of facts.
@InProceedings{wijaya-nakashole-mitchell:2014:EMNLP,
author = {Wijaya, Derry and Nakashole, Ndapa and Mitchell, Tom},
title = {CTPs: Contextual Temporal Profiles for Time Scoping Facts via Entity State Change Detection},
booktitle = {Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2014},
address = {Doha, Qatar.},
publisher = {Association for Computational Linguistics}}
-
Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases.
M. Gardner, P. Talukdar, J. Krishnamurthy and T.M. Mitchell.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.
[PDF]
[abstract]
[bib]
Much work in recent years has gone into the construction of large knowledge
bases (KBs), such as Freebase, DBPedia, NELL, and YAGO. While these KBs are
very large, they are still very incomplete, necessitating the use of inference
to fill in gaps. Prior work has shown how to make use of a large text corpus
to augment random walk inference over KBs. We present two improvements to the
use of such large corpora to augment KB inference. First, we present a new
technique for combining KB relations and surface text into a single graph
representation that is much more compact than graphs used in prior work.
Second, we describe how to incorporate vector space similarity into random walk
inference over KBs, reducing the feature sparsity inherent in using surface
text. This allows us to combine distributional similarity with symbolic
logical inference in novel and effective ways. With experiments on many
relations from two separate KBs, we show that our methods significantly
outperform prior work on KB inference, both in the size of problem our methods
can handle and in the quality of predictions made.
@InProceedings{gardner2014incorporating,
author = {Gardner, Matt and Talukdar, Partha and Krishnamurthy, Jayant and Mitchell, Tom},
title = {Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases},
booktitle = {Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
month = {October},
year = {2014},
address = {Doha, Qatar.},
publisher = {Association for Computational Linguistics}}
-
Language-Aware Truth Assessment of Fact Candidates.
N. Nakashole, T. Mitchell.
In Proceedings of
the 52nd Annual Meeting of the Association for Computational Linguistics (ACL), 2014.
[PDF]
[abstract]
[bib]
This paper introduces FactChecker, alanguage-aware approach to truth-finding.
FactChecker differs from prior approaches
in that it does not rely on iterative peer voting,
instead it leverages language to infer believability of fact candidates.
In particular, FactChecker makes use of linguistic features to detect
if a given source objectively states facts or is speculative and opinionated.
To ensure that fact candidates mentioned in similar sources have similar believability,
FactChecker augments objectivity with a co-mention score to compute the overall believability score of a
fact candidate. Our experiments on various datasets show that FactChecker yields higher accuracy than existing approaches.
@inproceedings{nakashole2014truth,
author = {Nakashole, Ndapandula and Mitchell, Tom M},
title = {{Language-Aware Truth Assessment of Fact Candidates}},
booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational
Linguistics (ACL)},
year = {2014},
pages = {1009--1019}}
-
Scaling Graph-based Semi Supervised Learning to Large Number of Labels Using Count-Min Sketch
P. P. Talukdar, and W. Cohen
In 17th International Conference on Artificial Intelligence and Statistics (AISTATS, 2014.
[PDF]
[abstract]
[bib]
Graph-based Semi-supervised learning (SSL) algorithms have been successfully used in a large number of applications. These methods classify initially unlabeled nodes by propa- gating label information over the structure of graph starting from seed nodes. Graph-based SSL algorithms usually scale linearly with the number of distinct labels (m), and require O(m) space on each node. Unfortunately, there exist many applications of practical sig- nificance with very large m over large graphs, demanding better space and time complexity. In this paper, we propose MAD-Sketch, a novel graph-based SSL algorithm which compactly stores label distribution on each node using Count-min Sketch, a random- ized data structure. We present theoretical analysis showing that under mild conditions, MAD-Sketch can reduce space complexity at each node from O(m) to O(logm), and achieve similar savings in time complexity as well. We support our analysis through exper- iments on multiple real world datasets. We observe that MAD-Sketch achieves simi- lar performance as existing state-of-the-art graph-based SSL algorithms, while requir- ing smaller memory footprint and at the same time achieving up to 10x speedup. We find that MAD-Sketch is able to scale to datasets with one million labels, which is be- yond the scope of existing graph-based SSL algorithms.
@inproceedings{talukdar2014scaling,
title={Scaling Graph-based Semi Supervised Learning to Large Number of Labels Using Count-Min Sketch},
author={Talukdar, Partha P and Cohen, William W},
booktitle={17th International Conference on Artificial Intelligence and Statistics (AISTATS 2014)},
year={2014},
address ={Reyjavik, Iceland}}
-
Programming with Personalized PageRank: A Locally Groundable First-Order Probabilistic Logic.
W.Y. Wang, K. Mazaitis and W.W. Cohen.
In Proceedings of the Conference on Information and Knowledge Management (CIKM), 2013.
[PDF]
[abstract]
[bib]
Many information-management tasks (including classification, retrieval,
information extraction, and information integration) can be formalized as
inference in an appropriate probabilistic first-order logic. However, most
probabilistic first-order logics are not efficient enough for
realistically-sized instances of these tasks. One key problem is that
queries are typically answered by "grounding" the query---i.e., mapping it
to a propositional representation, and then performing propositional
inference---and with a large database of facts, groundings can be very
large, making inference and learning computationally expensive. Here we
present a first-order probabilistic language which is well-suited to
approximate "local" grounding: in particular, every query Q can be
approximately grounded with a small graph. The language is an extension of
stochastic logic programs where inference is performed by a variant of
personalized PageRank. Experimentally, we show that the approach performs
well on an entity resolution task, a classification task, and a joint
inference task; that the cost of inference is independent of database size;
and that speedup in learning is possible by multi-threading.
@article{wangprogramming2013,
title={Programming with Personalized PageRank: A Locally Groundable First-Order Probabilistic Logic},
author={Wang, William Yang and Mazaitis, Kathryn and Cohen, William W},
journal={to appear in Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM 2013)},
year={2013}}
-
Improving Learning and Inference in a Large Knowledge-base using Latent Syntactic Cues.
Matt Gardner, Partha Pratim Talukdar, Bryan Kisiel, and Tom Mitchell.
In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), 2013.
[PDF]
[abstract]
[bib]
Automatically constructed Knowledge Bases (KBs) are often incomplete and there
is a genuine need to improve their coverage. Path Ranking Algorithm (PRA) is a
recently proposed method which aims to improve KB coverage by performing
inference directly over the KB graph. For the first time, we demonstrate that
addition of edges labeled with latent features mined from a large dependency
parsed corpus of 500 million Web documents can significantly outperform
previous PRA-based approaches on the KB inference task. We present extensive
experimental results validating this finding. The resources presented in this
paper are publicly available.
@article{gardnerpra2013,
title{Improving Learning and Inference in a Large Knowledge-base using Latent Syntactic Cues}
author={Gardner, Matt and Talukdar, Partha Pratim and Kisiel, Bryan and Mitchell, Tom},
journal={Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013)},
year={2013}}
-
PIDGIN: Ontology Alignment using Web Text as Interlingua.
D.T. Wijaya, P.P. Talukdar and T.M. Mitchell.
In Proceedings of the Conference on Information and Knowledge Management (CIKM), 2013.
[PDF]
[abstract]
[bib]
The problem of aligning ontologies and database schemas across different knowledge
bases and databases is fundamental to knowledge management problems, including
the problem of integrating the disparate knowledge sources that form the semantic
web's Linked Data. We present a novel approach to this ontology alignment problem
that employs a very large natural language text corpus as an interlingua to relate
different knowledge bases (KBs). The result is a scalable and robust method (
PIDGIN)
that aligns relations and categories across different KBs by analyzing both
(1) shared relation instances across these KBs, and (2) the verb phrases in
the text instantiations of these relation instances. Experiments with PIDGIN demonstrate
its superior performance when aligning ontologies across large existing KBs including NELL,
Yago and Freebase. Furthermore, we show that in addition to aligning ontologies,
PIDGIN can automatically learn from text, the verb phrases to identify relations,
and can also type the arguments of relations of different KBs.
@InProceedings{wijaya:2013:PIDGIN,
author = {Wijaya, Derry and Talukdar, Partha Pratim and Mitchell, Tom},
title = {PIDGIN: Ontology Alignment using Web Text as Interlingua},
booktitle = {Proceedings of the Conference on Information and Knowledge Management (CIKM 2013)},
month = {October},
year = {2013},
address = {San Francisco, USA},
publisher = {Association for Computing Machinery}}
-
Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World.
Jayant Krishnamurthy and Thomas Kollar. in Transactions of the Association for Computational Linguistics, Volume 1, 2013.
[PDF]
[Data and Online Appendix]
-
Vector Space Semantic Parsing: A Framework for Compositional Vector Space Models.
Jayant Krishnamurthy and Tom M. Mitchell. in
Proceedings of the ACL 2013 Workshop on Continuous Vector Space Models and their Compositionality, 2013.
[PDF]
-
Classifying Entities into an Incomplete Ontology.
Bhavana Dalvi,
William W. Cohen, and Jamie Callan, in AKBC , 2013,
3rd Knowledge Extraction workshop at CIKM 2013
[Draft]
-
Exploratory Learning.
Bhavana Dalvi,
William W. Cohen, and Jamie Callan, in to appear in Proceedings of European Conference on “Machine Learning” ECML/PKDD , 2013.
[PDF]
[bib]
@inproceedings{dalvi_ecml13,
author = {Dalvi, Bhavana and Cohen, William W. and Callan, Jamie},
title = {Exploratory Learning},
booktitle = {Proceedings of the 2013 European conference on Machine Learning and Knowledge Discovery in Databases},
series = {ECML PKDD'13},
year = {2013},
location = {Prague, Czech Republic},
publisher = {Springer-Verlag},
}
-
From Topic Models to Semi-Supervised Learning: Biasing Mixed-membership Models to Exploit Topic-Indicative Features in Entity Clustering.
Ramnath Balasubramanyan, Bhavana Dalvi and William W. Cohen, in to appear in Proceedings of European Conference on “Machine Learning” ECML/PKDD, 2013.
[PDF]
[bib]
@inproceedings{rbalasub_dalvi_ecml13,
author = {Balasubramanyan, Ramnath and Dalvi, Bhavana and Cohen, William W.},
title = {From Topic Models to Semi-Supervised Learning: Biasing Mixed-membership Models to Exploit Topic-Indicative Features in Entity Clustering},
booktitle = {Proceedings of the 2013 European conference on Machine Learning and Knowledge Discovery in Databases},
series = {ECML PKDD'13},
year = {2013},
location = {Prague, Czech Republic},
publisher = {Springer-Verlag},
}
-
Very Fast Similarity Queries on Semi-Structured Data from the Web .
Bhavana Dalvi,
William W. Cohen, and Jamie Callan, in SDM , 2013.
[PDF]
-
Conversing Learning: active learning and active social interaction for human supervision in never-ending learning systems.
S. D. S. Pedro and E. R. Hruschka Jr.
In Proceedings of the 13th Ibero-American Conference on AI (IBERAMIA) , 2012.
[PDF]
[abstract]
[bib]
The Machine Learning community have been introduced to NELL (Never-Ending Language Learning), a system able to learn from the web and, also, able to use its own knowledge to keep learning better each day. The idea of continuously learning from the web brings concerns about reliability and accuracy, mainly when the learning process uses its own knowledge to improve its learning capabilities. Considering that its knowledge base keeps growing forever, such a system requires self-supervision as well as self-reflection. The increased use of the Internet, that allowed NELL creation, also brought a new source of on-line information. The social media becomes more popular everyday and the AI community can now develop research to take advantage of these information, aiming to turn it into knowledge. This work follows this lead and proposes a new machine learning approach, called Conversing Learning, which uses collective knowledge from web community users to provide self-supervision and self-reflection to intelligent machines, thus, they can improve their own learning capabilities. The Conversing Learning approach explores concepts from Active Learning, as well as Question Answering to achieve the goal of showing what can be done towards autonomous Human Computer Interaction to automatically improve machine learning performance.
@incollection{pedro2012conversing,
title={Conversing learning: Active learning and active social interaction for human supervision in never-ending learning systems},
author={Pedro, Saulo DS and {Hruschka Jr}, Estevam R},
booktitle={Advances in Artificial Intelligence--IBERAMIA 2012},
pages={231--240},
year={2012},
publisher={Springer}
}
-
Coupled Bayesian Sets Algorithm for Semi-supervised Learning and Information Extraction.
S. Verma and E. R. Hruschka Jr.
In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2012.
[PDF]
[abstract]
[bib]
[slides]
Our inspiration comes from Nell (Never Ending Language Learning), a computer program running at Carnegie Mellon University to extract structured information from unstructured web pages. We consider the problem of semi-supervised learning approach to extract category instances (e.g. country(USA), city(New York)) from web pages, starting with a handful of labeled training examples of each category or relation, plus hundreds of millions of unlabeled web documents. Semi-supervised approaches using a small number of labeled examples together with many unlabeled examples are often unreliable as they frequently produce an internally consistent, but nevertheless, incorrect set of extractions. We believe that this problem can be overcome by simultaneously learning independent classifiers in a new approach named Coupled Bayesian Sets algorithm, based on Bayesian Sets, for many different categories and relations (in the presence of an ontology defining constraints that couple the training of these classifiers). Experimental results show that simultaneously learning a coupled collection of classifiers for random 11 categories resulted in much more accurate extractions than training classifiers through original Bayesian Sets algorithm, Naive Bayes, BaS-all and Coupled Pattern Learner (the category extractor used in NELL).
@InProceedings{verma:2012:cbs,
author = {Verma, Saurabh and {Hruschka Jr.}, Estevam Rafael},
title = {Coupled Bayesian Sets Algorithm for Semi-supervised Learning and Information Extraction},
booktitle = {Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2012)},
month = {September},
year = {2012},
address = {Bristol, UK},
publisher = {Association for Computing Machinery}}
-
Acquiring Temporal Constraints between Relations.
P.P. Talukdar, D.T. Wijaya and T.M. Mitchell.
In Proceedings of the Conference on Information and Knowledge Management (CIKM), 2012.
[PDF]
[abstract]
[bib]
We consider the problem of automatically acquiring knowledge
about the typical temporal orderings among relations
(e.g., actedIn(person, film) typically occurs before wonPrize
(film, award)), given only a database of known facts (relation
instances) without time information, and a large document
collection. Our approach is based on the conjecture
that the narrative order of verb mentions within documents
correlates with the temporal order of the relations they represent.
We propose a family of algorithms based on this conjecture,
utilizing a corpus of 890m dependency parsed sentences
to obtain verbs that represent relations of interest,
and utilizing Wikipedia documents to gather statistics on
narrative order of verb mentions. Our proposed algorithm,
GraphOrder, is a novel and scalable graph-based label propagation
algorithm that takes transitivity of temporal order
into account, as well as these statistics on narrative order of
verb mentions. This algorithm achieves as high as 38.4% absolute
improvement in F1 over a random baseline. Finally,
we demonstrate the utility of this learned general knowledge
about typical temporal orderings among relations, by showing
that these temporal constraints can be successfully used
by a joint inference framework to assign specific temporal
scopes to individual facts.
@InProceedings{talukdar:2012:temporal,
author = {Talukdar, Partha Pratim and Wijaya, Derry and Mitchell, Tom},
title = {Acquiring Temporal Constraints between Relations},
booktitle = {Proceedings of the Conference on Information and Knowledge Management (CIKM 2012)},
month = {October},
year = {2012},
address = {Hawaii, USA},
publisher = {Association for Computing Machinery}}
- Weakly Supervised Training of Semantic
Parsers.
J. Krishnamurthy and T.M. Mitchell. In
Proceedings of the 2012 Conference on Empirical Methods in
Natural Language Processing and Computational Natural Language
Learning (EMNLP-CoNLL), 2012.
[PDF]
[abstract]
[bib] We present a method for training a semantic
parser using only a knowledge base and an unlabeled text corpus,
without any individually annotated sentences. Our key observation
is that multiple forms of weak supervision can be combined to
train an accurate semantic parser: semantic supervision from a
knowledge base, and syntactic supervision from dependency-parsed
sentences. We apply our approach to train a semantic parser that
uses 77 relations from Freebase in its knowledge
representation. This semantic parser extracts instances of binary
relations with state-of-the-art accuracy, while simultaneously
recovering much richer semantic structures, such as conjunctions
of multiple relations with partially shared arguments. We
demonstrate recovery of this richer structure by extracting
logical forms from natural language queries against Freebase. On
this task, the trained semantic parser achieves 80% precision and
56% recall, despite never having seen an annotated logical form.
@InProceedings{krishnamurthy:2012,
author = {Krishnamurthy, Jayant and Mitchell, Tom M.},
title = {Weakly Supervised Training of Semantic Parsers},
booktitle = {Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
month = {July},
year = {2012},
publisher = {Association for Computational Linguistics}}
-
Collectively Representing Semi-Structured Data from the Web.
Bhavana Dalvi,
William W. Cohen, and Jamie Callan, in AKBC-2012 , 2012.
[PDF]
-
Bootstrapping Biomedical Ontologies for Scientific Text using NELL.
Dana Movshovitz-Attias and William W. Cohen,
, in BioNLP-2012, 2012.
[PDF]
-
WebSets: Extracting Sets of Entities from the Web Using Unsupervised Information Extraction .
Bhavana Dalvi,
William W. Cohen, and Jamie Callan, in WSDM-2012 , 2012.
[PDF]
[bib]
@InProceedings{dalvi:wsdm:2012,
author = {Dalvi, Bhavana and Cohen, William W. and Callan, Jamie},
title = {WebSets: Extracting Sets of Entities from the Web Using Unsupervised Information Extraction},
booktitle = {Proceedings of the Fifth ACM International Conference on Web Search and Data Mining (WSDM)},
month = {February},
year = {2012},
address = {Seattle, Washington, USA},
publisher = {Association for Computing Machinery}}
-
Coupled Temporal Scoping of Relational Facts.
P.P. Talukdar, D.T. Wijaya and T.M. Mitchell.
In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM), 2012.
[PDF]
[abstract]
[bib]
Recent research has made significant advances in automatically
constructing knowledge bases by extracting relational
facts (e.g., Bill Clinton-presidentOf-US) from large text corpora.
Temporally scoping such relational facts in the knowledge
base (i.e., determining that Bill Clinton-presidentOf-US
is true only during the period 1993 - 2001) is an important,
but relatively unexplored problem. In this paper,
we propose a joint inference framework for this task,
which leverages fact-specific temporal constraints, and weak
supervision in the form of a few labeled examples. Our
proposed framework, CoTS (Coupled Temporal Scoping),
exploits temporal containment, alignment, succession, and
mutual exclusion constraints among facts from within and
across relations. Our contribution is multi-fold. Firstly,
while most previous research has focused on micro-reading
approaches for temporal scoping, we pose it in a macro-reading
fashion, as a change detection in a time series of
facts' features computed from a large number of documents.
Secondly, to the best of our knowledge, there is no other
work that has used joint inference for temporal scoping. We
show that joint inference is effective compared to doing temporal
scoping of individual facts independently. We conduct
our experiments on large scale open-domain publicly
available time-stamped datasets, such as English Gigaword
Corpus and Google Books Ngrams, demonstrating CoTS's
effectiveness.
@InProceedings{talukdar:2012:coupled,
author = {Talukdar, Partha Pratim and Wijaya, Derry and Mitchell, Tom},
title = {Coupled Temporal Scoping of Relational Facts},
booktitle = {Proceedings of the Fifth ACM International Conference on Web Search and Data Mining (WSDM)},
month = {February},
year = {2012},
address = {Seattle, Washington, USA},
publisher = {Association for Computing Machinery}}
-
Closing the Loop: Fast, Interactive Semi-supervised Annotation With Queries on Features and Instances.
B. Settles.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2011.
[PDF]
[abstract]
[bib]
[software homepage]
This paper describes DUALIST, an active learning annotation paradigm which
solicits and learns from labels on both features (e.g., words) and instances
(e.g., documents). We present a novel semi-supervised training algorithm
developed for this setting, which is (1) fast enough to support real-time
interactive speeds, and (2) at least as accurate as pre-existing methods
for learning with mixed feature and instance labels. Human annotators in
user studies were able to produce near-state-of-the-art classifiers on
several corpora in a variety of application domains with only a few minutes
of effort.
@InProceedings{settles:2011:EMNLP,
author = {Settles, Burr},
title = {Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
year = {2011},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {1467--1478},
url = {http://www.aclweb.org/anthology/D11-1136}}
-
Discovering Relations between Noun Categories.
T. Mohamed, E.R. Hruschka Jr. and T.M. Mitchell.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2011.
[PDF]
[abstract]
[bib]
Traditional approaches to Relation Extraction from text
require manually defining the relations to be extracted. We propose
here an approach to automatically discovering relevant relations,
given a large text corpus plus an initial ontology defining hundreds
of noun categories (e.g., Athlete, Musician, Instrument). Our approach
discovers frequently stated relations between pairs of these
categories, using a two step process. For each pair of categories
(e.g., Musician and Instrument) it first co-clusters the text contexts
that connect known instances of the two categories, generating a
candidate relation for each resulting cluster. It then applies a
trained classifier to determine which of these candidate relations is
semantically valid. Our experiments apply this to a text corpus
containing approximately 200 million web pages and an ontology
containing 122 categories from the NELL system, producing a set of 781
proposed candidate relations, approximately half of which are
semantically valid. We conclude this is a useful approach to
semi-automatic extension of the ontology for large-scale information
extraction systems such as NELL.
@InProceedings{mohamed-hruschka-mitchell:2011:EMNLP,
author = {Mohamed, Thahir and Hruschka, Estevam and Mitchell, Tom},
title = {Discovering Relations between Noun Categories},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
year = {2011},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {1447--1455},
url = {http://www.aclweb.org/anthology/D11-1134}}
-
Random Walk Inference and Learning in A Large Scale Knowledge Base.
N. Lao, T.M. Mitchell, W.W. Cohen
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2011.
[PDF]
[abstract]
[bib]
We consider the problem of performing learning and inference in a large scale knowledge base containing imperfect knowledge with incomplete
coverage. We show that a soft inference procedure based on a combination of constrained, weighted, random walks through the knowledge base
graph can be used to reliably infer new beliefs for the knowledge base. More specifically, we show that the system can learn to infer
different target relations by tuning the weights associated with random walks that follow different paths through the graph, using a version
of the Path Ranking Algorithm (Lao & Cohen, 2010). We apply this approach to a knowledge base of approximately 500,000 beliefs extracted
imperfectly from the web by NELL, a never-ending language learner (Carlson et al., 2010). This new system improves significantly over NELL's
earlier Horn-clause learning and inference method: it obtains nearly double the precision at rank 100, and the new learning method is also
applicable to many more inference tasks.
@InProceedings{lao-mitchell-cohen:2011:EMNLP,
author = {Lao, Ni and Mitchell, Tom and Cohen, William W.},
title = {Random Walk Inference and Learning in A Large Scale Knowledge Base},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
year = {2011},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {529--539},
url = {http://www.aclweb.org/anthology/D11-1049}}
-
Which Noun Phrases Denote Which Concepts?.
J. Krishnamurthy, T.M. Mitchell. In Proceedings of
the 49th Annual Meeting of the Association for Computational Linguistics (ACL), 2011.
[PDF]
[abstract]
[bib]
Resolving polysemy and synonymy is required for high-quality
information extraction. We present ConceptResolver, a component for
the Never-Ending Language Learner (NELL) that
handles both phenomena by identifying the latent concepts that noun
phrases refer to. ConceptResolver performs both word sense induction
and synonym resolution on relations extracted from text using an
ontology and a small amount of labeled data. Domain knowledge (the ontology)
guides concept creation by defining a set of possible
semantic types for concepts. Word sense induction is performed by
inferring a set of semantic types for each noun phrase. Synonym
detection exploits redundant information to train several
domain-specific synonym classifiers in a semi-supervised fashion.
When ConceptResolver is run on NELL's knowledge base, 87% of the word
senses it creates correspond to real-world concepts, and 85% of noun
phrases that it suggests refer to the same concept are indeed synonyms.
@inproceedings{krishnamurthy-acl,
Title = {Which Noun Phrases Denote Which Concepts},
Author = {Jayant Krishnamurthy and Tom M. Mitchell},
Booktitle = {Proceedings of the Forty Ninth Annual Meeting of the Association for Computational Linguistics},
Year = {2011}}
-
Adaptation of Graph-Based Semi-Supervised Methods to Large-Scale Text Data.
Frank Lin and William W. Cohen, in MLG-2011 , 2011.
[PDF]
-
Understanding Semantic Change of Words Over Centuries.
D.T. Wijaya and R. Yeniterzi.
In Workshop on Detecting and Exploiting Cultural Diversity on the Social Web (DETECT), 2011 at CIKM 2011.
[PDF]
[abstract]
In this paper, we propose to model and analyze changes
that occur to an entity in terms of changes in the words
that co-occur with the entity over time. We propose to do
an in-depth analysis of how this co-occurrence changes over
time, how the change influences the state (semantic, role)
of the entity, and how the change may correspond to events
occurring in the same period of time. We propose to identify
clusters of topics surrounding the entity over time using
Topics-Over-Time (TOT) and k-means clustering. We
conduct this analysis on Google Books Ngram dataset. We
show how clustering words that co-occur with an entity of
interest in 5-grams can shed some lights to the nature of
change that occurs to the entity and identify the period for
which the change occurs. We find that the period identified
by our model precisely coincides with events in the same
period that correspond to the change that occurs.
-
"Nut Case: What does It Mean?": Understanding Semantic Relationship between Nouns in Noun Compounds through Paraphrasing and Ranking the Paraphrases.
D.T. Wijaya and P. Gianfortoni.
In Workshop on Search and Mining Entity-Relationship Data (SMER), 2011 at CIKM 2011.
[PDF]
[abstract]
A noun compound (NC) is a sequence of two or more nouns
(entities) acting as a single noun entity that encodes implicit
semantic relation between its noun constituents. Given an NC
such as 'headache pills' and possible paraphrases such as: 'pills
that in-duce headache' or 'pills that relieve head-ache' can we learn
to choose which verb: 'induce' or 'relieve' that best describes the
semantic relation encoded in 'headache pills'? In this paper, we
describe our approaches to rank human-proposed paraphrasing
verbs of NCs. Our contribution is a novel approach that uses two-step
process of clustering similar NCs and then labeling the best
paraphrasing verb as the most prototypical verb in the cluster. The
approach performs the best with an average Spearman's rank
correlation of 0.55. This approach, while being computationally
simpler, gives a better ranking than the current state of the art. The
result shows the potential of our approach for finding implicit
relations between entities especially when the relations are not
explicit in the context in which the entities appear, rather they are
implicit in the relationship between its constituents.
-
Toward an Architecture for Never-Ending Language Learning.
A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E.R. Hruschka Jr. and T.M. Mitchell.
In Proceedings of the Conference on Artificial Intelligence (AAAI), 2010.
[PDF]
[abstract]
[bib]
[supplementary materials]
We consider here the problem of building a never-ending language learner; that
is, an intelligent computer agent that runs forever and that each day must (1)
extract, or read, information from the web to populate a growing structured
knowledge base, and (2) learn to perform this task better than on the
previous day. In particular, we propose an approach and a set of design
principles for such an agent, describe a partial implementation of such a
system that has already learned to extract a knowledge base containing over
242,000 beliefs with an estimated precision of 74%, and discuss lessons
learned from this preliminary attempt to build a never-ending learning agent.
@inproceedings{carlson-aaai,
Title = {Toward an Architecture for Never-Ending Language Learning},
Author = {Andrew Carlson and Justin Betteridge and Bryan Kisiel and Burr Settles and Estevam R. Hruschka Jr. and Tom M. Mitchell},
Booktitle = {Proceedings of the Twenty-Fourth Conference on Artificial Intelligence (AAAI 2010)},
Year = {2010}}
-
Coupled Semi-Supervised Learning for Information Extraction.
A. Carlson, J. Betteridge, R.C. Wang, E.R. Hruschka Jr. and T.M. Mitchell.
In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM), 2010.
[PDF]
[abstract]
[bib]
[supplementary materials]
We consider the problem of semi-supervised learning to extract categories (e.g., academicFields, athletes) and relations (e.g., PlaysSport(athlete, sport)) from web pages, starting with a handful of labeled training examples of each category or relation, plus hundreds of millions of unlabeled web documents. Semi-supervised training using only a few labeled examples is typically unreliable because the learning task is underconstrained. This paper pursues the thesis that much greater accuracy can be achieved by further constraining the learning task, by coupling the semi-supervised training of many extractors for different categories and relations. We characterize several ways in which the training of category and relation extractors can be coupled, and present experimental results demonstrating significantly improved accuracy as a result.
@inproceedings{carlson-wsdm,
Title = {Coupled Semi-Supervised Learning for Information Extraction},
Author = {Andrew Carlson and Justin Betteridge and Richard C. Wang and Estevam R. Hruschka Jr. and Tom M. Mitchell},
Booktitle = {Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM 2010)},
Year = {2010}}
-
Populating the Semantic Web by Macro-Reading Internet Text.
T.M. Mitchell, J.Betteridge, A. Carlson, E.R. Hruschka Jr. and R.C. Wang.
Invited Paper, In Proceedings of the International Semantic Web Conference (ISWC), 2009.
[PDF]
[abstract]
[bib]
A key question regarding the future of the semantic web is "how will we acquire structured information to populate the semantic web on a vast scale?" One approach is to enter this information manually. A second approach is to take advantage of pre-existing databases, and to develop common ontologies, publishing standards, and reward systems to make this data widely accessible. We consider here a third approach: developing software that automatically extracts structured information from unstructured text present on the web. We also describe preliminary results demonstrating that machine learning algorithms can learn to extract tens of thousands of facts to populate a diverse ontology, with imperfect but reasonably good accuracy.
@inproceedings{mitchell-iswc09,
Title = {Populating the Semantic Web by Macro-Reading Internet Text},
Author = {Tom M. Mitchell and Justin Betteridge and Andrew Carlson and Estevam R. Hruschka Jr. and Richard C. Wang},
Booktitle = {Proceedings of the 8th International Semantic Web Conference (ISWC 2009)},
Year = {2009}}
-
Coupling Semi-Supervised Learning of Categories and Relations.
A. Carlson, J. Betteridge, E.R. Hruschka Jr. and T.M. Mitchell.
In Proceedings of the NAACL HLT Workshop on Semi-supervised Learning for Natural Language Processing, 2009.
[PDF]
[abstract]
[bib]
We consider semi-supervised learning of information extraction methods, especially for extracting instances of noun categories (e.g., athlete, team) and relations (e.g., playsForTeam(athlete,team)). Semisupervised approaches using a small number of labeled examples together with many unlabeled examples are often unreliable as they frequently produce an internally consistent, but nevertheless incorrect set of extractions. We propose that this problem can be overcome by simultaneously learning classifiers for many different categories and relations in the presence of an ontology defining constraints that couple the training of these classifiers. Experimental results show that simultaneously learning a coupled collection of classifiers for 30 categories and relations results in much more accurate extractions than training classifiers individually.
@inproceedings{carlson-sslnlp09,
Title = {Coupling Semi-Supervised Learning of Categories and Relations},
Author = {Andrew Carlson and Justin Betteridge and Estevam R. Hruschka Jr. and Tom M. Mitchell},
Booktitle = {Proceedings of the NAACL HLT 2009 Workskop on Semi-supervised Learning for Natural Language Processing},
Year = {2009}}