Good decision-making is dependent on comprehensive, accurate knowledge. But the information relevant to many important decisions in areas such as business, government, medicine and scientific research is massive, and growing at an accelerating pace. Relevant raw data is widely available on the web and other data sources, but usually in order to be useful it must be gathered, extracted, organized, and normalized into a knowledge base.

Hand-built knowledge bases such as Wikipedia have made us all better decision-makers. However more than human editing will be necessary to create a wide variety of domain-specific, deeply comprehensive, more highly structured knowledge bases.

A variety of automated methods have begun to reach levels of accuracy and scalability that make them applicable to automatically constructing useful knowledge bases from text and other sources. These capabilities have been enabled by research in areas including natural language processing, information extraction, information integration, databases, search and machine learning. There are substantial scientific and engineering challenges in advancing and integrating such relevant methodologies.

This workshop gathered researchers in a variety of fields that contribute to the automated construction of knowledge bases.

There has recently been a tremendous amount of new work in this area, some of it in traditionally disconnected communities. In this workshop the organizers aim to bring these communities together.

Topics of interest include:

  • information extraction; open information extraction, named entity extraction; entity resolution, relation extraction.
  • information integration; schema alignment; ontology alignment; ontology constrution.
  • monolingual alignment, alignment between knowlege bases and text.
  • joint inference between text interpretation and knowledge base
  • pattern analysis, semantic analysis of natural language, reading the web, learning by reading.
  • databases; distributed information systems; probabilistic databases.
  • scalable computation; distributed computation.
  • information retrieval; search on mixtures of structured and unstructured data; querying under uncertainty.
  • machine learning; unsupervised, lightly-supervised and distantly-supervised learning; learning from naturally-available data.
    - human-computer collaboration in knowledge base construction; automated population of wikis.
    - dynamic data, online/on-the-fly adaptation of knowledge.
    - inference; scalable approximate inference.
    - languages, toolkits and systems for automated knowledge base construction.
    - demonstrations of existing automatically-built knowledge bases.