ESTC 2008 - Impressions

Brought to you by:

STI International

Supported by:

Austrian Computer Society
ARK Group
BMWA
BMVIT
ffg

Sponsored by:

theseus
W3C
STI Innsbruck
iSOCO
Ontotext
matrixware
ontos
metatomix
NEON-Project
DERI Galway
Active Project
Seekda
Semantic Exchange
Franz Inc
Talis
Service Web 3.0
Innoraise
soa4all

larck

AWS

Keynote Presentations
Extraction and resolution capabilities for entities, events and facts

Peter Jackson, Thomson Reuters Professional
Barak Pridor, Clearforest - a Thomson Reuters Markets company

"We discuss some current thinking around the use of machine extraction and resolution to deliver what we call 'Intelligent Information'. Such information is tailored to a user's underlying task, and must be derived automatically from underlying content sources. We describe a platform called ClearForest Tags, which performs state of the art entity, fact and event extraction, and illustrate its utility with some commercial use cases.  We also describe an entity resolution toolkit called Dexter/Concord, which is capable of both mapping named entities to authority files and merging or updating database records.  We supply some technical details of these fully implemented and deployed systems, as well as describing plans to more tightly integrate them."

Peter Jackson- Chief Scientist Thomson Reuters Professional

Peter Jackson is chief scientist at the Professional Division of Thomson Reuters with responsibility for corporate research and development.  He is a seasoned technology executive with a background that includes 29 years experience in R&D, the last 13 of which have been spent in the online information business. He has a proven track record of working with new product development and strategic marketing functions to deliver state-of-the-art applications that generate new revenues. His specialty lies in the design and delivery of innovative platforms and novel solution components for document retrieval, text categorization and information extraction technologies. Peter Jackson studied Human Psychology at the University of Aston in Birmingham and completed his PhD in Artificial Intelligence at the Leeds University (UK).

Barak Pridor - CEO of Clearforest, a Thomson Reuters Markets company

Mr. Pridor brings extensive management experience in technology companies. At ClearForest he has overall responsibility for the company's strategic business direction, corporate finances and executive staffing. Prior to joining ClearForest in January 2000, Mr. Pridor spent 9 years with Tecnomatix Technologies in Israel (NASDAQ: TCNO). He held numerous management positions in technology, business development and marketing. Mr. Pridor last served as General Manager in Tecnomatix. Mr. Pridor has a B.Sc. in Mathematics and Computer Science from Tel-Aviv University in Israel and an MBA from the INSEAD International Business School in France.

 
Internet of Services

Orestis Terzidis, SAP Research Centre CEC

" We are moving towards a services economy where more and more value in an economy is created through services. The Internet of Services addresses the challenge to transform services into tradable goods that are offered, sold, executed and consumed via the internet. More specialized services will be offered as organizations need to focus on core competences to be able to compete on the global market. Semantic technologies are key enablers of the Internet of Services to allow for the aggregation, composition and coordination of such specialized services, which are potentially provided by different service providers, into value-added services. The German funded TEXO project runs under the umbrella of the THESEUS research programme and addresses the challenges for an Internet of Services. The interdisciplinary TEXO consortium is coordinated by SAP and includes a number of industrial and academic partners having technical, economical and legal competencies."

Orestis Terzidis - Director SAP Research Centre CEC

Orestis Terzidis graduated in theoretical Physics. As from 1998 he worked as applied developer at the SAP Labs Sophia Antipolis. From 2000 to 2003 he worked as assistant of the committee member of SAP AG, Henning Kagermann. Since January 2004 Terzidis is director of CEC Karlsruhe, a leading Centre of SAP Research. Within SAP he has the leading function in projects on strategic development.

 
Getting at the Semantics of Texts

Hans Uszkoreit, German Research Center for Artificial Intelligence (DFKI) and DFKI Language Technology Lab

"As semantic technologies keep evolving and maturing, there is growing concern about the gigantic wealth of knowledge encoded in so-called unstructured data. Actually the bulk of human knowledge on the web (and in books) is represented in texts. Not even the most optimistic proponents of semantic representation standards expect that this information will be rewritten or extensively complemented by semantic meta-data through intellectual labour.  On the other hand, there is a discipline of science and technology called computational linguistics that has been concerned for several decades with the automatic analysis of human language. One of the original goals of this field was the automatic understanding of texts by translating them into a knowledge representation language that machines could use for reasoning. However, through sobering experience of the complexity of this task most applied computational linguists turned to easier challenges. There is now a wide variety of human language technologies, many of which have enabled new types of products. Among these applications are text classification, email response systems, text-to-speech software, grammar checking and statistical machine translation. 

In this presentation, however, the state of the art and recent achievements in two strands of language technology will be explained and illustrated by examples. One of them is the automatic extraction of semantic relations, or more precisely of relation instances, from large volumes of texts. Such relation instances could be events, properties of objects, or opinions on products. Using results from our own research, I will demonstrate how machine learning techniques were combined with existing advanced language analysis methods for improving such an analysis beyond the best results achievable by either one of these approaches alone. I will also show how the semantic domain models can be utilized for improving the performance of the relation extraction. 

The second strand of research to be presented is the deep syntactic and semantic analysis of human language. While most computational linguists had turned away from this fundamental challenge in favour of lower hanging fruit, a few groups continued the quest for text understanding. Because of the size of the problem and the desire to develop techniques that would work for more than language, several of them teamed up in international collaborations. I will briefly describe the two largest international collaborations in this area, the DELPH-IN initiative dedicated to deep language processing with HPSG and the PARGRAM initiative pursuing the same goal by LFG. HPSG and LFG are two advanced formal models of linguistic description developed in the seventies and eighties of last century. The results of the PARGRAM initiative were lead by PARC and are among the central assets of the search technology company Powerset which was recently acquired by Microsoft. The results of the DELPH-IN initiative are collected in growing a open-source repository of research resources.  I will explain the significance of recent advances by these two consortia and related research activities. 

In the conclusion of the talk I will argue that a combination of the machine-learning approach to relation extraction with the advances of the deep linguistic processing research will open the way to an exploitation of large volumes of unstructured textual data by semantic technologies."

Hans Uszkoreit - Scientific Director at the German Research Center for Artificial Intelligence (DFKI) and Head of DFKI Language Technology Lab

Hans Uszkoreit is Professor of Computational Linguistics at Saarland Univertsity. Moreover, he serves as Scientific Director at the German Research Center for Artificial Intelligence (DFK) where he heads the DFKI Language Technology Lab. Uszkoreit studied Linguistics and Computer Science at the Technical University of Berlin and the Univertisity of Texas at Austin. While he was in Austin, he also worked as a research associate in a large machine translation project at the Linguistics Research Center. Furthermore, after he received his Ph.D. in linguistics from the University of Texas, he worked as a computer scientist at the Artificial Intelligence Center and, what is more, he was affiliated with the Center for the Study of Language and Information at Stanford University.

 
Improving Search with Semantic Technologies: Current Research Directions

Hugo Zaragoza, Yahoo! Research Barcelona

Search engines play a major role in the success and growth of the WWW. In doing so they in turn help shape the web: they create new business models, modify content creation and consumption practices, support new forms of user interaction, etc. Semantic Web technologies have the potential to greatly improve search; if they succeed, this will in turn speed up the growth and impact of the semantic web initiatives. Yahoo! has already announced important steps towards integrating semantic technologies to improve and open up its search engine to content publishers, consumers and advertisers. In my talk I will briefly discuss some of these initiatives.

However, the history of Information Retrieval has taught us that fundamentally improving search through the use of semantics is a hard scientific problem. So far, semantic technologies have been successful at improving search only in closed-domain areas, with controlled vocabularies and small ontologies. However, it is unclear how we may transfer these technologies to more open domains (or to the WWW at large). Even harder is the challenge of improving search through the use of automatically extracted semantic information from text. At Yahoo! Research, experts in computational linguistics, semantic web and information retrieval work together to better understand this problem and go beyond the current state of the art. During my talk, I will review several of our research projects in these areas, drawing from examples in computational advertising, entity ranking, question answering and query suggestion.

Hugo Zaragoza - Yahoo! Research Barcelona

Hugo Zaragoza is a researcher working on Information Retrieval at Yahoo! Research Barcelona. From 2001 to 2006, Hugo worked at Microsoft Research (Cambridge, UK) with Stephen Robertson, mostly on probabilistic ranking methods for corporate and web search, but also on document classification, expert finding, relevance feedback, and dialogue generation for games. He also spent a considerable amount of time collaborating with Microsoft product groups such as MSN-Search and SharePoint Portal Server. Prior to Microsoft Research, Hugo taught computer science and completed a Ph.D. at the LIP6 (U. Paris 6) on the application of dynamic probabilistic models to a wide range of Information Access problems.

 


Copyright ©2008 STI International, All rights reserved.