Most Popular Stories
- OMB plummets in agency satisfaction rankings
- Surprising lessons from a Florida college's iPad deployment
- Google Angstro purchase another piece of social networking tool
- Agencies stay watchful amid social-media fervor
- Motorola warns against downloading unofficial Android 2.2 upgrade for Droid X
- Google Chrome 7 will come with GPU acceleration
Events
- Gov 2.0 Summit
September 7 - 8 — Washington, DC - SharePoint Technology Conference
October 20 - 22 — Boston, MA - Northwestern University Master of Science in Information Systems (MSIS)
- Register for The Security Standard 2010
September 13 - 14 — New York, NY
Sponsored Links
HOT TOPICS >> Q2 Earnings Roundup | CMIS | Mergers and Acquisitions | Industry Voices | One on One Interviews
IT NEWS BY INDUSTRY >> Healthcare IT | Government IT | Financial Services IT | Biotech IT | Compliance IT
Free Newsletter
Fierce ContentManagement is a weekly content management news update, which focuses on best practices for creating, storing and managing documents and information. Join 24,000+ IT managers and executives who get Fierce ContentManagement via weekly email. Sign up today!
About | View Sample | Privacy
Latest News
Popular Topics
We never sell or give away your contact information. Our reader's trust comes first.
One on One with Daniel Tunkelang of Endeca
Daniel Tunkelang is the Chief Scientist and a co-founder of enterprise search vendor, Endeca. He is a an advocate of dialog-oriented approaches to information retrieval, and has organized annual workshops on Human Computer Information Retrieval (HCIR), in collaboration with researchers at MIT, IBM and Microsoft. Tunkelang also publishes The Noisy Channel. We asked him about his company, the state of enterprise search and his thoughts on some enterprise search trends:
FCM: What is the differentiator for Endeca search?
DT: In web search, we expect to type a query and get an excellent response in the top handful of results. In enterprise search, that is rarely our experience. There are a variety of reasons for this breakdown, but the consequence is that we must get away from the paradigm of the search engine as a mind reader, and instead promote bi-directional communication so that users can effectively articulate their information needs and the system can satisfy them. The approach is known as human computer information retrieval (HCIR).
Endeca combines a set-oriented retrieval approach with user interaction to create an interactive dialogue, offering next steps or refinements to help guide users to the results most relevant for their unique needs. An Endeca-powered application responds to a query with not just relevant results, but with an overview of the user's current context and an organized set of options for incremental exploration.
We often use a concierge analogy to help illustrate the difference between Endeca's solution and conventional search. What happens when you ask a hotel concierge for a restaurant recommendation? Rather than suggesting one place or handing you a list of all the restaurants in the area, the concierge asks you follow-up questions: "What kind of cuisine do you like? Do you want one in walking distance? Is this a special occasion? What kind of atmosphere are you looking for?" This process helps you better understand your options while helping the concierge better understands what you are looking for. As a result, the concierge can give you an answer that meets your unique needs and preferences.
This bi-directional communication between the user and the system addresses the inherent limitations of today's best-match approaches to enterprise search.
FCM: Many enterprise search users want a Google experience. Why do you think this is, and how can you battle that?
DT: On one hand, Google gets resounding reviews for web search. On the other hand, it gets, at best, mixed reviews in the enterprise--even within Google itself! How can the "Google experience" be good for the web and, yet, bad for the enterprise?
The answer is multi-faceted. In the enterprise, we lack the redundant and highly-social structure of the web, that is critical for PageRank and related approaches to succeed. We also have more sophisticated information needs. Specifically, we tend to ask the kinds of informational queries that web search serves poorly--the exception being when there is a Wikipedia page that addresses our particular need. Finally, web search benefits from the fact that the most popular web sites are portals or destinations, designed to help a user shop, research specialized information, communicate with other people, etc. When a web search takes a user to a page on such a site, the site takes on the responsibility for contextualizing the user's experience.
In contrast, enterprise content often consists of a heterogeneous collection of content that has a sparse link structure and whose organization is, at best, implicit in its physical and logical arrangement. Departments within an enterprise may build user-centered portals, but it's rare to see the sort of symbiosis that occurs between web search engines and the sites they index.
So a "Google experience" in the enterprise is a misleading aspiration, since even Google is unable to transfer the success of the web to a much more demanding environment. Instead, the enterprise calls for an HCIR approach that is the foundation for Endeca's offering.
FCM: Carl Frappaolo of AIIM has said that what makes enterprise search so difficult is what he calls the "digital landfill" of information, data that is spread out across repositories in the enterprise. How does Endeca search get at the information that is locked away in a variety of repositories?
DT: Carl is right that enterprise users expect information to be consolidated and made available through a single interface. Endeca has always provided connectors to standard enterprise repositories, as well as an extensible framework to connect to custom repositories. More importantly, Endeca's flexible data model accommodates complex schemas without extensive modeling, allowing each record and document to maintain its own unique structure, similar to XML. This flexibility is essential for accommodating the heterogeneity of enterprise content without reducing it to a lowest common denominator of unstructured text.
Another consideration is that, while enterprises may seek out generic enterprise search solutions, what they often need are search applications that solve specific business problems. The flexibility of Endeca's APIs and tools make it easy to build such applications on top of our information access platform.
FCM: What are some of the areas, in your view, that need improvement in enterprise search?
DT: Many people have raised the prospect of social search in the enterprise--specifically, the idea that people will tag content within the enterprise and benefit from each other's tagging. The reality of social search, however, has not lived up to the vision.
In order for social search to succeed, enterprise workers need to supply their proprietary knowledge in a process that is not only as painless as possible, but demonstrates the return on investment. We believe that our work at Endeca, on bootstrapping knowledge bases, can help bring about effective social search in the enterprise.
The other major area that comes to mind is federation. As much as an enterprise may value its internal content, much of the content that its workers need resides outside the enterprise. An effective enterprise search tool needs to facilitate users' access to all of these content sources while preserving value and context of each.
FCM: What impact will semantic search have on Enterprise search and what are you exploring in that area?
DT: Semantic search means different things to different people, but broadly falls into two categories: Using linguistic and statistical approaches to derive meaning from unstructured text, using semantic web approaches to represent meaning in content and query structure. Endeca embraces both of these aspects of semantic search.
From early on, we have developed an extensible framework for enriching content through linguistic and statistical information extraction. We have developed some groundbreaking tools ourselves, but have achieved even better results by combining other vendor's document analysis tools with our unique ability to improve their results through corpus analysis.
The growing prevalence of structured data (e.g., RDF) with well-formed ontologies (e.g., OWL) is very valuable to Endeca, since our flexible data model is ideal for incorporating heterogeneous, semi-structured content. We have done this in major applications for the financial industry, media/publishing, and the federal government.
It is also important that semantic search is not just about the data. In the popular conception of semantic search, the computer is wholly responsible derives meaning from the unstructured input. Endeca's philosophy, as per the HCIR vision, is that humans determine meaning, and that our job is to give them clues using all of the structure we can provide.
Related Article:
One on One with Content Management's Movers and Shakers
Related Stories
- Daniel Tunkelang leaving Endeca for Google
- Endeca releases faster, more flexible search engine
- Google claims new search appliance can index a billion docs
- Don't stop till you get enough
- Endeca and Informatica join forces in Informatica 9
- One on One with Jeff Catlin of Lexalytics
- MarkLogic releases new application builder product
- Endeca announces commerce suite upgrade
- Best of "One on One" 2009
- Should Google emphasize faceted search more?
Comments
Post new comment
Home
| Subscribe | Advertise | RSS |
Privacy
| Site MapTHE FIERCEMARKETS NETWORKFierceFinance | FierceFinanceIT | FierceComplianceIT | FierceHealthcare | FierceHealthFinance | FierceHealthIT | Hospital Impact | FierceMobileHealthcare | FierceHealthPayer | FiercePracticeManagement | FierceCIO | FierceCIO:TechWatch | FierceContentManagement | FierceMobileIT | FierceGovernmentIT | FierceBiotech | FierceBiotech Research | FiercePharma | FierceVaccines | FierceBiotechIT | FiercePharma Manufacturing | FierceMedicalDevices | FierceDrugDelivery | FierceIPTV | FierceOnlineVideo | FierceTelecom | FierceVoIP | FierceBroadbandWireless | FierceDeveloper | FierceMobileContent | FierceWireless | FierceWireless:Europe | FierceCable© 2010 FierceMarkets. All rights reserved. |
![]() |







