Topics:
One on One with Sid Probstein of Attivio
Sid Probstein is Chief Technology Officer at Attivio, responsible for technology strategy and innovation. He has over 15 years of experience leading engineering projects including a stint as vice president of technology at Fast Search & Transfer, where he developed next-generation search, text mining and multimedia capabilities and VP of Engineering at Northern Light Technology, where he produced the very first enterprise version of the award-winning search engine. We asked Probstein about what he's doing at Attivio and what the search market looks like for the coming year.
FCM: Your company takes an unusual approach to search. Can you describe it for us?
SP: Attivio's Active Intelligence Engine (AIE) is powering today's critical business solutions with a completely new approach to unifying information access. AIE supports querying with the precision of SQL and the fuzziness of full-text search. Our patent-applied-for query-side JOIN() operator allows relational data to be manipulated as a database would, but in combination with full-text operations like fuzzy search, fielded search, Boolean search, etc. Finally our ability to save any query as an alert and thereafter have new data trigger a workflow that may notify a user or update another system, brings a sorely needed “active” component to information access.
By extending enterprise search capabilities across documents, data and media, AIE brings deeper insight to business applications and websites. AIE’s flexible design enables business and technology leaders to speed innovation through rapid prototyping and deployment, which dramatically lowers risk--an important consideration in today’s economy. Systems integrators, independent software vendors, corporations and government agencies partner with Attivio to automate information-driven processes and gain competitive advantage.
FCM: When customers want to talk to you, what's the most common business problem they want you to solve?
SP: The top problem they want to solve is improving their ability to access their information. Our customers, whether they are enterprises or independent software vendors, want the information they need to be easier to find, absolutely up to date, in context, and retrievable with one query--regardless of how or where or in what format that information is stored. Frequently customers are frustrated with the limits of their current solutions, and we do find that some customers are skeptical about the promise of unified information access. They need to see proof that they really can use one simple, free-form query to get aggregated information, such as all the news and database statistics a reporter might need to write a story about the impact of unemployment and housing prices on a city’s tax base. You could include a chart that lets user change any of the factors to see how a single change ripples through the financial ecosystem.
FCM: How has social networking affected your customers and your products?
SP: Mostly what we’ve seen is that social networking opens up new marketing possibilities. Social networking allows rapidly spreading viral marketing and referrals. For example, a well-regarded blogger recently wrote about one of our white papers, which gave us some exposure we wouldn’t have had otherwise for that paper. We’ve also been able to use our own blogs to contribute to the ongoing market education about the benefits of unified information access and to publish focused information about some of our less-obvious capabilities, such as rapid prototyping and development.
FCM: What's your take on how search will fare in the coming year in light of the economic situation?
SP: We’re very optimistic, as are the major analyst firms. Our optimism comes from both our continuing traction in the market and our understanding that information access remains critical in economic downturns. In fact, companies that can improve the flow of information in their business processes are more competitive and better able to weather workforce disruptions. With one query to aggregate information across data and content stores, they are more efficient and responsive. With workflow and alerts, they can automate processes. For example, information access can bring improvements in customer retention by providing a 360 degree view of a customer that includes details about that customer retrieved from databases, emails, logs and relevant news articles. That aggregation lets any employee deal effectively with any service issue or question from a customer.
FCM: What impact do you think semantic search will have on enterprise search?
SP: The semantic web offers much potential capability for enterprise search. However, the potential mostly failed to materialize for a variety of reasons.
First, although there are a set of core semantic web technologies like RDF and OWL, they are so broad and flexible that figuring out how to use them can be very challenging. Well-defined standards for using these frameworks are needed before broader adoption will be possible. These standards need to include the schema or model that will be used to expose data, and guidelines (preferably rigid ones) about what can be placed in the model.
Second, it can be very challenging to understand and justify the value of implementing semantic web capabilities. In simplest form the semantic web is about embedding and publishing “facts” that can be used by machines. For example, that X is the author of document Y. It is useful to have a web standard around this type of information, but significant publishers already have well-defined metadata standards that accomplish the same thing. What’s the value, for them, to switch to RDF to accomplish the same thing? It’s likely low.
Of course the value to an enterprise search engine might be a different story! Broad adoption of any standard around metadata would help all the vendors in the space. However, it is very hard to drive adoption of standards from the vendor side.
Finally, there is the general problem of trying to enable computers to understand relationships in text. For example, the “X is the author of Y” problem noted above, or “X has title Y and works at organization Z”. Solving these problems would facilitate the adoption of any metadata standard (semantic web or other) as it would eliminate much of the overhead required to hand-code the information. Unfortunately, the current state-of-the-art in relationship extraction provides excellent precision but poor recall…in other words, it works well when it works, almost regardless of the approach, but doesn’t work on enough content to be useful because of the endless ways humans express ideas.
Attivio’s general approach to making use of the semantic web is to be able to consume information in the RDF format, and also to publish search results (including alerts) in RDF format. We combine support for these interfaces with a range of linguistic and statistical approaches around entity identification, classification, use of synonyms and acronyms, etc. These things together can provide many of the benefits of the semantic web without the overhead involved in representing data in semantic web formats or the extreme challenge of automatically identifying the same useful relationships.
Related Articles:
One on One with Content Management's Movers and Shakers




Comments