Federated Search Symposium wrap-up

Warning: long post!

I’ve just spent the last day and a half at a federated search symposium sponsored by The Alberta Library.  I went in with some preconceived ideas about the state of federated search, and while they weren’t totally assuaged, I think I feel a little more confident about the future of this idea.  But not the present.

Wikipedia has a pretty good definition of federated search.

Yesterday Roy Tennant talked to us for more than an hour on where we are and where he thinks we want to be with federated search.  His slides should be available in the next little while, and I’ll update with a link to them when they are.  I believe his talk, and those of the panelists and possibly that of Cathy Gordon from Google were all recorded and should also be available soon.

Roy and his team have spent over a year tweaking the version of MetaLib that they purchased for the California Digital Library.  They still haven’t released it to their public.  Roy has a lot of tecchies working on this project.  Though I don’t know exactly how many, I’ll bet it’s more than your library can afford to devote that much time to!  That tells you something about the state of federated search right there.  Based on his experience, Roy offers the following list of questions we need to ask our vendors when considering such a tool:

  • Exactly how difficult is it to customize your interface?  Show me.
  • Will we need to redo our customizations with system upgrades?
  • Do you have an API?  If so, please show me the documentation.
  • What resources are available for metasearching?
  • And through what types of connections?  For each, do you bring back actual records, or only a hit count?

As an aside, one of the nice little nuggets from Roy’s talk is the fact that all the assessment activity that the California Digital Library carries out is available online.

Roy also had this list for final advice if you’re considering a federated search tool:

  • Review your user needs
  • Determine your goals
  • Survey your options and decide which pain(s) you wish to endure (important to note there will be pain somewhere)
  • Be prepared to spend more money and time than you plan, for less result than you hope for.  Do not over-expect.  (that one’s a little depressing!)

After Roy’s talk was a panel discussion of four librarians who have implemented a federated search tool.  Products addressed were Central Search (Serials Solutions), AGent Portal (Auto-Graphics),   MetaLib (ExLibris)  and WebFeat (Consolidated Searching).  The panel discussion piqued my interest for the rest of the symposium as every single one of them mentioned the specific pains they had in implementing their product.  I’m sure punches were pulled, but some were thrown as well, and I was interested to hear how the vendors in attendance would respond the next day when we got a chance to see their demos.

The lunchtime talk was delivered by Cathy Gordon, Director of Business Development for Google Scholar.  Cathy has an MLS and extensive experience working for the likes of Dialog and Lexis/Nexis.  I think.  I didn’t write that down, sorry.  Her talk was about how Google considers itself a switchboard connecting content consumers and content owners.  One of the Google mantras stands in pretty stark contrast to Roy’s experience spending time tweaking MetaLib.  Google says launch fast, refine later. 

She then told us the story of how Google Scholar came to be, and how they’re continuing to work on it.  In response to one of the biggest complaints from librarians about Schoogle, she says, "It’s too much work to show a list of all our sources – it’s always changing."  Still sounds like a cop out to me!  (I wonder where that phrase comes from…)  We do know that Elsevier is not one of the participating publishers.  I spoke with Cathy afterwards and she says they’re still negotiating with Elsevier and many other publishers.  My colleague Diana asked Cathy if they were working on a way to limit results to either books or journals, and it seemed that they hadn’t been thinking of such a limit, as it’s only been recently that so many book citations are showing up in Schoogle.  But Cathy took note of the good idea, so watch for that feature soon 🙂

Rest of the day devoted to brainstorming group work / claims analysis.

Day two was spent moving between vendor presentations.  There were five vendors present, but only time to hear three presentations, so I ended up hearing about AGent Portal, SingleSearch, and Central Search.  Of the three I was personally most disappointed with SingleSearch – it’s a SIRSI product and we’re a SIRSI user here at the U of C.  It just didn’t seem to do much.  Couldn’t tell from the initial results screen whether you could get full text on the next click.  Couldn’t choose, from within SingleSearch, to search the native interface of any database.  It worked, but so did all the other products.  It just didn’t have anything extra, IMHO.  AGent was ok, but obviously not marketed to academic libraries, as aside from consortias they have not a single academic library customer.  Must be some reason. 

Of the three I saw, I was most impressed with the Serial Solutions presentation on Central Search.  This was at least 40% due to the excellent presentation style of their product manager, whos name now completely escapes me.  It was really refreshing to hear him say I’m working on this, not we’re working on this, becuase he’s the one who’s responsible for the product.  But I also got the feeling that Central Search was by far the most innovative and nimble of the three products I saw.  One of the neat things to look forward to later this year is the integration of Vivisimo clustering into the results.  That makes a lot of sense; every one of the vendors mentioned how difficult relevancy-ranking is, so this at least offers a useful way to help slice and dice lots of results.  He was also an advocate of popping a simple Google-style search box in multiple places all over a website instead of forcing people to come to The Portal Page.  Why not include a search box on your Education pathfinder page that defaults to a search of your 5 best education databases?  And another box on your Psychology subject page, or why not one on a non-library page in your School of Business?  Again, just a really sound idea that nobody else mentioned.  Sure, they’re not the be-all and end-all, but keep your eye out on this product, I think you’ll be hearing more about it.

OK, going to wrap this one up with a list of links to the vendors represented.  As mentioned, I’ll be sure to update this post if/when slideshows and audio recordings are made available.


Comments

22 Responses to “Federated Search Symposium wrap-up”

  1. So if I may ask — is the MARC record still alive?

  2. Randal, I don’t think MARC records were discussed at all, though all of the products do operate with Z39.50, mostly to enable access to catalogue records, I’d have to say still alive 🙂

  3. David P. Moore Avatar
    David P. Moore

    Hi Paul,
    We at the University of Alabam in Huntsville must be Auto-Graphics first academic customer, then. We purchased the system in 2005 and have been very happy with it. The company has been very responsive to our needs.
    David P. Moore
    mooredp@uah.edu

  4. Good to know, David – their sales rep will have to update his knowledge 🙂

  5. Kerry Anderson Avatar
    Kerry Anderson

    I also was in attendance at the Symposium last week. I found your comment on AGent very interesting because I had similar concerns about some of the other products – but from a public library perspective (I work at a regional library system serving public libraries). It seems to me that a lot of effort is being put into creating a “one-size-fits-all” federated searching solution for the library market. But, the needs of public and academic (and school and corporate) libraries don’t appear to mesh well when it comes to meta-search, even with all the customization options.

  6. Paul,
    Thank you for covering this symposium. Unfortunately, due to a scheduling conflict, there were no direct representatives from WebFeat in attendance to provide a true perspective on the present state of federated searching. Many of your readers interested in federated searching should know that:
    WebFeat is widely recognized as a leader in federated searching. WebFeat is used by over 3,000 public, academic, government and special libraries, including over half of the top 20 U.S. public libraries, 8 statewide systems, and 1 out of 10 ARL institutions.
    To our knowledge, WebFeat is the only federated search engine that is 100% compatible with all searchable databases. All WebFeat systems are customized to client specifications. Plus, WebFeat hosts all of its search systems and provides a, fast, no-hassle implementation that requires no work from library staff to implement or maintain. For these reasons, WebFeat has successfully implemented many fully customized search systems (with 100% database compatibility) for major academic and public libraries in a matter of months, not years.
    Sadly Mr. Tennant’s experiences with federated searching (now repeated and memorialized here for all to read) are not representative of what WebFeat clients experience. In fact, just 12 days prior to this symposium, WebFeat held an academic library panel at ALA featuring librarians from Princeton University, Brigham Young University, GALILEO (University System of Georgia), the University of Pittsburgh, University of Illinois Chicago, and Westminster College all showcasing their state-of-the-art federated search systems. Two of these institutions switched to WebFeat in the past year after having poor experiences with some of the products that were featured at the symposium.
    WebFeat encourages all libraries considering federated searching to carefully research all federated search vendors.
    Eddie Neuwirth
    Marketing Communications Manager
    WebFeat

  7. Eddie, thanks for the comments; it’s really too bad you guys weren’t represented at the symposium, as your claims do sound impressive. Rest assured that your comments are now memorialized along with Roy’s 🙂 Folks, WebFeat’s website is http://www.webfeat.org/, and I’d like to stress that my “report” of the Symposium is of course just one person’s impression of what went on. Roy was invited as a Federated Search expert though, and I don’t think he really said anything disparaging about any one product; he was reporting what he and his team have learned over the past few years with their trials and tribulations, and I’ve got to say the suggested questions still seem quite sound to me.

  8. A few things:
    1) My slides are now available at http://www.cdlib.org/inside/news/presentations/rtennant/2006tal/
    2) Whether any particular metasearch product is “state-of-the-art” or not depends on what you wish to do with it. So far I have yet to find any product (including WebFeat) that does what we at CDL want to do with it, but this does not surprise me. We want to do things with a metasearch product that few institutions yet wish to do. And much of what I said (subsequently “memorialized” here and elsewhere) is but common sense advice regarding how to deal with vendor claims for this type of product, no matter how you wish to deploy it.
    3) Yes, Randal, the MARC record is still alive, but not due to anything I’ve done. 😉

  9. Stephen Cramond Avatar
    Stephen Cramond

    Don’t know if it was just modesty, but Eddie Neuwirth does not mention that Serials Solutions’ Central Search, favourably reviewed by Paul, also uses Webfeat search technology.

  10. I have reviewed Roy Tennant’s slide deck and need to comment on two areas: one involves a significant inaccuracy in the presentation; the other is an observation regarding several bullet points that carry a common theme.
    The inaccuracy pertains to the screen prints of WebFeat systems used in the presentation. Screen prints from two of WebFeat’s major public library clients are presented, Boston Public Library and Los Angeles Public Library. The Boston Public Library prints appear to be over a year old and, as such, do not include the features offered in their current generation WebFeat 3 system. Boston Public demonstrated their production WebFeat 3 system at last year’s ALA MidWinter 2005 conference, held in Boston. WebFeat 3 introduced a number of new capabilities to the field of federated searching, including:
    • Dynamic merge/sort/de-duping for date, title and author, as well as relevancy
    • Structured/parsed citations for all databases (not just the relative few that provide structured cites)
    • Seamless citation export to bibliographic citation management packages as well as any OpenURL link resolver
    • Granular/personalized profiling of database menus/subject categories and user interfaces
    • Alerts
    • A proprietary read-write proxy technology capable of displaying database full text records in their native UIs with all native functionality preserved intact
    • The ability to track native database usage as well as federated
    Since the launch of WebFeat 3 in 2004, many of these features have not been duplicated by other federated search engines. These features were built on other as-yet unduplicated WebFeat capabilities, most notably its ability to include virtually any searchable database as part of the federated search, regardless of whether the content provider supports standards or offers an API. Of the some 6,200 databases currently supported by WebFeat, the vast majority do not offer standards or API-based means of searching their content. Moreover, those that do usually sacrifice their native web UI functionality as a price for standards or API support.
    I have great respect for Roy as a pioneer in federated searching. I am an enthusiastic proponent of “Tennant’s Law”, noted in Roy’s presentation. So it was with disappointment that I found that Roy had chosen not to present current generation WebFeat systems in his presentation. We would have been more than happy to provide screen prints from the many hundreds of WebFeat 3 systems in operation, including those from some two dozen ARL sites. I think that had he been exposed to some of these systems, it would not only have made for a more objective and informative presentation, but would have also raised his awareness of WebFeat technology and capability. Roy writes that CDL wants to “do things with a metasearch product that few institutions yet wish to do.” Again, with due respect for CDL and its groundbreaking work, I suggest that WebFeat ARL libraries like Princeton University, University of Georgia/GALILEO, and University of Pittsburgh are far from ordinary and the advanced federated search applications developed for these and many other clients are hardly pedestrian. These systems are carefully customized applications developed to exacting specifications. They each address specific needs. And they are successes, as evidenced by this comment from one of our ARL clients:
    “We are thrilled with the WebFeat solution…we didn’t think we would ever find one tool that would be compatible with all of our databases, be able to handle simultaneous searching of so many resources and return quality results with such a fast response time. WebFeat has far exceeded our expectations of what a search system could be.”
    I think these systems provide considerable evidence that WebFeat is fully capable of standing up to Roy’s sensible definition of “state-of-the-art.”
    I have assembled screen shots from several of these WebFeat applications, which can be found at: http://webfeat.org/features/WebFeat_Federated_Searching_State_of_the_Art2.mht
    My other comments regarding Roy’s presentation are observations on several bullets that appear to carry a similar theme:
    • Exactly how difficult is it to customize your interface? (show me)
    • How much work are we prepared to do to customize the interface?
    • Survey your options and decide which pain(s) you wish to endure
    • Be prepared to spend more money and time than you plan, for less result than you hope for
    • It’s hard, but it can be worth it!
    I see keywords like “difficult,” “work,” and “pain.” I’m confused as to why a library would want to commit to any product that demanded this kind of sacrifice. Maybe I’m old fashioned, but I believe that when you fork over money to a vendor, the vendor should actually do the work for you. I think this is why our clients typically use words like “easy,” “fast,” and “convenient” to describe the implementation process with their WebFeat systems. A case study from one of our clients who transitioned from a painful experience with their former federated search provider to a painless one with WebFeat can be found at http://www.webfeat.org/features/westminster_case_study.pdf.
    Do WebFeat client libraries receive lesser systems because they choose not to sacrifice? No. Their systems offer the greatest functionality and database compatibility available. Most importantly, these systems are deliverable – they are not in perpetual beta. I believe that if Google has taught us anything, it’s that there is a real world out there with real demands. Users have real needs – right now. The timeline for a project intended to help a student can’t exceed their academic lifespan. This does not necessitate compromise, though I think it does demand professional solutions and management.
    I agree with Roy on the notion of dealing with competitive claims, though, given the $millions in library money committed to non-working federated search offerings to-date, I would not recommend taking any vendor’s word at face value (including mine). What’s the best advice? Try it before you buy it. While the industry seems to be drawn to the wilder vendor claims, it’s the bedrock functionality that most engines seem to trip on: can the metasearch engine actually search the library’s databases? Pretty basic. If the engine can’t search the databases, then what’s the point of evaluating the hot bells and whistles? For the past 5 years, WebFeat has offered a pilot program to prospective clients evaluating multiple vendors’ products. We actually build systems that demonstrate compatibility with the library’s own database subscriptions and authentication prior to purchase. It’s a tremendous amount of work for WebFeat staff, but it’s the right thing for any vendor to do. And I think a comparative test drive is just plain common sense for any library committed to making a fully informed decision regarding where and how to spend their money.

  11. ContentAt last!

    OK, Ive finally taken the time to write. But what to write about? Well, Id like to comment (rant?) about a common idea/theme from The Alberta Librarys Federated Search Symposium last week. There have been postings about the symp…

  12. I’d like to expand on a few points on this thread.
    First, thank you very much Paul for the kind words about Central Search. It is true that Serials Solutions — and I personally — are deeply committed to Central Search and federated search in general. I believe that we have a great foundation in place which coupled with new approaches in development will take federated searching to the next level. Vivisimo clustering and distributed search boxes are just a couple of ideas that should bring increased use and usability to this space.
    I’d like to second much of what Todd Miller has talked about in his post. A key question that needs to be asked is: “Is this application hosted by the vendor, or locally by my IT staff?” If it’s the latter, the inevitable result will be “work” which will turn into the “memorialized” “pain” and “frustration.” If the application is hosted by the vendor, almost the entirety of that work (and associated pain) is handled by the vendor. This is part of what you’re paying for. And I agree with Todd, there’s no correlation between pain endured and quality of application. In fact, it’s probably the reverse.
    I’d also like to clarify the question about Serials Solutions licensing Webfeat software. Serials Solutions licenses from Webfeat only the “connectors” to databases – in addition to building many of our own. The rest of the application, the part patrons and librarians see, has been built entirely from the ground by Serials Solutions. What does this mean in practice? Libraries should expect similar levels of connections from both Webfeat and Serials Solutions. The features, functionalities, user interfaces, etc, are each developed independently, and this is where you can expect to see differences.
    And finally, I’d like mention what a great time I had at the symposium and express my thanks to all who participated and attended my sessions.

  13. JR Jenkins! That was the Serials Solutions product manager who’s name I forgot. Thanks for the comments, JR, and thanks also to Todd and Eddie for providing additional information on their products. I had hoped to include the data sheets that each of the vendors had supplied to the symposium participants, but never got around to tracking down each of the reps to ask for permission. Second best, it appears there’s now some solid info on this post and comments for folks who are considering federated search.

  14. info from Federated Search Symposium

    There was a Federated Search Symposium at the University of Calgary sponsored by The Alberta Library. Info via Library Boy. Distant Librarian has a good Federated Search Symposium wrap-up. Roy Tennant of the California Digital Library did a presentatio…

  15. info from Federated Search Symposium

    There was a Federated Search Symposium at the University of Calgary sponsored by The Alberta Library. Info via Library Boy. Distant Librarian has a good Federated Search Symposium wrap-up. Roy Tennant of the California Digital Library did a presentatio…

  16. Thank you very much for covering the symposium and for providing a wonderful opportunity to discuss federated search. Sergio really enjoyed the opportunity to share with the group a little more about our federated search capabilities. I’d like to take this opportunity to address some of the posts on this thread and clarify a few misconceptions in the market.
    Auto-Graphics does in fact have customers in the academic market for our federated search product. Thank you very much, David, for pointing that out; we greatly enjoy our relationship with University of Alabama in Huntsville and with our other academic clients and we are actively pursuing a number of other academic institutions as a focus for federated search this year. To that end, we have developed some enhancements to our federated search product that we will be unveiling shortly that are designed specifically with the academic institution in mind. Our newest version of our federated search product with increased functionality is being re-branded AGent Search, and will be available for a demo, along with our other products, at PLA in Boston in a few weeks.
    While I am very excited about the new functionality built into our AGent Search product, many people don’t realize that our current federated search solution has been implemented in more libraries than any other provider on the market, well in excess of 3,000 libraries, including public, academic, and special libraries. Our federated search product is part of our Web-based AGent platform and can also be fully integrated into our other two main products: AGent VERSO, our ILS solution; and AGent Resource Sharing, our ILL product. Rather than taking a “one size fits all” approach, our federated search product is highly scalable and customizable, serving the diverse needs of small academic institutions like Chaffey Community College, library consortia like the Tennessee Board of Regents Libraries, individual four-year academics like the University of Alabama at Huntsville among others, large public libraries like Toronto Public Library, as well as complex multi-tier statewide implementations whereby multiple levels of authentication such as SIP2 and NCIP, referring URLs, cookies, and user name and passwords are required.
    We would like to take this opportunity to update WebFeat’s knowledge base – our federated search product is also 100% compatible with all searchable databases. While we do offer a robust hosted solution for our clients, we’ve learned that this set-up doesn’t necessarily fit the need of our entire client base. It should be noted that Auto-Graphics was a pioneer of hosted solutions and has provided library specific hosted solutions for over twenty-five years. Additionally, we were the first library automation vendor to provide Internet based hosted solutions in the early 1990s. In addition to our hosted solution we also offer a unique hosted license solution, allowing libraries to make an initial capital investment while avoiding the IT and administrative support required to run the system locally, and a licensed model geared towards larger libraries that can afford an initial capital investment and have the IT and administrative staff to run the system themselves and who want that control.
    Below is a brief overview of our AGent Search product:
    • Fully compatible with ALL available databases
    • Displays full text in true native interfaces (no screen scraping)
    • Single login via multiple authentication methodologies (Proxy server, IP, Referring URL, Cookies, User ID and password, Custom authentication coding)
    • Parse citations for all databases
    • Intelligent results management featuring de-duplication, sorting, and configurable displays
    • Complete statistics package (In-library usage, Remote usage, Staff and patron tracking, Honed to multiple search types: basic/advanced/qualified, Utilities to assist with collection development and electronic resource usage, Extensive Web traffic stats, in both tabular and graphical format)
    • No concurrent user limits
    • No hidden throttles
    • Flexible delivery options: hosted, hosted license, and licensed solutions
    • Robust Web services integration to existing applications
    More details on the enhanced features of AGent Search will be available in the coming weeks on our website, http://www.auto-graphics.com. Thank you for the opportunity to explain a little more about Auto-Graphics’ federated search product and clarify a few threads throughout the post.

  17. Thank you Paul, for the extra information.

  18. Susan Beatty Avatar
    Susan Beatty

    From The Alberta Library
    I am pleased to announce that presentations, vendor links and a compilation
    of the group work have been posted on The Alberta Library’s website.
    Here is the direct link:
    http://www.thealbertalibrary.ab.ca/content/fssindex.html
    The Alberta Library sincerely appreciates the participation of our
    speakers, vendors and delegates. Recommendations are being reviewed by TAL
    and your comments will be considered.

  19. The URL for the presentations must have changed. They are now at:
    http://www.thealbertalibrary.ab.ca/viewPosting.asp?postingID=182
    Thank you for the extensive notes. And thank to everyone who left comments. It is all very helpful.

  20. Ahh site redesigns. Thanks very much for the updated URL, Jill.

  21. Paul (and others),
    Did anyone at the Symposium talk about using federated search software with digitized materials? I’m working with a group that is thinking of using a federated search solution to search across digitized collections, including collections housed in CONTENTdm. (BTW I can imagine that the software would also search other databases and catalogues, but the real need with digitized materials.) Assuming that each collection has metadata, then I don’t envision a problem searching (of course, I could be wrong). I would wonder, though, about displaying thumbnails. At any rate, I’m interesting in hearing what experiences others have had.
    Thanks!

  22. Jill, I honestly don’t recall anyone speaking on that topic, but if your federated search product included OAIster, might that not do the trick? I don’t see anywhere where they explain whether or not CONTENTdm is included, but I do see the ability to limit a search to type=picture, but I don’t see any thumbnails in the results. Bet it would be pretty easy to build a Firefox extension to display them though. Anyone? 🙂