Sunday, January 18, 2009

WebCenter Native API Development: Advanced Search Query explained (and applied to WebSite Search)

In his article about "Dot Com Portals: Smart Searching", Jordan Rose already explained really well how to easily implement with Webcenter Interaction an efficient and accurate website-like search (enter a search term, and expect in the results either website pages, or documents within these pages).

To summarize the challenge:

The very powerful WebCenter search component will index everything the user has access to (i.e. web content items, documents, crawled third party websites, etc…) independently from their real presence on the web site pages. For example, the search results would present web content items instead of the website pages where the web content item is displayed through a portlet.

To summarize a bit the solution:

Using WebCenter interaction “web crawler” capability (google like spider that follows links on a page and index its content for future searches) coupled with experience definition features (to hide the part of the page that we don’t want the crawler to index, like the top/left navigation, the banner, etc…), it is easy to actually implement a website-like search with accurate portal page results (Refer to Jordan's blog post:Dot Com Portals: Smart Searching)

But one thing that was not there yet in the solution was: "How can the crawler navigate from page to page" if the navigation is not there? What we did at first (we did not have time to do better) is create manually this HTML file that would contain all the pages of the website, and direct the web crawler to that page, instead of to the root of the public portal website url.

This would work ok, but would require a manual update of this file each time you create a new page...not super practical. Anyway, I finally took the time to improve it, and created this "Page Listing" code that basically render a list of pages located within a specific folder...Basically, you simply create a request with "topfolderid", "includesubfolders", and "openerhost" (http:////PageLinkListing?topfolderid=123&includesubfolders=true&openerhost=yourdomainhost) and the dotnet page will render all the portal page links that correspond to these values.

In this article I'd like to use this example in order to focus on the Native Search API (refer to my previous article about the native API) because the dotnet frontend is pretty simple:

  • DotNet front end page
  • Native Portal API
  • Webcenter search API to query the pages

First as always, it all starts with the native session creation…then, that’s when you can start creating the search request object:

IPTSearchRequest req = m_ptSession.GetSearchRequest();

From there, the PTSearchRequest object allows you to set all sorts of setting that will define the search you want to make. Simply call the SetSettings method. This method takes a setting ID and a value (that can be a string, int, or array of objects). The main problem is the non-documentation of this API (native API is non documented)…but luckily, the setting IDs are all available through the PT_SEARCH_SETTING class, and each name is relativelly straightforward (not always though). Check out the example below that sets the fields to return, specify not to execute best bet and spell check, and the maximum number of results to bring back:




   1: int[] arPropIDs = { PT_INTRINSICS.PT_PROPERTY_OBJECTID, PT_INTRINSICS.PT_PROPERTY_OBJECTNAME, PT_INTRINSICS.PT_PROPERTY_OBJECTSUMMARY};

   2: req.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_RET_PROPS, arPropIDs);

   3: req.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_INCLUDE_USUAL_FIELDS, false);

   4: req.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_KWIC, false);

   5: req.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_BESTBETS, false);

   6: req.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_SPELLCHECK, false);

   7: req.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_SKIPRESULTS, 0);

   8: req.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_MAXRESULTS, 10000);



You can also specify the admin folders (or KD folders) within which the search should be performed:




   1: req.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_ADMINFOLDERS, new int[] { adminfolderid });



and the object type the search should be dealing with (here we want to search only community pages, but very similarly to the object type checkboxes in the advanced search interface, you could pick several object type to search for):




   1: req.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_OBJTYPES, new int[] { PT_CLASSIDS.PT_PAGE_ID });



Finally, you can create all sorts of filters statements that you can add to this search request. It works very similarly to the snapshot query interface: A filter can contain several “Filter Clauses” and each clause can contain several “Filter Statements”. Clauses and Statements can be put together using “OR” or “AND” operations.


Here for this exercise, we will look for objects with ID greater than 230 and name containing “Test”… (kind if useless query…but that’s not the point here…)




   1: // Create a filter for the search request which will "AND" together each filter clause.

   2: IPTFilter ptFilter = PortalObjectsFactory.CreateSearchFilter();

   3: ptFilter.SetOperator(PT_BOOLOPS.PT_BOOLOP_AND);

   4:  

   5: //Create the clause that will contains the statements we need for the query

   6: IPTPropertyFilterClauses ptFilterClause = (IPTPropertyFilterClauses) ptFilter.GetNewFilterItem(PT_FILTER_ITEM_TYPES.PT_FILTER_ITEM_CLAUSES);

   7: // The filter clause should "AND" each of the statements.

   8: ptFilterClause.SetOperator(PT_BOOLOPS.PT_BOOLOP_AND);

   9:  

  10: //Statement 1: ObjectID > 230

  11: IPTPropertyFilterStatement statement1 = (IPTPropertyFilterStatement)filter.GetNewFilterItem(PT_FILTER_ITEM_TYPES.PT_FILTER_ITEM_STATEMENT);

  12: statement1.SetOperand(PT_INTRINSICS.PT_PROPERTY_OBJECTID);

  13: statement1.SetOperator(PT_FILTEROPS.PT_FILTEROP_GT);

  14: statement1.SetValue(230);

  15:  

  16: //Statement 2: Object Name contains the text "Test"

  17: IPTPropertyFilterStatement statement2 = (IPTPropertyFilterStatement) ptFilter.GetNewFilterItem(PT_FILTER_ITEM_TYPES.PT_FILTER_ITEM_STATEMENT);

  18: //search on the name property.

  19: statement2.SetOperator(PT_FILTEROPS.PT_FILTEROP_CONTAINS);

  20: statement2.SetValue("Test");

  21:  

  22: //add statements to clause

  23: ptFilterClause.AddItem(statement1, ptFilterClause.GetCount());

  24: ptFilterClause.AddItem(statement2, ptFilterClause.GetCount());

  25:  

  26: //add clause to filter

  27: ptFilter.SetPropertyFilter(ptFilterClause);



As you can see it is very powerful and straightforward, and allows you to perform all sort of searches that fit your needs.


Finally, when you are done with the search parameters and filters, you simply need to execute the query, and get the results back…




   1: IPTSearchQuery query = req.CreateAdvancedQuery(filter);

   2: IPTSearchResponse ptPagesResponse = req.Search(query);

   3: int nResultCount = ptPagesResponse.GetResultsReturned(); 

   4: for (int nIndex = 0; nIndex < nResultCount; nIndex++) { 

   5:     //do something with the data... 

   6:     ptPagesResponse.GetFieldsAsInt(nIndex, PT_INTRINSICS.PT_PROPERTY_OBJECTID));

   7:     ptPagesResponse.GetFieldsAsString(nIndex, PT_INTRINSICS.PT_PROPERTY_OBJECTNAME));

   8:     ptPagesResponse.GetFieldsAsString(nIndex, PT_INTRINSICS.PT_PROPERTY_OBJECTSUMMARY));

   9: }



Here it is, I hope you see the endless possibilities you now have using the native search API in your various Native API Utilities (Portlet, Console application, etc…). I will soon post on the ALUI Toolbox google project the integrality of this code plus many other extras. Stay tune, and Happy new year! :)