Wednesday, July 23, 2008

Inside ALUI Grid Search: Redundancy Bug (6.1 on window at least)

With ALUI 6.1, BEA introduced a completely revamped search component for ALUI, allowing for better redundancy and better throughput: Grid Search. The main advantages of that new search component are:

  • Multiple search nodes to provide redundancy for serving search requests.
  • Search index can be split in multiple partitions, each attached to various search nodes., to increase throughput.

Every nodes on the same partition automatically replicate locally their search index to guaranty redundancy and performance.

Although there is capability for multiple nodes that guaranty redundancy, all the nodes need to access a central "cluster" data repository located somewhere on the network (through file share). It is located by default at <ALUI_HOME>/ptsearchserver/6.1/cluster. What is usually done is to share that folder (simple network share if you are on windows) and set up the other nodes to access that share as their cluster repository. This cluster repository holds the cluster information (nodes and partitions info) and the multiple search checkpoints that allow for search index backup.

One main problem that I personally experienced with that design consists in the fact that this cluster repository represent a single point of failure... if the cluster share is suddenly not available (hard disk, server, or network failure), all the nodes are not able to talk to the cluster and there might be problems happening.

And actually, a huge problem occurs in that case: if the cluster share is not available, all the nodes are suddenly experiencing an "Out Of Memory" exception and shutting down abruptly. Thus, although you deployed multiple nodes and partitions, if the cluster share is down, your search architecture is...down.

It is pretty easy to test (at least I successfully reproduced the bug on ALUI 6.1 MP1 Patch 1 on windows server 2003): have your nodes all running, and simply remove the share from your cluster folder...all your nodes will go down (apart from the one that accesses the share locally if the cluster share is installed on the same server as one of the nodes)

2 options from there:

  • make sure the share is never down (windows clustering, redundant NAS cluster, or polyserve technologies)
  • install that critical fix from BEA that fixes this bug

If you don't have an infrastructure that provides the first expensive option, you might want to look seriously into the 2nd one...and contact your sales rep asap. Basically, the critical fix allow for the nodes to continue serving requests even if the cluster share is no longer available. All the nodes switch automatically in read only mode without the "out of memory" exception that was occurring before.

Although it is much better, some problems are still present with that critical fix. When in read-only mode, the nodes are no longer indexing new content...your search index is then blocked at the point in time when the cluster share did actually go down, and any new object or document will not show up in the search as long as the cluster share is not restored. The second problem is that the nodes will NOT automatically roll back to read/write mode whenever the cluster is available again. It will require a manual restart.

But compared to a total shut down of search, these problems seem less important indeed!

I am not 100% sure this fix has been pushed to ALUI 6.5 but I sure hope so. And by fix, I am talking about a total fix including auto rollback to "normal" mode when share is available anew, or even allowing for TOTAL continuity of service when this share goes down...

Please let me know (leave comment) if you have that information on 6.5, or if you reproduce this with other versions of the portal.

Tuesday, July 15, 2008

ALUI Tool: URL (or text) Migration within Publisher Items

Following my previous article "ALUI Administration Tool for Environment Refresh: String Replacing for URLS" talking about migration between environments, here is an extra piece that you might find very useful (I surely use it all the time)

Basically, as explained in the previous article, it is common to have different DNS aliases set up per environment...I.e for publisher remote server, you could have:

Similarly, the publish browsing URL is not an exception to this rule:

  • http://publisher-content.domain.com/publish  for production
  • http://publisher-content-stg.domain.com/publish for staging
  • http://publisher-content-dev.domain.com/publish for development
  • When you add an image or a link in the free text editor of content items in publisher, it will most of the time create an absolute URL to that resource...thus you can imagine that there will be a lot of DNS aliases within a lot of publisher items throughout the environment.

    What happen when you migrate the publisher DB from one environment to another? Well you will have a lot of DEV dns aliases within your Staging environment (in case of a DEV promotion to Stage); or a lot of production DNS aliases within your dev environment in the case of a production refresh to DEV.

    In my previous article "ALUI Administration Tool for Environment Refresh: String Replacing for URLS", I was mostly talking about migrating URL within portal objects, but nothing really about migrating urls within publisher items.

    Thus, I created some DB scripts (SQL Server only for now) that do just that...

    1. puburls-PTCSDIRECTORY-nvarchar-replace.sql: Script to change a particular string within the PUBLISHEDTRANSFERURL and PUBLISHEDURL columns (which is mapped out in the DB to a column of type VARCHAR)
    2. puburls-PTCSVALUE-ntext-replace.sql and puburls-PTCSVALUEA-ntext-replace.sql: Scripts to change a particular string within the "long text" property of a publisher item (which is mapped out in the DB to a column of type TEXT)
      1. PCSVALUES.LONGVALUE (hosting the long text of the currently published item)
      2. PCSVALUESA.LONGVALUE (hosting the long text values of all the previous versions of the item)

    For the first script, PUBLISHEDTRANSFERURL and PUBLISHEDURL columns are of type VARCHAR and thus it is easy to replace a string within those columns using the REPLACE MS SQL Function. Thus, a simple SQL statement is good here.

    The main challenge was really with the 2nd scripts...indeed, within a column of type TEXT, the SQL "REPLACE" function cannot be used...The workaround is to use the PATINDEX and UPDATETEXT functions within a Transact-SQL (T-SQL) script. To give the credit to to the right person, I adapted a script that I found at ASP FAQ - How do I handle REPLACE() within an NTEXT column in SQL Server?

    DISCLAIMER: ALTHOUGH I PERSONNALY USE THIS SCRIPT ALL THE TIME, THERE IS NO GUARANTY; SO USE THIS TOOL AT YOUR OWN RISK blah blah blah AND USE IT ONLY IF YOU ARE PROFFICIENT ENOUGH WITH ALUI PORTAL TECHNOLOGIES.

    Attached is the zip file package that contains the 3 scripts:

    Don't forget to change the string to look for, and the string to replace it with

    puburls-PTCSDIRECTORY-nvarchar-replace.sql

    UPDATE [dbo].[PCSDIRECTORY]
    SET
    [PUBLISHEDTRANSFERURL]=REPLACE([PUBLISHEDTRANSFERURL],'-DEV.DOMAIN.COM','-TST.DOMAIN.COM'),
    [PUBLISHEDURL]=REPLACE([PUBLISHEDURL],'-DEV.DOMAIN.COM','-TST.DOMAIN.COM')
    WHERE
    publishedtransferurl like '%-DEV.DOMAIN.COM%'
    or publishedurl like '%-DEV.DOMAIN.COM%'



    puburls-PTCSVALUE-ntext-replace.sql and puburls-PTCSVALUEA-ntext-replace.sql



    SET @oldString = N'por-pubcontent-dev.domain.com'; -- remove N 
    SET @newString = N'por-pubcontent-tst.domain.com'; -- remove N


    That's it! Let me know if you find it as useful as I do! Enjoy!!