Admin: RPI Selection Rule, Audience, or Interaction running slowly

Why are my RPI campaigns taking so long to complete?

The most common reasons that a Selection Rule, Audience, or Interaction takes a very long time to run are:

If the underlying SQL queries associated with the respective operation are taking too long to complete
If the query execution is being blocked by another long-running query

A long running selection rule or audience execution does not necessarily indicate an issue or limitation with Redpoint Interaction (RPI). In most cases, RPI is simply waiting for the results from the database server in order to proceed.

When a performance issue is observed, what should you do?

Review the underperforming campaign, including:
- Review if any recent changes were made
- Confirm updated joins are joining on indexed fields in tables
- Review if any other campaigns running at the time that could cause a conflict
Work with your DBA team to help identify any issues and areas of improvement, including:
- Identifying and tuning long running and/or blocking queries on the database server
- Verifying appropriate indexes are in place

Using the SQL Trace Log from within the RPI client

RPI maintains a SQL Trace log of all completed and failed T-SQL queries associated with selection rules, audiences, and interactions. From within the Interaction client navigate to Operations>Query Trace Log where you will see the ability to filter on:

Timeframe
Status (All, Completed, Failed)
Activity ID
Workflow Instance ID

You can also sort the entire result set by the “Duration” column as well as export the displayed Query Trace Log.

From these results you should clearly be able to see which queries are in need of tuning, indexes, and or statistics. Please remember that on long running queries, Interaction as an application is waiting for the result set from the database engine. These results should be provided to your DBA team.

Learn more: Gathering Logs From RPI

Identifying long running and blocking queries directly on SQL Server

The following section provides a basic example of how to identify blocking and long running queries on "Microsoft SQL Server".

Run the following from SQL Management studio to list out information about all running processes on the database instance:
- sp_who2
Use the information from sp_who2 to identify long running queries or blocked/blocking queries that are affecting your RPI operations. The output sp_who2 contains a lot of useful information, among which you will see columns:
- SPID: the SQL process id.
- HostName: the hostname of the client server that has established a connection to the SQL Server.
- BlkBy: the SPID number of the process that is blocking this process. If this is blank, then the process is not blocked.
- DBName: the database to which the client server is connecting.
- Command: a short description of the type of SQL Statement being executed.
- LastBatch: the date/time of the last batch executed by the connection identified in the sp_who2 record.

Identifying a blocking situation:

Review the HostName column for processes originating from the Redpoint interaction server. Identify if any queries are blocked by reviewing the BlkBy column. For any records that contain a SPID in this column, review the information from the blocking SPID. If the blocking SPID is blocked as well, trace that block back up to the SPID that is blocking the blocker. You may need to trace the block up several levels until you identify the lead blocker. Once you have identified the lead blocker, proceed to step 3. to reveal the SQL statement associated to the blocking SPID.

Identifying a long running query:

It's also possible that there is no blocking, but there is:

A query that is taking an inordinate amount of time to complete
Is possibly hung because of a poor execution plan due to missing indexes
Resource contention
SQL Server instance is simply undersized for the amount of data and complexity of the queries being issued against the database instance

In any case, check for sp_who2 records where the status is RUNNABLE and the LastBatch time has not been updated in a long while. If you identify a connection where the command has not been updated in a long time then make note of the SPID and proceed to step 3 below.

Identifying the blocking or long running SQL statement using the queries below, substituting @@SPID with the SPID identified in step 2 above.

SQL

DECLARE @Handle varbinary(64);
SELECT @Handle = sql_handle
FROM sys.dm_exec_requests
WHERE session_id = @@SPID and request_id = 0;
SELECT * FROM sys.fn_get_sql(@Handle);
GO
DBCC INPUTBUFFER(@@SPID)

Optionally, terminate the long running or blocking process to free up locks and resources on the SQL Server. For example, if you've identified a lead blocking SPID that has not had an update in hours, you can issue the KILL command to end the process and allow the blocked processes to complete.
- KILL @@SPID
Once the pressure and blocking has been released, you can proceed to tune the long-running queries identified above. One option is to run the SQL Server Database Engine Tuning Advisor analysis against the offending query and applying the index and statistics recommendations.

Additional forensic scripts for your DBA Team

Index Statistics

Use the SQL below to look at indexes on specific tables. This query has two parts the first being the index stats and the second being the last restart date of the server. Stats are cleared upon a restart of SQL service.

SQL

SELECT 
    t.name AS tblNm, 
    i.name AS idxNm, 
    'DROP INDEX ' + i.name + ' ON ' + t.name + ';' AS DropStmt, 
    ius.*
FROM 
    sys.dm_db_index_usage_stats ius
INNER JOIN 
    sys.tables t ON ius.object_id = t.object_id
INNER JOIN 
    sys.indexes i ON ius.index_id = i.index_id
    AND ius.object_id = i.object_id
WHERE 
    1 = 1
    -- Uncomment the following conditions as needed
    -- AND t.name = 'person'
    -- AND user_seeks < 1
    -- AND user_scans < 1
    AND i.name NOT LIKE 'pk[_]%'
    AND database_id = DB_ID()
    -- AND t.name LIKE 'sfdc_opp%'
ORDER BY 
    t.name, i.name;
-- Retrieves the SQL Server start time
SELECT 
    sqlserver_start_time 
FROM 
    sys.dm_os_sys_info;

Offer History

Use the OH-Research.sql script below to check OfferHistory size and performance.

OH-Research.sql

General RPI Forensics

Use these RPI Forensic scripts below for performance statistics on all workflow query activities.

RPIForensics.zip

How can I ensure my campaigns continue to run at peak performance?

Collaboration with your DBA and IT team is critical. As you continue further development and expansion of your RPI solution, here are some things you should keep in mind:

As campaigns are added or modified it is incumbent upon the RPI users to continually communicate index requirements to their DBA team.
If the indexes are not created for new joins table scanning will occur. Table scans are the most common culprit of performance degradation we see, so this level of collaboration is critical.

Check for Data Type Mismatches in Joins:

Columns joins that contains data type mismatches can cause extra table scans and indexes not being used correctly by SQL Server. You can use the RPI Joins Audit Report to identify these instances.