Data is no longer replicating between servers. You may see one or more of the following symptoms:
Job shows as in progress for more than 48hrs
Symantec Management Platform 7.0 SP5 and later
Symantec Management Platform 7.1 and later
Symantec Management Platform 7.5 and later
Symantec Management Platform 8.x
When a failure occurs during a replication job, the normal cleanup measures are unable to be performed.
NOTE: The Task attached to this article should be disabled/deleted from your servers once this issue has been resolved in a released version of SMP.
Also, if the ITEM table in the database is too large then the cleanup Task may fail to run, because there are too many rows to deal with at one time. If the Task continually fails to run then edit the SQL in the task ‘Replication Data Cleanup’ and at the very top line put the following statement:
SET ROWCOUNT 5000
This will cause the Task to only take out 5,000 rows. This may also require that the Task be run multiple times before everything is cleaned up.
Cleanup steps post 8.x:
/*
This query tells the current state of all replication jobs, past and present and includes other information such as duration and transfer sizes.
*/
select distinct
j.JobId,
fs1.Name as [Source],
fs2.Name as [Destination],
case when j.State = 1 then 'Not Started'
when j.State = 2 then 'Pending'
when j.State = 3 then 'Running'
when j.State = 4 then 'Complete'
when j.State = 5 then 'Aborted'
when j.State = 6 then 'Error'
end as [Job Status],
j.StartDate, j.FinishDate, j.RunCount, o.cnt as Operations,hs.RunningTime as [Duration], hs.TransferSize
from ReplicationJob j
join ForwardServer fs1 on fs1.Guid = j.[Source]
join ForwardServer fs2 on fs2.Guid = j.Destination
join ReplicationJobOperationState hs on hs.JobID = j.JobID
join
(select distinct JobId, cnt=COUNT(*)
from ReplicationJobOperationState
group by JobId
)o on o.JobId = j.JobID
where 1 = 1
-- and hs.OperationId = '{9317d360-bfa1-4070-95a0-0acc5e74c917}'
-- and DATEDIFF(hh, j.Startdate, GETDATE()) <= 24
order by j.StartDate desc
/*
The following query will show list of hung or stalled jobs currently not complete. The default is anything older than 8 hours.
It can be changed by modifying the value of the @hours variable on line 2.
*/
declare @hours float
set @hours = .34 -- approximately 8 hours
select j.StartDate
,isnull(jsi.Name,rs.Name) as Name
,cast(rs.State as XML).value('(//source/name/text())[1]','nvarchar(100)') sourceNS
,cast(rs.State as XML).value('(//destination/name/text())[1]','nvarchar(100)') DestinationNS
,isnull(LastAction,j.CreatedDate) as LastAction
,j.RunCount
,isnull(o.Type,'')+' '+isnull(o.Task,'') CurrentOperation
,m.ManifestCount, m.Pending,m.VerificationInProgress,m.[ReplicaitonInProgress],m.[Failed],m.[Replicated]
,j.JobID,ja.SourceJobID
from ReplicationJob j
left join Item rs on rs.Guid=j.ItemGuid
LEFT JOIN Evt_NS_Hierarchy_Job_Targets rt on rt.ReplicationJobID=j.JobID
LEFT JOIN Evt_NS_Hierarchy_Job_Start js on js.JobID=rt.JobID
LEFT JOIN Item jsi on jsi.Guid=js.CauseID
left join ReplicationJobAssociation ja on j.JobID=ja.DestinationJobID
left Join (
select o.JobID
,sum(case when m.State in (1,3,5,7,9,11,13) then 1 else 0 end) as [Pending]
,sum(case when m.State in (2,6) then 1 else 0 end) as [VerificationInProgress]
,sum(case when m.State in (4,8,10,12,14) then 1 else 0 end) as [ReplicaitonInProgress]
,sum(case when m.State=16 then 1 else 0 end) as [Failed]
,sum(case when m.State=15 then 1 else 0 end) as [Replicated]
,COUNT(m.State) ManifestCount
,max(case when m.ModifiedDate>o.ModifiedDate then m.ModifiedDate else o.ModifiedDate end) LastAction
from ReplicationOperation o
left join ReplicationManifest m on m.OperationID=o.OperationID
group by o.JobID
) m on m.JobID=j.JobID
left join ReplicationOperation o on o.JobID=j.JobID and o.State=2
where j.State<4
and isnull(LastAction,j.CreatedDate)<GETDATE()-@hours --No Action for over 8 Hrs
order by j.StartDate