CTRL characters break XML formatting in Odata
search cancel

CTRL characters break XML formatting in Odata

book

Article ID: 366886

calendar_today

Updated On:

Products

Clarity PPM SaaS

Issue/Introduction

CTRL characters break XML formatting in Odata

Steps to Reproduce: 
1- Update a Project text or RIch text field with any of the following characters
\2B7F    https://symbl.cc/en/2B7F/    ⭿    Vertical Tab Key
\0002    https://symbl.cc/en/0002/    STX Start of Text
\000B    https://symbl.cc/en/000B/    VT     Vertical Tabulation
These are used in Microsoft Word and other Rich text editors.
2- Copy paste in characters into text field and save (use Modern UX)
3- Run Load Data Warehouse
4- DWH is successful
5- Make an ODATA call to that object, the call will fail because the character is invalid for XML
For example:
hexadecimal value 0x0B is an invalid character

Expected Results: DWH job filter out XML breaking characters

Actual Results: XML breaking characters are loaded into the DWH

Workaround: 
Search for bad row after the fact, for example
Using Oracle:
SELECT * FROM DWH_INV_INVESTMENTS
WHERE 
DESCRIPTION LIKE '%'||UNISTR('\0002')||'%'
OR DESCRIPTION LIKE '%'||UNISTR('\2B7F')||'%'
OR DESCRIPTION LIKE  LIKE '%'||UNISTR('\000B')||'%'

Using Postgres:
select * from dwh_inv_investments 
where description ~ '\x2B7F|\x000B|\x0002'

Once found, remove that special character or do an update statement, for example:

UPDATE INV_INVESTMENTS
SET DESCRIPTION = REGEXP_REPLACE(description, '^[:alnum:]', ' ')
WHERE...

Alternatively, if the odata client supports headers use:
header('Content-Type: application/json')

Environment

Clarity SaaS GCP and Rego Hosted:  16.1.3, 16.2, 16.2.1

Cause

Certain Unicode Control characters are invalid for XML.  Odata returns data in XML format by default

Resolution

This is being investigated at Broadcom under: DE80636