Converting from EBCDIC to a format that can be read (preferably to ASCII) in DevTest

Products

CA Application Test CA Continuous Application Insight (PathFinder) Service Virtualization

Issue/Introduction

There are several ways to accomplish this. As you go through the options, please note that DevTest is not limited to DevTest and its documentation.

With the JAVA execution step, you are able to solve complex testing solutions by combining other solutions with DevTest.

The Software Development Kit (SDK) allows even greater integration with DevTest.

Environment

All supported DevTest releases.

Cause

N/A

Resolution

Option 1

Actually, there is a simple answer that only took me about a minute on Google to find.

You'll need a JavaScript step. The import part is this:

byte[] bytes = <somehow get the byte contents of the MQ message>;String result= new String(bytes, "Cp1047")

"Cp1047" is the name of the EBCDIC-compatible character encoding built into Java. This parses EBCDIC data into a regular Java String.

Getting the byte contents of the MQ message is a little trickier. They can try unchecking the 'Extract Payload as Response' checkbox and extracting it manually in the JavaScript step:

 Object mqm = testExec.getStateObject("lisa.<step name>.rsp");
 byte bytes = new byte[mqm.getDataLength()];
 mqm.readFully(bytes);
 String result= new String(bytes, "Cp1047")

Disclaimer: Hasn't been actually tested at the time of this KB.

Note: Not Verified as 100% complete

As described by the submitter, this has not been tried, but is simply a possibility to follow. That exercise is left to the reader.

Option 2

In this option, the mysterious man named "Cam" is referred to as stating this is an LEK. Generally, this implies that Professional Services might be required to accomplish this type of extension of LISA. Professional Services are available and are included in most initial engagement purchases. If this is your case, please do not hesitate to contact your Solution Architect or Sales Person to get a quote on ramping you up quickly. Most of us can take an example of how to do it right and extrapolate that solution across many other areas.

Cam is correct: it's definitely LEK territory. However, it's a fairly simple thing to do, assuming that Java supports the specific EBCDIC character set in use. From a quick web search, there are easily more than 50. A sample that Java does support are

charset IBM037 supported: true
charset IBM277 supported: true
charset IBM278 supported: true
charset IBM280 supported: true
charset IBM285 supported: true
charset IBM297 supported: true
charset IBM420 supported: true
charset IBM424 supported: true
charset IBM500 supported: true

The use of Java Reader and Writer classes allow the use of a charset. For example, using a charset on a reader of EBCDIC data will convert from EBCDIC to Java's internal format. I've attached a quick sample of how this is done.

Original: This is a TEST of the emergency broadcast system! 

ASCII bytes: 0x54,0x68,0x69,0x73,0x20,0x69,0x73,0x20,0x61,0x20,0x54,0x45,0x53,0x54,0x20,0x6F,0x66,0x20,0x74,0x68,0x65,0x20,
0x65,0x6D,0x65,0x72,0x67,0x65,0x6E,0x63,0x79,0x20,0x62,0x72,0x6F,0x61,0x64,0x63,0x61,0x73,0x74,0x20,0x73,0x79,
0x73,0x74,0x65,0x6D,0x21, 

EBCDIC bytes: 0xE3,0x88,0x89,0xA2,0x40,0x89,0xA2,0x40,0x81,0x40,0xE3,0xC5,0xE2,0xE3,0x40,0x96,0x86,0x40,0xA3,0x88,0x85,0x40,
0x85,0x94,0x85,0x99,0x87,0x85,0x95,0x83,0xA8,0x40,0x82,0x99,0x96,0x81,0x84,0x83,0x81,0xA2,0xA3,0x40,0xA2,0xA8,
0xA2,0xA3,0x85,0x94,0x5A, 

Converted: This is a TEST of the emergency broadcast system!

Here is the code

Convert Word documents to Clean HTML Clean HTML Original HTML view

 package com.itko.lisa.report.services;       
 import java.io.BufferedReader; 
 import java.io.ByteArrayInputStream;
 import java.io.ByteArrayOutputStream;
 import java.io.InputStreamReader;
 import java.io.OutputStreamWriter;
 import java.nio.charset.Charset;

 	public class CharsetTest {       
 public static void main(String[] args) throws Exception { 
 String[] names = {
 "IBM037", "IBM038", "IBM274", "IBM275", "IBM277",
 "IBM278", "IBM280", "IBM281", "IBM285", "IBM290",
 "IBM297", "IBM420", "IBM423", "IBM424", "IBM500"
 	};

 	for (String charset : names) {       
 System.out.println("charset " + charset + " supported: " + Charset.isSupported(charset)); 
 } 
 System.out.println("");

 	final String convertThis = "This is a TEST of the emergency broadcast system!";

 	// write out the original data       
 System.out.println("Original: " + convertThis);

 	// print out the ascii bytes in hex to verify

 	byte[] bytes = convertThis.getBytes();       
 System.out.print("ASCII bytes: "); 
 for (int i = 0; i < bytes.length; i++) { 
 System.out.print("0x"); 
 System.out.print(Integer.toString(bytes[i] & 0xff, 16).toUpperCase()); 
 System.out.print(","); 
 } 
 System.out.println("");

 	// convert it to EBCDIC charset IBM037       
 ByteArrayOutputStream array = new ByteArrayOutputStream(convertThis.length()); 
 OutputStreamWriter out = new OutputStreamWriter(array, Charset.forName("IBM037"));

 	out.write(convertThis);       
 out.close();

 	// print out the EBCDIC bytes in hex to verify       
 bytes = array.toByteArray(); 
 System.out.print("EBCDIC bytes: ");
 for (int i = 0; i < bytes.length; i++) {
 System.out.print("0x");
 System.out.print(Integer.toString(bytes[i] & 0xff, 16).toUpperCase());
 System.out.print(",");
 }
 System.out.println("");

 	// convert the EBCDIC data back to int
      ByteArrayInputStream byteIn = new ByteArrayInputStream(bytes);
 InputStreamReader in = new InputStreamReader(byteIn, Charset.forName("IBM037"));        
 BufferedReader bufIn = new BufferedReader(in);        
 String line = null;        
 while ((line = bufIn.readLine()) != null) {            
 System.out.println("Converted: " + line);        
 }         
 
 bufIn.close();
 
 }
 }

Note:
Handy Hint:
The source code is attached to this page.

Option 3

Here is another way to look at this exercise. In the following option, you are shown how to do just a snippet of the entire message. Another reference point outside of LISA is pointed out as "Wikipedia". Remember from Option 1 that Google is also a prime source of information.

Or, if what you want converted is small-ish, instead of all the fancy stream stuff, you could do:

String orig = "This is only a test!";
byte[] bytesEBCDIC = orig.getBytes("Cp037"); 

String converted = new String(bytesEBCDIC, "Cp037");
System.out.println(converted); //internally a string is UTF-16 which makes this easy. 

byte[] bytesASCII = converted.getBytes("ascii");
String roundTrip = new String(bytesASCII, "ascii");
System.out.println(roundTrip);

Where "Cp037" is the java.io & java.lang equivalent of IBM037. Similarly, Cp277 for IBM277, Cp1047 for IBM1047, etc. Seems to me the most common are Cp037 and Cp1047, but that is anecdotal. Note that both of those support all the Latin-1 characters. Also, any of the numbers listed in the first column on this page: http://en.wikipedia.org/wiki/EBCDIC_8859 should support the full Latin-1 character set, so would probably work as well. If you have to deal with control codes and other funky characters besides a-z A-Z 0-9 and punctuation, you'd probably be best served finding out which code page the original data is encoded in.

If you already have the data in an array of bytes, all you need are the 3rd and 4th lines above and you'll have a human readable string.

Option 4

Here is a brief interchange from that mysterious man named "Cam" and Rajeev. This option would be a Professional Services engagement or a Development process on your company's part.

Have a great idea? Submit an enhancement request!

Please note that from this exchange of ideas, an enhancement request was submitted on Rajeev's part to allow Development to look at the merits of this request.

I think we should modify our Read a file step to take Charset as an argument. That's what a client of ours did using LEK.

Rajeev

Cameron Bromley wrote:

Yes, it's a LEK thing. Apparently a client of ours did it a while back. Rajeev knows more.

From the above information, LISA is demonstrated as a very extensible product.

Option 5

Originally I tried _Option 1_; whilst it worked, my client eventually provided some custom code where they change certain characters around. I've yet to find the encoding they are using that matches. However it doesn't detract from the example of how to handle a scenario such as this. The code is attached on this page. Should you need to use it, replace the lookup tables with one your customer might have 'tweaked'. Run with two arguments, first is your EBDIC file and the second is the output file. I used this purely during my analysis phase when I started with a whole bunch of copybook requests given to me. - Vic.

 package com.itko.vic;

 import java.io.BufferedWriter;       
 import java.io.FileInputStream; 
 import java.io.FileWriter;

 	public class EBDICConverter {

 	static int [] ASCII_EBCIDIC_TABLE =       
 { 
 0x00, 0x01, 0x02, 0x03, 0x37, 0x2D, 0x2E, 0x2F, 0x16, 0x05, 0x25, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
 0x10, 0x11, 0x12, 0x13, 0x3C, 0x3D, 0x32, 0x26, 0x18, 0x19, 0x3F, 0x27, 0x1C, 0x1D, 0x1E, 0x1F,
 0x40, 0x5A, 0x7F, 0x7B, 0x5B, 0x6C, 0x50, 0x7D, 0x4D, 0x5D, 0x5C, 0x4E, 0x6B, 0x60, 0x4B, 0x61,
 0xF0, 0xF1, 0xF2, 0xF3, 0xF4, 0xF5, 0xF6, 0xF7, 0xF8, 0xF9, 0x7A, 0x5E, 0x4C, 0x7E, 0x6E, 0x6F,
 0x7C, 0xC1, 0xC2, 0xC3, 0xC4, 0xC5, 0xC6, 0xC7, 0xC8, 0xC9, 0xD1, 0xD2, 0xD3, 0xD4, 0xD5, 0xD6,
 0xD7, 0xD8, 0xD9, 0xE2, 0xE3, 0xE4, 0xE5, 0xE6, 0xE7, 0xE8, 0xE9, 0xAD, 0xE0, 0xBD, 0x5F, 0x6D,
 0x79, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, 0x88, 0x89, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96,
 0x97, 0x98, 0x99, 0xA2, 0xA3, 0xA4, 0xA5, 0xA6, 0xA7, 0xA8, 0xA9, 0xC0, 0x4F, 0xD0, 0xA1, 0x07,
 0x20, 0x21, 0x22, 0x23, 0x24, 0x15, 0x06, 0x17, 0x28, 0x29, 0x2A, 0x2B, 0x2C, 0x09, 0x0A, 0x1B,
 0x30, 0x31, 0x1A, 0x33, 0x34, 0x35, 0x36, 0x08, 0x38, 0x39, 0x3A, 0x3B, 0x04, 0x14, 0x3E, 0xE1,
 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48, 0x49, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57,
 0x58, 0x59, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, 0x68, 0x69, 0x70, 0x71, 0x72, 0x73, 0x74, 0x75,
 0x76, 0x77, 0x78, 0x80, 0x8A, 0x8B, 0x8C, 0x8D, 0x8E, 0x8F, 0x90, 0x9A, 0x9B, 0x9C, 0x9D, 0x9E,
 0x9F, 0xA0, 0xAA, 0xAB, 0xAC, 0x4A, 0xAE, 0xAF, 0xB0, 0xB1, 0xB2, 0xB3, 0xB4, 0xB5, 0xB6, 0xB7,
 0xB8, 0xB9, 0xBA, 0xBB, 0xBC, 0x6A, 0xBE, 0xBF, 0xCA, 0xCB, 0xCC, 0xCD, 0xCE, 0xCF, 0xDA, 0xdB,
 0xDC, 0xDD, 0xDE, 0xDF, 0xEA, 0xEB, 0xEC, 0xED, 0xEE, 0xEF, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
 };

 	/*       
 EBCDIC_ASCII_TABLE[74] 0x2E ====> 0x5B . ===> [ 
 EBCDIC_ASCII_TABLE[79] 0x7C ====> 0x21 | ===> ! 
 EBCDIC_ASCII_TABLE[90] 0x21 ====> 0x5D ! ===> ] 
 EBCDIC_ASCII_TABLE[121] 0x2E ====> 0x60 . ===> ~ 
 EBCDIC_ASCII_TABLE[173] 0x5B ====> 0x2E [ ===> . 
 */ 
 static int [] EBCDIC_ASCII_TABLE =
 {
 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18,
 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, 0x28,
 0x29, 0x2A, 0x2B, 0x2C, 0x2D, 0x2E, 0x2F,
 0x2E, 0x2E, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38,
 0x39, 0x3A, 0x3B, 0x3C, 0x3D, 0x2E, 0x3F,
 0x20, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x2E, 0x5B, 0x2E, 0x3C, 0x28, 0x2B, 0x21,
 0x26, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x2E, 0x5D, 0x24, 0x2A, 0x29, 0x3B, 0x5E,
 0x2D, 0x2F, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x2E, 0x7C, 0x2C, 0x25, 0x5F, 0x3E, 0x3F,
 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x60, 0x3A, 0x23, 0x40, 0x27, 0x3D, 0x22,
 0x2E, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, 0x68,
 0x69, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x2E, 0x6A, 0x6B, 0x6C, 0x6D, 0x6E, 0x6F, 0x70, 0x71,
 0x72, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x2E, 0x7E, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78, 0x79,
 0x7A, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x2E, 0x2E, 0x2E, 0x2E, 0x5D, 0x2E, 0x2E,
 0x7B, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48,
 0x49, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x7D, 0x4A, 0x4B, 0x4C, 0x4D, 0x4E, 0x4F, 0x50, 0x51,
 0x52, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x5C, 0x2E, 0x53, 0x54, 0x55, 0x56, 0x57, 0x58, 0x59,
 0x5A, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E,
 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38,
 0x39, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E, 0x2E
 };

 	void saveToFile(final String filename, final String content) {       
 try { 
 BufferedWriter out = new BufferedWriter(new FileWriter(filename));
 out.write(content);
 out.close();
 } catch (Exception exception) {
 exception.printStackTrace();
 }
 }

 	int [] readFile(final String filename) {       
 try { 
 FileInputStream fileInputStream = new FileInputStream(filename);
 int chararray[] = new int[fileInputStream.available()];
 int inChar; int index = 0;
 while((inChar = fileInputStream.read()) != -1) {
 chararray[index++] = inChar;
 }
 fileInputStream.close();
 return chararray;
 } catch (Exception exception) {
 exception.printStackTrace();
 }
 return new int[0];
 }
 
 String convertEBDICtoASCII(final String fileName) {
 int [] rawArray = readFile(fileName);
 char [] convertedArray = new char[rawArray.length];
 for(int index = 0; index < rawArray.length; index++) {
 convertedArray[index] = (char) EBCDIC_ASCII_TABLE[rawArray[index]];
 }
 return new String(convertedArray);
 }

 	public static void main(String [] args) {       
 String file = args[0]; String output=args[1]; 
 EBDICConverter converter = new EBDICConverter();
 String result = converter.convertEBDICtoASCII(file);
 converter.saveToFile(output, result);
 }
 }

Additional Information

Refer to section "Using the SDK" in the documentation of the DevTest release you are running.