input xml file

<ARS>

<tag1>one</tag1>

<tag2>two</tag2>

<tag3>three<AltError Code=123 Description=456789/></tag3>

<tag4>four</tag4>

<ARS>

<ARS>

<tag1>ABCD</tag1>

<tag2>ABCD</tag2>

<tag3>ddsdsds<AltError Code=123 Description=456789/></tag3>

<tag4>EFGH<AltError Code=abc Description=defg/></tag4>

<ARS>

Expected Output:

tag1|tag2|tag3|tag4|code|Description

one|two|three|four|123|456789

ABCD|ABCD|ddsdsds|123|456789

ABCD|ABCD|ddsdsds|abc|defg


The script should read this file and create a pipe delimited file. Each ARS tag record in the XML file should create one line in the target. Get all the values enclosed within the tags, which are the column values.

Also if suppose an <AtlError> encounters within an ARS tag, then get the Code and Description values of the AltError tag and append it to the last of that particular ARS record. If suppose 2 AltError encounters within an ARS tag, the target should be having 2 records.

Can someone please help me out

Member Avatar for iamthwee

Looks like you need an xml parser or regex evaluator of some kind.

Linux probably has a good regex library look there.

Looks like you need an xml parser or regex evaluator of some kind.

Linux probably has a good regex library look there.

cant it be done with only shell scrip using sed and awk as m not supposed to use a xlst parser

since this is homework, i will show an example. you follow up with the rest.

awk '/<tag1>/ { 
         gsub("<tag1>|</tag1>","")
		 printf "%s|",$0
		 t1 = t1 sprintf("%s|",$0)		 
     }END {print t1}' "file"
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.