Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 4 of 4

Thread: Cut and paste XML into new XML

  1. #1
    Junior Member
    Join Date
    Feb 2014
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Cut and paste XML into new XML

    First time poster so please direct me if I am in the wrong place.

    I am trying to do something which I have done repeatedly using a manual cut and paste. Basically here is what happens. I receive an XML file from my client with every active "Activity" within his system. This equates to about 45 "Activities". Of those only 8 need to be processed by my team. Since the file is a single XML, I end up processing 32 sets of irrelevant data. So what I have done in the past is search for an ActivityID within the XML. When I find it, I collapse the <Activity> tag and then copy that collapse tag to a new XML document. I continue this process until I have all of the ActivityIds in a new XML. I then process the new XML and move on. I have to believe that this is an activity which is best automated. Here is a snippet of the XML:

    <Activity>
        <RetailFormat>ABC</RetailFormat>
        <FeedDate>2014-02-06 21:01:10</FeedDate>
        <ActivityId>665507</ActivityId>
        <ActivityTitle>ABC 3.9.14 Hawaii </ActivityTitle>
        <StartDate>2014-03-09</StartDate>
        <EndDate>2014-03-15</EndDate>
        <StartTime>00:00:00</StartTime>
        <EndTime>23:59:59</EndTime>
        <JANumber>0</JANumber>
        <PlanItemNo>0</PlanItemNo>
        <ChannelType>Circular</ChannelType>
        <Version>
        </Version>
    </Activity>

    So I would search for 665507, go to the tag <Activity> collapse it, copy it and paste it into a new XML, rinse and repeat. Now I know that this looks like the file is small and why should I be concerned, but the file is 70 MB and each activity is between 3000 and 15000 lines. The last file had 1.5MM lines. I would rather deal with 100K lines.

    I have seen postings on various parsers within Java including DOM, SAX and XSLT. I am not sure that I need to parse out specific data but rather create a new subset of the existing data in a new XML. Also, many of the tutorials have the resulting data set in some text format where I need it in XML and that data set could be 15000 lines.

    I would appreciate any assistance.


  2. #2
    Forum VIP
    Join Date
    Jul 2010
    Posts
    1,676
    Thanks
    25
    Thanked 329 Times in 305 Posts

    Default Re: Cut and paste XML into new XML

    My recommendation would be use an XML parser, and create a new Activity object, where you would store each Activity item you want to pull out. Store all those Activity objects in memory until you are done parsing the file, then output to a new file by deconstructing all the Activity objects back into XML.

    Basic DOM or SAX parsers should do most of the work for you. If you need a bit of code to make your life easier (as far as parsing the document goes, or printing back out to XML), I can give you an XML utilities class I created for use on my personal projects (just don't modify the class's @author tag).
    NOTE TO NEW PEOPLE LOOKING FOR HELP ON FORUM:

    When asking for help, please follow these guidelines to receive better and more prompt help:
    1. Put your code in Java Tags. To do this, put [highlight=java] before your code and [/highlight] after your code.
    2. Give full details of errors and provide us with as much information about the situation as possible.
    3. Give us an example of what the output should look like when done correctly.

    Join the Airline Management Simulation Game to manage your own airline against other users in a virtual recreation of the United States Airline Industry. For more details, visit: http://airlinegame.orgfree.com/

  3. #3
    Junior Member
    Join Date
    Feb 2014
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Cut and paste XML into new XML

    Please send me the XML utilities class that would work for my project. I will not modify the author tag.

  4. #4
    Forum VIP
    Join Date
    Jul 2010
    Posts
    1,676
    Thanks
    25
    Thanked 329 Times in 305 Posts

    Default Re: Cut and paste XML into new XML

    /**
     * XMLUtilities.java
     * 
     * Mar 22, 2013
     */
     
    import java.io.OutputStream;
    import java.io.OutputStreamWriter;
    import java.io.UnsupportedEncodingException;
    import java.net.URL;
    import java.util.Properties;
     
    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.transform.OutputKeys;
    import javax.xml.transform.Transformer;
    import javax.xml.transform.TransformerConfigurationException;
    import javax.xml.transform.TransformerException;
    import javax.xml.transform.TransformerFactory;
    import javax.xml.transform.dom.DOMSource;
    import javax.xml.transform.stream.StreamResult;
     
    import org.w3c.dom.Document;
    import org.w3c.dom.Element;
    import org.w3c.dom.Node;
    import org.w3c.dom.NodeList;
     
    /**
     * @author Adrian McGrath, APMSoftware
     *	
     */
    public class XMLUtilities {
     
    	static {
    		Properties systemProperties = System.getProperties();
    		systemProperties.remove("javax.xml.parsers.DocumentBuilderFactory");
    		System.setProperties(systemProperties);
    	}
     
    	/**
    	 * Creates a Text Node with the given name and value
    	 * @param document The Document to use to create the node
    	 * @param name The name of the node
    	 * @param value The value of the node
    	 * @return The created Text Node
    	 */
    	public static Element createNode(Document document, String name, String value) {
    		Element node = document.createElement(name);
    		node.setTextContent(value);
    		return node;
    	}
     
    	/**
    	 * Gets the String value of a node
    	 * @param node The parent node
    	 * @param key The key for the child node whose value is desired
    	 * @return The String value of a node
    	 */
    	public static String getValueOfNode(Node node, String key) {
    		Element element = getChildElement(node,key);
    		if(element!=null)
    			return element.getTextContent();
    		return null;
    	}
     
    	/**
    	 * @param node The node to look in
    	 * @param key The attribute key
    	 * @return The attribute value, or an empty String if not able to
    	 */
    	public static String getAttributeValue(Node node, String key) {
    		if(node instanceof Element) {
    			Element element = (Element)node;
    			return element.getAttribute(key);
    		}
    		return "";
    	}
     
    	/**
    	 * Gets the Child Element of the node
    	 * @param node The parent node
    	 * @param key The key for the child
    	 * @return The child as an Element
    	 */
    	public static Element getChildElement(Node node, String key) {
    		if(node.getNodeType() == Node.ELEMENT_NODE) {
    			Element element = (Element)node;
    			NodeList list = element.getElementsByTagName(key);
    			if(list.getLength()==0)
    				return null;
    			Node toReturn = list.item(0);
    			if(toReturn.getNodeType() == Node.ELEMENT_NODE) {
    				return (Element)toReturn;
    			}
    		}
    		else if(node.getNodeType() == Node.DOCUMENT_NODE) {
    			Document document = (Document)node;
    			NodeList list = document.getElementsByTagName(key);
    			if(list.getLength()==0)
    				return null;
    			Node toReturn = list.item(0);
    			if(toReturn.getNodeType() == Node.ELEMENT_NODE) {
    				return (Element)toReturn;
    			}
    		}
    		return null;
    	}
     
    	/**
    	 * Gets the Node in the NodeList that has the desired key and value
    	 * @param list The node list to search through
    	 * @param key The name of the child node
    	 * @param value The value of the child node
    	 * @return The node being searched for
    	 */
    	public static Node getElementWithChild(NodeList list, String key, String value) {
    		for(int i=0;i<list.getLength();i++) {
    			Node item = list.item(i);
    			String child = getValueOfNode(item,key);
    			if(value.equals(child))
    				return item;
    		}
    		return null;
    	}
     
    	/**
    	 * Method that simply prints the document for debugging purposes
    	 * @param doc The document to print
    	 * @param out The output stream to print to
    	 * @return <b>true</b> is successful, <b>false</b> otherwise
    	 */
    	public static boolean printDocument(Document doc, OutputStream out) {
    	    TransformerFactory tf = TransformerFactory.newInstance();
    	    Transformer transformer = null;
    		try {
    			transformer = tf.newTransformer();
    		} catch (TransformerConfigurationException e) {
    			e.printStackTrace();
    			return false;
    		}
    	    transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
    	    transformer.setOutputProperty(OutputKeys.METHOD, "xml");
    	    transformer.setOutputProperty(OutputKeys.INDENT, "yes");
    	    transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    	    transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
     
    	    try {
    			transformer.transform(new DOMSource(doc),new StreamResult(new OutputStreamWriter(out, "UTF-8")));
    		} catch (UnsupportedEncodingException e) {
    			e.printStackTrace();
    			return false;
    		} catch (TransformerException e) {
    			e.printStackTrace();
    			return false;
    		}
    	    return true;
    	}
     
    	/**
    	 * Prints the Node for debugging purposes
    	 * @param doc The Node to print
    	 * @param out The output stream to print to
    	 */
    	public static void printDocument(Node doc, OutputStream out) {
    	    TransformerFactory tf = TransformerFactory.newInstance();
    	    Transformer transformer = null;
    		try {
    			transformer = tf.newTransformer();
    		} catch (TransformerConfigurationException e) {
    			e.printStackTrace();
    			return;
    		}
    	    transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
    	    transformer.setOutputProperty(OutputKeys.METHOD, "xml");
    	    transformer.setOutputProperty(OutputKeys.INDENT, "yes");
    	    transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    	    transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
     
    	    try {
    			transformer.transform(new DOMSource(doc), 
    			     new StreamResult(new OutputStreamWriter(out, "UTF-8")));
    		} catch (UnsupportedEncodingException e) {
    			e.printStackTrace();
    		} catch (TransformerException e) {
    			e.printStackTrace();
    		}
    	}
     
    	/**
    	 * Returns a Document from the give URL
    	 * @param xml The url which contains the xml
    	 * @return A document of the URL
    	 * @throws Exception Any problems loading the document
    	 */
    	public static Document loadXMLFromURL(URL xml) throws Exception {
    		System.setProperty("javax.xml.parsers.DocumentBuilderFactory", "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl"); 
    		DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
     
    	    factory.setNamespaceAware(true);
    	    DocumentBuilder builder = factory.newDocumentBuilder();
    	    return builder.parse(xml.openStream());
    	}
    }

    It provides methods for reading (from a url) to writing (to an output stream), as well as some common parsing methods.
    Let's say we have this document:
    <Activities>
        <Activity>
            <RetailFormat>ABC</RetailFormat>
            <FeedDate>2014-02-06 21:01:10</FeedDate>
            <ActivityId>665507</ActivityId>
            <ActivityTitle>ABC 3.9.14 Hawaii </ActivityTitle>
            <StartDate>2014-03-09</StartDate>
            <EndDate>2014-03-15</EndDate>
            <StartTime>00:00:00</StartTime>
            <EndTime>23:59:59</EndTime>
            <JANumber>0</JANumber>
            <PlanItemNo>0</PlanItemNo>
            <ChannelType>Circular</ChannelType>
            <Version>
            </Version>
        </Activity>
    </Activities>
    When you read it into a document, you can get a list of the Activity Elements by using the java Document's method (can't remember its name off the top of my head). Once you get the list of elements, you can loop through them and find the activity numbers with:
    // element is the Activity node you get by looping through the list of activity nodes
    String activityId = XMLUtilities.getValueOfNode(element,"ActivityId");
    Then, later when you want to create a new Document, you just say:
    // document is the document object you are creating
    Element activityIdElement = XMLUtilities.createNode(document,"ActivityId","665507");
    // then add the activityIdElement where you want to in the document
    ...
    // when you are ready to output the document, create an output stream and call:
    boolean success = XMLUtilities.printDocument(document,outputStream);

    I can expand on anything that is not clear. The class is just to simply the use of the already existing java XML classes.
    NOTE TO NEW PEOPLE LOOKING FOR HELP ON FORUM:

    When asking for help, please follow these guidelines to receive better and more prompt help:
    1. Put your code in Java Tags. To do this, put [highlight=java] before your code and [/highlight] after your code.
    2. Give full details of errors and provide us with as much information about the situation as possible.
    3. Give us an example of what the output should look like when done correctly.

    Join the Airline Management Simulation Game to manage your own airline against other users in a virtual recreation of the United States Airline Industry. For more details, visit: http://airlinegame.orgfree.com/

Similar Threads

  1. xml file in JAR archive vs xml in classes folder
    By n0hc in forum What's Wrong With My Code?
    Replies: 1
    Last Post: December 2nd, 2013, 09:44 PM
  2. convert excel to xml and read the input from xml file
    By rahulruns in forum Object Oriented Programming
    Replies: 5
    Last Post: April 3rd, 2012, 11:13 AM
  3. Reading XML File using DOMParser and have problem with accessing xml
    By optiMystic23 in forum What's Wrong With My Code?
    Replies: 2
    Last Post: January 21st, 2012, 02:22 PM
  4. [SOLVED] Write an xml dom to a xml file
    By Kakashi in forum File I/O & Other I/O Streams
    Replies: 2
    Last Post: March 2nd, 2011, 03:30 PM
  5. java xml-rpc response parsing to xml
    By kievari in forum File I/O & Other I/O Streams
    Replies: 0
    Last Post: November 19th, 2009, 02:36 PM

Tags for this Thread