The Document Object Model (DOM) is a collection of Java interfaces that makes us easy to access and modify the structure and content of an XML file. In this tutorial, we will learn how to read XML file using DOM in Java.
First, I’ll list some of the interfaces which we use when working with DOM:
- org.w3c.dom.Document
- org.w3c.dom.Node
- org.w3c.dom.Element
- org.w3c.dom.Attr
- org.w3c.dom.Text
Each of these interfaces will have one or more classes implementing it, depending on the structure of the XML content we need to process.
Next, I will create and work with a Maven project to illustrate reading XML file using the DOM.
I have the following project:
The XML file we need to read is students.xml with the following contents:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <students> <student n0='1'> <name>John</name> <code>12345</code> <age>19</age> </student> <student n0='2'> <name>Marry</name> <code>23456</code> <age>24</age> </student> </students> |
This file contains information about two students, including the name, code and age of each student. Each student will be numbered through the attribute n0.
OK, now we start reading this file!
First, to be able to read the students.xml file, we need to initialize the File object for this file:
1 |
File f = new File("students.xml"); |
Next, we need to initialize the Document object containing the information of the XML file from the generated File object:
1 2 3 |
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder buider = factory.newDocumentBuilder(); Document doc = buider.parse(f); |
To read the XML file with DOM, we will read each of its tags from outside into inside follow their hierarchical order.
In the students.xml file, the <students> tag is the outermost tag, also known as the root element. To read this tag we use the following method:
1 |
Element students = doc.getDocumentElement(); |
In DOM, a tag is defined as an element.
Now, we have the <students> tag information, to get the <student> tags information, we will use the <students> tag to retrieve them.
1 |
NodeList studentList = students.getElementsByTagName("student"); |
The NodeList object will contain the two <student> tag information contained in the XML file. Now we will go through each tag to read their information:
1 2 3 4 5 6 |
for (int i = 0; i < studentList.getLength(); i++) { Node node = studentList.item(i); if (node.getNodeType() == Node.ELEMENT_NODE) { Element student = (Element) node; } } |
A Node is a node in our DOM tree. We need to convert it to the Element object to use.
OK, now we have the object containing the <student> tag information. The task now is to read them.
To read the “n0” attribute, we simply do the following:
1 |
student.getAttribute("n0"); |
where n0 is the name of the attribute.
Because now we are reading the <student> tag and every <student> tag has only one child tag for the student’s name, student code and age so we can read this information as follows:
- <name> tag:
1 |
student.getElementsByTagName("name").item(0).getTextContent(); |
- <code> tag:
1 |
student.getElementsByTagName("code").item(0).getTextContent(); |
- <age> tag:
1 |
student.getElementsByTagName("age").item(0).getTextContent(); |
OK, we have finished reading the file students.xml, you can refer to full code as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
package com.huongdanjava.dom; import java.io.File; import java.io.IOException; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; import org.w3c.dom.Document; import org.w3c.dom.Element; import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org.xml.sax.SAXException; public class DOMExample { public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException { File f = new File("students.xml"); DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder buider = factory.newDocumentBuilder(); Document doc = buider.parse(f); Element students = doc.getDocumentElement(); NodeList studentList = students.getElementsByTagName("student"); for (int i = 0; i < studentList.getLength(); i++) { Node node = studentList.item(i); if (node.getNodeType() == Node.ELEMENT_NODE) { Element student = (Element) node; System.out.println("n0: " + student.getAttribute("n0")); System.out.println("name: " + student.getElementsByTagName("name").item(0).getTextContent()); System.out.println("code: " + student.getElementsByTagName("code").item(0).getTextContent()); System.out.println("age: " + student.getElementsByTagName("age").item(0).getTextContent()); System.out.println("\n"); } } } } |
Result: