To test the validity of tags, attributes in an XML file are according to the needs we want or not, you can use an XSD Schema file. What is it? We will find out in this tutorial.
Assume now, you are working with an application and application that will process a simple XML file with structure as follows:
1 2 3 4 5 |
<student> <id>1</id> <name>Khanh</name> <address>Viet Nam</address> </student> |
In particular, the <id> tag must exist and its value must be of type int, and the length of the value of the <address> tag must be less than 20.
So how can you ensure that the XML files that your program will handle guarantee the above requirements? Just use the XSD Schema file : D
The XSD Schema file is also an XML file but its functionality is:
- Defines tags, attributes needed in an XML file.
- Any tag, attribute is required, which not?
- Defining a tag requires a value or not?
- Defines the data type of tags, attributes.
- Defines default values for attributes.
Thanks to these functions, we can use the XSD Schema file to validate whether the content of the file has exactly what we want or not.
Now, I will show you how to build an XSD Schema file to satisfy the needs of the above example.
First, I would create a Maven project and create two student.xml and student.xsd files in the /src/main/resources/ directory as follows:
The contents of the student.xml file as above.
Now I’m going to build an XSD Schema file to validate our student.xml file.
First, we need to declare the outermost tag of the schema file, named <schema>. This tag is also called a root tag with some attributes as follows:
1 2 3 4 |
<?xml version="1.0" encoding="UTF-8"?> <schema targetNamespace="http://huongdanjava.com/student/" xmlns="http://www.w3.org/2001/XMLSchema"> </schema> |
- Inside:
targetNamespace: Defines the elements defined in the current schema that belong to the “http://huongdanjava.com/student/” namespace. - xmlns: defines the default namespace to use for tags, attributes in this schema file belong to the “http://www.w3.org/2001/XMLSchema” namespace.
Next, we will start defining the element for the <student> tag.
There are two types of element declarations in an XSD Schema file: Simple Element and Complex Element.
Simple Element is used to define tags that contain simple data type such as string, integer, boolean, and so on. They do not contain child tags or attributes.
Simple data types are as follows:
- string: string
- boolean: true / false
- numeric: number
- dateTime: date, time
- binary: binary
- anyURI: URI strings
- integer: integer
- decimal: decimal
- time: time
For example, if our XML file only has the following content:
1 |
<name>Khanh</name> |
You can use Simple Element.
Complex Element is used to define more complex tags containing child tags, divided into four categories:
- Empty tag, only attribute.
For example:
1 |
<name id="1"/> |
- Tags contain only child tags:
For example:
1 2 3 4 |
<student id="1"> <name>Quan</name> <address>456</address> </student> |
- Tag contains only text
For example:
1 |
<name id="2">Quan</name> |
- The tag contains both the child tag and the text.
1 2 3 |
<student id="1"> Nam of student is : <name>Quan</name> </student> |
In the example of this tutorial, the <student> tag will contain some other child tags, so we will declare element student as a Complex Element as follows:
1 2 3 4 5 6 7 8 9 10 |
<?xml version="1.0" encoding="UTF-8"?> <schema targetNamespace="http://huongdanjava.com/student/" xmlns="http://www.w3.org/2001/XMLSchema"> <element name="student"> <complexType> <sequence> </sequence> </complexType> </element> </schema> |
As you can see, I have declared the student element using the <complexType> tag and the <sequence> tag. Inside the <sequence> tag will be the definition of the <student> tag.
To declare an element for the <id> tag within the <student> tag, we must consider the need to have and must be of type int.
To force a tag to appear in an XML file, you can use the element’s minOccur attribute with a value of 1. To declare a data type for a tag, we will use the element’s type attribute. Combining these two requirements, I declare the following for the element of the <id> tag:
1 2 3 4 5 6 7 8 9 10 |
<?xml version="1.0" encoding="UTF-8"?> <schema targetNamespace="http://huongdanjava.com/student/" xmlns="http://www.w3.org/2001/XMLSchema"> <element name="student"> <complexType> <sequence> <element name="id" type="int" minOccurs="1"></element> </sequence> </complexType> </element> </schema> |
For the <name> tag, the request is just a string data type, so we declare the element for this tag as follows:
1 2 3 4 5 6 7 8 9 10 11 |
<?xml version="1.0" encoding="UTF-8"?> <schema targetNamespace="http://huongdanjava.com/student/" xmlns="http://www.w3.org/2001/XMLSchema"> <element name="student"> <complexType> <sequence> <element name="id" type="int" minOccurs="1"></element> <element name="name" type="string"></element> </sequence> </complexType> </element> </schema> |
As for the <address> tag, it is required to limit the length to 20.
For this requirement, when declaring the element for the <address> tag, use the <restriction> tag with the <maxLength> tag as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
<?xml version="1.0" encoding="UTF-8"?> <schema targetNamespace="http://huongdanjava.com/student/" xmlns="http://www.w3.org/2001/XMLSchema"> <element name="student"> <complexType> <sequence> <element name="id" type="int" minOccurs="1"></element> <element name="name" type="string"></element> <element name="address"> <simpleType> <restriction base="string"> <maxLength value="20"></maxLength> </restriction> </simpleType> </element> </sequence> </complexType> </element> </schema> |
Here, we have completed the construction of the XSD Schema file to validate the contents of the student.xml file.
At present, we have not used the file student.xsd to validate the student.xml file, the content of the student.xml file does not have any problems:
But if we use this student.xsd file in the student.xml file, by editing the contents of the student.xml file a bit to add the location of the XSD Schema file, as follows:
1 2 3 4 5 |
<st:student xmlns:st="http://huongdanjava.com/student/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://huongdanjava.com/student/ student.xsd "> <id>1</id> <name>Khanh</name> <address>Viet Nam</address> </st:student> |
then you will see the error:
Or:
In the student.xml file above, I used the xsi: schemaLocation attribute to load the contents of the student.xsd file.