XXE in Java
Vulnerability: XML External Entities (XXE)
Vulnerable Code:
javaCopy codeDocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new File("input.xml"));
Reason for vulnerability: This code does not disable external entity processing, allowing an attacker to read arbitrary files or perform SSRF attacks.
Fixed Code:
javaCopy codeDocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new File("input.xml"));
Reason for fix: Disabling the DOCTYPE declaration prevents external entities from being processed, mitigating XXE vulnerabilities.
Example 2: Python
Vulnerable Code:
pythonCopy codetree = ET.parse('file.xml')
Reason for vulnerability: External entities are enabled by default, allowing XXE attacks.
Fixed Code:
pythonCopy codeparser = ET.XMLParser(resolve_entities=False)
tree = ET.parse('file.xml', parser=parser)
Reason for fix: Disable external entities to prevent XXE attacks.
Java Example
Vulnerable Code:
javaCopyimport javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
public class XMLParser {
public Document parseXML(String xmlString) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
return builder.parse(new InputSource(new StringReader(xmlString)));
}
}
Reason for Vulnerability:
This code uses the default configuration of DocumentBuilderFactory, which allows external entity resolution, potentially leading to XXE attacks.
Fixed Code:
javaCopyimport javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
public class XMLParser {
public Document parseXML(String xmlString) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
factory.setXIncludeAware(false);
factory.setExpandEntityReferences(false);
DocumentBuilder builder = factory.newDocumentBuilder();
return builder.parse(new InputSource(new StringReader(xmlString)));
}
}
Reason for Fix:
The fixed code disables external entity resolution and other potentially dangerous features, preventing XXE attacks.
Java Example
Vulnerable Code:
javaCopyimport org.dom4j.Document;
import org.dom4j.io.SAXReader;
public class XMLProcessor {
public Document processXML(InputStream xmlStream) throws Exception {
SAXReader reader = new SAXReader();
return reader.read(xmlStream);
}
}
Reason for Vulnerability:
This code uses the default configuration of SAXReader, which allows external entity resolution, potentially leading to XXE attacks.
Fixed Code:
javaCopyimport org.dom4j.Document;
import org.dom4j.io.SAXReader;
import org.xml.sax.SAXException;
public class XMLProcessor {
public Document processXML(InputStream xmlStream) throws Exception {
SAXReader reader = new SAXReader();
reader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
reader.setFeature("http://xml.org/sax/features/external-general-entities", false);
reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
return reader.read(xmlStream);
}
}
Reason for Fix:
The fixed code disables external entity resolution and DOCTYPE declarations, preventing XXE attacks when using dom4j.
Python Example
Vulnerable Code:
pythonCopyfrom lxml import etree
def parse_xml(xml_string):
parser = etree.XMLParser()
root = etree.fromstring(xml_string, parser)
return root
Reason for Vulnerability:
This code uses the default XMLParser from lxml, which allows entity expansion and external entity resolution, potentially leading to XXE attacks.
Fixed Code:
pythonCopyfrom lxml import etree
def parse_xml(xml_string):
parser = etree.XMLParser(resolve_entities=False, no_network=True, dtd_validation=False)
root = etree.fromstring(xml_string, parser)
return root
Reason for Fix:
The fixed code configures the XMLParser to disable entity resolution, network access, and DTD validation, preventing XXE attacks when using lxml.
Last updated