XML External entity injection in Pywps
PyWPS is an implementation of the Web Processing Service standard from the Open Geospatial Consortium. PyWPS is written in Python.
PyWPS was started by Jachym Cepicky as part of his project ‘Connecting of GRASS GIS with UMN MapServer’, supported by the German Foundation for Environment. He began to work on this project with a scholarship by GDF-Hannover that went from April to September of 2006.
PyWPS enables integration, publishing and execution of Python processes via the WPS standard.
PyWPS is Open Source and released under an MIT license.
The why
While bug hunting on a university that wishes to remain anonymous, I discovered that several of the hosted apps were handling huge geo data sets. The two apps that caught my eye were front ends for viewing historical weather data and data relating to farming ground fertility respectively.
The apps both consisted of a map view, and multiple overlays and filtering options for displaying the fetched data. The sheer number of user inputs available and the complexity of the apps was enough to spike my interest.
The Recon and initial finding
I decided to drill into one of the apps and see what I could find. I chose the farming related app because….well I live in the UK. The last thing I want is to get knee deep in weather data for another cold rainy place.
All of the intended input fields were well sanitized and where possible, limited to dropdown menus. In fact, half an hour into poking around I was starting to think I would not find a way to break anything.
I opened the web-app in the Zaproxy built in browser and interacted with the app in every way it seemed I could. In this case, I had already tested most of the inputs manually to no avail, so my reason for proxying the app was to see if I could gain a deeper understanding of how requests for data were being handled under the hood.
I discovered that when the app first loads, and when I select a new data set, a query to another subdomain was being made and returning very large responses. The requests were being sent somewhere that looks like this:
“https://wps.domain.com:8081/?service=WPS&request=BlahBlahBlah“
Tampering with the parameters to try and manipulate responses was fruitless at first. Again I was faced with solid input filtering. Anything out of place in the queries would simply result in a status code 400. If I attempted to pass an unkeyed or incorrect parameter or value, via a POST request, the parameter name and value would be reflected in the error message but there was no apparent rout to exploit. The only noteworthy part of the error response was a comment reading:
“<!– Pywps 4.4.4 –>”
So I looked into Pywps and was specifically interested in how it was handling requests. This is when I discovered that some input was being passed to lxml and that by default lxml was set to accept entity declarations.
Remember how my input was being reflected? Now I know my input is reaching a misconfigured module. From here the exploitation was fairly trivial. I prepended a legitimate input with the following then referenced the entity later in the request data
“”””
<?xml version=”1.0″ encoding=”utf-8″?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM “file:///etc/passwd”>
]>
“”””
I still got the status code 400 error but this time the contents of /etc/passwd were right there in the response too. indicating that entities I define are being parsed server side. This finding was shared immediately with the organisation I was testing. I then I set about figuring out if this was a widespread issue.
One off or bigger issue?
Before notifying the maintainers of Pywps, I decided to verify my finding by hosting my own Pywps app. Fortunately, there is a project called birdhouse/emu that contains a fork of Pywps, perfectly configured for this test.
I confirmed that most if not all versions were vulnerable by downgrading my install and testing again.
Once I was sure of my finding, I emailed a couple of the maintainers and they began working on a fix.
The report timeline
-22/7/21 Fist reported
-23/7/21 Initial acknowledgement
-9/8/21 Vendor/maintainers pull request merged for testing ahead of patching main
-10/8/21 I tested the proposed patch locally and confirmed fixed
-13/8/21 Patch merged
-13/8/21 Maintainer requested 2 week lead time before issue is made public
Further research
While the patch was being applied I looked into how I might identify this vulnerability at scale. Dorking and Shodan revealed a number of vulnerable installations. This included a company using Pywps for mapping things in space, several other universities and a few government funded climate mapping organisations. I notified the ones I deemed to be the most critical. (alas no bounty paying organisations)