Mischiefblog
I make apps for other people

I’m not entirely stupid . . .

Posted by Chris Jones
On January 7th, 2008 at 17:11

Permalink | Trackback | Links In |

Comments Off on I’m not entirely stupid . . .
Posted in Python

Now, I’m not entirely stupid. Lacking common sense? Yes. Forgetful of dates, appointments, anniversaries, bills? Yes. Easily confused? Absolutely.

But I’ve also been programming in Python for years and wrote my first CGI scripts in 1993 or 1994 or thereabouts. I’m kind of familiar with both environments, even if this is the first time I’ve actually put them together. Everything seemed to be meshing really, really well until I tried to get XML content from the web client through a DOM document passed into the XMLHttpRequest. (And yes, I am new to XMLHttpRequest dynamic app development. I’d always had the pleasure of using someone else’s library.)


function createDocGetTime() {
    docGetTime = document.implementation.createDocument("", "", null);
    var getNode = docGetTime.createElement("get");
    var getTime = docGetTime.createElement("time");
    getNode.appendChild(getTime);
    docGetTime.appendChild(getNode);
}
 
function updateTime() {
    timerClient = new XMLHttpRequest();
    timerClient.onreadystatechange = timerHandler;
    timerClient.open("POST", "xmltimer.py");
    timerClient.setRequestHeader("Content-Type", "text/xml;charset=UTF-8");
    timerClient.send(docGetTime);
    timerId = setTimeout("updateTime()", 1500);
}

Evil code (there’s no way to stop calling updateTime() :)), tutorial-simple, and pretty obvious, or at least it should be. The problem looks to be Python.

From the old days of CGI, if you wanted to read the contents of a POSTed form, you’d need to split and parse the contents of the body of the HTTP request which should come in through STDIN. (The QUERY_STRING is only useful for GET requests.) This is old, ancient stuff, and should be possible with something like:


import os
import sys
 
print "Content-Type: text/html\n\n<b>Body Content:</b><pre><![CDATA["
if os.environ.has_key("CONTENT_LENGTH"):
    print sys.stdin.read(os.environ["CONTENT_LENGTH"])
print "]]></pre>"

Except that the body never gets read.

In these paranoid times, you can’t call a different server, port, or host than the one that served up the webpages, or you’d become vulnerable to XSS, so the browser forbids that kind of request. I had to shut down Apache and start netcat to verify that wasn’t not a problem with how I was using XMLHttpRequest.

$ ./netcat -l -p 80
POST /demo/simple.py HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Content-Type: text/xml;charset=UTF-8
Referer: http://localhost/demo/xmlupdate.html
Content-Length: 18
Pragma: no-cache
Cache-Control: no-cache
 
<get><time /></get>

So I’m stuck wondering what I screwed up with my Python CGI configuration or with how I invoke the XMLHttpRequest.

I’ve been bugged by this since Saturday and haven’t figured it out yet. Does anyone have any idea why STDIN is empty?

  • Is it something to do with how Python is invoked (however unlikely)? This error should also cause the standard cgi library to fail–but it doesn’t.
  • Is it something to do with the content type? Changing to plain text/html doesn’t help, and neither does the incorrect application/x-www-form-urlencoded.

My code which dumps the POST content works fine with an HTML FORM submission:

$ ./netcat -l -p 80
POST /demo/simple.py HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/2
0071127 Firefox/2.0.0.11
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plai
n;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://localhost/demo/xmlupdate.html
Content-Type: application/x-www-form-urlencoded
Content-Length: 45
 
yourname=xxx&yourmood=yyy&Submit=Submit+Query

Output

<test>
<![CDATA[
<br/>
HTTP_REFERER=http://localhost/demo/xmlupdate.html<br/>
SERVER_SOFTWARE=Apache/2.0.54 (Win32)<br/>
SCRIPT_NAME=/demo/simple.py<br/>
 
. . . skipped . . .
 
CONTENT_LENGTH=45<br/>
HTTP_ACCEPT_CHARSET=ISO-8859-1,utf-8;q=0.7,*;q=0.7<br/>
 
. . . skipped . . .
 
CONTENT_TYPE=application/x-www-form-urlencoded<br/>
HTTP_ACCEPT_ENCODING=gzip,deflate<br/>
 
<pre>yourname=xxx&yourmood=yyy&Submit=Submit+Query</pre>
<br/>
</test>
]]>

So Python with CGI doesn’t appear to be entirely at fault.

The easy solution would appear to be to shove the XML content as a String in Javascript into a form field value and submit the form. I hate to take that approach when I should be able to do native and correct XML through the DOM object. Another point of view may say, “If you wanted to save bandwidth, don’t use XML: use tokens and plain text instead.” I tend to regard that as premature optimization.

I have learned that sys.stdin.seek(0) is useless, so if you plan on using the cgi library with your own routine to read the body, pass the body to cgi.FieldStorage() after you’ve read and parsed it, or patch cgi (subclass or override) with a version that stores the unaltered body in a field.

I might find time to test if I can get the XML update request working with PHP.

Comments are closed.