(hint: á a vi keystroke sequence)
Just a sort of random thought that's been brewing in the back of my mind for a while, and exacerbated by playing with the editor built into this blog: it's a bad thing that we need to sanitize user inputs to make sure they're not malicious HTML or script. It's a bad thing that thousands of projects and millions of web developers have to even think about this. It's an unconscionable drag on productivity.
Possible solutions:
1) create a post-spammer, post-troll utopia where no one even wants to enter malicious text into the entry boxes.
2) differentiate markup and code from content.
Considering the amount of work that's gone into the Web as we know it, #2's not a likely option, so we're stuck with #1.
Seriously, I'm sure somebody could come up with a reason why this wouldn't work, or would be counterproductive -- it literally is something I just thought of -- but it seems to me there ought to be a better way. If it wasn't clear from the title, my model is vi. (For the uninitiated) it centers on a very basic idea: different modes for commands and content. Typing "G" in text mode gives you the letter "G". Hitting "G" in command mode brings you to the bottom of the file/buffer. "4x" in command mode will delete the next 4 characters after your current cursor location. There are many keystrokes that will put you into text mode, but only one (AFAIK) that will put you into command mode (ESC).
So, wouldn't it be nice if you could count on a "<script>SendInfoToEvilPerson(document.cookie)</script>" rendering as text content, and not executing as a script? Should this really be so hard? Discuss.
Update: Ugh. Tags: "Vi" and "Xss"? (I typed in "vi,XSS") Ugh. I mean, I like case-insensitivity as much as the next guy, but saying that searching for "VI" should return a match with "vi" isn't the same as saying that any rendering of these terms is as good as any other (if you click the tags, you'll see that the URL has them in all lower case, which would honestly be much better than "proper case" for both of these terms). Kind of weird, coming from a blogging tool by programmers, and probably mostly for programmers.