首页 > nginx > nginx location 配置踩坑过程分享

nginx location 配置踩坑过程分享

2015年2月6日 mood 发表评论阅读评论

问题描述
我们的业务系统比较复杂，但最终提供给用户的访问接口比较单一，都是使用 Nginx 来做一个代理转发，而这个代理转发，往往需要匹配很多种不同类型的 URL 转给不同的服务。这就使得我们的 Nginx 配置文件变得很复杂，粗略估计了下，我们有近20个 upstream，有近60个 location 匹配。这些配置按照模块分布在不同的文件中，虽然复杂，但是仍然在我们的努力下运行的良好。直到有一天，有位同事给我反映说偶尔有些 URL 会出现 404 的问题。一开始没太在意，因为他也说不准是哪一种 URL 才遇到这个问题。

问题查找
后来，慢慢的查找，找到了一些规律，一开始只知道是 tomcat 那边返回 404了，想到 Nginx 都代理给了 tomcat，一开始就怀疑是程序的问题，不会想到是 Nginx。

我开始查找代码的问题，我在本地的开发环境，尝试了很久，我使用 8080 端口访问，不论如何都是正确的结果，可是生产环境就是不行。然后我就听信了某坑友同事的理论，重启解决 95% 的问题，重装解决 100%的问题，我尝试重启了 tomcat 和 Nginx，依然不行，然后是重装，你猜结果如何？？？？？ ——想啥呢？当然也是不行！

后来就开始怀疑是生产环境和开发环境的差异，去服务器上访问 8080 端口，仍然是可以的。可是一经过 Nginx 代理，就不行。这个时候才开始怀疑是 Nginx 出了什么问题。

Nginx 怎么会出问题呢，业务系统中 URL 模式 /helloworld/* ,这样的 URL 我们都是统一处理的。怎么会出现一些行，一些不行呢。问题表现为 A URL （/helloworld/nn/hello/world）没问题，而 B URL(/helloworld/ii/hello/world) 有问题。

所以到目前为止，基本可以肯定是 Nginx 的 location 上出了一些问题。

问题解决
因篇幅有限，为了直面本次问题的核心，我不再贴出完整的 Nginx 配置，我简化此次问题的模型。请看如下 Nginx 配置，这是我们之前的会导致问题的错误配置模型。

worker_processes  1;
error_log  logs/error.log;
events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;
    log_format  main  '$remote_addr - $request_time - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  logs/access.log main;

    sendfile        on;
    keepalive_timeout  65;

    gzip  on;

    server {
        listen       80;
        server_name  localhost;
        location / {
            root   html;
            index  index.html index.htm;
        }
        location = /helloworld {
                return 602;
        }
        location /helloworld {
                return 603;
        }
        
        ## 生产环境中如下两个 location 在另外一个文件中，通过 include 包含进来
        location /ii {
                return 604;
        }
        location ~ /ii/[^/]+/[^/]+ {
                return 605;
        }
        ##
        
        location ~ ^/helloworld/(scripts|styles|images).* {
                return 606;
        }
    }
}

worker_processes 1;

error_log logs/error.log;

events {

worker_connections 1024;

}

http {

include mime.types;

default_type application/octet-stream;

log_format main '$remote_addr - $request_time - $remote_user [$time_local] "$request" '

'$status $body_bytes_sent "$http_referer" '

'"$http_user_agent" "$http_x_forwarded_for"';

access_log logs/access.log main;

sendfile on;

keepalive_timeout 65;

gzip on;

server {

listen 80;

server_name localhost;

location / {

root html;

index index.html index.htm;

}

location = /helloworld {

return 602;

}

location /helloworld {

return 603;

}

## 生产环境中如下两个 location 在另外一个文件中，通过 include 包含进来

location /ii {

return 604;

}

location ~ /ii/[^/]+/[^/]+ {

return 605;

}

location ~ ^/helloworld/(scripts|styles|images).* {

return 606;

}

注意，这里有几点需要说明一下，生产环境的 Nginx 服务器配置文件比这里要复杂很多，而且是按模块分布在不同的文件中的。这里简化模型后，使用 Http 响应状态码 60x 来区分到底被哪个 location 匹配到了。
我针对当时的情况，做了大量尝试，最终的简化版本如下：

尝试1：http://localhost/helloworld ==> 602 符合预期
尝试2：http://localhost/helloworld/hello ==> 603 符合预期
尝试3：http://localhost/ii ==> 604 符合预期
尝试4：http://localhost/ii/oo ==> 604 符合预期
尝试5：http://localhost/ii/pp/kk ==> 605 符合预期
尝试6：http://localhost/ii/pp/kk/ll ==> 605 符合预期
尝试7：http://localhost/helloworld/scripts/aaa.js ==> 606 符合预期
尝试8：http://localhost/helloworld/ii/hello/world ==> 605 不符合预期，预期为【603】
上面这些尝试支持读者自行试验，Nginx 配置文件是完整可用的，我本地 Nginx 的版本是1.6.2

问题就在这里：我这里是事后，把这些匹配 location 标记成了不同的响应码，才方便查找问题。当发现这个不符合预期后，我还是难以理解，为何我一个以 /helloworld 开头的 URL 会被匹配到 605 这个以 /ii 开头的 location 里面来。在当时的生产环境中，以 /ii 的配置统一放在另外一个文件中，这里是很难直观的察觉出来这个 /ii 跟访问的 URL 里面的 /ii 的关系。

我不得不重新编译了 Nginx ，加上了调试参数，修改配置项，看调试日志了。

这里不再讲如何给 Nginx 加调试的编译参数，可自行查看相关文档。修改配置项很简单，只需要在

error_log logs/error.log;
后面加上 debug 就可以了。

打出详细调试日志后，访问
http://localhost/helloworld/ii/hello/world
我得到了这样的一段日志(省略掉了前后无用的日志，只保留有意义的一段)：

2015/02/02 15:38:48 [debug] 5801#0: *60 http request line: "GET /helloworld/ii/hello/world HTTP/1.1"
2015/02/02 15:38:48 [debug] 5801#0: *60 http uri: "/helloworld/ii/hello/world"
2015/02/02 15:38:48 [debug] 5801#0: *60 http args: ""
2015/02/02 15:38:48 [debug] 5801#0: *60 http exten: ""
2015/02/02 15:38:48 [debug] 5801#0: *60 http process request header line
2015/02/02 15:38:48 [debug] 5801#0: *60 http header: "User-Agent: curl/7.37.1"
2015/02/02 15:38:48 [debug] 5801#0: *60 http header: "Host: localhost"
2015/02/02 15:38:48 [debug] 5801#0: *60 http header: "Accept: */*"
2015/02/02 15:38:48 [debug] 5801#0: *60 http header done
2015/02/02 15:38:48 [debug] 5801#0: *60 event timer del: 4: 1422862788055
2015/02/02 15:38:48 [debug] 5801#0: *60 rewrite phase: 0
2015/02/02 15:38:48 [debug] 5801#0: *60 test location: "/"
2015/02/02 15:38:48 [debug] 5801#0: *60 test location: "ii"
2015/02/02 15:38:48 [debug] 5801#0: *60 test location: "helloworld"
2015/02/02 15:38:48 [debug] 5801#0: *60 test location: ~ "/ii/[^/]+/[^/]+"
2015/02/02 15:38:48 [debug] 5801#0: *60 using configuration "/ii/[^/]+/[^/]+"
2015/02/02 15:38:48 [debug] 5801#0: *60 http cl:-1 max:1048576
2015/02/02 15:38:48 [debug] 5801#0: *60 rewrite phase: 2
2015/02/02 15:38:48 [debug] 5801#0: *60 http finalize request: 605, "/helloworld/ii/hello/world?" a:1, c:1
2015/02/02 15:38:48 [debug] 5801#0: *60 http special response: 605, "/helloworld/ii/hello/world?"
2015/02/02 15:38:48 [debug] 5801#0: *60 http set discard body
2015/02/02 15:38:48 [debug] 5801#0: *60 posix_memalign: 00007FC3BB816000:4096 @16
2015/02/02 15:38:48 [debug] 5801#0: *60 HTTP/1.1 605
Server: nginx/1.6.2
Date: Mon, 02 Feb 2015 07:38:48 GMT
Content-Length: 0
Connection: keep-alive

2015/02/02 15:38:48 [debug] 5801#0: *60 http request line: "GET /helloworld/ii/hello/world HTTP/1.1"

2015/02/02 15:38:48 [debug] 5801#0: *60 http uri: "/helloworld/ii/hello/world"

2015/02/02 15:38:48 [debug] 5801#0: *60 http args: ""

2015/02/02 15:38:48 [debug] 5801#0: *60 http exten: ""

2015/02/02 15:38:48 [debug] 5801#0: *60 http process request header line

2015/02/02 15:38:48 [debug] 5801#0: *60 http header: "User-Agent: curl/7.37.1"

2015/02/02 15:38:48 [debug] 5801#0: *60 http header: "Host: localhost"

2015/02/02 15:38:48 [debug] 5801#0: *60 http header: "Accept: */*"

2015/02/02 15:38:48 [debug] 5801#0: *60 http header done

2015/02/02 15:38:48 [debug] 5801#0: *60 event timer del: 4: 1422862788055

2015/02/02 15:38:48 [debug] 5801#0: *60 rewrite phase: 0

2015/02/02 15:38:48 [debug] 5801#0: *60 test location: "/"

2015/02/02 15:38:48 [debug] 5801#0: *60 test location: "ii"

2015/02/02 15:38:48 [debug] 5801#0: *60 test location: "helloworld"

2015/02/02 15:38:48 [debug] 5801#0: *60 test location: ~ "/ii/[^/]+/[^/]+"

2015/02/02 15:38:48 [debug] 5801#0: *60 using configuration "/ii/[^/]+/[^/]+"

2015/02/02 15:38:48 [debug] 5801#0: *60 http cl:-1 max:1048576

2015/02/02 15:38:48 [debug] 5801#0: *60 rewrite phase: 2

2015/02/02 15:38:48 [debug] 5801#0: *60 http finalize request: 605, "/helloworld/ii/hello/world?" a:1, c:1

2015/02/02 15:38:48 [debug] 5801#0: *60 http special response: 605, "/helloworld/ii/hello/world?"

2015/02/02 15:38:48 [debug] 5801#0: *60 http set discard body

2015/02/02 15:38:48 [debug] 5801#0: *60 posix_memalign: 00007FC3BB816000:4096 @16

2015/02/02 15:38:48 [debug] 5801#0: *60 HTTP/1.1 605

Server: nginx/1.6.2

Date: Mon, 02 Feb 2015 07:38:48 GMT

Content-Length: 0

Connection: keep-alive

可以看到，Nginx 测试了几次 location 匹配，最终选择了
~ "/ii/[^/]+/[^/]+"
这个作为最终的匹配项。到这里问题就完全展现出来了，我们本来的意思，是要以 /ii 开头，后面有两个或者更多的 / 分割的 URL 模型才匹配，但是这里的正则表达式匹配写的不够精准，导致了匹配错误。正则表达式没有限制必须从开头匹配，所以才会匹配到 /helloworld/ii/hello/world 这样的 URL 。

解决办法就是在这个正则表达式前面加上 ^ 来强制 URL 必须以 /ii 开头才能匹配.
由
/ii/[^/]+/[^/]+
变成
^/ii/[^/]+/[^/]+
至此，这个坑被填上了，消耗的是五个小时和一个字符。

相信很多人在写 Nginx 的location 的时候都会 location ~ /xxx 或者 location /iii 这样简单了事，但是我想说的是能尽量精确就尽量精确，否则出现问题的时候，非常难以查找。

有关 Nginx 的 location 匹配规则，可以查看: http://nginx.org/en/docs/http/ngx_http_core_module.html

问题总结
这个问题看似简单，却也隐含了不少问题，值得我们深思。

计算机或者软件出的问题往往是确定的，你发现他捉摸不定的时候，往往是没有观察到问题点
追踪一个问题，如果有一个必现方式，一定要紧追不舍，这就是所谓线索
当你实在是找不到问题所在的时候，要怀疑一下之前被自己排除掉的可能性
借助各个组件的详细调试日志来查找问题，往往能得到意想不到的效果
程序员的价值不是用行数，字数，提交数衡量的！

原文地址http://blog.coding.net/blog/tips-in-configuring-Nginx-location

分类: nginx 标签: location

这几篇文章你可能也喜欢：

评论 (3) Trackbacks (1) 发表评论 Trackback

tdus

2016年7月5日11:14 | #1

回复 | 引用

感謝分享!
isaac

2016年1月22日14:29 | #2

回复 | 引用

写得不错，尤其是复杂问题简化讲解做得很漂亮，学习了，已收录到我的NGINX内部培训当参考资料了。
Aceslup

2015年2月28日23:24 | #3

回复 | 引用

这个排错经验对我很宝贵。感谢经验分享。

2018年4月23日14:41 | #1

nginx 错误日志 error_log 日志级别 – JerryChan的个人小站

nginx uWSGI Django python运行环境安装和配置 PHP当前时间毫秒