Table of Contents
1 问题
1.1 背景
在线上准备用fpm的access log 统计不同api的耗时和cpu使用情况,但是发现access log里显示的内容 不准确
通过定位,发现是使用了 fastcgi_finish_request 导致的这个问题
1.2 复现
基于 php-5.5.20 版本
在fpm里启用access log,格式设置为:
access.format = "%R - %u %t %T %s %{mili}d %{kilo}M %C%%"
1.2.1 code
<?php echo "hi\n"; fastcgi_finish_request();
1.2.2 期待结果
127.0.0.1 - 13/Jan/2015:12:30:07 +0800 13/Jan/2015:12:30:07 +0800 200 0.913 256 0.00%
1.2.3 实际结果
127.0.0.1 - 01/Jan/1970:08:00:00 +0800 13/Jan/2015:12:30:07 +0800 200 3896627338.811 256 0.00%
1.2.4 测试结果说明
至少有两个内容不符合预期
- 收到请求的时间,即%t: server time the request was received
- 处理时间,即%d: time taken to serve the request
2 定位
源码之下无秘密,so,看源码吧
2.1 打印access.log相关逻辑
在fpm的主函数里,当处理完一个请求的时候,就先调用fpm_request_end标记请求已经完成,并计算各种打印信息,最后调用fpm_log_write打印access.log
相关代码如下:
int main(int argc, char *argv[]) { //............. fcgi_init_request(&request, fcgi_fd); zend_first_try { while (fcgi_accept_request(&request) >= 0) { //................. fpm_request_executing(); php_execute_script(&file_handle TSRMLS_CC); //............ fpm_request_end(TSRMLS_C); fpm_log_write(NULL TSRMLS_CC); //............. } //............. } int fpm_log_write(char *log_format TSRMLS_DC) { //.................. s = log_format; while (*s != '\0') { //................ switch (*s) { case 't': /* time */ if (!test) { time_t *t; if (*s == 't') { t = &proc.accepted_epoch; //we need proc.accepted_epoch here //but when we use fastcgi_finish_request, we will clean the value of proc.accepted_epoch before call fpm_log_write } //................ } format[0] = '\0'; break; case 'u': //...................... } } return 0; }
2.2 fpm_request_finished相关逻辑
当php代码中走到 fastcgi_finish_request(),就会调用关闭与client的socket,同时调用fpm_request_finished设置进程的状态,但是这里却有一个问题
在fpm_request_finished里就把proc->accepted_epoch && proc->accepted清掉,导致后续打印日志的时候,就不符合预期了
php代码的调用逻辑如下:
main -> php_execute_script -> fastcgi_finish_request -> fcgi_close -> fpm_request_finished -> fpm_request_end -> fpm_log_write
相关代码:
PHP_FUNCTION(fastcgi_finish_request) { //................. if (request->fd >= 0) { //....................... fcgi_close(request, 0, 0); RETURN_TRUE; } //................ } void fcgi_close(fcgi_request *req, int force, int destroy) { //............. if ((force || !req->keep) && req->fd >= 0) { //............. req->fd = -1; fpm_request_finished(); } } void fpm_request_finished() { //..................... proc->request_stage = FPM_REQUEST_FINISHED; proc->tv = now; memset(&proc->accepted, 0, sizeof(proc->accepted));//~~~~~~~~~~~~~~~~~~~~~~~~~~~ proc->accepted_epoch = 0; //~~~~~~~~~~~~~~~~~~~~~~~~~~ fpm_scoreboard_proc_release(proc); }
2.3 结论
代码中使用fastcgi_finish_request的时候,会调用fpm_request_finished,但是在fpm_request_finished里会把统计时候用的参数清零,导致后续打印日志的时候就不符合预期了
3 解决
3.2 源码上解决
说实话,我没有想到一个比较好的解决办法
只是把fpm_request_finished里清空统计参数的那两句话注释了
虽然这个也能worker,但是感觉怪怪的,感觉在留坑
相关修改代码diff:
@@ -221,8 +221,6 @@ void fpm_request_finished() /* {{{ */ proc->request_stage = FPM_REQUEST_FINISHED; proc->tv = now; - memset(&proc->accepted, 0, sizeof(proc->accepted)); - proc->accepted_epoch = 0; fpm_scoreboard_proc_release(proc); } /* }}} */
并向官方提了一个patch, 目前已经被merge ,第二个被php官方AC的patch:
https://github.com/php/php-src/pull/995