Table of Contents
1 问题
1.1 背景
在线上准备用fpm的access log 统计不同api的耗时和cpu使用情况,但是发现access log里显示的内容 不准确
通过定位,发现是使用了 fastcgi_finish_request 导致的这个问题
1.2 复现
基于 php-5.5.20 版本
在fpm里启用access log,格式设置为:
access.format = "%R - %u %t %T %s %{mili}d %{kilo}M %C%%"
1.2.1 code
<?php
echo "hi\n";
fastcgi_finish_request();
1.2.2 期待结果
127.0.0.1 - 13/Jan/2015:12:30:07 +0800 13/Jan/2015:12:30:07 +0800 200 0.913 256 0.00%
1.2.3 实际结果
127.0.0.1 - 01/Jan/1970:08:00:00 +0800 13/Jan/2015:12:30:07 +0800 200 3896627338.811 256 0.00%
1.2.4 测试结果说明
至少有两个内容不符合预期
- 收到请求的时间,即%t: server time the request was received
- 处理时间,即%d: time taken to serve the request
2 定位
源码之下无秘密,so,看源码吧
2.1 打印access.log相关逻辑
在fpm的主函数里,当处理完一个请求的时候,就先调用fpm_request_end标记请求已经完成,并计算各种打印信息,最后调用fpm_log_write打印access.log
相关代码如下:
int main(int argc, char *argv[])
{
//.............
fcgi_init_request(&request, fcgi_fd);
zend_first_try {
while (fcgi_accept_request(&request) >= 0) {
//.................
fpm_request_executing();
php_execute_script(&file_handle TSRMLS_CC);
//............
fpm_request_end(TSRMLS_C);
fpm_log_write(NULL TSRMLS_CC);
//.............
}
//.............
}
int fpm_log_write(char *log_format TSRMLS_DC)
{
//..................
s = log_format;
while (*s != '\0') {
//................
switch (*s) {
case 't': /* time */
if (!test) {
time_t *t;
if (*s == 't') {
t = &proc.accepted_epoch; //we need proc.accepted_epoch here
//but when we use fastcgi_finish_request, we will clean the value of proc.accepted_epoch before call fpm_log_write
}
//................
}
format[0] = '\0';
break;
case 'u':
//......................
}
}
return 0;
}
2.2 fpm_request_finished相关逻辑
当php代码中走到 fastcgi_finish_request(),就会调用关闭与client的socket,同时调用fpm_request_finished设置进程的状态,但是这里却有一个问题
在fpm_request_finished里就把proc->accepted_epoch && proc->accepted清掉,导致后续打印日志的时候,就不符合预期了
php代码的调用逻辑如下:
main -> php_execute_script -> fastcgi_finish_request -> fcgi_close -> fpm_request_finished -> fpm_request_end -> fpm_log_write
相关代码:
PHP_FUNCTION(fastcgi_finish_request)
{
//.................
if (request->fd >= 0) {
//.......................
fcgi_close(request, 0, 0);
RETURN_TRUE;
}
//................
}
void fcgi_close(fcgi_request *req, int force, int destroy)
{
//.............
if ((force || !req->keep) && req->fd >= 0) {
//.............
req->fd = -1;
fpm_request_finished();
}
}
void fpm_request_finished()
{
//.....................
proc->request_stage = FPM_REQUEST_FINISHED;
proc->tv = now;
memset(&proc->accepted, 0, sizeof(proc->accepted));//~~~~~~~~~~~~~~~~~~~~~~~~~~~
proc->accepted_epoch = 0; //~~~~~~~~~~~~~~~~~~~~~~~~~~
fpm_scoreboard_proc_release(proc);
}
2.3 结论
代码中使用fastcgi_finish_request的时候,会调用fpm_request_finished,但是在fpm_request_finished里会把统计时候用的参数清零,导致后续打印日志的时候就不符合预期了
3 解决
3.2 源码上解决
说实话,我没有想到一个比较好的解决办法
只是把fpm_request_finished里清空统计参数的那两句话注释了
虽然这个也能worker,但是感觉怪怪的,感觉在留坑
相关修改代码diff:
@@ -221,8 +221,6 @@ void fpm_request_finished() /* {{{ */
proc->request_stage = FPM_REQUEST_FINISHED;
proc->tv = now;
- memset(&proc->accepted, 0, sizeof(proc->accepted));
- proc->accepted_epoch = 0;
fpm_scoreboard_proc_release(proc);
}
/* }}} */
并向官方提了一个patch, 目前已经被merge ,第二个被php官方AC的patch:
https://github.com/php/php-src/pull/995