公共服务满意度测量的问题顺序效应:来自一项嵌入性调查实验的证据The Effects of Question Order in the Measurement of Public Service Satisfaction:Evidence from an Embedded Survey Experiment
王思琦,郭金云
摘要(Abstract):
公共服务满意度是绩效评估的重要工具。然而,研究发现其并不是一个绝对和稳定的数值,容易受到测量方法包括问题措辞和位置等的影响,即心理学上的启动效应或情景效应。研究将调查实验嵌入一项高速公路服务满意度横向项目,通过将高速公路服务"总体满意度"与"特定满意度"问题的先后顺序进行随机分配,发现将"总体满意度"放在"特定满意度"之前的组(n=619),比将其放在"特定满意度"之后的组(n=601),有更高的总体满意度均值,证实了问题顺序效应在中国情境下的存在。基于回归分析发现,总体满意度的关键因素的识别,对问题顺序也有高度敏感性。研究发现具有重要的政策意义:任何使用公共服务满意度作为依据的管理决策,必须考虑到这些效应,以便保证决策的合理性。
关键词(KeyWords): 公共服务满意度;问题顺序效应;启动效应;调查实验
基金项目(Foundation): 教育部人文社会科学研究青年基金项目“基于IAT实验方法的政府信任测量与影响机制研究”(项目批准号:17XJC630009)资助
作者(Author): 王思琦,郭金云
参考文献(References):
- 季丹,郭政,胡品洁.2016.公共服务质量第三方评价研究---基于华东地区的试点应用[J].中国行政管理,(1):41-44.Ji D,Guo Z,Hu P J.2016.Third-party evaluation of the quality of public servicesBased pilot application in East China[J].Chinese Public Administration,(1):41-44.(in Chinese)
- 格伯·艾伦,格林·唐纳德.2018.实地实验:设计、分析与解释[M].王思琦,译.北京:中国人民大学出版社.Gerber A S,Green D P.2018.Field experiment:Design,analysis,and interpretation[M].Wang S Q,trans.Beijing:China Renmin University Press.(in Chinese)
- 贾奇凡,尹泽轩,周洁.2018.行为公共管理学视角下公众的政府满意度:概念、测量及影响因素[J].公共行政评论,(1):62-82.Jia Q F,Yin Z X,Zhou J.2018.Public satisfaction with government under the view of behavioral public administration:Concept,measurement and predictors[J].Journal of Public Administration,(1):62-82.(in Chinese)
- 景怀斌.2015.专栏导语:强化公共管理实验研究的几个问题[J].公共行政评论,(3):120-125.Jing H B.2015.Column lead:Several issues on strengthening experimental research in public administration[J].Journal of Public Administration,(3):120-125.(in Chinese)
- 李晓倩.2018.行为公共管理学实验:基于SSCI期刊(1978-2016)相关文献的分析[J].公共行政评论,(1):37-61.Li X Q.2018.Experimental research in behavioral public administration:An analysis of related literature based on SSCI-Indexed journals(1978-2016)[J].Journal of Public Administration,(1):37-61.(in Chinese)
- 聂旭刚,陈平,张缨斌,等.2018.题目位置效应的概念及检测[J].心理科学进展,26(2):368-380.Nie X G,Chen P,Zhang Y B,et al.2018.Item Position Effect:Conceptualization,detection and developments[J].Advances in Psychological Science,26(2):368-380.(in Chinese)
- 王劲松,孙穗,梁华,等.2003.高速公路营运服务质量评价调查问卷的设计[J].交通标准化,(11):77-80.Wang J S,Sun S,Liang H,et al.2003.Design of investigating test paper about evaluate on service quality in expressway transportation[J].Communications Standardization,(11):77-80.(in Chinese)
- 王思琦.2018.公共管理与政策研究中的实地实验:因果推断与影响评估的视角[J].公共行政评论,(1):87-107.Wang S Q.2018.Field experiments in research of public administration and public policy:Causal inference and impact evaluation[J].Journal of Public Administration,(1):87-107.(in Chinese)
- Andersen S C,Hjortskov M.2016.Cognitive biases in performance evaluations[J].Journal of Public Administration Research and Theory,26(4):647-662.
- Bartels L M.2002.Question order and declining faith in elections[J].Public Opinion Quarterly,66(1):67-79.
- Benton J E,Daly J L.1991.A question order effect in a local government survey[J].The Public Opinion Quarterly,55(4):640-642.
- Benton J E,Daly J L.1993.Measuring citizen evaluations:The question of question order effects[J].Public Administration Quarterly,16(4):492-508.
- Bouckaert G,Van de Walle S.2003.Comparing measures of citizen trust and user satisfaction as indicators of“good governance”:Difficulties in linking trust and satisfaction indicators[J].International Review of Administrative Sciences,69(3):329-343.
- Bouckaert G,Van de Walle S,Kampen J K.2005.Potential for comparative public opinion research in public administration[J].International Review of Administrative Sciences,71(2):229-240.
- Bradburn N M,Mason W M.1964.The effect of question order on responses[J].Journal of Marketing Research,1(4):57-61.
- Charbonneau,van Ryzin G G.2015.Benchmarks and citizen judgments of local government performance:Findings from a survey experiment[J].Public Management Review,17(2):288-304.
- Dietz T L,Jasinski J L.2007.The effect of item order on partner violence reporting:An examination of four versions of the revised conflict tactics scales[J].Social Science Research,36(1):353-373.
- De Moranville C W,Bienstock C C.2003.Question order effects in measuring service quality[J].International Journal of Research in Marketing,20(3):217-231.
- Gerber A S,Green D P.2012.Field experiments:Design,analysis,and interpretation[M].New York:W.W.Norton.
- Giventer L.1996.Statistical analysis for public administration[M].Boston:Wadsworth.
- Hjortskov M.2017.Priming and context effects in citizen satisfaction surveys[J].Public Administration,95(4):912-926.
- James O,Jilke S,Van Ryzin G G.2017.Causal inference and the design and analysis of experiments[M]//James O,Jilke S,Van Ryzin G.Experiments in Public Management Research:Challenges and Contributions.Cambridge:Cambridge University Press,59-88.
- Kampen J K,Van de Walle S,Bouckaert G.2006.Assessing the relation between satisfaction with public service delivery and trust in government:The impact of the predisposition of citizens toward government on evalutations of its performance[J].Public Performance and Management Review,29(4):387-404.
- Kaplan S A,Luchman J N,Mock L.2013.General and specific question sequence effects in satisfaction surveys:Integrating directional and correlational effects[J].Journal of Happiness Studies,14(5):1443-1458.
- Kelly J M,Swindell D.2003.The case for the inexperienced user:Rethinking filter questions in citizen satisfaction surveys[J].The American Review of Public Administration,33(1):91-108.
- Lavrakas P J.2008.Encyclopedia of survey research methods[M].Thousand Oaks:SAGEPublications.
- McFarland S G.1981.Effects of question order on survey responses[J].Public Opinion Quarterly,45(2):208-215.
- Moore D W.2002.Measuring new types of question-order effects[J].Public Opinion Quarterly,66(1):80-91.
- Morgeson F V,Petrescu C.2011.Do they all perform alike?An examination of perceived performance,citizen satisfaction and trust with us federal agencies[J].International Review of Administrative Sciences,77(3):451-479.
- Mutz D C.2011.Population-based survey experiments[M].Princeton,NJ:Princeton University Press.
- Ngoye B,Sierra V,Ysa T,et al.2020.Priming in behavioral public administration:Methodological and practical considerations for research and scholarship[J].International Public Management Journal,23(1):113-137.
- Poister T H,Thomas J C.2011.The effect of expectations and expectancy confirmation/disconfirmation on motorists'satisfaction with state highways[J].Journal of Public Administration Research and Theory,21(4):601-617.
- Price V,Tewksbury D.1996.Measuring the third-person effect of news:The impact of question order,contrast and knowledge[J].International Journal of Public Opinion Research,8(2):120-141.
- Pustejovsky J E,Spillane J P.2009.Question-order effects in social network name generators[J].Social Networks,31(4):221-229.
- Ramirez I L,Straus M A.2006.The effect of question order on disclosure of intimate partner violence:An experimental test using the conflict tactics scales[J].Journal of Family Violence,21(1):1-9.
- Sigelman L.1981.Question-order effects on presidential popularity[J].Public Opinion Quarterly,45(2):199-207.
- Stipak B.1979.Citizen satisfaction with urban services:Potential misuse as a performance indicator[J].Public Administration Review,39(1):46-52.
- Tourangeau R,Rips L J,Rasinski K.2000.The psychology of survey response[M].Cambridge:Cambridge University Press.
- Van de Walle S,Van Ryzin G G.2011.The order of questions in a survey on citizen satisfaction with public services:Lessons from a split-ballot experiment[J].Public Administration,89(4):1436-1450.
- Van de Walle S.2018.Explaining citizen satisfaction and dissatisfaction with publicservices[M]//Ongaro E,van Thiel S.The Palgrave Handbook of Public Administration and Management in Europe.London:Palgrave Macmillan,227-241.
- Van Ryzin G G,Muzzio D,Immerwahr S,et al.2004.Drivers and consequences of citizen satisfaction:An application of the American Customer Satisfaction Index model to New York City[J].Public Administration Review,64(3):331-341.
- Van Ryzin G G.2006.Testing the expectancy disconfirmation model of citizen satisfaction with local government[J].Journal of Public Administration Research and Theory,16(4):599-611.
- Van Ryzin G G.2013.An experimental test of the expectancy-disconfirmation theory of citizen satisfaction[J].Journal of Policy Analysis and Management,32(3):597-614.
- Voicu B.2015.Priming effects in measuring life satisfaction[J].Social Indicators Research,124(3):993-1013.
- Walker R M,James O,Brewer G A.2017.Replication,experiments and knowledge in public management research[J].Public Management Review,19(9):1221-1234.
- Wang Z,Solloway T,Shiffrin R M,et al.2014.Context effects produced by question orders reveal quantum nature of human judgments[J].Proceedings of the National Academy of Sciences of the United States of America,111(26):9431-9436.
- (1)心理学研究表明,情景效应在态度测量方面最明显,本文中的情景效应被定义为先前的问题会影响(启动)对后面调查问题的反应,参见Lavrakas(2008)和Hjortskov(2017)。
- (1)这种问题顺序的情景效应揭示了人类决策的量子属性,他们提出了“量子问题等量”(quantum question equality)来解释这种现象,即比较AB问题顺序和BA问题顺序,回答“是-是”的人的比例变化,会被回答“否-否”的人的比例变化所抵消;同样地,回答“是-否”的人的比例变化,会被回答“否-是”的人的比例变化所抵消(Wang et al.,2014)。民意调查中类似现象参见https://www.cbsnews.com/news/why-question-order-changes-poll-results/。
- (1)在调查方法文献中,启动效应被进一步细分为同化(assimilation)和对比(contrast)效应,也称为携带(carryover)和逆火(backfire)效应(Voicu,2015)。在满意度调查背景下,同化效应是指:如果人们在回答特定满意度问题之后再回答一个总体满意度问题,他们会将之前对特定问题的反应总结(summarize)或吸收(assimilate)到总体问题中。对比效应是指:如果先询问调查对象总体问题,然后再使其对特定问题做出回答,则他们倾向于将每个特定问题与总体问题进行比较后来进行评价(DeM oranville and Bienstock,2003)。
- (1)在Van de Walle and Van Ryzin(2011)的研究中,他们通过电子邮件向调查对象发出邀请,答复率(response rate)只有45.6%。
- (1)需要说明的是,本研究开始时该横向项目已完成其中2个高速公路路段的调查,因此实验只覆盖了其中的54个路段,感谢审稿人的提醒。由于数据中这些路段名称都是代码形式,出于简洁性的考虑,样本分布没有报告在论文中。
- (2)即分配到A版本问卷的对象与分配到B版本问卷的对象,在某些内在特征上存在持续的、明显的系统差异。
- (1)当然,进入高速公路服务区并接受我们调查的人能否代表高速公路服务人群总体,可能存在一定的争议。但是从经验和常识来看,司机是否进入服务区、进入什么服务区很大程度上是偶然的。
- (2)需要指出的是,包括实验样本在内,横向项目一共发放问卷11894份,其中有效样本量为10716份。本实验的样本量(干预组A,n=619;干预组B,n=601)占横向项目样本量比例较小,是因为横向项目的原始问卷不能全部改成实验问卷,否则可能会导致项目委托方有意见,因此在每个地点只选择了小部分问卷采用实验版。由于实验在调查程序、地点和人员配置上完全依托横向项目,因此实验样本与原始样本的无效剔除标准是一致的,最终有效回收率均为90%。
- (3)为了保证效应估计值的统计效力,研究使用STATA15软件中估计样本量的power命令,在以β=0.8以及α=0.05标准下,基于之前相关文献,假设AB两组的结果变量(即总体满意度)均值之差为0.5、标准差为2的情况下,计算的每组所需的最低样本量为506/2=253,本实验中两组实际样本量均大于该样本量,因此是有统计效力的。实验研究样本量的计算公式可参见Gerber and Green(2012)。
- (1)在Van de Walle and Van Ryzin(2011)的研究中,没有统一特定满意度和总体满意度问题测量尺度(分别为5点和7点李克特量表),因此无法直接比较二者绝对值。我们的研究设计中对二者统一使用了5点量表,因此可以直接比较。
- (1)审稿人提出“是否投诉”可能会影响研究结论的可靠性。我们认为,这种个别协变量的不平衡可能会影响随机分配实验结果的精度(precision),但不太可能引起严重偏误(bias)。研究文献认为,一般来说,除非我们能重复进行大量随机实验,否则在一项实验研究中个别协变量的不平衡是有可能出现的(Gerber and Green,2012)。我们对协变量进行了回归控制,未发现与均值差(difference in means)估计量的不同。在审稿人建议下,我们还进行了一系列稳健性检验,如删除数据中有投诉经历的样本,只保留无投诉经历样本,各种分析结果仍然与全样本保持一致。
- (2)Kelly and Swindell(2003)指出,在满意度测量中,一般使用从“非常不满意”到“非常满意”5个点的有序李克特量表(ordinal Likert-type scales),因此是一种类别变量而非连续变量,要采用非参数的MannWhitney U检验(又称为Two-sample Wilcoxon rank-sum test),以便体现变量类型与潜在分布(Giventer,1996)。因此,表4在进行顺序效应的统计检验时,没有采用一般连续变量使用的t检验。
- (1)尽管这三个题项在两组之间差异的P值均在0.06左右,在统计上仅仅是边缘显著,参见表4。
- (1)需要指出的是,原始问卷特定满意度计算所使用的数据,并未包括所有横向项目的原始问卷数据,但这些数据均收集于嵌入性实验前后,因此与其有较强的可比性。
- (1)Van de Walle and Van Ryzin(2011)将这种结果归因于满意度问题与问卷中的开场问题的复杂交互作用,即开场问题在总体满意度问题之前就产生了自己的顺序效应。
- (1)这些调查对象参与调查的主要原因是获取报酬,样本在年龄、性别、学历等方面具有一定的特异性,因此对一般公众尤其是公共服务使用者缺乏代表性。