Assessing Information Integration Processes Using Between-or Within-Subjects Designs : Some More Evidence *

Within-subject designs (WSDs) remain unappreciated in psychology although many experimental tactics can reduce or eliminate the demand and order effects that WSDs tend to create. Comparative studies conducted in the Information Integration Theory (IIT) framework have shown that patterns of results observed using WSDs can largely be replicated using between-subject designs (BSDs). In order to add evidence to these findings, three additional studies were conducted in order to complement data obtained in previous studies. One of these studies was about health risk perception and tested the possibility to find evidence for a disjunctive rule of information integration using a BSD. The other two studies focused on the valuation process of IIT. The new findings regarding the disjunctive rule added support to the view that equivalent results can be obtained either with a highly economical repeated-measures design or with a much costlier independent factorial group arrangement. However, when the focus was on the valuation process and not on the integration process, ratings obtained in the BSD condition seemed to be restricted to a limited range of values by comparison with ratings obtained in the WSD condition. An explanation in terms of context effect is offered.

Despite their interest, within-subjects designs (WSDs) remain unwelcome in psychology.They are in bad repute for several reasons.Firstly, they would create demand effects.In WSDs, participants are presented with several stimuli in close succession.As these stimuli obviously vary as a function of one or several parameters, (a) participants' attention would be attracted by the information that varies from one stimulus to the other and (b) this information would, as a result, be given more weight during their judgment process as compared with the information that does not vary.This belief is still very strong despite evidence showing that when the judgment task is borrowed from daily life, such a possibility is unlikely (Mullet et al., 2012(Mullet et al., , 2016)).In field contexts, participants know in advance what information is, from their own perspective, important or not important.Huge variations in a factor that they consider as irrelevant, given the judgment task at hand, would not make any change.Nevertheless, in laboratory settings, such a possibility exists when the task is, from the participants' perspective, mostly meaningless.In this case, however, the real issue is: Why present people with material that does not make sense to them (Mullet, 2012)?Secondly, they would create order effects.As in WSDs, participants are presented with stimuli in close succession, the first stimulus and the immediate participant's responses to it would contaminate the way the second one is perceived, processed and responded to.This second belief is also still very strong despite evidence showing that many experimental precautions can reduce or eliminate order effects.For example, a prior session of familiarization with the stimuli can eliminate warm-up, practice and learning effects (Anderson, 2001(Anderson, , 2002)).
Thirdly, findings from WSDs would not be generalizable to the real world because the structure of the world is of a between-subjects kind.This third belief has been strong among authors working in the decision-making field although it is not only false but absurd.The flow of life makes us pass without a break from one situation to another, which is for obvious reasons closely similar to the previous one.To illustrate, think of doctors who receive, in close succession, several patients in consultation.These doctors' diagnoses and recommendations can in no way be made independently of professional training and professional experience; that is, of memory for past and current experiences.Think also in evolutionary terms.If the world structure was of a between-subjects type; that is, if it prevented living organisms from becoming aware of variations in the environment, no evolution would have been possible.

Six previous studies conducted in the Information Integration Theory framework
Methodological studies conducted in the Information Integration Theory framework (Anderson, 1996(Anderson, , 2016(Anderson, , 2018) ) have shown that findings obtained using WSDs can be replicated using BSDs.Howe and Loftus (1992) used scenarios that depicted a fight between two persons.Two factors were considered in these scenarios: (a) level of intent to harm from the part of the aggressor, and (b) consequences of the fight (e.g., severe injury).In the WSD condition, twelve participants were presented with the whole set of eight scenarios, in close succession.They were to judge of the level of blame the aggressor deserved in each case.The BDS condition involved 96 participants, split into eight groups of twelve, who were presented with only one scenario.In both conditions, intent had the most substantial effect, (c) consequences had a weaker, although significant effect and no Intent x Consequence interaction was detected.In other words, the Blame = Intent + Consequences rule suggested by Leon (1980) was supported by both sets of empirical findings.
Mullet and Chasseigne (2017) also found support for additive-type integration rules using scenarios in which participants expressed willingness to forgive offenses under various circumstances.Each of their 376 participants was presented with one scenario depicting a situation in which Person A severely and intentionally hurt Person B, and then has or has not apologized for the harmful act.Intent, consequences, and apologies had significant effects, and more importantly, none of the two-and three-way interactions was significant.In other words, the Willingness to Forgive = Intent + Consequences -Apologies rule suggested by Girard and Mullet (1997) was supported by empirical findings resulting from the use of a BSD.Apologies had, however, a weaker effect in the BSD condition than in the original WSD study.Howe and Loftus (1996) used scenarios that depicted legal punishments that varied as a function of certainty and severity.Their set of 216 participants was split into nine groups of 24.Each participant was presented with only one scenario and asked to judge the level of deterrence that could be expected from this kind of punishment.In this BSD condition, certainty and severity had a significant effect and, more importantly, the Certainty x Severity interaction was significant.The pattern of mean responses was fan-shaped and was open to the right.In other words, the Deterrence = Certainty x Severity rule previously found by these authors using a WSD was also found using a BSD.
Mullet and Chasseigne (2017) also found support for multiplicative-type integration rules using scenarios in which participants expressed probable consumers' intoxication levels as a function of the type of drink consumed (beer, wine or fortified wine) and the number of glasses of this drink (one, three or five).The size of the glass was kept constant.Their 376 participants were split into nine independent groups and each participant was presented with only one scenario.Type of drink and the number of glasses had significant effects, and more importantly, the Type x Number interaction was significant.The pattern of mean responses was fan-shaped and was open to the right.In other words, the Intoxication = Type of drink x Number of glasses multiplicative-type rule suggested by Muñoz Sastre, Mullet, and Sorum (2000) was also found using a BSD.
The design used by Howe and Loftus in their two studies (1992,1996) did not allow testing averaging information integration rules because no scenario in which a piece of information was missing was part of their designs.Mullet and Chasseigne (2017) tested the possibility to detect averaging rules of information integration using complete and incomplete information BSDs.They presented two-third of their 316 participants with one of 12 scenarios depicting a situation in which two persons share variable levels of passion the one for the other, experience variable levels of intimacy, and have or have not decided to marry (Sternberg, 1997).The remaining participants were presented with one of six scenarios depicting a similar situation but in which the commitment information was missing.In both conditions, participants were asked to assess the level of true love experienced by these persons.In the complete information condition, passion, intimacy, and commitment had significant effects, and none of the interactions was significant.The effects of passion and intimacy were, however, significantly weaker in this condition than in the incomplete information condition.This set of findings is the signature of an equal-weight averaging process of information integration.In other words, the Love = (w Passion + w' Intimacy + w" Commitment) / w + w' + w") information integration rule suggested by Falconi and Mullet (2003) can also be found using a BSD.Ratings were, however, higher in the BSD condition than in the Falconi and Mullet' (2003) study.
In a subsequent study, Mullet and Chasseigne (2017) also tested the possibility to detect more complex, differential-weight averaging rules of information integration using, as in the previous study, complete and incomplete information BSDs.They presented two-third of their 316 participants with one of 12 scenarios depicting a person regarding the level of agreeableness, openness and emotional stability (McCrae & Costa, 1996).The remaining participants were presented with one of six scenarios depicting a similar situation but in which the agreeableness information was missing.In both conditions, participants were asked to assess the level of attractiveness of the person depicted by the two or three adjectives.In the complete information condition, agreeableness, openness, and emotional stability had significant effects, and all two-and three-way interactions were significant.This result is consistent with the view that a differential-equal weighting scheme has been implemented.The effects of openness and emotional stability were, however, significantly weaker in this condition than in the incomplete information condition.This second result is consistent with the view that an averaging scheme has been implemented.Taken together, both sets of findings are the signature of a differential-weight averaging process of information integration.In other words, the Attractiveness = (w a Agreeableness + w o ' Openness + w s " Stability) / w a + w o ' + w s ") information integration rule suggested by Cretenet, Mullet, and Dru (2015, see also Mullet, Cretenet, & Dru, 2014), in which the weight attributed to each factor depends on the level of this factor, can also be found using a BSD.

The present set of studies
The present set of studies was aimed at completing the set of studies conducted by Howe andLoftus (1992, 1996) and by Mullet and Chasseigne (2017).As in these works, findings obtained using BSDs will be compared with findings obtained through the use of WSDs.The first study was about health risk perception and the disjunctive rule of information integration.The second study was also about health risk perception but focused on the valuation process of information integration.
Disjunction .In studies about risk perception, a pattern opposite to the fan-shaped and opento-the right one observed when a multiplicative rule is operative has been repeatedly observed (e.g., Hermand, Mullet & Lavieville, 1997).This pattern corresponds to the use of a disjunctive rule of information integration.In order to test whether such a pattern can be observed in a BSD condition, nine scenarios depicting a person's levels of daily intake of alcohol and tobacco were presented to 376 participants.
An example of one story is the following: "Each morning, Christopher smokes some cigarettes.At midday, he is used to have lunch in the company of his colleagues and to drink some glasses of wine.At the end of the day, he has usually smoked about one pack of cigarettes and drunk equivalent of one liter of red wine.To what extent do you think that he runs the risk of having a type of cancer associated with this daily consumption?"Ratings were provided on a 16point scale anchored by not a risk at all (0) and very high risk (15).

Figure 1 Patterns of ratings observed in the risk of cancer study under the Between-Subject condition (the lefthand panel) and the Within-Subject condition (the right-hand panel)
In each panel, the mean levels of judged risk are on the y-axis, the three levels of the alcohol consumption factor are on x-axis, and the three curves correspond to the three levels of tobacco intake.
The panel on the left of Figure 1 shows the main results observed in the BSD condition.Curves are ascending and separated.In other words, both circumstantial factors had the expected effect on the judged level of cancer risk.The pattern of curves is fan-shaped to the left, that is, an interaction was present, which, in this case, is the signature of a disjunctive process of information integration.Participants considered that indulging in only one of these two risky behaviors represented a high health risk.This pattern is highly similar to the one shown in the right panel that presents results borrowed from the study by Hermand et al. (1997) in which the same scenarios were presented, but a WSD was used.(Responses from 20 participants also aged 21-26, from both genders, were randomly selected.)In each panel, the mean levels of judged risk are on the inverted y-axis, the three levels of the alcohol consumption factor are on x-axis, and the three curves correspond to the three levels of tobacco intake.
An ANOVA on the whole set of data was performed with a design of Condition (BDS vs. WDS) x Tobacco Consumption x Alcohol Intake, 2 x 3 x 3. The Condition effect was not significant.The Tobacco x Alcohol interaction was significant and concentrated in its bilinear component, F(1, 359) = 24.79,p < .001.The Condition x Tobacco x Alcohol interaction was not significant, which is consistent with the view that the disjunctive rule of information integration holds similarly in both conditions.
This finding was, in a certain way, implied by the findings reported by Howe and Loftus (1996) and by Mullet and Chasseigne (2017).If the responses scale were inverted (as if participants were expected to assess levels of healthiness), and if the values on the x-axis were also inverted, the patterns of responses observed would be, as shown in Figure 2, utterly similar to the ones observed each time a multiplicative rule of information integration is applied.
Valuing .According to Anderson (1996), a series of three functions -Valuation, Integration, and Response --converts a set of external or internal stimuli into a single judgment.The Valuation Function converts stimuli (e.g., number of glasses of some drink or feeling of unease) into concurrent psychological representations (e.g., elevated risk of cancer).In other words, this function assigns a value to each stimulus, and this value depends on the type and intensity of the stimulus and on the goal pursued (e.g., assessing health risk).Through psychological Integration, these values are weighted and combined into an overall implicit response.Finally, the judgment is generated employing the Response function.
In most IIT studies, the researchers' focus is on the integration process.In the study by Hermand et al. (1997) reported above, the focus was on the disjunctive rule of information integration, and the levels of the tobacco intake factor, for example, were selected in order to maximize the chances of correctly diagnosing the integration rule.In some studies, however, the focus is on the valuation process.In these studies, one factor -the factor of interest -has many levels and the other factor, the role of which is purely technical, has only two or three levels.For example, in a study conducted by Muñoz Sastre, Mullet, and Sorum (1999), one of the factorsthe tobacco consumption factor-had six levels and the other factor -type of tobacco -had three levels.The objective of the study was to examine whether lay people considered that the risk of cancer was a linear function of tobacco consumption.An example of a scenario used was the following: "Sydney smokes 35 cigarettes daily.He usually smokes light tobacco.To what extent do you think that he runs the risk of having a lung cancer associated with this daily consumption?"Ratings were provided on a 16point scale anchored by no risk at all (0) and very high risk (15).The mean levels of judged acceptability are on the y-axis, the nine levels of the fetus' age factor are on x-axis, and the two curves correspond to the two adolescent's age factors As this was the first instance of an apparent discrepancy between results obtained in BSD and WSD conditions, a second study was run that also focused on the valuation process.The scenarios used were borrowed from Muñoz Sastre, Pecarisi, Legrain, Mullet and Sorum (2007) who examined the acceptability of induced abortion among adolescents as a function of the fetus' age.An example of scenario used is the following: "Pauline is 15 years old.She has always wanted to go to college.She is one-month pregnant and has told her doctor she wants an abortion.Her parents consent, as does her boyfriend.To what extent do you think that abortion is an acceptable procedure for Pauline in this case?"Ratings were provided on a 16point scale anchored by not at all acceptable (0) and completely acceptable (15).The main results are shown in Figure 4. Curves are descending: the older the fetus, the less acceptable the procedure.Curves were, however, less steep and much higher than the ones shown in the original study.For example, at four months, the mean acceptability rating reported in the original paper was 5.15 (on a 0-15 scale).It was higher than 10 in the present study.

Discussion
Our new findings regarding the disjunctive rule of information integration rule add support to the view already expressed by Howe and Loftus (1996) that equivalent results can be obtained either with a highly economical repeated-measures design or with a much costlier independent factorial group arrangement.There is, however, one caveat to this optimistic stance.When the focus of the study is on the valuation process and not on the integration process, ratings obtained in the BSD condition seems to be restricted to a limited range of values by comparison with ratings obtained in the WSD condition.
There is no independent touchstone that would allow deciding which set of findings -the one that results from the application of a BSD or the one that results from the application of a WSD-is true and which is false.In other words, there is no independent touchstone that would allow deciding which methodology is, given the purpose of the study, appropriate and which is flawed.In view of Figures 3 and 4, it seems fair, however, to consider that, when the purpose of a study is to examine the values that people attach to different levels of a factor, using WSDs is a better option than using BSDs.This view rests on the empirical fact that discrimination between levels of the factor of interest -tobacco intake -is more fine-grained if a WSD rather than a BSD is used.
Findings observed in the two valuation studies are reminiscent of Birnbaum's (1999) results.This author had participants assess the subjective size of numbers using a 10-point scale, ranging from "very very small" to "very very large".Some participants judged the subjective size of the number 9 and other participants judged the subjective size of the number 221.Mean ratings in the 9 condition were significantly higher than ratings in the 221 condition.
According to Birnbaum (1999), the reason behind this perplexing finding was that when presented alone, the number 221 recalls a context of three-digit numbers (ranging from 0 to 999) among which, its first cipher being 2, it may seem quite small.In contrast, the number 9 recalls a context of one-digit numbers (ranging from 0 to 9) among which it may seem quite large.When presented together or in close succession in the same session both numbers -9 and 221-were, of course, judged entirely different and 221 was judged much higher than 9.In our two valuation studies, different contexts were, probably in the same way as in Birnbaum's study, evoked depending, for example, whether the number of cigarettes was only 5, or 15, or as high as 30.Our results were not as perplexing as the ones reported in Birnbaum's, but an evident shrinkage in subjective values associated with tobacco consumption levels was observed with BSD.
As a result, neither Birnbaum's perplexing finding nor the new findings reported here in the BSD condition must be interpreted as additional demonstrations of cognitive biases in people's judgment capacities (Fox, 1992) but as an additional demonstration that the exclusive use of BSD can lead researchers to gather confusing data, which can lead them and the scientific community as a whole to enduring misconceptions regarding human performance (Mullet, 2012).

Figure 2
Figure 2 Patterns of ratings observed in the risk of cancer study under the Between-Subject condition (the lefthand panel) and the Within-Subject condition (the right-hand panel)

Figure 3
Figure 3Patterns of ratings observed in the second risk of cancer study under the Between-Subject condition (the left-hand panel) and the Within-Subject condition (the right-hand panel).In each panel, the mean levels of judged risk are on the y-axis, the six levels of the tobacco intake factor are on x-axis, and the three curves correspond to the three levels of nicotine content.

Figure 4
Figure 4 Patterns of ratings observed in the acceptability of abortion study under the Between-Subject condition