1
多模态语篇的连贯构建研究 以中国英语学习广告为例 英文版
1.10.1.4 7.1.4 Results concerning the three levels of coher...
7.1.4 Results concerning the three levels of coherence

The results will be reported and discussed with reference to the three levels of coherence—global coherence,local coherence and surface cohesion.Specifically,Subsection 7.1.4.1,Sub-section 7.1.4.2 and Sub-section 7.1.4.3 will elaborate respectively on the possible effects of global coherence,local coherence and surface cohesion on eye movement.

7.1.4.1 Global coherence

Before reporting and analyzing the data,a way in which the global coherence of the experiment materials can be measured must be devised in this subsection for two reasons:(i)to provide numeric data of global coherence which makes possible statistical analysis with eye movement data and other empirical data later in this chapter,and(ii)to investigate the extent to which the reader's expectation about the discourse correlates with the discourse topical structure represented in the discourse.For this end,a smallscale survey on global coherence was conducted.This survey will be introduced before the report and discussion of the results of the eye-tracking experiment.

7.1.4.1.1 A survey on reader's expectation of global coherence

The survey was designed to investigate the reader's expectation of the topics in the English learning advertisements.The underlying theoretical assumption derives from the discussion of reader's mental frame on the construction of global coherence(Chapter 4)and the central position of reader expectation in the mental frame(Section 2.1.2.2).

In the survey 11 graduate students(5 females and 6 males,mean age=26.6)in a well-known Chinese university were shown individually a series of discrete semiotic elements excerpted from the English learning advertisements(the same set of advertisements were also used as stimuli in the eye-tracking experiment which will be introduced in Chapter 7)on the computer.As one important target group of these English training companies,the participants all have background knowledge of or even experience with the practices of profit-making English training services in China.None of them knew the specific purpose of the interview.They were asked to judge how well these semiotic elements fit in the advertisements of English language schools.They should give their answers on a five-point Lickert scale(ranging from 1=not suitable at all to 5=very suitable).

The result shows that the elements can be grouped into four types according to their rating.The highest rated(above 4)elements are about teaching activities or scenes,with foreigner teachers and Chinese students.The second highest rated(from 3 to 4)show people in the context of English learning and teaching,such as teachers represented by images of foreigners in middle age,often accompanied by formal dressing and books.The third highest(from 2 to 3)are not related to the English learning proper,but to the abstract concepts attached to English learning,such as hope(image of paper plane,ship),dream(image of wing girl),opportunity(image of gold key),success(image of wine toast),going abroad(image of Big Ben,Statue of Liberty)and the upper-middle class life.The lowest scored(below 2)are those having no relationship with either English(often represented by foreign-ness in this case)or teaching/learning,for example,image of balloon,bump car game and cartoon characters.

To triangulate the result,a brief interview with 4 of them was conducted after approximately three months.The interview comprises three questions.The first one is“what do you expect to see in the advertisements of English language schools or companies?”The most popular answers include:(i)the information about the courses they provide,(ii)why their courses are different from,or better than those of other companies,and(iii)the teaching team they employ.The second question is“what factor(s)will you consider as the most important if you are going to choose such an English training course?”The answers are highly concentrated on the quality of teaching.The third question is“what pictures do you expect to see in the leaflets of English language schools/companies,if there are any?”The answers are highly unanimous—those showing teaching activities.

The triangulation of results of the survey and interview show that the reader's expectation about the discourse topical structure in these advertisements is rather consistent.Further,comparison between the results here and the textual analysis of the discourse topical structure in Section 4.3 shows that the reader's expectation matches to a large extent with the actual representation in these advertisements.That is,readers predict that information about courses and their competitiveness will be obligatory in such discourse types and the pictures showing teaching/learning activities will most probably occur.The predictions are largely confirmed in the analysis of the stages and the semiotic patterns in these English learning ads.This suggests that the reader's processing and comprehension of the local relations in the texts will be conducted under the top-down influences of the reader's cognition of the global topical network in the discourse.

Since the participants in this small survey were of the similar age and identity with those participate in the empirical studies in Chapter 7,their rating data can also be used in the statistical analysis of the results of the eye-tracking experiment(in this section)and the memory recognition test(Section 7.2),and the qualitative data will be used in the discussion of the results of the evaluation investigation(Section 7.3).

7.1.4.1.2 Hypothesis on global coherence

Although in their situation model van Dijk and Kintsch(1983)hold that reader's knowledge about the superordinate ideas plays a part in building and retaining macrostructure,they admit that there is no ideal discourse comprehension model that links perfectly the textual aspect and knowledge.Therefore the experiment on global coherence in this chapter does not have rigid hypotheses about the concrete ways it affects the discourse processing.Rather,the experiment is an exploratory one instead of a strictly controlled one dedicated to confirm or reject certain causal relationships between the reader's expectation of global coherence and his/her reading behavior.Therefore,the hypothesis here will also be exploratory in nature.

A tentative prediction is formed on the basis of assumptions for cognitive mechanisms.Since global coherence is related to the expectation of the reader in terms of content,therefore,if a sign,either a picture or a verbal proposition,is surprising or unexpected to the reader/viewer,it tends to attract more attention,whereas the sign which is totally predictable in the context will be uninteresting to the reader/viewer and thus receives shorter gaze time.Therefore,the hypothesis is:A sign,either a picture or a verbal proposition,which has lower ratings of global coherence,tends to receive more looking time.On the other hand,the picture or verbal proposition with higher ratings of global coherence is more likely to receive shorter looking time.

What needs to be noted is that only pictorial signs will be considered in the analysis whereas the verbal elements won't.The reason is stated as follows.Since the materials used in this experiment are Reader Attraction of the English learning advertisements,which are specifically either the cover page of a leaflet or brochure or the most salient section of a flyer,they contain the key information of the overall document,but not the detailed information in a specific stage in the genre.The most frequently occurred pieces of information on these RA include the brand and logo of the English training company,the theme picture,and the slogan.Some include the name of the advertised course/program as the headline or a paragraph as extension of the slogan.All these verbal information are closely related to English learning,thus the verbal information on these RA all have high degree of global coherence and is thus uninteresting to study with regard to global coherence and its possible influence on reading behavior.In contrast,the pictures on the RA materials differ greatly in terms of content which covers both global coherence and local verbal-pictorial coherence.Therefore the hypothesis should be modified and rephrased into the following:

H1:A picture which has lower ratings of global coherence tends to receive more looking time.On the other hand,the picture with higher ratings of global coherence is more likely to receive shorter looking time.

7.1.4.1.3 Results and discussion

In order to conduct quantitative data analysis on the results,the global coherence needs to be measured in numeric.For this purpose,the ratings of the small-scaled survey described in Sub-section 7.1.4.1.1 were used.The participants can be seen as comparable to the participants in the eye-tracking experiment because as graduate students in Peking University,they have opportunities to visit foreign universities either during or after their master-program years.Due to the similarity in the participants'background,the ratings of global coherence were considered as valid for the data analysis in this section.

The hypothesis concerning global coherence(H1)predicts the possible negative correlation between the global coherence of a pictorial image and its gaze time,and consequently,its percentage in the overall gaze time spent on picture and verbal slogan.

Given the unequal amount of information and diversified patterns of spatial arrangement on each experiment page,the absolute data of gaze time might not be an adequate indicator for the attention spent on the pictorial portion of the page.For this reason,another indicator was introduced—the ratio picture takes in the overall gaze time on picture and verbal slogan,that is,

Picture ratio=picture/(picture+slogan).(Measurement:gaze time)

This indicator tells us the relative amount of attention participants spend on looking at the picture.As the Pearson correlation test shows,the two indicators—the absolute picture gaze time and the relative picture ratio—are significantly correlated(r=0.798,p<0.01).It is highly predictable.

Pearson correlation test was performed to test Hypothesis 1.The results(see Table 7.2 below)show that there is no significant correlation between global coherence and the gaze time of the picture,nor between global coherence and the picture ratio.

Table 7.2 Correlation test between global coherence ratings and the picture gaze time(absolute and relative)

The absence of correlation between global coherence and picture gaze time(the absolute and the relative)might suggest that the hypothesis is not supported.However,this conclusion must be drawn with caution.As mentioned in the previous section,the materials used in the experiment are all natural authentic data,which means that they differ in every possible aspect,for example,(i)the amount of information,(ii)patterns of spatial arrangement and layout,(iii)content of the pictorial and verbal information which might have different degrees of attractiveness to the readers/viewers.Therefore,this experiment is not a strictly controlled or manipulated one in which only one variable is tested and possible influences of all other variables are excluded.Instead,there is a great deal of uncertainty caused by the various interfering variables.For instance,the first variable about information amount means that some pages show only one person or one object whereas some others show three or four persons.The pages containing more people or objects are very likely to lead to longer looking time.The second variable about spatial pattern can mean two things;one is that some pages have more complicated composition than others.Signs in a simpler composition tend to be more prominent and thus attract more attention than those compacted in a complicated structure.The other is that the spatial arrangement of different types of signs may differ a lot;for instance,the picture might be placed above or below the slogan and the verbal information might be placed in one area or in two separate areas.The third variable is also a common feature in cognitive research.It has been found that images of different content have different attractiveness to human perception.For example,moving objects are more eye-catching than static objects,and the image of people tends to attract more attention than that of objects.All these variables are very likely to lead to varied patterns of distribution of attention in the actual reading behavior.Therefore,it is risky to reject the hypothesis on the basis of only the result of the correlation test here.In that case we might miss some really interesting findings.Therefore a more careful examination of the experiment data is needed.

A possible solution for this problem is to put aside the purely quantitative and statistical way of thinking and try to do an explorative study on the basis of the existing natural data.Fortunately this solution proves to be revealing in a great sense.The procedure of this explorative study of the materials is as follows.First of all,a careful scrutiny of the experiment materials was performed.The purpose is to group them with the above-mentioned variables as criteria.Secondly,those materials which are more or less homogeneous with respect to the variables were treated as members of a comparison group/pair.Thirdly,we can study whether there is interesting patterns with regard to our hypothesis within the comparison groups.

The table below(see Table 7.3)is an example of such comparison groups:

Table 7.3 Comparison group:3 young people

The pictures of all the three pages represent three young people engaging in posing action,without holding any objects.The complexity of their spatial structure and layout does not show substantial differences either.Therefore,they can be studied as a comparison group.As Table 7.3 shows,the material labeled as“EF_teen”has the lowest mean score of global coherence(2.36)and the longest picture gaze time(1502 ms)and the highest picture ratio(0.5777).The material labeled as“HQSD_grad”has the highest mean score of global coherence(3.64)and the shortest picture gaze time(548 ms)and the lowest picture ratio(0.2839).The data for the material labeled as“HQSD_jump”are all in between.This pattern accords with the hypothesis that the degree of global coherence of a sign is negatively correlated with the looking time it receives.

Table 7.4 Comparison group:1 person 1 object

Table 7.4 shows data of another example of the comparison group(pair).The two material pages are put in a comparison pair because both of them have 1 person and 1 object in the pictorial part and thus similar amount of information.The page labeled as“XDF_wg”shows a young woman standing with a pair of wings on her back and the other labeled as“XDF_horn”shows a young woman blowing a horn.Therefore,the degree of attractiveness of the picture should not differ much,either.The former has a lower mean score of global coherence(2.09),longer gaze time(799 ms)and higher picture ratio(0.3627).The latter has a higher mean score of global coherence(3.18),shorter gaze time(395 ms)and lower picture ratio(0.2063).

Table 7.5 Comparison group:2 persons

To show the robustness of the negative correlation between global coherence and attention,data from the cover pages in the complete leaflets material series(mentioned in section 7.1.2.2)are also employed.These materials are cover pages of English learning leaflets and thus,of the same type as those used here—the Reader Attraction stage of the English learning advertisements.Of the five cover pages in this material series,two show two persons and they can be regarded as a comparison group with another page showing 2 persons in the RA material series.What needs to be noted in passing here is that the average looking time per page in the leaflets series is much longer than that in the RA material series.

As shown in Table 7.5,the absolute looking time does not comply with the predicted pattern,which can be attributed to the fact mentioned in previous paragraph.However,the relative picture ratio does comply with my hypothesis perfectly.This finding can be said to support Hypothesis 1.

In summary,due to the existence of interfering variables in natural materials,it is difficult to obtain a neat and convincing pattern of correlation between global coherence and the amount of attention as predicted.However,the explorative study of the more or less homogeneous comparison groups reveals that the global coherence of a sign in relation to the reader expectation indeed has influence on the amount of attention it receives in the reading/viewing process.The pattern of the influence is in accordance with Hypothesis 1.Therefore,our conclusion is that if the materials are comparable in terms of interfering variables,the hypothesis will be supported.For more systematic empirical results future experiments with strictly controlled materials will be needed.

7.1.4.2 Local coherence

As elaborated in Chapter 5,local coherence is the semantic connection between the sign and its adjacent signs.The local coherence weaves the individual components on the page into a whole and thus renders it an intact text.If the semantic relation between two adjacent parts is strong and close,it is more likely to lead to the reader/viewer's cognitive construction of the overall topic of the page and a better comprehension.

7.1.4.2.1 Hypothesis on the local coherence

Research in various fields of human communication and information processing has shown that there is a correlation between the degree of difficulty of a task and the degree of efforts in the human cognitive processing.For example,studies on conversational implicature indicate that when participants in a conversation hear an utterance which has no relevance with the previous one,they will undergo several cognitive steps in order to establish an implicature that can explain the lack of relevance.Eye-tracking studies in reading also reveal that when readers are presented with texts which are not totally coherent among its sentences,readers will try to develop inferences of the meaning in order to achieve a coherent understanding of the text,and thus show more look-back fixations and longer time in their reading behavior.For example,Carney and Levin(2002)report that the difficulty of multimodal learning materials influences learners'reading behavior.Specifically,the more difficult the learning content is,the higher the learner's frequency of looking at visual displays adjunct to the text.

It is assumed that the pages showing closer link between component parts will be easier for the reader's processing,whereas those showing weak semantic link between component parts require more efforts of the reader's cognitive processing mechanism.That is,when readers look at a page with strong semantic connection between its componential parts,it is easy for them to establish a coherent understanding of it.They don't need to shift their eyes back and forth between the visual and verbal to develop inferences of the connection,and they tend to spend less time in processing it.When they look at pages with weaker semantic connection between the main componential parts,they need to use more efforts to establish a coherent and understandable inferred connection in their mind.To do this,they need to examine the visual and verbal repetitively,thus their eyes will show more times of switches.

Therefore,the hypothesis concerning the relation between local coherence and eye movement data is:

H2:The stronger the local coherence/semantic connection is between adjacent signs,the fewer gaze switches there will be between them.

7.1.4.2.2 Results on local coherence

Hypothesis 2 predicts a positive correlation between verbal-pictorial local coherence/semantic connection and the number of gaze switches between the two parts.

In the materials in the current study,there are usually more than four AOIs defined and thus many switches altogether between all these AOIs.However,only the switches between picture AOI and verbal slogan AOI were counted since our research interest here is the semantic connection between adjacent verbal and pictorial parts.Moreover,the coding of the degree of verbal-pictorial local coherence was based on the semantic relation of the theme picture and the slogan on these advertisement pages.Thus the examination of gaze switches should also be concerned with these two AOIs.

To get the data of gaze switches between picture AOI and slogan AOI,every participant's gaze plots were carefully examined and the numbers of gaze switches between the two AOIs for each page were counted by hand.This was very easy in most cases since the picture and slogan had been divided into different AOIs and the ClearView software had marked the routine of the eye gaze on the gaze plot by lines.Therefore,the number of gaze switches between different AOIs could be easily obtained by counting the number of lines crossing the AOIs concerned.For example,in the gaze plot shown in Figure 7.5,there are two lines crossing picture area and slogan area,so the number of picture-slogan gaze switch is 2.The results of the number of gaze switches are thus generally very objective and highly reliable.

Of course,there are some rare cases in which the decision requires some extra work because of some ambiguous fixation points.For example,if a fixation point is just located on the boundary line of two AOIs,it will be difficult to decide whether there is a line crossing the two AOIs concerned or not.In this case,it relies on the researcher's judgment to decide which AOI this fixation point should belong and consequently whether there is a crossing line.Of course the premise is that the AOIs have been reasonably defined.As mentioned before,in the current study the AOIs are defined on the basis of a careful study of the hot spot as a collective reflection and every participant's gaze plot as individual evidence.In turn,the rarity of the ambiguous cases supports the validity and precision of my principle of AOI definition.

Figure 7.8 An example of ambiguous switch

However,the resolving of the ambiguous cases was not as straightforward as it might sound.Instead,deep insights into the visual cognition were needed.The following gaze plot(see Figure 7.8)is an example of the ambiguous case.As it shows,Fixation 500 is located on the boundary of No.80 AOI(picture)and No.79 AOI(verbal slogan).In order to judge whether there is a gaze switch between the two AOIs,we must first decide whether this Fixation 500 should belong to the picture AOI or the slogan AOI.This requires the researcher to conduct a qualitative observation of the data.First we should observe the Fixation 500 and its consecutive fixations.As the gaze plot shows,the gaze came from the top right AOI(fixation 498),first to Fixation 499 in the picture AOI which was located on the lowest part of the key image.The three Fixations 498,499 and 500 were on the same line of this gaze path.After Fixation 500 the gaze was attracted to the big print“金钥匙”in the right part of slogan AOI,and then went back to the smaller print(Fixation 502 and 503)in the slogan.

As acknowledged in the eye-tracking studies,there are two mechanisms occurring together whenever we look at things—overt attention and covert attention.The fixation points recorded by eye-tracking technology can reflect only the overt attention in the viewing process whereas the covert attention leaves immeasurable.However,covert attention is also very important as The Tobii Eye Tracking Whitepaper illustrates,“a shift of our overall attention is commonly initiated by our covert attention quickly followed by a shift of our overt attention and the corresponding eye movements”(ibid:5).Therefore,it means that people actually have already begun to see and process things(in their covert attention)when they move their eyes in that direction before having fixated on it.The overt attention represented by fixation points is always temporally behind the covert attention.

According to this principle,Fixation 500 should belong to the slogan AOI because it locates on a point which is almost the end of the eye movement line from the picture AOI to the slogan AOI.It means that the participant's overt attention has almost arrived at the slogan.Since overt attention is always a little behind covert attention,we can infer that it indicates that the participant's covert attention has already begun to read the words below Fixation 500.Therefore,Fixation 500 should be attributed to the slogan area and accordingly the ambiguity about the switch is resolved,that is,there is one crossing line,thus one switch between the picture AOI and the slogan AOI.

After the manual count of the number of switches between the picture and the slogan,the data was typed into SPSS 13.0 for further statistical analysis.The purpose of doing this is to do a correlation test with the expectation to find possible correlation between the degree of local coherence and the number of gaze switches.

The descriptive statistics of the number of gaze switches in these materials show that the maximum mean number of gaze switches is 3.08 and the minimum is 1.60.

The coding of the local coherence of all these materials was done by the author and another graduate student independently.The inter-rater reliability was 0.74.When there was divergence between the two raters,the average of the two codings were used.

The result of the Pearson Correlation test is r=-0.466(p>0.05).It means that there is no significant correlation between the degree of local coherence and the number of gaze switches between picture and verbal slogan areas as predicted.

However,with consideration of the high variability of the authentic materials used in the current experiment,it is wise not to reject the hypothesis immediately on the basis of the correlation test.A careful examination of the data is needed.The scatter graph of these data shows that a particular page is very different from the others.This indicates that this page might have some special features which make it unable to fit into the overall pattern.As shown in Figure 7.9,the problem of this page lies in its inseparability of the verbal phrases from the image of the green tree.It's very different from other pages where the picture(or at least the central part of the picture)and the verbal can be divided into separate areas.So for this page we can not judge to what degree the participants process and interpret the advertisement through the image of the tree or through the verbal phrases on it which echo the large print.Accordingly,it is impossible to judge what influence this combination of tree image and the verbal phrases in the central AOI had on the participants'reading behavior.Therefore this page is not a suitable piece of material for studying pictorial-verbal relations and thus should be excluded.

Figure 7.9 An exceptional case in terms of local coherence patterns

If this page is excluded from the material list for the correlation test,we could obtain a significant negative correlation between the picture-slogan local coherence and the number of gaze switches as predicted(r=-0.503,p<0.05).

It means that the Hypothesis 2 is supported.That is,the semantic connection between the picture and the slogan has a significant negative correlation with the quantity of switches between the two areas.The theoretical assumptions on the local coherence in multimodal texts and the cognitive processing involved are verified.

The correlation test also shows that the local coherence between picture and slogan and the average fixation duration on the slogan AOIs are significantly negatively correlated(r=-0.718,p<0.01).

Therefore,the three factors—average fixation duration of slogan,local coherence between picture and slogan and the number of switches between picture and slogan,are interrelated with each other.

The average fixation duration(AFD)measures the average time length per fixation calculated with reference to the time length of all the fixations in a certain area.It is an indicator for the speed and intensity of the participant's reading/viewing.The high value of AFD implies that the participant is reading the concerning sign/signs slowly and carefully.Therefore,the results of the correlation test could be interpreted as:on the one hand,the significant negative correlation between local coherence and AFD indicates that the closer the semantic connection between picture and slogan is,the more quickly the viewers read the verbal slogan.On the other hand,the significant positive correlation between the number of switches and the AFD of the slogan AOI implies that at the same time when the participant switches his/her eye gaze between the picture AOI and slogan AOI,the reading speed for the slogan is slow.This finding sheds light on the cognitive mechanism in reading/viewing multimodal texts.That is,when the signs/elements are semantically closely related with adjacent signs/elements in a text,people tend to process the information(at least the verbal information)at a quicker speed,and they also tend to switch less between these adjacent elements.This pattern implies that the local coherence or semantic connection between signs has something to do with the ease of comprehension.Specifically,multimodal texts in which signs/elements of different semiotic systems are semantically connected are easier to be comprehended;on the other hand,multimodal texts in which signs of different semiotic systems are semantically unrelated are more difficult for people to understand,and thus lead to a slower reading/viewing speed and more eye switches between the signs/elements.

This observation supports the theoretical assumption according to which Hypothesis 2 was proposed.It is repeated here:The pages showing closer link between component parts will be easier for the reader's comprehension,whereas those showing weak semantic link between component parts require more efforts of the reader's cognitive processing mechanism.The AFD can serve as another indicator of cognitive efforts alongside the number of shifts across the AOIs concerned.Therefore,the results verify my hypothesis on the correlation between local coherence and the cognitive efforts involved in the processing and comprehension of the discourse.

7.1.4.2.3 Discussion

The results in the current eye-tracking experiment indicate that,the degree of information linking,as the measurement of local coherence,has effects on the cognitive efforts readers need to make in order to build coherence relations between verbal and pictorial components in multimodal discourses.

This conclusion can also find support in Meng(2010)in which a questionnaire survey was conducted to check the framework of information linking proposed in Subsection 5.3.1.1.The survey was done with 57 undergraduate students(29 males and 28 females,mean age=20.78)at Peking University.The participants were shown 13 modified Reader Attraction pages in the English learning advertisements corpus.They represent five types of information linking between the verbal and the pictorial:multiple links,cross-cutting,bridging,associational and juxtaposition.The materials were largely the same with that in the eye-tracking experiment except that in Meng(2010)the verbal slogans on these pages were removed.The task for the participants was to choose the most appropriate verbal slogan for the page from four alternatives.The matching task was driven by the hypothesis that the degree of information linking is positively correlated to the correct rate.The hypothesis is,specifically,the stronger visual-verbal info link is,the more probable the participants will arrive at the original(right)verbal slogan.

The result shows a very regular pattern of the relation between the degree of visualverbal information link and the correction rate of the matching test(see Figure 7.10).

In order to obtain more accurate evidence,Meng(2010)has also conducted a correlation test between the correction rates and the degree of information linking in each of the experiment pages.The overall descriptive statistics are:the mean for the correction rate is 0.60755 and the mean for the degree of visual-verbal informational link is 2.7727.The two groups of data have a correlation of 0.769 which is significant at the 0.01 level.Therefore the conclusion is drawn that the degree of information linking between the verbal and the pictorial in these pages is significantly correlated to the degree of difficulty of choosing the right match between the picture and the slogan.In all,the hypothesis is supported,that is,the closer the connection between the picture and the verbal slogan is on the page,the easier it is for people to reconstruct one part on the basis of the other part.

Figure 7.10 Average correction rates of different types of information linking

From this conclusion we can derive something for the construction of local coherence in multimodal discourses.The discourses in which the pictorial and the verbal have closer connection in informational content are easier to be regarded as a coherent whole,and the understanding of the discourse is achieved through the interaction between the cognitive processing of the picture and that of the verbal.This result provides triangulation for the eye-tracking experiment in this section.The reader's cognitive effort in building coherence between the verbal and the pictorial are visualized in the eye movement data in the eye-tracking experiment.Therefore,the current eye-tracking experiment and Meng(2010)verify the validity of the information linking principle in local coherence proposed in Chapter 5.

The results and discussion here have implications in the following three aspects:

Firstly,the results of the two studies justify the duality of coherence,that is,the discourse coherence lies partly in the text/discourse and partly in the reader's mind.On the one hand,on the level of representation,there must be some connection in semantic meaning between the picture and the verbal slogan in these materials in order to form a cohesive text.On the other hand,on the cognitive level,the implicit meaning connection between the picture and the verbal slogan must be intelligible to the reader in order for him to construct a mental model of a coherent text.If the information link is too weak and elusive,the probability for the reader to successfully construct such a mental model tends to be low.In turn,when the meaning connection is strong and obvious,the interpreter will have a greater chance of constructing a coherent understanding of the text.

Secondly,the findings here also provide empirical justification for the holistic view in discourse studies stressed by Jiang(2009).Jiang summarizes the development of the concept of holism in science and discourse studies,and argues that both the East and the West have a long history of studying discourse from the holistic approach.The gist of holism is the idea that the whole is more than the sum of its parts and the meaning of each part/element in the text should be examined with proper consideration of the whole text and its context.The findings in the current research show that,the meaning of the picture and that of the verbal are not merely two independent parts of the text;instead,when put together,they create a force which bring significant difference to the comprehension of the whole text.This force of holism is usually invisible in our conscious level of mind and in discourse analysis;however,the eye-tracking data which reveal the unconscious cognitive mechanism in the reading/viewing process prove that the principle of holism indeed plays a substantial role in people's perception and comprehension.Content in each semiotic component contributes to the topic of the advertisement as a whole and thus to the comprehension of the whole advertisement.

Thirdly,the correlation between local coherence and reading behavior verifies the assumptions concerning discourse processing in van Dijk and Kintsch(1983).They assume:

The establishment of local coherence is strategic.We do not merely have rules which define the conditions that make sequences of propositions coherent,but also,or rather,we have strategies which process information from various sources in such a way that coherence can be established in an effective and flexible manner.It is not necessary for interpretations of sentences to be completed before beginning to establish coherence links among propositions.Instead,hypotheses about coherence links are made as the propositions themselves are being formed.These hypotheses must be based on partial information.Once the information is complete,a coherent structure has typically already been generated,though there may still be a need to check or revise it.(van Dijk and Kintsch,1983:151 152)

In the current eye-tracking experiment,this strategic establishment of local coherence can be interpreted as follows:in order to understand the message of the advertisement page,the reader tries to establish coherent links between the picture and the verbal slogan.When they look at either the pictorial or the verbal part first,they form hypotheses about possible links in the other part.The hypotheses will be confirmed when they process the other part if the semantic connection between the two parts is direct.Otherwise,in the cases where the semantic connection between verbal and pictorial is weak and elusive,the hypotheses tend to undergo revision processes.The revised hypotheses will be checked for a second time and possibly more times,which requires the reader to shift the eye gaze between the verbal and pictorial parts repeatedly.When the establishment of coherence link is finally done or abolished,the reading process is terminated.

Although the current eye-tracking experiment is not conducted on a controlled basis and thus only correlation but not causal relationship can be obtained,the results can shed some light on the establishment of local coherence in multimodal documents.They show that the concept of local coherence and information linking can be successfully extended from purely verbal texts to the pictorial-verbal relations in multimodal texts,and it has similar effects with that between sentences in purely verbal texts.This cross-modal perspective indicates that human cognition operates on the level of meaning contained in signs instead of the forms the signs take.Whether the component elements are pictorial or verbal does not matter much for readers to understand the discourse;it's the meaning relations contained in these elements that matter.

7.1.4.3 Surface cohesion

As discussed in Chapter 6,the surface cohesion in layout is an important aspect of coherence and facilitates readers'processing and understanding of the discourse.In linguistics there is a large body of empirical studies on the effects of logical connectives(see for example,Chung,2000)and structural markers like topic heading(Hyönäand Lorch,2004),text visual structure(Natasha,1999;Lemarié,Eyrolle and Cellier,2008)on recall and comprehension.In visual discourses,however,the empirical studies mainly focus on the effects of salience on reading processes(see for example,Nothdurft,2006).There are few empirical researches on the relation between layout and reading processes from the perspective of multimodal discourse processing.The current eyetracking experiment is also exploratory in nature.As proposed in Section 6.2,there are mainly two groups of devices in layout which help organize the semiotic elements on the pages into a coherent,hierarchical discourse:structural cues and salience cues.Accordingly,the eye-tracking experiment will try to find evidences for the two aspects of surface cohesion.

7.1.4.3.1 Layout structure

As discussed in Section 6.3.1,the visual structure is formed by the simultaneous working of a series of structural cues,such as visual similarity,spatial nearness,and framing.These cues are visually and perceptually based.In multimodal discourses there are also some structure-signaling resources that are mainly attached to the verbal,among which the most important is heading.All the visual,verbal and graphical resources work together to form the structure of the layout.

The structural aspect of cohesion is not easy to prove by eye tracking,because people's perception of the structure cannot be conveniently translate into fixation patterns,let alone that the fixation patterns are affected by both top-down and bottom-up processes triggered by many factors.However,this section tries to find evidence in eye tracking to show that the structural cues,as cohesion in layout,is related to the cognitive processes involved in reading.The basic assumption is to compare the eye-tracking data of two pages,one with obvious structural cues and the other one without.

For pages which have clear visual structure,the participants tend to fixate on the section headings and ignore the detailed textual and pictorial information.The following hot spot(see Figure 7.11)is such an example.The visual field has an overall topic heading“与国际儿童同步的纯外教英语培训”.Under it there are three sections,each of which consisting of a heading(both in Chinese and in English)and a verbal paragraph.The section at the bottom also includes a picture.The three sections are cued by spatial nearness inside the section and white space for boundaries between them.The hot spot shows that the fixation points of the participants concentrate on the headings whereas the details inside the sections,including verbal texts and the picture,receive few fixations.

Figure 7.11 A hot spot as example of fixations on headings

For pages which the visual structure is not so regular or clearly signaled,the participants tend to read the details more carefully.In Figure 7.12 for example,the page is composed of four visual sections apart from the contact information at the bottom margin.The first three sections are marked by obvious structural cues,including heading,visual similarity,and number sequencing.In contrast,the last visual chunk,that is the one at the right bottom,is very different from others in both content and composition.Thus it is not visually similar to others.Although it is separated by white space as an independent section as others,it is not cued as such obvious structural markers such as heading or number sequencing.From the hot spot we can see that for the three structurally marked sections,the fixation points concentrate on the headings whereas the verbal and pictorial details receive much less looking time.For the last section,however,it is shown that the verbal details are read carefully by the participants.Given the small size of the verbal texts,the careful reading is not normal,since the participants usually skip over the small size verbal texts.

Figure 7.12 A hot spot as example of fixation patterns in pages with unsymmetrical structure

The reason may be attributed to the structural cues.The headings provide a gist of the information of the whole section and thus save the readers'effort to read the details.The visual similarity also indicates that the details inside the sections are similar in nature.These cohesive cues enhance the readers'confidence in their expectation or predication of detailed information,so they tend to skip over these details.For the visual section without heading,for example,the right bottom section(the light area)in Figure 7.12,the reader has no clue to predict the content and thus she has to read the details carefully in order to get a general idea.

In summary,the structural cues affect the reading behavior to a large extent.This indicates that the structural cues as cohesive devices are closely related to the cognitive processes in reading and comprehension.

7.1.4.3.2 Visual salience

Figure 7.13 A hot spot as example of fixation patterns in the light of spatial location

As reported in many researches in visual cognition(see for example,Taylor et al,1979),salience has a robust effect on the cognition.The current eye-tracking experiment also finds a lot of evidence supporting this view.As mentioned in Section 6.2,the frequently employed devices for salience in the English learning advertisements include color,size,and location,among others.For example,Figure 7.13 is a hot spot of a page spread which typically represents the fixation pattern in light of the spatial location.The page spread is composed of two pages,the left one concerning the teaching team in this company and the right one concerning the course information.Apart from the topic heading,each page comprises similar elements arranged in different locations.Specifically,the left page comprises an array of photos of foreigners organized in rows and columns,and the right page is organized into the form of a table,with the verbal texts arranged in a matrix.Since these component elements are similar in both form and content,the only difference among them is the spatial location on the page.As mentioned in Section 6.2,the top is more salient than the bottom.From the hot spot,we can see a clear pattern that the elements on the top receive much longer looking time than those at the bottom,and the relative fixation length decreases in the downward direction.

Figure 7.14 A gaze plot as example of salience

The hot spot reflects the general gaze pattern of a group of participants.In the following we'll look at an example which is more concrete and detailed.Figure 7.14 is an example of reading behavior in the light of size.As shown in the gaze plot,the most salient element on the page is the picture showing many people on the top right part due to its bright colors and much bigger size than others.At the left bottom of the page,there is another picture which is second in size to the top one.On the remaining field of the page,there are verbal texts and some small pictures.Therefore the big picture on the top and the second big picture on the left bottom of the page should have more capability to attract attention.The reading started from the top left corner(Fixation 762)which was intended by the experiment design.The actual first fixation—Fixation 763 went in the direction of the big picture on the right.It shows that the participant's attention sensed that to the right there was a salient picture and was attracted by it.Fixation 764 was on the big print verbal heading—“瑞来英语”,and then gaze 765-767 were back in the picture again.It shows that the participant briefly scanned the heading which was nearest to his present fixation,then shifted his gaze to the biggest picture.Fixation 768—773 concentrated on the red-color heading under the big picture—“360度的全角8cs完整学习法”.Fixation 774—775 were on the picture to the left.Fixation 776—783 were on the verbal paragraphs under the heading“360度的全角8cs完整学习法”.The eye movement shows that the participant read the heading,but before he finished,he was attracted to the picture to the left.After scanning it with two fixations,he went back to read the verbal heading and its sections each of which comprised a verbal paragraph and a small picture.What needs to be noted is that when the participant went on to read the section which is on the top right corner(Fixation 783),he was again attracted to the big-size picture(Fixation 784—785).After that he went back to finish reading the parallel sections(Fixation 786—793)in a downward direction.The repetitive look-backs to the biggest picture prove that the salient elements indeed have priority in attracting attention.

The data in this section supports the view that salience,such as spatial location,color and size,affect the bottom-up processes in reading.

7.1.4.3.3 Discussion

The result of the eye movement data on surface cohesion is drawn from qualitative observation and description,and no quantitative data are obtained.The reasons are:

The typographical features in the layout that visually structure the advertisements operate on the whole visual space,and thus are not easy to obtain quantitative research data.The typographical features cueing salience are,on the one hand,attached to certain elements which vary greatly in global coherence and local coherence.On the other hand,they are relative attributes derived from the comparison with their surrounding in the visual space rather than absolute ones.Therefore,they are very difficult to be separated as a measurable indicator in a natural experiment condition,so it is impossible to do statistical analysis.

However,in spite of the lack of quantitative data,the observation of the eye movement data on the typographical features in the layout can still support,to some extent,our theoretical assumptions concerning the effects of surface cohesion and comprehension.That is,these features are made use of by readers as cues for the global structure of the advertisement and relations among the semiotic elements,so that they facilitate the comprehension process in which coherence is established.