Orig­i­nal source pub­li­ca­tion: Brito, M. A. and F. de Sá-Soares (2012). Sup­port­ing Intense Needs of Assess­ment in Com­puter Pro­gram­ming Dis­ci­plines. Pro­ceed­ings of TECH-EDU­CA­TION 2012—3rd Inter­na­tional Con­fer­ence on Tech­nol­ogy Enhanced Learn­ing, Qual­ity of Edu­ca­tion and Edu­ca­tion Reform. Barcelona (Spain).

Sup­port­ing Intense Needs of Assess­ment in Com­puter Pro­gram­ming Dis­ci­plines

Miguel A. Brito and Fil­ipe de Sá-Soares

Uni­ver­sity of Minho, School of Engi­neer­ing, Cen­tro Algo­ritmi, Guimarães, Por­tu­gal

Abstract

After sev­eral years of expe­ri­ence teach­ing com­puter pro­gram­ming dis­ci­plines, the major insight about how to suc­ceed became very clear. Stu­dents must work in a weekly flaw­less base. Instead, stu­dents tend to study occa­sion­ally with strong peeks of work at assess­ment eves. How­ever, imple­ment­ing assess­ments in a weekly base requires a lot of resources and that is not easy to obtain. At an ear­lier stage, a sequence of exper­i­ments proved the influ­ence of weekly assess­ment in stu­dents’ suc­cess in com­puter pro­gram­ming dis­ci­plines. A method­ol­ogy to guide the weekly rhythm was devel­oped and finally an auto­mated assess­ment tool solved the prob­lem of lack of resources.

Key­words: Pro­gram­ming Edu­ca­tion; e-Learn­ing Plat­form; Learn­ing and Teach­ing Com­puter Pro­gram­ming to Novice Stu­dents; Assess­ment Fre­quency; Con­struc­tivism; Learn­ing Man­age­ment Sys­tem

1. Constructivism and Computer Programming Education

Learn­ing to pro­gram is mostly about devel­op­ing the abil­ity of doing it rather than know­ing how to do it. We usu­ally illus­trate this dif­fer­ence to our stu­dents using the metaphor of rid­ing a bicy­cle. They can know all about the mechan­ics of it and fully under­stand how other peo­ple accel­er­ate, break, turn, etc., but yet mas­ter­ing this knowl­edge will not make them able to ride a bike.

This need of doing it in order to learn, or bet­ter say­ing, the need of doing it to achieve cer­tain capa­bil­i­ties unreach­able other way, led us to the foun­da­tions of con­struc­tivism. This the­ory jus­ti­fies our insight of the absolute need of weekly prac­tic­ing pro­gram­ming. Con­struc­tivism is a learn­ing the­ory in which Jean Piaget argues that peo­ple (and chil­dren in par­tic­u­lar) build their knowl­edge from their own expe­ri­ence rather than on some kind of infor­ma­tion trans­mis­sion.

Later, based on Piaget and oth­ers’ work with the expe­ri­en­tial learn­ing par­a­digm, impor­tant works have been devel­oped, such as Kolb’s Expe­ri­en­tial Learn­ing Model [Kolb and Fry 1975], which rein­forces the role of per­sonal exper­i­ment in learn­ing and sys­tem­atizes iter­a­tions of reflec­tion, con­cep­tu­al­iza­tion, test­ing and back again to new expe­ri­ences. A rich set of works about con­struc­tivism in edu­ca­tion can be found in [Steffe and Gale 1995].

Mean­while the dis­cus­sion was brought to the com­puter sci­ence edu­ca­tion field claim­ing that real under­stand­ing demands active learn­ing on a lab envi­ron­ment with teacher’s guid­ance for ensur­ing reflec­tion on the expe­ri­ence obtained from prob­lem solv­ing exer­cises. Pas­sive com­puter pro­gram­ming learn­ing will likely be con­demned to fail­ure [Ben-Ari 1998; Had­jer­rouit 2005; Wulf 2005]. Indeed, con­struc­tivism can even be used to explain the prob­lem of weak stu­dents and be part of the solu­tion [Lui et al. 2004].

2. The Path till Weekly Assessment

Build­ing on the belief of the impor­tance of sub­mit­ting stu­dents to more and more assess­ments but also fac­ing the strong restric­tion of human resources to imple­ment those assess­ments, we pro­gres­sively intro­duced more fre­quent assess­ments.

A first and strong indi­ca­tor of the weekly assess­ment suc­cess is the evo­lu­tion of the per­cent­age of stu­dents that stay till the end, i.e. that do not drop at the mid­dle of the semes­ter.

In 2004/2005 a small project was quar­terly assessed; i.e. there were two assess­ment points per semes­ter. The forty-eight per cent of stu­dents who did not aban­don was clearly insuf­fi­cient.

Dur­ing 2005/2006 small prob­lems res­o­lu­tion in com­put­ers’ lab were added to the assess­ment on a monthly basis. This assess­ment paid well in stu­dents’ suc­cess but rep­re­sented a hard load of work to teach­ers.

In 2006/2007 a weekly auto­mated the­o­ret­i­cal-prac­ti­cal (TP) assess­ment was imple­mented and com­ple­mented with a quar­terly lab­o­ra­to­r­ial (L) assess­ment. That meant more work in prepar­ing the auto­mated test­ing bat­ter­ies but it would be an invest­ment for the future and took us back to quar­terly prac­ti­cal assess­ment. The results were sim­i­lar to the pre­ced­ing year, bet­ter than two years before but still not sat­is­fac­tory. The plat­form adopted was the Learn­ing Man­age­ment Sys­tem (LMS) Moo­dle (http://moo­dle.org). Although the choice process is not rel­e­vant at this point it is impor­tant to stress the option for an open source plat­form as it will be obvi­ous a few para­graphs below.

Dur­ing 2007/2008 the fre­quency of lab­o­ra­to­r­ial assess­ment was increased to monthly. Ben­e­fit­ing of the pre­vi­ous year invest­ment in auto­mated TP assess­ment we needed to prove the impor­tance of also increas­ing lab­o­ra­to­r­ial assess­ment fre­quency. Finally, the result reached a very accept­able level of sev­enty-six. This proved our insight about the results of increas­ing assess­ment fre­quency but cre­ated a new prob­lem: unbear­able teach­ers’ work­load.

So finally, 2008/2009 was undoubt­edly the tough­est year in this process with an invest­ment in a plu­gin to the Moo­dle plat­form in order to per­form auto­mated test­ing of pro­gram­ming pro­ce­dures. The whole pro­gram­ming tests bat­tery was not com­pletely fin­ish in that year. But finally the whole assess­ment was auto­mated and weekly per­formed.

Since then the per­cent­age of stu­dents who do not give-up from the dis­ci­pline con­tin­ues to increase reach­ing a peak of eighty-six in the last aca­d­e­mic year.

This whole evo­lu­tion led not only to a sup­port­able work­load for teach­ers but also allowed to engage less expe­ri­enced teach­ers to help in classes and also in the assess­ment process.

The ratio of approval was also grow­ing—4% in 2005/2006, 2% the year after, 1% in 2007/2008 and finally a huge 13% jump last year with the weekly fully auto­mated assess­ment—a 20% total improve­ment in five years! It should be men­tioned that this evo­lu­tion was achieved nei­ther by shrink­ing the syl­labus nor by decreas­ing the level of rigor imposed to the course over the years.

3. The Learning Methodology

Weekly assess­ment induces a reg­u­lar weekly rhythm. In order to fully take advan­tage of it a learn­ing method­ol­ogy was devel­oped (cf. [Brito and de Sá-Soares 2010]). As part of this method­ol­ogy we have a set of sup­pos­edly good advices and tuto­r­ial sup­port dur­ing classes but in this con­text the focus is on the method the stu­dents need to fol­low each week.

Some authors ([Duke et al. 2000] and [Costello 2007]) agree on the mer­its of fre­quent assess­ment but have a small enough num­ber of stu­dents or a big enough num­ber of hours x teach­ers. We do not have such resources and have almost two hun­dred stu­dents and only a lab­o­ra­to­r­ial teacher. Nev­er­the­less, a third fac­tor in the equa­tion has proved to be the auto­mated assess­ment of the prob­lems solved by stu­dents.

4. The Programming Plugin

The men­tioned need of autom­a­tiz­ing the pro­gram­ming ques­tions led to the imple­men­ta­tion of a quiz plu­gin with some addi­tional fea­tures.

Fig. 1 presents the high-level archi­tec­ture of the plu­gin which is com­posed of five main mod­ules (spe­cific mod­ules related to the set­ting of the plu­gin into the sys­tem are not depicted) and three stor­age resources.

The two mod­ulesStu­dent Inter­face” andTeacher Inter­face” pro­vide the inter­face to the plu­gin to stu­dents and teach­ers, respec­tively. These mod­ules inter­act with plu­gin’s CSS def­i­n­i­tions and with the lan­guage local­iza­tion of the plu­gin (cur­rently there are two lan­guages avail­able, namely Por­tuguese and Eng­lish, although the exten­sion of the plu­gin to other lan­guages is very easy).

When the stu­dent is under­tak­ing the assess­ment an extended inter­face is dis­played for the pro­gram­ing ques­tions.

In this kind of ques­tion, the stu­dent uses a Resposta (answer) box to write his/her solu­tion to the prob­lem. To avoid mis­spelled pro­ce­dure names the pro­ce­dure header is already in the Resposta box and the stu­dent only fills the respec­tive body and other even­tual aux­il­iary pro­ce­dures.

If some­how the stu­dent loses con­trol about what he already did it is pos­si­ble at any time to press the Reini­cializar (reset) but­ton in order to restart from scratch and the box is cleared again with just the orig­i­nal con­tent.

The but­ton Tes­tar (test), allows the stu­dent to eval­u­ate that par­tic­u­lar ques­tion. So at any time the stu­dent can ask for an assess­ment of his/her answer. The but­ton Tes­tar is only avail­able for the pro­gram­ming ques­tions. Besides ver­i­fy­ing basic syn­tax
cor­rec­tion (for instance, if there is an unbal­anced num­ber of open ver­sus close
paren­the­ses the sys­tem informs the stu­dent of that sit­u­a­tion), the use of the Tes­tar but­ton just pro­vides an indi­ca­tion of how good the stu­dents’ answer cur­rently is. This is done by pro­vid­ing a per­cent value indi­cat­ing the grad­ing that the answer would get if the stu­dent sub­mit­ted that cur­rent answer.

Figure 1

Fig­ure 1: High-level Archi­tec­ture of Pro­gram­ming Plu­gin

When defin­ing a pro­gram­ming ques­tion an extended inter­face is pre­sented to the teacher. Besides the stan­dard fields related to the def­i­n­i­tion of ques­tions, those that are spe­cific to the plu­gin are:

TheGrad­ing Engine” is the mod­ule respon­si­ble for grad­ing the stu­dent’s answer. It is invoked when the stu­dent hits the Tes­tar but­ton (if dis­played) or when the sys­tem is grad­ing the stu­dent’s answer. This mod­ule was designed accord­ing to the fol­low­ing top level algo­rith­mic steps:

  1. Strip answer of pro­gram­ming lan­guage com­ments;

  2. Check answer syn­tax;

  3. Eval­u­ate the answer by sub­mit­ting code to the bat­tery of tests;

    1. For each pair of argu­ments (input)/expected return value (out­put), apply the pro­ce­dure by call­ing the inter­preter and accu­mu­late par­tial grades;

    2. Pre­vent too long to accom­plish processes or infi­nite loops;

      Cal­cu­late per­cent­age of cor­rect­ness;

TheInter­preter” is the pro­gram­ming lan­guage inter­preter (or com­piler) that is invoked to test stu­dent’s answer.

The mod­uleCon­fig­u­ra­tor” allows some low level con­fig­u­ra­tion of theGrad­ing Engine”, such as pro­gram­ming lan­guage com­ment open­ing char­ac­ter sequence and com­ment end­ing char­ac­ter sequence; inter­preter/com­piler to use, its loca­tion and call­ing options; and time to wait before con­sid­er­ing com­pu­ta­tional process is tak­ing to long to com­plete.

5. Conclusions

Weekly assess­ment proved to induce the solu­tion to a major prob­lem in com­puter pro­gram­ming teach­ing, trans­form­ing high per­cent­ages of stu­dents’ fail­ure into high rates of approval.

How­ever, it also brought a whole new kind of issues to address. Some of these issues are mainly secu­rity issues which were addressed by strict pro­ce­dures dur­ing assess­ment and even a browser spe­cially devel­oped for the pur­pose of these assess­ments. How these issues were addressed is a whole new story that does not fit in this doc­u­ment but will cer­tainly be addressed in a future one.

For this time the resources issue that came with weekly assess­ment was solved with resource­ful­ness with the con­struc­tion of a Moo­dle quiz plu­gin which auto­mates tests assess­ment.

The imple­men­ta­tion of this plu­gin solved the issue of avail­able resources, with the whole list of advan­tages inher­ent to the weekly assess­ments, which were already exposed above in this doc­u­ment. How­ever two addi­tional not neg­li­gi­ble advan­tages were found:

In the future it would be very use­ful to extend this plu­gin to other pro­gram­ming lan­guages. How­ever this demands to find very robust lan­guage inter­preters in order to insure the test does not blow up dur­ing assess­ments. Other lan­guages also demand an extended syn­tac­tic checker to pro­vide stu­dents more sup­port where heavy syn­tax is present. Another inter­est­ing improve­ment would be a first approach to the eval­u­a­tion of pro­gram­ming style.

References