Jcepm Online ISSN 2233-9582 Dx.doi /10 .

Transcription

COREMetadata, citation and similar papers at core.ac.ukProvided by Institutional Repository of the Islamic University of GazaKICEM Journal of Construction Engineering and Project ManagementOnline ISSN EPM.2014.4.4.009A Neural Network Model for Building ConstructionProjects Cost EstimatingNabil Ibrahim El-Sawalhi 1 and Omar Shehatto 2Abstract: The purpose of this paper is to develop a model for forecasting early design construction cost of building projects usingArtificial Neural Network (ANN). Eighty questionnaires distributed among construction organizations were utilized to identifysignificant parameters for the building project costs. 169 case studies of building projects were collected from the constructionindustry in Gaza Strip. The case studies were used to develop ANN model. Eleven significant parameters were considered asindependent input variables affected on "project cost". The neural network model reasonably succeeded in estimating buildingprojects cost without the need for more detailed drawings. The average percentage error of tested dataset for the adapted model waslargely acceptable (less than 6%). Sensitivity analysis showed that the area of typical floor and number of floors are the mostinfluential parameters in building cost.Keywords: Building construction projects, Artificial Neural Network, Gaza Strip, Cost estimationI. INTRODUCTIONDue to the inadequacy of traditional estimatingtechniques in conceptual stage and most commonmethods for estimating in Gaza strip are still thetraditional and spreadsheet methods. Therefore, it isdesirable to create a new method that helps a user withlittle knowledge to quickly create an accurate costestimate.Moreover, in conceptual stage, there is a limited ofavailable data and lack of appropriate cost estimationmethods, which need to search for a technique thatcapable to deal with limited information and giveaccurate estimate with very limited data. ANNs areutilized as systems able to generalize solutions for theseproblems by learning from a set of historical exampleswith a little of time and effort.Many researchers apply neural network approach invarious fields of engineering prediction and optimization.However, the authors reckon that the researches andstudies on utilizing neural networks to estimate the costof construction projects at various stages of the work arevery limited [12] ,[13].This research developed a model of ANN to estimatethe cost of building projects at the conceptual stagedepending on the historical data of projects implementedin Gaza strip between “2009-2012” to help the users inpredicting cost of projects at early stage with high level ofaccuracy.Cost is one of the three main challenges for theconstruction manager, where the success of a project isjudged by meeting the criteria of cost with budget,schedule on time, and quality as specified by the owner[1]. Poor strategy or incorrect budget or scheduleforecasting can easily turn an expected profit into loss [2].Therefore, effective estimating is one of the main factorsof a construction project success [3].In recent decades, researchers and participants inconstruction industry have recognized the potentialimpact of early planning to final project outcomes.Therefore, they started to put more emphasis on earlyplanning process, where the project definition in the earlyplanning process is an important factor leading to projectsuccess [4].The cost estimate becomes one of the main elements ofinformation for decision making at preliminary stage ofconstruction. Thus, improved cost estimation techniqueswill facilitate more effective control of time and costs inconstruction projects [5]. Actually, estimates are preparedand used for different purposes including feasibilitystudies, tendering phase, avoidance misuse of fundsduring the project, etc. The primary function of costestimation is to produce a credible cost prediction of aconstruction project. However, the predicted cost dependson the requirements of a client and upon the informationand data available [6].The largest obstacles standing in front of a costestimate, particularly in early stage, are lack ofpreliminary information and larger uncertainties as aresult of engineering solutions.As such, to overcome this lack of detailed information,cost estimation techniques are used to approximate thecost within an acceptable accuracy range [7].1.2The aim of this research is develop a prototype modelfor estimating the cost of building construction projectsincluding skeleton and architecture factors using artificialneural networks ANNs. Besides, giving a betterunderstanding of the underlying factors of buildings cost,which can help the users to prepare preliminary costestimate of building projects.Depty Dean faculty of Engineering , The Islamic University- Gaza, E-Mail: nsawalhi@iugaza.edu.ps (*Corresponding Author)Master degree, The Islamic University- Gaza, E-Mail: shehato@hotmail.com This research is limited to the buildings9sector of

A Neural Network Model for Building Construction Projects Cost Estimatingconstruction projects in Gaza strip; including the maintwo phases of building construction; skeleton andfinishing phase. Thus, collecting data on building projectsthat were implemented between 2009 and 2012 wereconducted. It is assumed in this study that the initialarchitecture drawings of the building project are availablewhich include the information of adopted model factors.estimating. The nine building functional element groupsof cost classification system are as follows: Slab onground, Number of floors , stairs and elevators, Externalwalls, Windows, External doors, Floor height, Internaldoors, Area of ground and typical slabs and finally thecolumns quantity and length in between [11].Floor area parameter, Number of stories, Slab type,Foundation type, Number of elevators, Type of project,Type of project and External finishing parameter wereadopted by several researchers [5];[12]; [13];[2]; [4]; [14];[15].II. PARAMETRIC COST FACTORSThe "parametric" method of estimating involvescollecting relevant historical data, usually at anaggregated level of detail, and relating it to the product tobe estimated through the use of mathematical techniques.Since parametric methods typically capture cost at a veryhigh level, less detail is required for this approach thanfor other methodologies [8]. Parametric estimation isexpressed as an analytical function of a set of variables.These usually consist in some features of the project(performances, type of materials used), which aresupposed to influence mainly the final cost of the project.Commonly, these analytical functions are named „„CostEstimation Relationships‟‟ (CER), and are built throughthe application of statistical methodologies [9].One of the most significant keys in modelling buildingsearly stages cost estimating is identifying the factors thathave real impact on the cost of building projects.Depending on this great importance of selecting thesefactors, several techniques were adopted carefully toidentify these parameters for building projects. Thefactors affecting parametric cost estimate are illustrated.Since a construction project‟s location affects the finalcost, an estimator must understand what particularlocation factors will be encountered and whatconsiderations should be taken into account whenformulating the estimate.General assumptions about soil conditions may bemade early in the estimating process, but they may turnout to be wrong. As the estimate progresses, geotechnicaldata may help improve the information and prevent costlychange orders and claims. In the early estimates theassumptions regarding soil conditions and the potentialeffects of unknown soil conditions should be clearlydocumented. Soil conditions can be a significant cost riskto a project. The soil type will influence the chosenfoundation type [10].The quantity and type of a given material on a projectimpacts the unit cost of constructing and/or supplying thatitem. This is not simply a supply and demand issue, butalso one of production efficiency and economy of scale.Generally speaking, the unit price for larger quantities ofa given material will be less than smaller quantities.Mobilization, overhead and profit are all spread out overa larger quantity, thus reducing their affect on each unit.Small quantities of items of work are less cost effectiveto construct and hence lead to higher unit prices. Thispractice increases a contractor‟s overhead and usuallyresults in a mark-up being applied to those items [10].Types of plastering, tilling, marble, electrical, sanitary,carpentry, metal, and aluminum would largely affect costIII. NEURAL NETWORKIn the recent years, new approaches based on thetheory of computer systems that simulate the learningeffect of the human brain as Artificial Neural Networks(ANNs) has grown in popularity [9]. ANNs is one ofthese new approaches that is able to perform tasksinvolving incomplete data sets, fuzzy or incompleteinformation and for highly complex and ill-definedproblems. Moreover, it is able to deal with non-linearproblems. One of the distinct characteristics of ANN is itsability to learn from experience and examples and then toadapt to changing situations. It has a natural propensityfor storing experiential knowledge and making itavailable for use [14]. Another major benefit of usingANN is its ability to understand and simulate morecomplex functions than older methods such as linearregression [16]. In addition, it can approximate functionswell without explaining them. This means that an outputis generated based on different input signals and bytraining those networks, accurate estimates can begenerated [7]. In spite of great accuracy of using ANNmodel in cost estimation, it has a considerable defect, as itdepends mainly on historical data; this dependency hasseveral disadvantages as the following:1.2.3.Despite the large number of researchers who appliedneural network approach in various fields of engineering,the studies and researches on utilizing neural networks toestimate the cost of construction projects at various stagesof the work are very limited[13].Locally, there is a lack of cost estimation researchesbased on ANN applied in Gaza Strip. Arafa and Alqedra(2011) developed an ANN model to estimate the cost ofbuilding construction projects at early stages. A databaseof 71 building projects collected from the constructionindustry of the Gaza Strip was used in a developed ANNmodel. The model had one hidden layer with sevenneurons. The results obtained from the trained modelsindicated that neural networks are reasonably succeededin predicting the early stage cost estimation of buildings10Vol.4, No.4 / Dec 2014Diversity of variables for effective factors islimited to what available in collected data.Data should contain sufficient projects for eachvariable.New variables which was not included inadopted model will not be handled.

Nabil Ibrahim El-Sawalh, Omar Shehattousing basic information of the projects and without theneed for a more detailed design [13].Regionally, Elsawy et al., (2011) developed a neuralnetwork model to assess the percentage of site overheadcosts for building projects in Egypt, which can assist thedecision makers during the tender analysis process [15].Kim et al., (2004) applied hybrid models of ANN andGenetic Algorithm (GA) to estimate the preliminary costof residential buildings. They first optimized theparameters of the back-propagation algorithm usinggenetic algorithms and then obtained a set of trainedweights for the ANN model using GA. The results of theresearch revealed that optimizing each parameter of backpropagation networks using GA is most effective inestimating the preliminary costs of residential buildings[5]. Gunaydin and Dogan (2004) developed an ANNmodel to estimate the cost of a square meter of thestructural system of buildings in early phases of designprocesses. The input layer of the trained ANN modelcomprised eight parameters available at the early designstage. The trained ANN model was capable of providingaccurate estimates of at least 93% of buildings cost persquare meter [12].Emsley et al., (2002) trained neural network costmodels using a database of data nearly 300 buildingprojects. They used linear regression techniques as abenchmark for evaluation of the neural network models.The results showed the ability of neural networks tomodel the nonlinearity in the data, where the model wascapable of evaluating the total cost of the construction,and the trained ANN model obtained a mean absolutepercentage error of 16.6 % [17].The above researches reviewed by the authors indicatedthat the application of artificial neural networks toestimate the early cost of construction projects is apromising area.Number of rooms, Location of project, Number ofstaircases and Type of contract) and eighteen finishingphase parameters (Type of external plastering, Volume ofAir-conditioning, Area of curtain walls, Type of tiling,Type of water and sanitary works, Type of electricalworks, Area of gypsum board and false ceiling, Area ofmarble works, Fire fighting and alarm works, Quantity ofelectrical works, Number of windows, Quantity of waterand sanitary works, Type of carpentry works, Number ofinternal doors, Type of Aluminium works, Type ofcarpentry works, Quantity of metal works and Type ofpainting) identified from literature were evaluated. Eightyquestionnaires were distributed to various engineeringinstitutions. Fifty-seven questionnaires with a responserate 71% have been correctly received.The case studies used in this research was collectedfrom different institutions concerned with constructionengineering in Gaza Strip. A data sheet was prepared andused to extract all useful information from 193 bids ofbuilding projects during 2009 and 2012. However, Inorder to overcome any defect in collected data, somebasic assumptions and criteria were defined andperformed on collected projects. 24 projects of 193projects were eliminated. Therefore, 169 projects wereused to build the neural network model.4.2 Data AnalysisTo measure the accuracy of the neural network model,several methods were used. Mean Absolute Error (MAE)is one of many ways to quantify the difference betweenan estimated and the actual value of the projects beingestimated. According to Willmott & Matsuura, (2005) theMAE is relatively simple; It involves summing themagnitudes (absolute values) of the errors to obtain the„total error‟ and then dividing the total error by n, it canbe defined by the following formula [18]: IV. METHODOLOGY Eq. (1)Where:P number of output PEs. N number of exemplarsin the data set. dyij denormalized network output forexemplar i at PE j. ddij denormalized desired output forexemplar i at PE j.Mean Absolute Percentage Error (MAPE) is a quantityused to measure how close forecasts or predictions are tothe eventual outcomes, according to Principe, et al., (2010)The MAPE is defined by the following formula [19]:An extensive review of previous studies, withstructured questionnaire and expert interviews were usedto identify the most influential factors on building projectcost in Gaza Strip. These influential factors would be theindependent input parameters in the neural networkmodel and they will form the basis of collectedinformation from historical cases of building projectsfrom municipalities, government ministries, engineeringinstitutions, contractors and consultants. After analysingthe data, many models would be built and trained withvarious structures by using Neuro-Solution 5.07application.P100 j 0 i 0 Eq. (2)Where:P number of output PEs. N number of exemplarsin the data set. dyi j denormalized network output forexemplar i at PE j. ddi j denormalized desired output forexemplar i at PE j.According to Principe et al., (2010) the size of themean square error (MSE) can be used to determine howwell the network output fits the desired output, but itdoesn't necessarily reflect whether the two sets of data4.1 Questionnaire designA questionnaire was designed according to theidentified factors that affect parametric cost estimate ofprojects. Thirteen cost parameter in skeleton phase (Areaof typical floor, Number of floors, Area of retaining walls,Use of building, Type foundation, Number of elevators,Type of slab, Length of spans, Number of columns11KICEM Journal of Construction Engineering and Project Management

A Neural Network Model for Building Construction Projects Cost Estimatingmove in the same direction. For instance, by simplyscaling the network output, we can change the MSEwithout changing the directionality of the data. Thecorrelation coefficient (r) solves this problem [19].By definition, the correlation coefficient between anetwork output x and a desired output d is:𝒓̅) 𝒊(𝒙𝒊 𝒙̅)(𝒅𝒊 𝒅𝑵Eq. (3)𝟐̅ )𝟐 (𝒙 𝒙 𝒊(𝒅𝒊 𝒅 𝒊 𝒊 ̅)𝑵𝑵According to Hegazy & Ayed, (1998); the total MAPEmethodology is defined by determining the total MAPE[20]. Training phase were represented fifty percent of thetotal MAPE. Likewise the test set is equal the remainingfifty percents.TotalP(MA ETr Tr MAEC.V C.V ) ( Tr C.V) MA5Type of slab6Number of elevators7Type of external finishing8Presence of HVAC and falseceiling9Type of tilling1011Type of electricity worksType of mechanical worksSolid - Ribbed – Slab withdrop beams(0-1-2)Plaster – Natural stoneOthersCentral conditioning - SplitunitsCeramic – Terrazzo PorcelainBasic - LuxuryBasic - LuxuryVI. MODEL DEVELOPMENTThe developed model in this research based on NeuroSolution 5.07 for Excel program. It was selected for itsease of use, speed of training, flexibility of building andexecuting the ANN model. In addition, the modeller hasthe flexibility to specify his own neural network type,learning rate, momentum, activation functions, number ofhidden layers/neurons, and graphical interpretation of theresults.Five steps for implementing the neural network modelwere followed as shown in Figure I and illustrated below:ETest2 . Eq. (4)Where:MAPETr Mean absolute percentageerror for training data set. NTr number of exemplars inthe training data set. MAPEC.V Mean absolutepercentage error for cross validation data set. NTr number of exemplars in the cross validation training dataset. MAPE Test Mean absolute percentage error for testdata set.StartEncodethe dataOrganizethe dataTraining,CrossValidation,Testing setV.RESULTS AND DISCUSSIONMost of respondents for the thirteen skeleton phasecost factors declared that the area of typical floor (90%)and number of floors (90%) are the most influentialfactors on building cost, while area of retaining wall(75%), type of building (73%), type of foundation (72%),number of elevators (71%) and type of slab (71%) have amoderate influence. For remaining parameters as, lengthof span between columns (69%), number of columns(65%), number of rooms (59%), location of project (57%),number of stair cases (57%), and type of contract (51%)have a lower influence on the project cost. On the otherhand, for the influence of eighteen finishing parameters,the external finishing type has the highest rate (78%),while volume of air-conditioning, area of curtain walls,type of tiling, type of sanitary, and type of electricalworks have a rate between 72%-76%. Accordingly, fiveexperts in construction field were selected to reach aconsensus about specifying the key cost parameters. Theresults with those five experts were significantly close tothe questionnaire results, and only three Delphi roundswere conducted due to largely degree of consensus. Table(I) shows the most influential parameters which had thehighest rate adopted in several previous researches.End1Area of typical floorsNumber of storeyUse of buildingType of foundationArtificial networks only deal with numeric input data.Therefore, the raw data must often be converted from theexternal environment to numeric form [21]. This may bechallenging because there are many ways to do it andunfortunately, some are better than others are for neuralnetwork learning [19]. In this research the data is textualand numeric, so it is encoded to numeric form.Initially, the first step in implementing the neuralnetwork model in Neuro-Solution application is toorganize the Neuro-solution excel spreadsheet. Then,specifying the input factors that have been alreadyencoded, which consist of 11 factors; type of project, areaof typical floor, number of floors, type of foundation,type of slab, number of elevators, type of externalfinishing, type of air-conditioning, type of tilling, type ofelectricity, and type of sanitary. Finally, specifying thedesired parameter (output) which is (total cost of theproject).As a rule of thumb, determining the number of hiddenlayer/neurons is one of the main drawbacks of ANNs,because there is no specific rule and it requires many trialand error processes, while considerable time must beLess than 1200 m2(1-8) storeyResidential - Schools - PublicIsolated – Strap – Piles - Mat12Vol.4, No.4 / Dec 2014Normalize the dataFIGURE IMODEL IMPEMENTATION STEPS FLOWCHARTTABLE IINFLUENTIAL PARAMETER OF BUILDING PROJECT COSTSkeleton and finishing phaseNo.Rangeinput parameter234Buildinitialmodel

Nabil Ibrahim El-Sawalh, Omar Shehattospent [5]. Hegazy & Moselhi, (1995) stated that onehidden layer with a number of hidden neurons as 0.5 m,0.75m, m, or 2m 1, where m is the number of inputneurons, is suitable for most applications [22].The available data were divided into three sets namely;training set, cross-validation set and test set. Training andcross validation sets are used in learning the modelthrough utilizing training set in modifying the networkweights to minimize the network error, and monitoringthis error by cross validation set during the trainingprocess. However, test set does not enter in the trainingprocess and it hasn‟t any effect on the training process,where it is used for measuring the generalization abilityof the network, and evaluated network performance [13].In the present study, the total available data is 169exemplars that were divided logical randomly, accordingto literatures, into three sets with the following ratio:-results, to be focused in following training process.The training process started with selecting the neuralnetwork type either MLP or GFF network. For each one,five types of learning rules were used, and with everylearning rule six types of transfer functions were applied,and then 3 separate hidden layers were utilized withincrement of hidden nodes from 1 node up to 40 nodes ineach layer.More than one and a half thousand trials contain 40variable hidden nodes for each was executed to obtain thebest model of neural network. Ten runs, in each one 3000epochs, were applied, where a run is a completepresentation of 3000 epochs, each epoch is a onecomplete presentation of all of the data [19]. However, ineach run, new weights were applied in the first epoch andthen the weights were adjusted to minimize thepercentage of error in other epochs. To avoid overtrainingfor the network during the training process, an option ofusing cross-validation was selected, which computes theerror in a cross validation set at the same time that thenetwork is being trained with the training set. The modelwas started with one hidden layer and one hidden node inorder to begin the model with simple architecture, andthen the number of hidden nodes was growing up by onenode up to 40 hidden nodes.Training set (includes 116 exemplars 69%).Cross validation set (includes 27 exemplars 16%).Test set (includes 26 exemplars 15%).Before starting the training phase, it is usuallynecessary to scale the data, or normalize it to thenetwork's paradigm. Kshirsagar & Rathod (2012) andGunaydın & Dogan (2004) stated that data is generallynormalized for the purpose of confidentiality and foreffective training of the model being developed, wherethe input data must be normalized between an upper andlower bound [19]. The normalization of training data isrecognized to improve the performance of trainednetworks. Therefore, the input/output data was scaled,zero is the lower bound and the upper bound is one to suitneural networks processing by using Neuro-solutionprogram.Once all data were prepared, then the subsequent stepis represented in creating the initial network by selectingthe network type, number of hidden layer/nodes, transferfunction, learning rule, and number of epochs and runs.An initial neural network of Multilayer Perceptron(MLP), that consists of one input, hidden, and outputlayer, was built and a supervised learning control waschecked to specify the maximum number of epochs andthe termination limits.6.2 Neural network testingThe purpose of testing phase of ANN model is toensure that the developed model was successfully trainedand generalization is adequately achieved. Therefore,testing the network is essentially the same as training [15].The testing set is critical to confirm that the network hasnot simply memorized a given set of data but has learnedthe general patterns involved within an application [21].The testing data is totally a different set of data that thenetwork is unaware of; after finishing the training processtesting data is used for validation and generalization ofthe trained network. If the network is able to generalizerather precisely the output for this testing data, then itmeans that the neural network is able to predict the outputcorrectly for new data and hence the network is validated[24].Through a system of trial and error guided by earlierrecommendation, the best model that provided moreaccurate cost estimate without being overly complex wasstructured of Multilayer Preceptron (MLP) includes oneinput layer with 11 input neurons and one hidden layerwith (22 hidden neurons) and finally one output layerwith one output neuron (Total cost). However, the maindownside to using the Multilayer Preceptron networkstructure is that it required the use of more nodes andmore training epochs to achieve the desired results.Figure (II) summarizes the architecture of the model asnumber of hidden layer/nodes, type of network andtransfer function.6.1 Neural network trainingThe objective of training a neural network is to get anetwork that performs best on unseen data throughtraining many networks on a training set and comparingthe errors of the networks on the validation set [23].Therefore, several network parameters such as number ofhidden layers, number of hidden nodes, transfer functionsand learning rules were trained multiple times to producethe best weights for the model.As a preliminary step to filter the preferable neuralnetwork type, a test process was applied for most ofavailable networks in the application. Two typesMultilayer Perceptron (MLP) and General feed Forward(GFF) networks were chosen, due to their good initial13KICEM Journal of Construction Engineering and Project Management

A Neural Network Model for Building Construction Projects Cost EstimatingEstimated cost1000xArchitecture of the model2,5002,000R² 0.98941,5001,000500001,0002,0003,000Actual cost1000xFIGURE IIILINEAR REGRESSION OF ACTUAL AND ESTIMATED COSTSModel TypeTransferFunctionUpdateMethodsGradient SearchMultilayerPerceptronTanhBatchMomentumBy reviewing many researches that used ANN in costestimation, it is shown that no specific percent ofallowable error for model estimate is available. However,the acceptable accuracy performance for ANN model isequal 10% according to [25] and [26]. In this study andaccording to Eq 4, the Total MAPE 10%, where thiserror includes all datasets as training, cross validation,and test datasets.The results of performance measures are summarizedin Table (II), where the accuracy performance of adoptedmodel is 94%. In which the average error is 6%.No. of PEs inNo. of PEsNo. of PEs inthe inputin the 1stthe output layerlayerhidden layer111221FIGURE IIARCHITECTURE OF THE ADOPTED MODELNo. of hiddenlayerThe most common statistical performance measureswere applied on the adopted model to ensure the validityMLPModelMAEMAPEAPRTotalMAPE33,7576%94 %0.99510 %TABLE IIRESULTS OF PERFORMANCE MEASUREMENTFigure IV describes the actual cost comparing withestimated costs for cross validation (C.V) dataset. It isnoted that there is a slight difference between two costlines.of this model in estimating the cost of new projects as thefollowing:The Mean Absolute error (MAE) for the adoptedmodel equals (33,757 ), it is largely acceptable forprojects worth hundreds of thousands dollars. However, itis not a significant indicator for the model performancebecause it proceeds in one direction, where the mentionederror may be very simple if the total cost of a project islarge, and in turn; it may be a large margin of error incase the total cost of a project is small.The mean absolute percentage error of the model iscalculated from the test set, which equals 6%, this resultcan be expressed in another form by accuracyperformance (AP) according to Willmott and Matsuura(2005) which is defined as (100 MAPE) %[18].Regression analysis was used to ascertain therelationship between the estimated cost and the actualcost. The results of linear regressing are illustratedgraphically in Figure (3). The correlation coefficient (R)is 0.995, indicating that; there is a good linear correlationbetween the actual value and the estimated neuralnetwork cost at testing phase.FIGURE IVDESIRED OUTPUT & ACTUAL NETWORK FOR C.V SETFor test dataset, a perfect agreement between the actualand estimated cost is shown in Figure V which means theestimated values equal the actual ones.14Vol.4, No.4 / Dec 2014

Nabil Ibrahim El-Sawalh, Omar Shehattowith small sensitivities can be discarded [19](Principe etal., 2000). Finally, a report summarizing the variation ofoutput with respect to the variation of each input wasgenerated and presented in Figure VI as shown.Sensitivity about the mean per

A Neural Network Model for Building Construction Projects Cost Estimating 10 Vol.4, No.4 / Dec 2014 construction projects in Gaza strip; including the main two phases of building construction; skeleton and finishing phase. Thus, collecting data on building projects th