1992-8645 Evaluation Subquery Methods In Microsoft Sql Server 2008

Transcription

Journal of Theoretical and Applied Information Technology10th June 2014. Vol. 64 No.1 2005 - 2014 JATIT & LLS. All rights reserved.ISSN: 1992-8645www.jatit.orgE-ISSN: 1817-3195EVALUATION SUBQUERY METHODS IN MICROSOFT SQLSERVER 2008Tanty OktaviaBina Nusantara University, School of Information Systems, Jakarta – 11480, IndonesiaE-mail: tanty oktavia@yahoo.comABSTRACTFrequently user compiles a query to satisfy business process needs, whether directly using DBMS orconnecting into application system. There are many methods that can be used to generate desired results tosupport transaction, but all the query processing should be run effectively and efficiently. A significantaspect of query processing is how to choose an efficient execution strategy to minimize resource usage.Based on competitive environment in industry, many companies compete with each other to be a numberone. For this reason, company doing a lot of experiments, in order to accomplish a visions and missionobjectives. One of those is increasing performance of the system to support daily activities. Nowadays,most of business process in company already integrated with information technology. All of data isconsolidated by database, so user easily doing their job. According to this fact, user does not matter abouthow many transactions to be process per day, but how much time they need to process that transaction isthe priority concern. Because of that, query optimization techniques become more important to be appliedin many applications. In this research focused on measurement effectiveness of the method sub querywhich can be applied to reach optimal execution and present a comparative study of various cost to declareSub query. This study based on Microsoft SQL Server 2008 platforms.Keywords: Query Processing, Microsoft SQL Server 2008, Sub query, Cost Control1.INTRODUCTIONAlmost all of the business applications requiredatabase for the purpose of integration data as wellas data distribution among the system. DatabaseManagement System (DBMS) as software whichmanage data should be administered and optimizedfor better performance. This system should fastrespond any kind of possible threats that may befrighten. The performance of database system isinfluenced by several factors i.e.: database sizewhich growing proportional with the data,increased user base, increase in the user processes,improperly and un-tuned DBMS. All these factorsdegrade the system response time that wouldanticipate the performance degradation [1]. Thisincreased load has to be minimizing to stabilize theresponse rate of system.One of the major critical issues often happened in acompany was inadequate performance of queries toperform the suitable output. Many factors thatcause this occurrence, one of them are queryprocessing problem. Since then, a significantamount of research and observation has been doneto find an efficient solution for processing queries.A query may be expensive in terms of cost ofexecution if it is not optimized well [2]. For a longtime this matter can be a negative impact for acompany because decreasing business performance.The detection of performance degradation isdetected by continuously monitoring systemperformance parameter.In first generation structure database systems, thelow level procedural query language is generallyembedded in a high level programming languageand the programmer’s should select the mostappropriate execution strategy. In contrast, withdeclarative languages such as SQL, the userspecifies what data is required rather than how it isto retrieved [3]. This pattern transforms the userresponsibility to determine, or even know whatmethod to support good execution strategy. Themost important objectives to be considered in orderto improve the performance of DBMS are:designing an efficient data schema, optimizingindexes, analyzing execution plans, monitoringaccess to data, and optimizing query [4]. For thisresearch focused on optimizing query in MicrosoftSQL Server 2008 platform.255

Journal of Theoretical and Applied Information Technology10th June 2014. Vol. 64 No.1 2005 - 2014 JATIT & LLS. All rights reserved.ISSN: 1992-8645www.jatit.orgMicrosoft SQL Server 2008 is one of the mostwidely large scale databases in commercial. As aDatabase Management System (DBMS), it is asoftware product that function is to store andretrieve data as complied by software applications.This version aims to make data management selftuning, self organizing, and self maintaining, alsoprovide near zero downtime.Table 1. Rule Based Optimization Rankings [3]2.MATERIALS & METHODSGenerally, this study tries to compare the totalexecution time of all individual operation in orderto enhance query performance using sub querymethods. A large variety of processing techniquesare supported by Microsoft SQL Server 2008 in itsSQL Processing engine. For the analysis, the queryexecution plan and response time are considered.This study isiterative process to measure theimpact of sub query methods and reassessing thechanging from the query construction from theothermethodtodeterminesatisfactoryperformance, but need to realize the performancedepends on the amount of data and users’ activitiesthat access the application. There are still manyfactors that should be considered in order toincrease optimal performance of queries thatconnect with application [5]. High performance canbe reached by constructing an efficient code at theapplication level and design the database usingsuitable techniques. Nevertheless, using superiorspecification of hardware is one of theperformances influenced.3.RESULT AND DISCUSSIONInformation system application is growing faster inthe future and the database is going to be largerthan before to support business process. Day byday, data is manipulated by operation insert,E-ISSN: 1817-3195update, delete this condition makes the systemrunning slow and needs more time to execute thetransaction. Depending on application requirementsand user needs are force to retrieve the records of afile on either fast or sequential. Disk devices canstore records in some logical sequence. Assume onefile consists of many thousands or even a millionrecords and user want to retrieve a single recordsbased on a particularly criteria [3]. The disk devicemay be capable of going directly into the middle ofa file to show a record, but to accomplish that, thesystem need time execution to generate that recordwith many procedures compilation in the systemDBMS. To solve or prevent this indicationproblem, the database tunings can be the bestsolution. In this research, the method of databasetuning that being applied is the query optimizationprocess on sub query functions, which shouldoptimized for better performance.The process of DBMS manages query, consist ofseveral stages are as follow: After receiving queryfrom external level, query parser checkingsemantically and syntax. If there are violation ofstructure, user right, or procedure; an error messagewill send to user, otherwise query will be translatedinto internal level in relational algebra expression.Furthermore, query optimizer selects appropriateoptimal method to implement relational algebra andfinally generate query execution plan [6]. The queryplan is compiled code that contains the orderedsteps to carry out the query [7]. Identifying anappropriate plan for execution is very importantbecause these queries can be the determinants ofeffectiveness transaction. Using statistics on tablesand indexes, the optimizer predicts the cost of usingalternative access methods to resolve a particularquery.Queries in algebra are constructed using operators.Each relational query describes a step by stepprocedure for computing the desired output, basedon the order in which operators are applied [8].There are many variations of the operations that areincluded in relational algebra. The five fundamentaloperations in relational algebra: selection,projection, Cartesian product, union, set difference,join, intersection, and division operations. Theselection and projection operations are unaryoperations, which operate only on one relation. Theother operations work on pairs of relations istherefore called binary operations [3].256

Journal of Theoretical and Applied Information Technology10th June 2014. Vol. 64 No.1 2005 - 2014 JATIT & LLS. All rights reserved.ISSN: 1992-8645www.jatit.orgE-ISSN: 1817-3195just trying to read the data for applications, and alsodegrading query response time. Hence that fact,user should beware when applied index in query.The placement of index must be precise with thenecessity and procedures.Figure 1 Process Of Query Execution [6]A Sub query as one of the binary operations, is aquery that is nested anywhere inside DMLsyntaxes, such as INSERT, UPDATE, DELETE,OR SELECT statement. By default, the sub querylist only consists of a single column name orexpression, except for sub query that use EXISTSkeyword. Sub query can be combined with allfunction of SQL Server 2008. The SQL Server2008 will resolve the IN condition by accessing theindex for a number of times equal to the number ofvalues which to search. There are three types of subquery [3]: A scalar sub query returns a single columnand a single row. A row sub query returns multiple columns, butonly a single row. A table sub query returns one or more columnand multiple rows.The types of sub query can be applied inaccordance what data and how many value of data,a user want to return and satisfy the requirement.According to the utilization requirementsfor the system, Microsoft SQL Server 2008provides indexing method. It divided into clusteredindex and non clustered index. The clusteredindexes are recommended only for tables that arefrequently updated. Clustered type indexes areeffective when operators like BETWEEN, , , , , , ! . Non clustered indexes use isrecommended only for databases where updates areinfrequent and gives the optimal solution for the“exact match” [5]. Many questions are come up,which attributes are suitable to be applied index toget better performance. Gilenson states there aretwo sorts of possibilities: primary keys, and searchattributes [9].Indexes are extremely method forsearching data, but should keep in mind when therecord in a table is modified, the system must takethe time to update the table’s indexes too. It will doupdate automatically, but it needs time. If userupdates a lot of data, the time that it takes toexecute the updates operation and update all theindexes could slow down the operations that areIn transact SQL statements, there is usually noprocedure that regulate when to apply a sub queryor that does not, because it is not differencebetween them. User does not concern how toretrieve the data, but how many times the executionoccurred. However, in some cases where the data tobe returned numerous or the conditions from thequery are very complicated, it caused theperformance query is going to down. In case ofquery optimization, it is impractical to searchevaluation plans exhaustively, when theoptimization of query involves many relations [10].The table used in the experiment is nsactionDetailId, BinusianId, AttendDate,AttendPlace, Status, and InsertedDate. The table ispopulated with 500,000 transactional records. Theexperiments are performed on ten times withMicrosoft SQL Server 2008 platform. Thefollowing four queries are used in the experimentsfor observing time execution. Experiments areperformed using both INDEX and NONINDEX.Indexes are applied in ClassTransactionDetailIdAscendingly, BinusianId Ascendingly ingly, BinusianId Ascendingly includeStatus, InsertedDate.257Table 2. List Of Query Used For lassTransactionDetailId) CONVERT(VARCHAR(50),i.BinusianId) VERT(VARCHAR(50),ClassTransactionDetailId) CONVERT(VARCHAR(50),

Journal of Theoretical and Applied Information Technology10th June 2014. Vol. 64 No.1 2005 - 2014 JATIT & LLS. All rights reserved.ISSN: 1992-864523EXISTRELATIONALOPERATOR ( )www.jatit.orgBinusianId) CONVERT(VARCHAR(50),LastInsertedDate,13) AS [Key]FROM ertedDate) ASLastInsertedDateFROMTransactionAttendancesGROUP Messier.dbo.TransactionAttendancesiWHERE EXISTS(SELECT 1 FROM ertedDate) ASLastInsertedDateFROMTransactionAttendancesGROUP BYClassTransactionDetailId, ctionDetailId) CONVERT(VARCHAR(50),i.BinusianId) CONVERT(VARCHAR(50),i.InsertedDate,13) CONVERT(VARCHAR(50),ClassTransactionDetailId) CONVERT(VARCHAR(50),BinusianId) ARCHAR(50),i.ClassTransactionDetailId) CONVERT(VARCHAR(50),i.BinusianId) CONVERT(VARCHAR(50),i.InsertedDate,13) (SELECTCONVERT(VARCHAR(504E-ISSN: 1817-3195RELATIONALOPERATOR ( ) TOP),ClassTransactionDetailId) CONVERT(VARCHAR(50),BinusianId) CONVERT(VARCHAR(50),LastInsertedDate,13) AS [Key]FROM ertedDate) ASLastInsertedDateFROMTransactionAttendancesGROUP BYClassTransactionDetailId, BinusianId)xWHEREx.ClassTransactionDetailId i.ClassTransactionDetailIdAND x.BinusianId ndancesiWHERE i.TransactionAttendanceId (SELECT TOP 1TransactionAttendanceIdFROMTransactionAttendances xWHEREi.ClassTransactionDetailId x.ClassTransactionDetailIdAND i.BinusianId x.BinusianIdORDER BY x.InsertedDatedesc)GOBased on the above query can be checked, the userwant to retrieve the last transaction record thatrepresented by inserted date of each data. First tothird query, using an aggregate function that isMAX to return the latest record that user input. Forthe last query, the experiment try to change theMAX function with TOP, but still refer to the samemean of query and the same amount of result.258

Journal of Theoretical and Applied Information Technology10th June 2014. Vol. 64 No.1 2005 - 2014 JATIT & LLS. All rights reserved.ISSN: 1992-8645www.jatit.orgAfter doing the experiments, using the same queryand same platform, there are results from theexperiments on table 3:Table 3 Time Comparison Of Sub Query Method NoIndex AppliedThe time execution average shows relationaloperator sub query methods take the most time toprocess the query. Relational operators consist of , , , , , , ! in WHERE clause, or aHAVING clause. To apply this method enclosedthe query in parentheses and combine the relationaloperators corresponding to the requirements.IN and EXISTS method does not have a significantdifferentiation for time processing. On average thecomparison between IN and EXISTS only 147.3second faster if IN applied. Despite the fact IN onlyreturn one value, otherwise EXISTS, but IN andEXISTS has no high variance time. This fact can beas a guideline for user to choose the appropriatemethod, while using sub query method in queryapplication. It depends on how many data, userwants to return in query but for the processes havethe same time execution.E-ISSN: 1817-3195time execution, but relational operator and TOPonly need the minimal time to execute query. Ifuser decides to use relational operator for sub queryor using combination with TOP keyword,INDEXING is the standard requirement that querymust be applied to gain the best performance inquery execution.Utilization of IN and EXISTS still have the samelevels of time execution. The difference averageexecution is only 17.2 second faster if IN applied.This condition is contrast with before execution,when index were not applied. There are nosignificant difference between indexing applied ornot for IN and EXISTS keyword. It’s up to user tochoose what method to be applied in query.Figure 2 Time Comparison of Query Using INTable 4 Time Comparison Of Sub Query Method WithIndex AppliedFigure 3 Time Comparison of Query Using EXISTSTo overcome the boundary of existing condition,this study proposed an improved method of queryoptimization to reduce execution time. The manualquery tuning process is analyzed by applying indexmethod that affects the query performance, whereasautomatic query tuning process is provided byMicrosoft SQL Server 2008. After the indexing isapplied, there are extremely transformations of timeexecution from every activity than before. Averagetime execution of IN and EXISTS inversely withrelational operator. IN and EXISTS need the most259Figure 4 Time Comparison of Query Using RelationalOperator

Journal of Theoretical and Applied Information Technology10th June 2014. Vol. 64 No.1 2005 - 2014 JATIT & LLS. All rights reserved.ISSN: 1992-8645www.jatit.orgE-ISSN: 1817-3195REFERENCESFigure 5 Time Comparison of Query UsingCollaboration Operator and TOP4.CONCLUSIONIn Structured Query Languages (SQL), there aremany collaboration methods that can be applied toform sub query, such as IN, EXISTS, or RelationalOperator sub query. Based on the study, there arethe facts have founded:1. IN and EXISTS has no significant differenceon time execution, although IN only returnsingle value and EXISTS return many valuesfrom the query result. Likewise indexing isapplied in query.2. Relational operator sub query is notrecommended for the query because moretime consuming than the other methods of subquery if index are not applied. The situation isinversely if index are applied. Relationaloperator will extremely efficient than IN andEXISTS methods.3. Relational operator sub query can be veryeffective,ifthequerystructuredsystematically and use different method, i.e. itcan be combined with keyword TOP to returnquery in specific order particularly4. INDEXING extremely influenced queryprocessing with all methods sub query. Themethod can be applied to reach out optimalquery execution. This can happen because theitem in the index was sorted; each indexeditem was associated with a physical address.So the process can quickly find a record thatuser looking for [9]The above conclusions are based on experimentdata. The result may be varied in other environmentwith different configuration system or volume ofdata. According to these results, user can specifythe effective methods to get an optimal queryprocessing in order to support business process inorganization.[1] Verma, A. (2011). Enhanced Performance ofDatabase by. IJCSMS International Journal ofComputer Science & Management Studies .[2] Gupta, M. K., & Chandra, P. (2011). AnEmpirical Evaluation of LIKE Operator inOracle. BVICAM’s International Journal ofInformation Technology .[3] Connolly, T. M., & Begg, C. E. (2010).Database Systems : A Practical Approach toDesign, Implementation, and Management.Boston: Pearson Education.[4] Mercioiu, N., & Vladucu, V. (2010).ImprovingSQLServerPerformance.Informatica Economică[5] Lungu, I., Mercioiu, N., & Vladucu, V. (n.d.).Optimizing Queries in SQL Server 3008.Scientific Bulletin – Economic Sciences, Vol. 9(15)[6] Alamery, M., Faraahi, A., Javadi, H. H.,Nourossana, S., & Erfani, H. (2012).Application of Bees Algorithm in Multi-JoinQuery Optimization. ACSIJ Advances inComputer Science: an International Journal .[7] Karthik, P., Reddy, G. T., & Vanan, E. K.(2012). Tuning the SQL Query in order toReduceTimeConsumption.IJCSIInternational Journal of Computer Science[8] Lasya, S., & Tanuku, S. (2011). A Study ofLibrary Databases by Translating Those SQLQueries Into Relational Algebra andGenerating Query Trees. IJCSI InternationalJournal of Computer Science[9] Gillenson, M. L. (2012). Fundamentals ofDatabase Management Systems. USA: Wiley.[10] Mahajan, D. S., & Jadhav, V. P. DQUERIES. International Journal of DatabaseManagement Systems ( IJDMS )260

function of SQL Server 2008. The SQL Server 2008 will resolve the IN condition by accessing the index for a number of times equal to the number of values which to search. There are three types of sub query [3]: A scalar sub query returns a single column and a single row. A row sub query returns multiple columns, but only a single row.