Big Data Analytics For Processing Real-time Unstructured Data . - JETIR

Transcription

2022 JETIR January 2022, Volume 9, Issue 1www.jetir.org (ISSN-2349-5162)Big Data Analytics for Processing Real-timeUnstructured Data from CCTV in TrafficManagementKevin ShethDept. of Computer Engineering Savitribai Phule Pune University,Pune, Maharashtra, IndiaAbstract:-Today, many devices generate data Anytime, anywhere. The data has increased significantly. The handling becomescomplicated. It is difficult to process and consists of unstable attributes. In traffic management, install CCTV Monitor specificlocations on the highway. CCTV Create unstructured data for images and videos format. Due to the complexity of the data, it isdifficult to process this data. This survey proposes to implement big data analytics to process unstructured real-time data fromCCTV into displayed on the dashboard. It uses the Yolo V4 architecture and the COCO dataset to implement the YOLO frameworkfor traffic flow counting and detection of illegal parking that is classified as anomalous situations. Next, the unstructured data fromCCTV was converted to JSON semi-structured format. The data can also be visualized in real-time to help local governmentsunderstand highway conditions. Historical data is stored in the NoSQL database to give you a deeper understanding of vehicletraffic patterns and more. This system requires an ROI drawing line as a trigger to count the passing vehicles.Keywords:- Real-time ,CCTV, unstructured data ,big data, Traffic management.I.INTRODUCTION:-Big data is used to describe a large collection of data. Data grows rapidly and has complex types. Consists of several characteristicssuch as volume, speed, and variety. Types of data are i) structured ii) unstructured iii)semi-structured. Structured data is fixedformat data that can be easily processed using traditional databases such as RDBMS. A semi-structured form is a structured form,but it has dynamic attributes. JSON or XML format. Unstructured data is an unfamiliar format such as images and videos. Manyapplications today generate large amounts of data. Produces big data phenomena everywhere. Big data phenomena occur in manyplaces Includes industry, banking, media, tourism, health care, transportation, and more. Since the data is in an unstructured format,it produces a lot of data, but I don't know how to handle it.One of the main areas is traffic management, where businesses or governments collect data from images. And video from CCTV.Many CCTVs are installed on highways to monitor traffic. Generated data From CCTV, is one example of unstructured data. Thedata is large in size and grows very rapidly in video format. To process unstructured data, you need a big data solution. At the sametime, you need to understand the complexity of traffic as soon as possible. This means that unstructured data from CCTV needs tobe processed in real-time.There are two ways to process data: 1) batch processing and 2) real-time processing. More real-time data. It's more complicated thana batch because it needs to be processed in a short amount of time or in near real-time. Real-time processing requires processingseveral domains, one of which is in traffic management. Real-time processing helps authorities instantly understand current trafficJETIR2201228Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.orgc196

2022 JETIR January 2022, Volume 9, Issue 1www.jetir.org (ISSN-2349-5162)conditions. One example is to understand traffic density. Big data analysis helps you analyze CCTV traffic in real-time and returnstructured reports to active executives.II.LITERATURE SURVEY:-SRNOPAPER TITLE ,PUBLICATIONDETAILSPREPROCESSING1Nada Elgendy andAhmed Elragal “BigData Analytics: ALiterature ReviewPaper” SpringerInternational PublishingSwitzerland August 2014Due to the rapidpace of datagrowth, solutionsmust bediscovered andprovided tomanage andextract value andknowledge fromthese data sets.2D. P. Acharjya KauserAhmed P ”A Survey onBig Data Analytics:Challenges, OpenResearch Issues andTools” InternationalJournal of AdvancedComputer Science andApplications, Vol. 7, No.2, 2016This big dataanalysis requiresefforts on multiplelevels to extractthe knowledgeneeded fordecision making.3Saurabh Malgaonkar,Sanchi Soral, ShailjaSumeet , Tanay Parekhji,” Study on Big DataAnalytics ResearchDomains”Data analytics isthe trending fieldof data analysis toobserve patternsand predict futureoutcomes.FEATUREEXTRACTION ANDCLASSIFICATIONBig DataAnalytics andDecisionMaking,CustomerIntelligence,Supply Improvement,RiskManagementand ROCESSINGRESEARCHGAPIDENTIFIED90%By usingvariousmethods wecan handlethe dataIn futureresearch canfocus onproviding roadmap orframework.-It provides aplatform toexplore bigdata atmultiplestagesCloudSystems, DataAnalytics AndInteroperability, DataAnalysis,MachineLearning AndNeuralNetworks91%efficient tools tobe developedmust haveprovision tohandle noisyand imbalancedata,uncertainty andinconsistency,and missingvaluesMore researchcan be done inthe samedomain.The final partof the articlecoversmachinelearningalgorithmsand neuralnetworks totrain a datasetto recognizepatterns fromthe modeleddata and topredictoutcomesbased ontraining andpatternrecognition.4PichaimuthuMost companiesHBase using89%The paperBy using textMohankumar “BIG Datahave anHadoop,finds anminingAnalytics: A Frameworkunstructuredefficient wayalgorithms ,weJETIR2201228Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.orgc197

2022 JETIR January 2022, Volume 9, Issue 1for Unstructured DataAnalysis” InternationalJournal of Engineeringand Technology (IJET)March 2013model.Informationretrieval andextraction isnecessary andimportant work inthe fields of theSemantic Web.Data is generatedin Variousformats so itsdifficult toanalyze the data.There is no fixformatwww.jetir.org (ISSN-2349-5162)5Suyash Mishra, DrAnuranjan Misra“Structured andUnstructured Big DataAnalytics” (ICCTCEEC2017)6Jaein Kim, NacwooKim, Byungtak Lee,Joonho Park, KwangikSeo, “RUBA: Real-timeUnstructured Big DataAnalysis Framework”October 2013Therecommendationframeworkprovides dynamicmodification andreal-time analysisfor unstructuredbig data analysisBig Data,UnstructuredData, Realtime System,CEP,CQL.88%7Rubal, Sheetal Kalra“Real-Time Applicationsof Big Data- A Survey”(IJERT) Vol. 5 Issue 03,March-2016Big data,Hadoop,HDFS,MapReduce,NoSql,Realtime dataanalytics90%8Sasan Amini, IliasGerostathopoulos,Christian Prehofer “BigData AnalyticsArchitecture for RealTime Traffic Control”Conference Paper · June2017A large amount ofdata is generatedfrom differentsources that canbe structured orunstructured. Thistype of data isdifficult toprocess andmanage andcontains millionsof recordsinformationincluding socialmedia, web sales,etc.There is a criticalneed to developnew tools andsystems to keeppace with the riseof big data.Kafka, HDFS80%JETIR2201228Real-timeOnline orstreamProcessing,BatchProcessingto storeunstructureddata and anappropriateapproach todata retrievalwould try to getmore insightsThey describeabout varioustechniquesand softwareused toManage,processunstructuredbig data inefficientmanner, andincreases theperformanceof complexityanalysis.The objectmonitoringsystem isimplementedas a testsystemapplied to ourframeworkand we haveconfirmed allthefunctionalityand usabilityof ourframework.To improvethe quality ofinformationand decisionmaking it isimportant toeffectivelyanalyze thislarge volumeof data toanswer newchallenges.By usingMapReduceunstructureddata cantransformed andconverted intostructured dataTheyproposed abendystructure. Thestructure isprimarilybased totallyon a scientificevaluation ofWe can useKafka streamsor sparkstreams toperformcomplexprojects.We would try touse RUBAFramework forreal timeapplicationsWe can upgradeto Analysis as aService(AaaS).Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.orgc198

2022 JETIR January 2022, Volume 9, Issue 19www.jetir.org (ISSN-2349-5162)N. Naga Lakshmiand T.Asha Latha“AUTOMATEDTRAFFICMANAGEMENTSYSTEM USING BIGDATATECHNOLOGY”(IJLTET )Mauricio Perez, Alex C.Kot, Anderson Rocha“DETECTION OFREAL-WORLDFIGHTS INSURVEILLANCEVIDEOS” IEEE 2019large scalechallenges toperform largescale calculationsare very difficultand largeamounts of dataare generatedfrom sensors.Previous workwas either toosuperficial orunrealistic. Noone has done realtime detection onLong-durationCCTV recording.CapturesSnapshot ofvehicle, SendSMS to owner89%3D-CNN,local interestpoints92%11Payal Saha, MohitMittal, Shreya Gupta“Big Data Trends andAnalytics: A Survey”(IJCA) 2018Big data,Hadoop,Mapreduce,Data analytics,Big data tools.95%12Subramaniyaswamy ,Vijayakumar, Logesh Rand Indragandhi V“Unstructured DataAnalysis on Big Datausing Map Reduce”(ISBCC’15)t is envisionedthat the Big Dataconcept willensure that hugechunks of data arereduced to amanageable form.Social networkingsites likeFacebook, Twitterhave discoveredthat data growthwill get out ofcontrol in ahout,Maven,SentimentAnalysis85%13Jeffrey Dean and SanjayGhemawat ,”MapReduce: SimplifiedData Processing onLarge Clusters” OSDI2004MapReduce isrelatedprogramming andimplementationmodel forprocessing andgeneratingdatasetsLarge-ScaleIndexing98%14Joao Ricardo Lourenco,Veronika Abramova,Bruno Cabral, JorgeBernardino “NoSQL inpractice: a write-heavyenterprise application”2015 IEEE InternationalCongress on Big DataAnkit Parag Shah, JeanBapstite Lamare, TuanNguyen-Anh, andAlexander Hauptmann“CADP: A NovelCurrentbenchmarksevaluate databaseperformance byrunning specificqueries on mostlyaggregated dataNoSQL, assandra,SQL ServerCN,ACM, RCNN85%1015JETIR2201228Addresses thelack of public datafor the study ofautomatedspatiotemporal80%thenecessities ofthe domainSmart sensorsare used foridentifyingdrivers whoignore trafficlaws.Datasetscontaining1000 videosof real fights,More than 8hours ofCCTVmaterial withcommentsDiscussedconcepts ofBig Data andchallengesIt processesdata inparallel asfractions indistributedclusters andaggregates alldata betweenclusters to getfinalprocesseddata.TheMapReduceprogrammingmodel hasbeensuccessfullyused atGoogle forvariouspurposes.Ahomogeneouscluster usingfour machineswith similarhardware waschosen to hostthe databases.demonstratedtheperformanceof accidentforecasting inWe canimplement thissystem to reduceaccidents.spatialfeatures, whichhave notdemonstratedpositivecomplementaryto the temporalinformationMore research isrequired becausedata isincreasing dayby dayNew analysis isdoneLarge datasetscan beimplementedthroughMapReduceTechniqueWe should usein real timeWe shouldimplement in thefutureJournal of Emerging Technologies and Innovative Research (JETIR) www.jetir.orgc199

2022 JETIR January 2022, Volume 9, Issue 1Dataset for CCTVTraffic Camera basedAccident Analysis” 2018IEEEIII.SrNo1www.jetir.org (ISSN-2349-5162)annotations forroad safety.dataset usingFaster RCNN and anAccidentLSTMarchitectureALGORITHMIC SURVEY:Paper TitleBig Data Analytics:A Literature ReviewPaperAlgorithmUsedMapReduceand HDFS, BDADframeworkTimeComplexityO(K), O(1)SpaceComplexityO(M),Accuracy80%2Study on Big DataAnalytics eData MiningO (kn2)O(k n)85%3BIG Data Analytics:A Framework forUnstructured DataAnalysisDETECTION OFREAL-WORLDFIGHTS P: A NovelDataset for CCTVTraffic Camerabased AccidentAnalysisR-CNNO(nt (ij jk Cost-effectivesolution, I t'snot always veryeasy toimplement eachand everythingas a out statisticalsignificance offeatures. Theassumptions oflogisticregression.Can store largedata sets. Nosupport SQLstructureit automaticallydetects theimportantfeatureswithout anyhumansupervision.CNN do notencode theposition andorientation ofobject.Very Highaccuracy inimagerecognitionproblems. CNNdo not encodethe position andorientation ofobject.CONCLUSION:-Deep learning algorithms and NoSQL databases are big data technologies that can process unstructured data in real time. These arevery helpful in understanding traffic conditions and for police officers to monitor the highway. The proposed prototype canrecognize objects such as cars, trucks, and buses, and aggregate vehicle types. The YOLO v4 model and COCO dataset have beenJETIR2201228Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.orgc200

2022 JETIR January 2022, Volume 9, Issue 1www.jetir.org (ISSN-2349-5162)trained to classify highway traffic objects. You can also analyze the normal and unusual status of real-time unstructured data.Powered by the Dell Inspiron GPU 1050 Ti, i7 Intel processor, this real-time system can monitor real-time surveillance with 10fpsreal-time CCTV .14.15.Nada Elgendy and Ahmed Elragal “Big Data Analytics: A Literature Review Paper” August 2014 DOI: 10.1007/978-3319-08976-8 16D. P. Acharjya Kauser Ahmed P ”A Survey on Big Data Analytics: Challenges, Open Research Issues and Tools”International Journal of Advanced Computer Science and Applications, Vol. 7, No. 2, 2016Saurabh Malgaonkar, Sanchi Soral, Shailja Sumeet , Tanay Parekhji ,” Study on Big Data Analytics Research DomainsPichaimuthu Mohankumar “BIG Data Analytics: A Framework for Unstructured Data Analysis” International Journal ofEngineering and Technology (IJET) March 2013Suyash Mishra, Dr Anuranjan Misra “Structured and Unstructured Big Data Analytics” (ICCTCEEC-2017)Jaein Kim, Nacwoo Kim, Byungtak Lee, Joonho Park, Kwangik Seo, “RUBA: Real-time Unstructured Big Data AnalysisFramework” October 2013Rubal, Sheetal Kalra “Real-Time Applications of Big Data- A Survey” (IJERT) Vol. 5 Issue 03, March-2016Sasan Amini, Ilias Gerostathopoulos, Christian Prehofer “Big Data Analytics Architecture for Real-Time Traffic Control”Conference Paper · June 2017N. Naga Lakshmi and T.Asha Latha “AUTOMATED TRAFFIC MANAGEMENT SYSTEM USING BIG DATATECHNOLOGY” (IJLTET )Mauricio Perez, Alex C. Kot, Anderson Rocha “DETECTION OF REAL-WORLD FIGHTS IN SURVEILLANCEVIDEOS” IEEE 2019Payal Saha, Mohit Mittal, Shreya Gupta “Big Data Trends and Analytics: A Survey” (IJCA) 2018Subramaniyaswamy , Vijayakumar, Logesh R and Indragandhi V “Unstructured Data Analysis on Big Data using MapReduce” (ISBCC’15)Jeffrey Dean and Sanjay Ghemawat ,” MapReduce: Simplified Data Processing on Large Clusters” OSDI 2004Joao Ricardo Lourenco, Veronika Abramova, Bruno Cabral, Jorge Bernardino “NoSQL in practice: a write-heavyenterprise application” 2015 IEEE International Congress on Big DataAnkit Parag Shah, Jean-Bapstite Lamare, Tuan Nguyen-Anh, and Alexander Hauptmann “CADP: A Novel Dataset forCCTV Traffic Camera based Accident Analysis” 2018 IEEEJETIR2201228Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.orgc201

In traffic management, install CCTV Monitor specific locations on the highway. CCTV Create unstructured data for images and videos format. Due to the complexity of the data, it is difficult to process this data. This survey proposes to implement big data analytics to process unstructured real-time data from CCTV into displayed on the dashboard.