By: Kim Schroeder. Lecturer SLIS WSU A Presentation To The .

Transcription

By: Kim Schroeder. Lecturer SLIS WSUA Presentation to the NDSA and SAA WayneState University Student Groups

The Problem 1) Digital Files become unusable at alarming rates Hardware failure and obsolescence Media Storage failure and obsolescence Software incompatibility and obsolescence

2) VolumeThe Problem Sheer quantity is at an alarming rate One billion emails from the first Obama Administration(National Archivist – Personal Communication)The world’s information is doubling every two years.(EMC2/IDC 2011 Digital Universe Study: ExtractingValue from Chaos)

3) SkillsThe Problem Lack of Concrete Skills and Training in a new field

4) TimeThe Problem Lack of time and budget allocated to the highmaintenance of digital files

5) ToolsThe Problem Lack of clear paths for evaluating tools and fuzzydocumentation of relationships in the workflow. Collectivematters.com

The Traditional Solution Check one: Pack Disks in Boxes and Wait Basic Finding Aid and Wait Do Nothing All of the Above

Digital Forensics as aDiscipline Digital Forensics attempts to trace computers and theirfiles to evaluate the origination of criminal activity.This may mean piecing together files that were deleted,tracing logs of email activity to find a person of interest,or using any clues on a computer to find a missingperson.In order to engage in successful police work, the staffneeds to understand the back-end pieces of how acomputer works and how you can recreate content that isdamaged or only partially there. They need to assessduplicates, look for digital bombs that clear all data andunderstand how to avoid viruses.Many tools have evolved to help in these endeavors.

Digital Forensics as aDiscipline These more mature tools (with modifications) can beof value to archivists.Cal Lee and Kam Woods of the UNC SLIS havewritten grants to develop this opportunity.Bit Curator is their creation using individual opensource tools and developing APIs to unify it into oneportal tool.

Bit Curator Brings Together Imaging of DrivesCheck Sum FunctionsBulk Extractors for Big Data ManagementMetadata and Reporting Tools

Bit Curator Run on a Virtual Machine which is often used forprogramming environments.Linux basedSome command line user interface but working onbetter menu driven design.

Bit Curator Types ofTools Disk Visualization – Creating a Disk Image format to replicate a drive including hierarchical and file naming.The goal is that all data is captured including potentiallyhidden viruses.“Disk images can serve as baselines for comparison fordigital preservation activities, as they provide fail-safemechanisms when curatorial actions make unexpectedchanges to data; enable access to potentially valuable datathat resides below the file system level; and provideoptions for future analysis”. http://www.ils.unc.edu/callee/p57-woods.pdfIn BitCurator, Guymager does this.

Bit Curator Types ofTools Check Sum processes, also called MD5, create a uniqueidentifier on a file level. According to Tech Target.com it is “analgorithm that is used to verify data integrity through thecreation of a 128-bit message digest from data input (whichmay be a message of any length) that is claimed to be as uniqueto that specific data as a fingerprint is to the specificindividual.”This program should be run on a regular basis toindicate file changes or corruption either via viruses or bit rot.A checksum may look like this:85c0455bd6ad1acf7d1eebb20488d812If you change one character it would be completely different asit may now appear:85c0455bd7kllldeddzed0488d81abc2In Bit Curator SDHash does this.

Bit Curator Types ofTools Bulk Extractors – meant to manage large chunks ofdata. Can redact information (social securitynumbers and other sensitive content). In Bit Curator the application marked “BulkExtractor” does this.

Bit Curator Types ofTools Metadata Tools – Many of these independent toolscontribute toward pulling existing information in abatch. Fiwalk is a tool that pulls the metadata from thedigital content and can convert to XML.

Bit Curator Types ofTools

Bit Curator Types ofTools Reporting Tools – gather the information extracted and copiedto generate better analysis of content, quality, integrity, etc. This is done in Bit Curator by the “Bit Curator Reporting Tool”which actually includes fiWalk as it is considered metadata,annotation and reporting. DFXML (Digital Forensics XML)alsohelps to convert the XML into the human readable report form. Report generating can include:“visualizations, transcriptions of file system metadata, highlevel reports on file types, and overviews of featuresidentified by the Bulk Extractor.”(See page 36 for a small sample ickstart-v0.7.4.pdf)

Starting BitCurator Open the Oracle Virtual Machine Environment

The Goal are

Learn in Pieces Bit Curator Deleting Duplicate Files ckstart-v0.7.4.pdf Bitcurator.net

Resources “Automated Analysis and Visualization of DiskImages and File Systems for Preservation” ending Digital Repository Architectures toSupport Disk Image Preservation and Access“ http://www.ils.unc.edu/callee/p57-woods.pdfDigital Forensics and Born-Digital Content inCultural Heritage Collections http://www.clir.org/pubs/reports/pub149/pub149.pdf

Resources curator.net/index.php?title Software

A Presentation to the NDSA and SAA Wayne State University Student Groups . Cal Lee and Kam Woods of the UNC SLIS have written grants to develop this opportunity. Bit Curator is their creation using individual open source tools and developing APIs to unify it into one