File carving is a process used in computer forensics to extract data from a disk drive or other storage device without the assistance of the file system that originality created the file. It is a method that recovers files at unallocated space without any file information and is used to recover data and execute a digital forensic investigation. It also called “carving,” which is a general term for extracting structured data out of raw data, based on format specific characteristics present in the structured data.
As a forensics technique that recovers files based merely on file structure and content and without any matching file system meta-data, file carving is most often used to recover files from the unallocated space in a drive. Unallocated space refers to the area of the drive which no longer holds any file information as indicated by the file system structures like the file table. In the case of damaged or missing file system structures, this may involve the whole drive. In simple words, many filesystems do not zero-out the data when they delete it. Instead, they simply remove the knowledge of where it is. File carving is the process of reconstructing files by scanning the raw bytes of the disk and reassembling them. This is usually done by examining the header (the first few bytes) and footer (the last few bytes) of a file.
File carving is a great method for recovering files and fragments of files when directory entries are corrupt or missing. This is especially used by forensics experts in criminal cases for recovering evidence. In certain cases related to child pornography, law enforcement agents are often able to recover more images from the suspect’s hard disks by using carving techniques. Another example is the hard disks and removable storage media that U.S. Navy Seals took from Osama Bin Laden’s campus during their raid. Forensic experts used file carving techniques to squeeze every bit of information out of this media.
Difference between File Recovery and File Carving
After reading the above, I think you might be confused: If file carving is a method of file recovery, then what is the difference between file recovery and file carving?
Modern operating systems do not automatically eradicate a deleted file without prompting for the user’s confirmation. Deleted files are recoverable by using some forensic programs if the deleted file’s space is not overwritten by another file. A damaged file can only be recovered if its data is not corrupted beyond a minimal degree. File recovery is different from file restoration, in which a backup file stored in a compressed (encoded) form is restored to its usable (decoded) form. So there is a difference between the techniques. File recovery techniques make use of the file system information and, by using this information, many files can be recovered. If the information is not correct, then it will not work.
File carving works only on raw data on the media and it is not connected with file system structure. File carving doesn’t care about any file systems which is used for storing files.In the FAT file system for example, when a file is deleted, the file’s directory entry is changed to unallocated space. The first character of the filename is replaced with a marker, but the file data itself is left unchanged. Until it’s overwritten, the data is still present.
File systems Overview
Windows File systems: Microsoft Windows simply uses two types of files system FAT and NTFS.
A) FAT, which stands for “file allocation table,” is the simplest file system type. It consists of a boot sector, a file allocation table, and plain storage space to store files and folders. Lately, FAT has been extended to FAT12, FAT16, and FAT32. FAT32 is compatible with Windows-based storage devices. Windows can’t a create FAT32 file system with a size of more than 32GB.
B) NTFS, or “new technology file system,” started when Windows NT introduced in market. NTFS is the default type for file systems over 32GB. This file system supports many file properties, including encryption and access control.
Linux File systems: We already know that Linux is an open source operating system. It was developed for testing and development and aimed to use different concepts for file systems. In Linux there are varieties of file systems.
A) Ext2, Ext3, Ext4—This is the native Linux file system. Generally, the file system is called the root file system for all Linux distribution. Ext3 file system is just an upgraded Ext2 file system that uses transactional file write operations. Ext4 is further development of Ext3 that supports optimized file allocation information and file attributes.
B) ReiserFS—This file system is designed for storing huge amount of small files.
It has a good capability for searching files and it enables allocation of compact files by storing file tails or small files along with metadata in order not to use large file system blocks for this purpose.
C) XFS—This file system used in the IRIX server which is derived from the SGI company.
The XFS file system has great performance and is widely used to store files.
D) JFS—This is the file system currently used by most modern Linux distributions. It was developed by IBM for powerful computing systems.
MacOS File systems: Apple Macintosh OS uses only the HFS+ file system, which is an extension of the HFS file system. The \HFS+ file system is applied to Apple desktop products, including Mac computers, iPhones, iPods, and Apple X Server products. Advanced server products also use the Apple Xsan file system, a clustered file system derived from StorNext or CentraVision file systems.
This file system, in addition to files and folders, also stores finder information about directories view, window positions, etc.
File Carving Techniques: During digital investigations, various types of media have to be analyzed. Relevant data can be found on various storage and networking devices and in computer memory. Various types of data such as emails, electronic documents, system logs, and multimedia files have to be analyzed. In this article, we focus on the recovery of multimedia files that are stored either on storage devices or in computer memory using the file carving approach. File carving is a recovery technique that merely considers the contents and structures of files instead of file system structures or other meta-data which is used to organize data on storage media. The below figure summarizes the file carving terminology.
The most common general file carving techniques are:
1. Header-footer or header-“maximum file size” carving—Recover files based on known headers and footers or maximum file size
- JPEG—”\xFF\xD8″ header and “\xFF\xD9” footer
- GIF—”\x47\x49\x46\x38\x37\x61″ header and “\x00\x3B”
- PST—”!BDN” header and no footer
- If the file format has no footer, a maximum file size is used in the carving program,
2. File structure-based carving
- This technique uses the internal layout of a file
- Elements are header, footer, identifier strings, and size information
3. Content-based carving
- Content structure is loose (MBOX, HTML, XML)
- Content characteristics• Character count• Text/language recognition• White and black listing of data• Statistical attributes (Chi^2)• Information entropy
Tools widely used for file carving: Data recovery tools play an important role in most forensic investigations because smart malicious users will always try to delete evidence of their unlawful acts. Some important data recovery tools are:
- Magic Rescue
First, we are going to see how simple file carving happens. Before beginning at first we will have a look at a jpeg file structure. As an example, I am opening an image in hex editor.
Basically a JPEG file starts with FFD8FFE0, which is called a header.
And it ends with FFD9, which is called a trailer.
The rest of the JPEG file itself.
So if we have any kind of document file that contains an image, if we locate the header and trailer, we can recover that image from the document.