DupeCare - Automatic Data Deduplication Software for Windows
Download DupeCare Hide this button

Description

This documentation section provides in-detail description of the HHD Software DupeCare product.

Deduplicated Folder

The product allows you to create one or more deduplicated folders. The folder looks like a normal file system folder and can store other folders and files.

Initially, when a file is created inside a deduplicated folder, its full contents is stored on a volume, like any other file.

Folder Optimization

Periodically, DupeCare starts an optimization task that searches for all new files, de-duplicates and compresses their contents and removes the full “cached” representation of a file from the volume. The default period of the optimization task is 1 hour. The user may also start the task manually at any time.

De-duplication means that parts of a file that are exactly the same as other parts of the same file or other files, are only stored once.

When the user deletes files and folders inside the deduplicated folder, they will be removed from the internal storage next time the optimization task runs. DupeCare also offers an ability to recover deleted files if optimize task has not yet been started.

A garbage-collection task, that also runs periodically (by default, once a day), purges deleted blocks from the internal storage.

Local Caching

When an application queries information about the file or folder inside the deduplicated folder, DupeCare fetches the information from its internal storage. The system caches this information.

When an application reads the contents of a file, DupeCare un-compresses and re-constructs the original file data from its internal storage. The system caches this information, turning the deduplicated file into a normal file. All subsequent I/O operations will be automatically executed against a cached copy, making them as fast as for any other “normal” file.

DupeCare automatically deletes all cached copies of metadata and file contents next time an optimization task runs.

NOTE

Unfortunately, the underlying technology used by DupeCare (an optional Windows feature called Projected File System) always caches a full local copy of an entire file the first time application tries to read data from it.

This is usually normal, but makes impractical to store extremely large files (tens of gigabytes) inside the deduplicated folder if they are accessed often. This includes, for example, virtual machine disk images. However, if large files are not accessed often, they can still be stored inside the deduplicated folder with a benefit of high deduplication and compression ratios.

Internal Store

When a folder is de-duplicated, DupeCare creates a new hidden folder, which is used as an internal storage for all de-duplicated data. It is stored in the same parent folder and has the same name with “.dedup” extension. If, for example, you de-duplicate the c:\temp\test folder, a new hidden folder c:\temp\test.dedup is created.

WARNING

Never delete or modify this folder or any file inside it!

Original de-duplicated folder will only store small placeholders as well as any new files and folders created after the last optimization operation. It will still “look” like a normal folder in Windows Explorer and any other application.

NOTE

It is recommended to exclude the hidden .dedup folder from virus scans and disable indexing to improve optimization performance. Only compressed block files will be written to this folder. The original de-duplicated folder may still be subject to virus scan and indexing.

Limitations

The usage of Projected File System imposes the following limitations: