"When does a difference engine become a search for truth?" (I Robot)
I don't know why I did it, but I decided to translate the Perl module Algorithm::Diff into C#, since there aren't any C# libraries yet for finding the differences between two lists.
The diffing algorithm operates on any list of objects that can be compared: not just strings or characters. Custom comparison functions can be provided. The Diff and Patch classes also support writing out a diff in unified diff format. The library contains Patch and Merge classes, but these aren't well developed and I don't maintain them.
You can download:
(updated November 27, 2006)
There's a bug in there somewhere that causes some hunks to be labeled as different when they're actually the same, but this doesn't affect the actual diff returned. You may just need to double check that sections marked as different are actually different.
I've had a few inquiries about using this for patching binary files. I've never tried it, but I don't think this algorithm is particularly well suited for the task. It's possible to diff two byte arrays, but it may not be fast or generate the best patches. The algorithm works best when the elements of the array are large enough that there aren't too many identical instances of it in a file: good for lines of a text file, since lines tend to be fairly unique; bad for bytes, since there are going to be a lot of instances of each actual byte in a binary file. But you're welcome to try it out.
The code for diffing is derived from Algorithm::Diff (credits below), and it is protected by the Perl Artistic License. Feel free to use the rest of it (e.g. merging), which I wrote, any way you'd like.
The library is largely untested, especially merging and patching, so expect bugs.
/* * Diff Algorithm in C# * Based on Tye McQueen's Algorithm::Diff Perl module version 1.19_01 * Converted to C# by Joshua Tauberer * * The Perl module's copyright notice: * Parts Copyright (c) 2000-2004 Ned Konz. All rights reserved. * Parts by Tye McQueen. * * The Perl module's readme has a ridiculously long list of * thanks for all of the previous authors, who are: * Mario Wolczko (author of SmallTalk code the module is based on) * Ned Konz * Mark-Jason Dominus * Mike Schilli * Amir Karger * Christian Murphy */