Software Development
Blogs and Discussion
developer.*
Books Articles Blogs Subscribe d.* Gear About Home

Awkward behavior of System.Collections. Generic.List. BinarySearch

I was trying to build my own morphoanalyzer for Russian language. First step is to search through the list of word radicals – prefixes which will remain constant in all word forms like "wom" for "woman"/"women". In Russian I get more 100K+ radicals, so search efficiency is definitely an issue. Since my initial list is presorted, but can contain duplicate radicals, I decided to choose .NET 2.0 Generic List over SortedList/Dictionary and use BinarySearch. One of the surprising results I've got is that BinarySearch (Int32, Int32, T, Generic IComparer) does not always return first occurrence of item, if there are duplicates. I realize this is caused by nature of binary search algorithm. After some time I even found corresponding note in MSDN, but I cannot come up with any reasons why someone might want behavior like that. Combination of List and BinarySearch is the only container that accepts duplicate items, and it is quite obvious that most of the people will use BinarySearch to find all occurrences of items, not the random one. I do realize that it will hurt performance insignificantly, but it will become right-out-of-the-box usable solution. I created my wrapper around BinarySearch, which produce correct results. The assumption is that number of duplicate items is significantly less than total number of items in the collection:

public static int CorrectBinarySearch(string prefix, int startIdx) {
int idx = List.BinarySearch(startIdx + 1, List.Count - startIdx - 1, prefix, null);

if (idx >= 0) {
while (idx > startIdx && List[idx].Key == prefix) {
idx--;
}
}

return ++idx;
}
Categories:  | |

User login

About our advertising.

Atom Feed

developer.* Blogs also has an Atom feed, located at this url.

Click here for more information about Atom.

A Jolt Award Finalist
Software Creativity 2.0
Foreword by Tom DeMarco

Recent Posters

Based on most recent 60 days, sorted by # of posts and name.

Google
Web developer.*

Who's online

There are currently 0 users and 19 guests online.

Syndicate

Syndicate content
All views expressed by authors, bloggers, and commentors are their own and do not necessarily reflect the views of developer.* or its proprietors.
Click to read the Copyright Notice.

All content copyright ©2000-2005 by the individual specified authors (and where not specified, copyright by Read Media, LLC). Reprint or redistribute only with written permission from the author and/or developer.*.

www.developerdotstar.com