Skip to main content

Find the shortest prefix to identify a string in Ruby

The built-in Abbrev module can calculate a set of unambiguous abbreviations for a set of strings, and then you can look for the shortest result for each string.

I had a list of strings, and I wanted to find the shortest prefix that would unambiguously identify each string.

For example, if this is my list:

fruits = ["apple", "banana", "blueberry", "coconut"]

then a one-letter prefix a/c will identify apple/coconut, but I need two letters ba/bl to identify banana/blueberry.

The Abbrev module has a function that will calculate all the unambiguous abbreviations for a set of strings:

require 'abbrev'

Abbrev.abbrev(fruits)

# {"apple"=>"apple", "appl"=>"apple", "app"=>"apple", "ap"=>"apple", "a"=>"apple",
#  "banana"=>"banana",  "bana"=>"banana", "ban"=>"banana", "ba"=>"banana",
#  "blueberry"=>"blueberry", "blueberr"=>"blueberry", "blueber"=>"blueberry", "bluebe"=>"blueberry", "blueb"=>"blueberry", "blue"=>"blueberry", "blu"=>"blueberry", "bl"=>"blueberry",
#  "coconut"=>"coconut", "coconu"=>"coconut", "cocon"=>"coconut", "coco"=>"coconut", "coc"=>"coconut", "co"=>"coconut", "c"=>"coconut"}

We can invert this to get a list of unambiguous abbreviations for each word:

Abbrev.abbrev(fruits).group_by { |_, v| v }

# {"apple"=> [["apple", "apple"], ["appl", "apple"], ["app", "apple"], ["ap", "apple"], ["a", "apple"]],
#  "banana"=> [["banana", "banana"], ["banan", "banana"], ["bana", "banana"], ["ban", "banana"], ["ba", "banana"]],
#  "blueberry"=> [["blueberry", "blueberry"], ["blueberr", "blueberry"], ["blueber", "blueberry"], ["bluebe", "blueberry"], ["blueb", "blueberry"], ["blue", "blueberry"], ["blu", "blueberry"], ["bl", "blueberry"]],
#  "coconut"=> [["coconut", "coconut"]}

And finally, we can flatten the list and pick out the shortest element:

Abbrev.abbrev(fruits)
      .group_by { |_, v| v }
      .transform_values { |v| v.flatten.min_by(&:length) }
# {"apple"=>"a",
#  "banana"=>"ba",
#  "blueberry"=>"bl",
#  "coconut"=>"c"}