Computer Science Canada

USACO: PROB Calf Flac

Author:  zero-impact [ Fri Aug 28, 2009 9:44 pm ]
Post subject:  USACO: PROB Calf Flac

Quote:
Calf Flac

It is said that if you give an infinite number of cows an infinite number of heavy-duty laptops (with very large keys), that they will ultimately produce all the world's great palindromes. Your job will be to detect these bovine beauties.

Ignore punctuation, whitespace, numbers, and case when testing for palindromes, but keep these extra characters around so that you can print them out as the answer; just consider the letters `A-Z' and `a-z'.

Find any largest palindrome in a string no more than 20,000 characters long. The largest palindrome is guaranteed to be at most 2,000 characters long before whitespace and punctuation are removed.
PROGRAM NAME: calfflac
INPUT FORMAT
A file with no more than 20,000 characters. The file has one or more lines. No line is longer than 80 characters (not counting the newline at the end).
SAMPLE INPUT (file calfflac.in)

Confucius say: Madam, I'm Adam.

OUTPUT FORMAT

The first line of the output should be the length of the longest palindrome found. The next line or lines should be the actual text of the palindrome (without any surrounding white space or punctuation but with all other characters) printed on a line (or more than one line if newlines are included in the palindromic text). If there are multiple palindromes of longest length, output the one that appears first.
SAMPLE OUTPUT (file calfflac.out)

11
Madam, I'm Adam


So far I have tried the brute force solution by just checking all the substrings from biggest to smallest to see if they are palindromes, which has O(n^3) time complexity as far as I can tell. It fails (1.2s ) on the 4th test. I have also tried copying a reversed string into another variable and checking for the longest common substring between the two (courtesy of slatpro) but I see now that that is basically my fist attempt in a different form.

I'm not sure what to try from here. I have been told to try using a suffix array/tree? I'm not quite sure on how to implement this for this problem.

Any other ideas? Is there something obvious I'm overlooking?

All help is appreciated.

Author:  saltpro15 [ Fri Aug 28, 2009 10:02 pm ]
Post subject:  RE:USACO: PROB Calf Flac

slatpro indeed : P hmmph

I found that article about how to find palindromes in linear time. It didn't make much sense to me, but you're welcome to try. Cheers, hopefully we get this one soon.

Author:  bbi5291 [ Sat Aug 29, 2009 10:28 am ]
Post subject:  Re: USACO: PROB Calf Flac

Actually, the LCS approach takes "straight" O(n^2) time if you use DP (this will still most likely time out). It can be made to run in O(n) using a suffix tree/array, but of course that is overkill for chapter 1.
The official solution is O(n^2), though. It's quite clever for chapter 1, so I'm not going to give it away. Kudos if you get a linear-time solution up-and-running, but that sorta defeats the purpose of USACO training.

Author:  A.J [ Sat Aug 29, 2009 11:42 am ]
Post subject:  RE:USACO: PROB Calf Flac

I actually did use the suffix tree algorithm for this question. I don't really feel that it is overkill for Ch 1; on the contrary, I feel that it is really good practice for certain other questions that you might encounter.

Having said that, I still do agree with Brian. The given O(n^2) algorithm, although pretty straightforward, is pretty insightful and clever.

Author:  bbi5291 [ Sat Aug 29, 2009 11:50 am ]
Post subject:  Re: USACO: PROB Calf Flac

You actually coded a suffix tree? Which algorithm did you use? PM me your code... (Wow, you are really starting to scare me :O)

Author:  riveryu [ Sat Aug 29, 2009 12:43 pm ]
Post subject:  RE:USACO: PROB Calf Flac

Hi zero-impact, I'm going to "expand" on bbi5291's helpful hint. My first thought is to brute force just like your solution. It doesn't work from what you said for USACO. For me its natural checking from the ends of a palindrome towards the center, but this is redundant. Why am I scanning through the same place again and again?

So here's the hint I hope you get it: check if a string is palindrome from the center.

This is the first time I see that problem and I've never coded a solution. I'm not sure if what I think is a solution is feasible within the time bound. (It should be O(n^2) though.)

Author:  bbi5291 [ Sat Aug 29, 2009 2:00 pm ]
Post subject:  Re: USACO: PROB Calf Flac

riveryu's solution is the same as the official solution. It is worst-case O(N^2) but even a slight deviation from worst-case drastically reduces the runtime, so it works fine on USACO.

Author:  zero-impact [ Sat Aug 29, 2009 3:49 pm ]
Post subject:  RE:USACO: PROB Calf Flac

While I was at work today I expanded on a previous idea I had, and it should be O(N*M).
Just go through the string from the beginning and for each character use that as the center of the palindrome. (as riveryu said). I guess I will have to treat even lengthed palindromes differently than odd lengthed ones. I figure O(N*M) because the problem states that the length of the palindrome will not exceed 2000.

Hopefully I will be able to code this up tonight.

Author:  A.J [ Sun Aug 30, 2009 2:28 pm ]
Post subject:  RE:USACO: PROB Calf Flac

I believe that that is still O(N^2).

Author:  bbi5291 [ Sun Aug 30, 2009 6:50 pm ]
Post subject:  Re: USACO: PROB Calf Flac

How is it not O(NM)? You won't be doing more than O(M) operations for each of O(N) positions... (Of course, it is also O(N^2), because of the way big-O notation is defined. But O(NM) is a better bound.)

Author:  A.J [ Sun Aug 30, 2009 8:04 pm ]
Post subject:  RE:USACO: PROB Calf Flac

Please forgive me, Brian, as I tend to take the pessimistic route about life.

Author:  bbi5291 [ Sun Aug 30, 2009 8:39 pm ]
Post subject:  Re: USACO: PROB Calf Flac

I guess if they didn't tell you the bound on M then they could give you a pathological case like a string consisting of 20000 a's. Then you'd have to examine approximately N^2 characters (the calculation is left as an exercise to the reader - calculating constant factors can be useful). That seems just a bit too slow to get AC on USACO (although it would probably AC on the PEG Judge.)

Author:  riveryu [ Sun Aug 30, 2009 10:17 pm ]
Post subject:  RE:USACO: PROB Calf Flac

I think there was a reason why they bounded the palindrome to be 2000. Perhaps as a hint? I don't know.

@ bbi5291
Whats AC? Alternating Current??

Author:  A.J [ Sun Aug 30, 2009 10:34 pm ]
Post subject:  RE:USACO: PROB Calf Flac

it's All Correct

Author:  bbi5291 [ Sun Aug 30, 2009 11:08 pm ]
Post subject:  Re: USACO: PROB Calf Flac

I guess it could be interpreted that way, but all the English-language online judges seem to use "accepted".

Author:  A.J [ Mon Aug 31, 2009 8:38 am ]
Post subject:  RE:USACO: PROB Calf Flac

I guess that does make more sense.

Author:  saltpro15 [ Mon Aug 31, 2009 4:11 pm ]
Post subject:  RE:USACO: PROB Calf Flac

I suppose using a trie wouldn't work? Just spitballing ideas at this point, I've tried pretty much every obvious approach I can think of.

Author:  bbi5291 [ Mon Aug 31, 2009 6:03 pm ]
Post subject:  Re: USACO: PROB Calf Flac

A prefix trie won't help you. A suffix trie takes O(N^2) time and O(N^2) space. With some path compression and leet hax you can reduce both of those requirements to linear. Congratulations, you have just constructed a suffix tree!

Author:  A.J [ Mon Aug 31, 2009 8:44 pm ]
Post subject:  Re: USACO: PROB Calf Flac

bbi5291 wrote:

With some path compression and leet hax you can reduce both of those requirements to linear.


I don't quite see how you can ever possibly get intimidated by me when you are talking about reducing the time and space required for a suffix trie to linear. I would be really impressed if you can present a program that performs in linear time.

Author:  bbi5291 [ Tue Sep 01, 2009 10:40 am ]
Post subject:  Re: USACO: PROB Calf Flac

Hold on a minute. You used Ukkonen's algorithm, and your solution isn't linear time? Ukkonen's algorithm was hailed as the first online linear-time suffix tree construction algorithm. Is it possible that what you built is actually a suffix trie?
btw, there is an "easy" linear-time construction - you use the linear-time divide-and-conquer suffix array construction described in Simple Linear Work Suffix Array Construction by J. Karkkainen, then the linear-time suffix-array-to-lcp-array algorithm described in Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications (T. Kasai, 2001), and finally you put these together to obtain the suffix tree (exercise for the reader).

Author:  A.J [ Tue Sep 01, 2009 11:27 am ]
Post subject:  RE:USACO: PROB Calf Flac

Alas, I am trying to reconstruct my memory of Ukkonen's algorithm...

How do you know this much, Brian?

Author:  bbi5291 [ Tue Sep 01, 2009 12:06 pm ]
Post subject:  Re: USACO: PROB Calf Flac

I don't know all that from memory, of course. I have various bookmarks to stuff like this.
There are some problems on SPOJ that are trivial if you can construct a suffix tree in linear time, so Ivan Katanic and I teamed up to find some info on suffix tree construction. That's how I found out all of this.
btw, here's a partial list (note: many of these have more elegant solutions):
BEADS
BWHEELER
DISUBSTR
LCS
LCS2
MINMOVE
PALIM
SUBST1
TWICE
WPUZZLES
BWHEELER and DISUBSTR are passable with quadratic time/space too, though.

Author:  riveryu [ Tue Sep 01, 2009 6:04 pm ]
Post subject:  RE:USACO: PROB Calf Flac

Can someone explain how LCS would connect to the Calf Flac problem? I'll read what LCS is right now... Pretend I don't know anything.

Author:  CodeMonkey2000 [ Wed Sep 02, 2009 12:15 am ]
Post subject:  Re: USACO: PROB Calf Flac

If you have 2 strings, the LCS algorithm can be used to figure out the largest common sub sequence. For example suppose you have the strings: ABCFG and ABTFG, then the longest common sub-sequence is of length 4 (the largest sequence in both is ABFG).

In this problem it is useful because you are looking for the largest common sub sequence between a string and the reversed version of the string. Although for palindromes, you need to change the the implementation of the algorithm a bit.


: