15

When I implemented a new algorithm in C and beat the previous implementation of a tool by 5 seconds in speed (17s best case before) and my mentor implemented the same algorithm's pseudo code beating the previous implementation by 14s.

Comments
  • 0
    Well, if your pseudo-code says "sort the array",... or something like that, there are a lot of ways to do it
  • 0
    @wil222 wasn't that simple. He did some insane bitset magic to squash the runtime of the code down to 2.3 seconds total, where the theoretically possible max speed is 1.4s. He's like the yoda of C. But I'm also fanboying a little- he worked on all kinds of protocols and file formats and generally is my coding hero. Even though he wouldn't know who yoda is.
  • 0
    @evilmupp3t
    I need someone like this
  • 1
    Can you post the problem I want to beat both of you.

    Bit magic is my middle name
  • 0
    @stevemk14ebr compare each string in a list of strings with each other string of another list of strings and output those that match best by qgrams of length 5 for each string. Both lists are 100k lines long for the test case.

    That about summons it up. Feel free to encode to.
  • 3
    Following this to see who beats who
  • 0
    @evilmupp3t what Is the encoding of the strings (ASCII?) and what is a qgram
  • 0
    @stevemk14ebr they're all 8bit ASCII. A qgram (or Ngram) is just another word for a substring of a string with length q (or n, respectively).
    It's essentially supposed to fuzzy match the strings.
    Post a link to a pastebin or sth if you've got something!
  • 0
  • 1
    I am unable to benchmark since I don't have your dataset but that's about as fast as I can think without using sse instructions. Please test and tell me how fast. Also you should pre-compute the length of the strings for this implementation (or if they are all guaranteed to be > 5 chars remove that logic).
  • 1
    Also this implementation assumes that data loading is taking longer than CPU cycles on your machine in these benchmarks so I've optimized for cache efficiency here rather than CPU cycles. By doing loop unrolling a super scalar CPU will reorder those joins of the string bytes so that if one blocks while gathering data it reads another, i.e. cpu always has work; this is why more instructions (4 shifts, 5 ors, and 1 compare) is more efficient than 5 compares in a loop. The conditional at the beginning of the function may make this slower than necessary, remove if possible, I'm not sure how.

    Basically I'm saying I assumed alot so it might not be faster.
  • 0
    Any news on how fast I was?
  • 0
    Sorry didn't get around to it yet! I'll let you know later today!
  • 0
    Ah, u misunderstood - not just first 5 characters. All qgrams of a string against all qgrams of all other strings in another list. I'll upload pseudo code in a sec :)
  • 0
  • 0
  • 0
    Link 404s. Is it like a 5 letter sliding window? Where there is a window on word1 and a window on word2 and any offset of either window can match if the 5 litters in the two windows match?
  • 0
    @stevemk14ebr goo.gl/E6kbFC How's that?
  • 1
    Eh, I'm done. Thats more work than fun at this point. Props in the speedups dude!
Add Comment