Vim Tip: Global Delete

Today I was asked to help debug a problem with our product's patcher. All of the debug information for the entire product goes into a single log file, and some processes are quite chatty. The log file that contained the information I was interested in for the patcher problems was some 26.5MB by the time I got it.

All of the lines I was interested in were very easy to find, because they contained specific strings (yay). The problem was that they were scattered throughout the log, in between debug output for other processes. At first, I tried to just delete lines that were meaningless for me, but that got old very quickly. This is how I made my life easier using Vim.

It's possible to do a "global delete" on lines that don't contain the stuff you are interested in. The lines I wanted to see contained one of two words, but I'll just use foo and bar for this example:

:g!/\v(foo|bar)/d

This command will look for any line that does not contain foo or bar and delete it. Here's the breakdown:

  • :g - This is the command for doing some other command on any line that matches a pattern
  • ! - Negate the match (perform the pending command on any line that does not contain the pattern)
  • /\v(foo|bar)/ - The regular expression pattern
    • \v - Use of \v means that in the pattern after it all ASCII characters except '0'-'9', 'a'-'z', 'A'-'Z' and '_' have a special meaning (very magic). Basically, it removes the need to escape almost everything in your regex.
    • (foo|bar) - Find either foo or (|) bar
  • d - The command to perform on matching lines, which is delete in this case

So, executing that command in the Vim window with the log file wiped out all of the lines that didn't have my magical keywords in them.

When I showed my co-worker how awesome Vim was, he was mildly impressed, and then he asked, "What about multiline log messages?" My particular case didn't have any multiline messages, but I wanted to figure it out anyway. I haven't been able to figure out an exact method for deleting the lines that don't match, but I have found a way to show only the lines that match:

:g!/\v^".+(foo|bar)\_.{-}^"/p

This command is pretty close to the previous one.

  • :g - Global command on lines that match a pattern
  • ! - Negate the match (seems a little backward this time)
  • /\v^".+(foo|bar)\_.{-}^"/ - The regular expression pattern
    • \v - Very magic
    • ^" - Find a line that starts with a double quote ("). Each of our individual log messages starts with a double quote that is guaranteed to be at the beginning of the line, so this is specific to our environment.
    • .+ - One or more characters between the " and foo or bar
    • (foo|bar) - Find either foo or (|) bar
    • \_.{-}^" - Non-greedy multiline match. Matches any character, including newlines (because of the \_), and continues matching until it reaches the next line that begins with ^". Again, that double quote is specific to our environment. The {-} is what makes this a "non-greedy" match--it's like using *, but it matches matches as few as possible of the preceding atom.
  • p - The command to perform on matching lines, which is print in this case. This brings up a separate little window that displays each match (which is why I mentioned the negation seemed a bit backward to me). Navigation and whatnot in this window appears to be similar to less on the command line.

And there you have it! I hope you find this information as useful as it has been for me!

Comments

Comments powered by Disqus