Saturday, January 17, 2015

Regular Expressions

As a regular of user of Perl regular expressions I know that this is going to suck, but... here we go. Having no idea what the syntax might be I pulled up this site http://www.tutorialspoint.com/python/python_reg_expressions.htm and it seems simple enough.

Attempt 1:

import re
file_in = open ("Text-1.txt", "r")
file_in_content = file_in.read()
if (re.match("<p><br \/></p>", file_in_content)):
  modified = re.sub("<p><br \/><\/p>", "<p>&nbsp;</p>", file_in_content)
  print modified
else:
  print "Nothing to do"
Worked fine and brought to my attention that indentation is important in Python. I also modified the text to hit my "Nothing to do" statement which worked fine as well. Imports are simple enough which is going to become important in the next part because now I want an html / html parser and I don't want to build my own.



No comments:

Post a Comment