Python and Bioinformatics and Perl: Chomp in python

Update: As many readers have commented, I have just missed the obvious – there are functions in python to do this. See comments section for details.

So I do a lot of file processing in my bioinformatics work and I’ve always really liked the perl function chomp.

I wanted to implement something in python to do this and something, that like the perl one, is able to handle multiple line endings (that is, Linux, Windows, and Mac line endings).

So this is chomp in python , in a sense a def chomp, but I rename it.

IMPORTANT: I am not guaranteeing in any way that this completely replicates chomp behavior. And, of course, this won’t work on more unusual systems that have different line ending conventions. In my work, I use UNIX/Linux, windows, and older mac stuff – so this works for those. And it handles ugly cases well, as you can see.

Enjoy! and comments welcome.

Also, this is not beautiful code! I threw this together because I was frustrated.

NOTE: you will have to adjust the tab spacing for the function to work; but you already know this… copying to HTML can be a pain…

>>> def chomppy(k):
    if k=="": return ""
    if k=="\n" or k=="\r\n" or k=="\r": return ""
    if len(k)==1: return k #depends on above case being not true
    if len(k)==2 and (k[-1]=='\n' or k[-1]=='\r'): return k[0]
    #done with weird cases, now deal with average case
    lastend=k[-2:] #get last two pieces
    if lastend=='\r\n':
        outstr=k[:-2]
        return outstr
    elif (lastend[1]=="\n" or lastend[1]=="\r"):
        outstr=k[:-1]
        return outstr
    return k

>>> chomppy(‘cow\n’)
‘cow’
>>> chomppy(”)

>>> chomppy(‘hat’)
‘hat’
>>> chomppy(‘cat\r\n’)
‘cat’
>>> chomppy(‘\n’)

>>> chomppy(‘\r\n’)

>>> chomppy(‘cat\r’)
‘cat’
>>> chomppy(‘\r’)

Advertisements

6 Responses

  1. Hi

    First thing to remember when using Python is that it has batteries included. Python has a string method that strips any unusual characters from the string called strip. YOu can also use rstrip (right) and lstrip (left).

  2. why doesn’t .strip() work for you?

  3. yes, just checked and the string function .strip() gives the exact same results:


    stringlist = [
    'cow\n',
    '',
    'hat',
    'cat\r\n',
    '\n',
    '\r\n',
    'cat\r',
    '\r',
    ]

    for substr in stringlist:
    c = chomppy(substr)
    s = substr.strip()
    print repr(substr), ':', c == s

  4. Have you tried rstrip()?

    From pydoc str:

    | rstrip(…)
    | S.rstrip([chars]) -> string or unicode
    |
    | Return a copy of the string S with trailing whitespace removed.
    | If chars is given and not None, remove characters in chars instead.
    | If chars is unicode, S will be converted to unicode before stripping

  5. Hi,
    Although chomp works very well in Perl, in Python i find I can always adapt the code to use strip(), rstrip(), (or split() ). I’ve not missed chomp in Python, but like you, I do use it in Perl.

    – Paddy.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: