Free, easy, quick, great PDF creation: Try OpenOffice

keywords: free software, opensource, OpenOffice, grantwriting

I try to give credit where credit is due.

I have written before about using OpenOffice (version 2.4) for “real professional work.” In an earlier post, I wrote about successfully writing an entire grant application using OpenOffice for wordprocessing and figure creation in conjuntion with Zotero for references (and the grant was funded, so…).

PDF creation from OpenOffice (use “Export to PDF” in the File menu) simply works great. It is very fast and the pdf quality is excellent. One note – it does not open the pdf automatically – it just stores the file – so pay attention to this. This works much better than printing to a pdf using the Adobe PDF printer or using the Microsoft Office 2007 export to pdf functions (which, besides being slow, caused Microsoft Office to crash occasionally on my machine).

Also, before I forget, I really like OpenOffice Draw for scientific figure creation – I use it a lot in my work and I have been quite happy with it. I’m using Microsoft Office a fair amount now, but I still use draw to make figures. I’ve used Zotero and Draw for well over a year now, with fairly intense use.

Note: This is almost entirely based on using OpenOffice 2.4. The current version is 3.0, which I just downloaded.

Viewing Large Text Files (like big GFF files) in Windows

I know, I know, many of you will say “just use Linux”. And this is true, but SignalMap from NimbleGen, which is quite convenient for viewing GFF files of ChIP-chip data, is just a windows product (and yes, I did try WINE as an emulator).

So if you try to load a 380,000 line file into Notepad (or even a much smaller file), Notepad will blow up. And wordpad even acts bad.

The good – no, great – free windows solution is Notepad++. Available here.
I’ve used it for a few years; it works great. Will easily load multiple 380,000 line files (like 40 Mb GFF files).

Notepad++ also fulfills other requirements for me: it clearly has a large-ish user base and is constantly being updated/upgraded. So it is a robust, free product.

Other good bits:
(1) will convert from Windows to Mac to Unix line endings
(2) automatically recognizes the line-ending type (important for looking at files)
(3) very good syntax highlighting for a wide variety of programming languages
(4) tabbed files means that you can easily switch from file to file
(5) it retains memory of your open files – so they will be there each time you open it
(6) good behavior when you move/change a file that you are editing – will ask you to reload/save/etc.

Some not-so-good bits:
(1) for big files, regular expression stuff is slow

Mark Bieda windows big files viewing gff files NimbleGen tutorial howto notepad problems

Free Multiplatform Reference Management? Try Zotero

Mark Bieda zotero references computer software citations

You use Endnote, refman, or one of the others. You want a free alternative because (1) you don’t want to worry about licensing issues (like buying a new copy for each computer) (2) you want something that will run under windows, linux, and mac os x (3) you just don’t want to pay or (4) you want to move your references from place to place without having to adapt to the local software choice (i.e. some places will have Endnote, others will have RefMan, others will have other solutions) or (5) you just believe stuff like this should be free.

So: I have been using Zotero for over a year. Zotero is great for everyday web stuff, but here I will just talk about it as a reference manager.

As with my other software comments, this is based on my real experience. I recently wrote an entire grant using Zotero as my only reference manager. And it worked well.

A key thing:
Zotero is heavily and institutionally supported (see the webpage). From the forum comments, you can see that many users are in academe. So it should only get better

Problems/Weaknesses:
(1) This is clearly still in development. But, as I said, I wrote a grant with it – and it worked well for me, but it is not as smooth as EndNote in many ways.
(2) There are a limited number of citation styles, but this number is growing – and you can define your own. For things like grants, usually you get to choose a style. For a typical paper, you won’t have a large number of references, and a little manual editing. Still, because of this, Endnote really still has a big edge.

Getting it:
(1) Zotero is a firefox extension and, when you go to the site, seems more geared toward web-based research.
(2) Installation is superfast and easy. Firefox is the way to go. No internet explorer version.
(3) You will also need to download plug-ins for either Microsoft Word or Openoffice Writer. I used OpenOffice Writer for my grant.

Basics:
(1) There is a tutorial on the website, unfortunately oriented mostly toward the MS Word usage. The same rules apply.
(2) IF you are using OpenOffice Writer, here is something to be careful with: don’t save your files in .doc (MS word) format. I usually do, because I need to send files to colleagues, all of who have MS word but not OpenOffice Writer. If you do this, you will lose the ability to handle your citations.

Getting going:
download and install Zotero from the Zotero website
download and install the appropriate word processing plugin

To get citations:
(1) you can import from many, many sites – like Pubmed, notably.You just click on a button when you find something you like and it gets imported into Zotero.

Recommendations:
(1) When I last looked (about April, 2008), the documentation for Zotero was generally very good, but the documentation for the citation/reference aspects was very poor. So I strongly suggest that you download a few references and play with a pretend, test document to get a sense of how zotero works and your results. I did this and it really helped me use it. Only took a few minutes of playing around.

Python for Perl Programmers (and Bioinformatics people)

Mark Bieda python getting started quick tips hints tutorial

I wanted to write a short post about getting started in python.

What you will like about Python as a perl person:
(1) A great thing is the interpreter. This will allow really rapid learning of python. For a perl person, python should come really fast. I was very, very surprised at how quickly I was writing actually useful (not toy) programs to manipulate things.
(2) It is easy to install in windows and has a decent editor/run environment (IDLE). Python is now a standard part of Linux distros, except for the smallest ones (perl is everywhere, so an advantage to perl here, but only a small one).

Some key things:
(1) The online manuals for python are good (but maybe not great). The Guido tutorial is key; make sure that you get the latest one.
(2) If you like to have a book on the python around (I always do for my programming language du jour), then make sure that you have the most recent one.
(3) Why the emphasis on the most recent? Python has added key new features in recent times – like even since version 2.4! So make sure that you have the latest documentation.

Installation and Usage:
(1) For windows people, use the IDLE editor. Really. You will find it very easy to use and efficient. It comes in the download, so no installation deal.
(2) To learn python really fast, just play with commands in the interpreter window. It really is easy and efficient – a very quick way to get up to speed on things.

Some key things for bioinformatics people, in particular:
(1) Sets. Sets are very nice. Intersection, union… all that stuff that you want to use.
(2) A lot of string manipulation functions (actually methods, technically) are available. These will do a lot of what you would do with regular expressions, but see the next point.
(3) Unfortunately, regular expressions are in an external (but standard library) and are a bit different from perl in usage/implementation.
(4) Like perl, the built-in sorting in python is weird (and annoying to set up to do anything beyond simple), but very useful. Again, here, make sure that you look at the latest documentation.
(5) Sqlite library is now part of the standard package. I haven’t used it yet as part of python – but given that this is a standard part of the distribution, it seems like I could write code that uses it and not worry about portability issues. This is well worth looking at for bioinformatics people.
(6) Remember that tuples are unchangeable (immutable) and lists are changeable. So far, this has led me to be pretty list-oriented, but I am new to this.

I’ll leave it at that for now. I’ll write more about python later on.

I wish I had… started with python earlier…

So far, my bioinformatics work has used a melange of perl, R, and bash scripting. While this has worked pretty well, it does have limits. For one, it is very not portable (bash scripting). I’ve already had problems with distributing software.

I wanted something that I could distribute in an easier way, yet had the advantages of perl. I found Jython, which is Python-in-Java. For me, the big deal is not use of Java libraries, but rather that the language would compile to Java byte-code and hence would be easy to distribute.

But I found that Python is much more than this: the interactive environment, for one, makes me ok with not having my unix/linux toolbox when I am stuck on the windows side.

And Python has a lot of nice features for bioinformatics work, including convenient types like sets (as of version 2.4) and even comes with sqlite (which I have not used from python, but want to)…

Anyways, for now, I am a fan.

Howto install linux… basic options

Ok, so I have another post on howto install linux.

This is just a short list of the top options – all this stuff is covered elsewhere, so I won’t be boring.

The basic list is: dual-boot windows machine (a favorite of mine), a Mac (ok, it’s FreeBSD, but you could also do linux under parallels), Vmware Player, Vmware server + your linux of choice, a live CD (e.g. Knoppix, but there are a bunch now).

To me, it seems the Vmware Server or dual boot windows or mac are the real options…

Howto install linux to a computer… fast

Ok, so I have been in the unfortunate situation of jumping from one computer to another.

The fortunate thing is that, along the way, I’ve had to learn to get linux going quickly.

This entry is about getting linux going fast but temporary. You will want to look at my other post (coming): installing linux… longterm for some other advice

Here are options that I have really used

1. Knoppix live cd
The basic deal here is that you just boot the computer with the CD installed.
Ease: very easy
Minuses: slow boot
Pluses: Truly excellent hardware recognition – I’d recommend trying this for the cool new laptop that seems to hate standard distributions. Also, nice full system.

2. Vmware Player + DSL (Damn Small Linux)
The basic deal here is that you download the Vmware player (easy), install Vmware player like any other windows software, then download the damn small linux virtual machine from the Vmware site.
Ease: very easy
Minuses: several. DSL is a micro-distrib of linux. Yes, it is graphical, but a huge amount of stuff is missing.
Pluses: After you install the Vmware player software, you can always jump into linux – just start the Vmware player. And you are just running it like any other app in windows – which means that you can be doing windows stuff in another window at the same time.

What do I do: well, I currently use the Vmware player + DSL. But this is just temporary… I need to do a few things fast…

When I finally do my long term deal, I will install linux fully to a partition… or try Vmware server…