Tuesday, November 22, 2011

Create HTML with Google Spreadsheet: Python Google Docs API

Recently, I helped set up the Gerstein lab's Alumni page. All the information of the current lab people and alumni are situated in a Google Spreadsheet: name, past affiliations, date joined etc. Google Docs API allows the creation of webpages directly based on these information. They recommend the use of the Python scripts. A predecessor did all the work. I merely modified his script to do what I wanted it to do.

M named it BLIS - Bioinformatics Lab Information System, which is quite true in the sense that the entire process is streamlined to minimize fuss and making consolidation of dynamic human resource data a "blissful" procedure for the general administrator. However, it can be a pain for the system administrator trying to put these together, as Mi can testify. So BLIS comprised of a couple of lab resources - people, papers, lectures and images - based heavily on Google spreadsheets. A sanitized code and somewhat abstract walkthrough (pardon me, because I could only better express this by SHOWING how it is done) can be found here. This resource is made free and public for all. Nothing very fancy but it works for the lab and it does make a lot of things easier once you get it to work.

Eventually, M wants to centralize all resources by the use of the Google Spreadsheet, grabbing information off each column. I can however, foresee that the spreadsheet will get outrageously big/messy/both at some point, making the new person looking at it a headache. Now, it is still rather manageable.

Now, Mi is trying to interface Googlegroups with the spreadsheet, to manage mailing lists in this "central repository". M wants me to help him, given my relative success with the People's page. Mi is pretty stuck now since he can't find a way around connecting Googlegroups with GoogleDocs. Any Google team or experts out there who can help?

Here on, I am going try documenting what I did for the People's page:
1) Found in the lab server, is a Python script, some sort of a wrapper, that
a) creates a HTML file for the People webpage (the people's page points to this file)
b) grabs information off GoogleSpreadsheet (refer to mini-walkthrough)
- This is essentially your workhorse.

2) Every time you update the Google Spreadsheet, you need to update the web page too. This can be done by running this file each time you update. A quick and dirty way that can be done is to circumvent the problem of accessing the file in the background is to create a link run the script manually.

I would say 95% of the credit that this works goes out to Mi and the predecessor working on this (I don't know the name). I am merely a modifier that builds on the established success.

Found a pretty neat page that talks more in detail: http://www.pearltrees.com/#/N-p=24598257&N-u=1_199189&N-fa=2133196&N-s=1_3262934&N-f=1_3262934

Google Spreadsheet gadgets targeted at data visualization mostly:
http://code.google.com/apis/spreadsheets/gadgets/

2 comments:

Anonymous said...

hi jm!

google spreadsheets are great when you need to collaborate on data processing. or when you need to have revision history.

I had different task - to implement simple database for storing strings and importing/exporting them in xml file.

after having some hard time with gdata lib, i've found gspread lib which can be helpful in your case. no tricky 'spreadsheet keys' are required, use a name of a spreadsheet and you're ready to go. i've no idea why google's implemented such a verbose client lib.

jm said...

hey warwick-st! I will go check that out! Thanks!!