'RJB'에 해당되는 글 1건

  1. 2009.09.07 Stories about Automating Excel in Ruby
We can work IT out2009. 9. 7. 11:56
Recently I needed to program some module in ruby that parses excel file. So I googled to find some solutions. As we all know, there are some open-source projects that provides us libraries for parsing and creating excel file and those were the results of my googling. PARSEEXCEL, SPREADSHEET, ROO are such examples.


1. Pure Ruby Libraries : ParseExcel, Spreadsheet, ROO

As a brief introduction, PARSEEXCEL is an translated module from Perl thing (I don't know the original name of perl module but easly can guess that has a the same name 'parseexcel') and only reads informations from excel file but cannot create or modify the file. No more updates happening after the translation had done few years ago, and has very poor documentation (but it does not needed as long as we have the perl document). Most of all, it has god damn slow performance on reading large size of excel. Don't waste your time on trying this.

Comparing to parseexcel, SPREADSHEET is very fast on reading large size and supports to create new excel file. And It's still on going project, which is pretty fast updating and can get API documentations on http://spreadsheet.rubyforge.org . In case of the ROO, it takes the most advantages from spreadsheet because it's based on that and offers common interface not only for XLS format but also for other various formats, for examples, OpenOffice, XLSX (new format of Excel 2007), and Google Spreadsheet.

But all those three tools has fatal weak point which those are not supported with Excel files including folumlars. Do you think that the excel file is useful when it doesn't have any formular? Absolutly not! Without formulars, what's difference between excel and tables on other MS Office document (ex. PowerPoint, Word) or HTML page! Oh yeah, I know the roo can handle some formulars, but has exception on XLS format which is commonly used as we call Excel.

My work was handling large sized XLS file which has plenty of formulas. So the three tools I had tried was useless. So far, this was the sad story about my attempts to solve excel problem with 'pure' ruby.


2. Unpure Solutions : WIN32OLE, ugly POI Ruby Binding

I'm not the kind of person who doesn't know the word "give up". But it's the work and I need to earn my salery from it. So I went to the second stage finding un-pure solution. Seems limitless things can be done with WIN32OLE, but has windows platform dependency and I love platform indendence.

And here's another sad story on the second stage that I failed again though I spent much time with it. It's about POI, very famous commonly used Java libaray which definitly supports formulars. And few years ago it had released Ruby binding. POI Ruby Binding looks quite messy, it compiles POI Java code to machine code (not to a bytecode sounds suit to Java) by using GCJ as a compiler, and then wrap it to Ruby interface by SIWG . POI project released only a shared library 'poi4r.so' compiled on GCC 3.4.3 but it does not working. Object file poi4r.so had compiled with very old GCC so it fails to link dynamically with my libgcj library on GCC 4.2.4. I'm not a such a idiot to downgrade my GCC version to 3.4.3 to get old version of libgcj library with no sure that it's successfully running on that version. So I tried to compile my own poi4r.so and spent a whole day to do so running over the several hurdles called compile errors. And I finally gave up.

You people involved in POI project! Make smart configuration or build script, or release updated binary with current GCC version. Or just stop your cons by dropping POI Ruby binding on your project web site which is totally useless.

3. Utilize POI or JExcelAPI with RJB

Then I went to the third stage. It's a success story. I was wandering about why POI hustlers didn't bind there Java POI code to Ruby directly. They had made a D-tour using SWIG and GCJ. The answer was so clear that there was no way to bind Java code to Ruby at the time of releasing POI ruby binding library. But now we have RJB (Ruby Java Bridge) that makes possbile to bind Java class as a Ruby class. This is my happy solution. We can use POI or JExcelAPI (This is pretty much better than POI, I think) libraries without wrapping it and usage of RJB is quit simple. An example is here.

#!/usr/bin/env ruby

require 'rubygems'
require 'rjb'

Rjb::load("./jxl.jar") # I prefer JExcelAPI rather than POI, but you can substitute poi.jar for jxl.jar.

# import java.io.File;
file_class = Rjb::import("java.io.File")
# import jxl.Workbook;
workbook_class = Rjb::import("jxl.Workbook")

workbook = workbook_class.getWorkbook(file_class.new("ExcelFile.xls"))
sheet = workbook.getSheet("First Sheet")
puts sheet.getCell(7,7).getContents() # the cell has data calculated by some formular
Posted by Lyle