LaTeX to Word format conversion how-to

Grant that I will never ever have to submit a paper in Word format again…

But in case that should happen, here is David’s step-by-step cookbook for converting a ground-breaking paper on building physics from LaTeX to Word format. The idea is to first convert it to HTML, and use Word to convert it—something that Word actually does quite well.

  1. Insert

    somewhere towards the end of the document preamble. (Your document will produce rubbish from this point on, so make sure to remove this when you want to typeset the real thing again.)

  2. Run LaTeX on the document. You should end up with a DVI file. (I say should, because on my system, for instance, /usr/bin/latex is a symbolic link to /usr/bin/pdfetex. The original LaTeX is invoked with /urs/bin/latexnoetex. Go figure.)
  3. Run tex4ht filename.
  4. Run t4ht filename. This will produce the HTML file itself and all accompanying pictures in the current directory. See t4ht’s documentation for alternative directories—I like having everything put in its own html/ directory by invoking t4ht -dhtml/ filename. And yes, that slash after html is needed.
  5. At this point you could upload your nice paper on this server and bypass the peer-review process altogether. Since you don’t want to do that, you should now open the filename.html file with Word.
  6. Select File->Save As and save the document in Word’s format.
  7. NO, we are not done yet… You must now also embed the pictures in the Word file or your paper’s addressee will only see small red crosses instead of your equations. You do this by following Edit->Links, select all the links and click on Save picture in document. Save the document again.
  8. That should do it. Note that the whole TEX->HTML conversion should be doable in one step with the htlatex or similar program. On my system it fails because, as said above, latex does not produce DVI files by default, which is why I had to do it the pedestrian way.

Java GNU Scientific Library project update

The GNU Scientific Library (GSL) is a rich state-of-the-art C library of mathematical routines. Pretty much everything you could think of—special functions, linear algebra, function optimization, fast Fourier transforms, and much, much more—is covered by this library.

I have taken over the open-source Java GNU Scientifc Library project at SourceForge. The plan is to write a Java wrapper around this library and make it available for Java developers. There is, for the time being, strictly nothing to see on SourceForge—I’m still planning this whole thing.

I have done some preliminary testing and compared the speed for calculating natural logarithms. I wrote the appropriate JNI wrappers and this test class:

import java.util.Random;
public class LogTest {
public static void main(String argv[]) {
        double[] sample = new double[5000000];
        Random rnd = new Random();
        for (int i=0; i<sample.length; i++) {
            sample[i] = rnd.nextDouble();
        double tmp;
        long tic = System.currentTimeMillis();
        for (int i=0; i<sample.length; i++) {
            tmp = Math.log(sample[i]);
        long toc = System.currentTimeMillis();
        System.out.println("Built-in log: " + (toc-tic) + " ms.");
        tic = System.currentTimeMillis();
        for (int i=0; i<sample.length; i++) {
            tmp = gsl_sf_log.gsl_sf_log(sample[i]);
        toc = System.currentTimeMillis();
        System.out.println("GSL log: " + (toc-tic) + " ms.");

In this program I time the computation of five million randomly chosen doubles between 0 and 1, first with Java’s built-int Math.log function (which I understand is based on netlib), and then with GSL’s. This is the output:

[lindelof@lesopriv3 jgsl_test]$ java -D -Djava.library.path=./ LogTest
Built-in log: 1327 ms.
GSL log: 1164 ms.

Not a big difference, except the GSL version runs slightly faster.

My idea is to cross-compile the GSL library and its Java wrappers, at least for the i386 and arm targets, and package it together with the Java classes in target-specific jarfiles.

The idea is to start work on this right after I finish writing my PhD manuscript, by the end of June at the latest. Help and contributions welcome. I will regularly post updates about this project in this column.

International Building Physics Toolbox

Too much choice can sometimes be just as bad as not enough. Market research has shown that there is an optimal number of jam flavours a supermarket should have on shelf for its customers—less than that, and you feel constrained; but more than that and the variety confuses you.A recent paper in Energy and Buildings reminded me of the bewildering variety of building simulation software packages available today, many of them free of charge. The paper describes the International Building Physics Toolbox, a software library for Simulink that provides all the building blocks you need to simulate heat, air and moisture in buildings.

A Simulink library for building physics is an excellent thing—the flexibility and speed of model development more than make up for the library’s slower execution speed, compared with software written in more pedestrian environments. And Simulink/Matlab provide an environment that can directly execute Java code, enabling developers to extend these building models with their own Java programs. For example, code running in a separate process—possibly on a remote machine—can interact with the building simulation while it runs through e.g. the Java RMI library.

But the IBPT is now the second building simulation package for Simulink that I hear about. The first was SIMBAD, which I used in my doctoral work. The IBPT paper mentions yet another, CARNOT. Nowhere does the paper discuss the pros and cons of each of these packages, and the readers are left to do their own research. Will a healthy, thriving community adopt the IBPT and maintain it, or will it fizzle out? There is no way to know.

I will, however, boldly suggest two criteria that may predict the success of a building simulation package. The first is that it must be developed by programmers, not by scientists. Of course the package’s physical models must be described and validated by academics—but the package’s architecture, and the models’ implementation, must be left to people who understand principles that characterize good software such as simplicity, reusability and maintainability.

The second is that the package must be free of charge and open source. This is the best way to attract talented developers and users that will ultimately help build the package’s user community. Open sourcing the software will not only help maintain the software’s quality—though I have some reservations about that—but it will also make the package attractive to bright and talented people interested in extending and improving it. And that has always been the secret to great open source projects.

CO2 concentrations and global temperature correlation

CS Lewis called this person the “embarassing enthusiast”. A person very committed and very verbal about your common religion, yet someone who occasionally discredits your own views by advancing weak arguments to support it. What to do with such a person’s views? You cannot oppose them—for that would work against your own. Yet you cannot endorse them either, for you want your creed to be built on unshakable foundations.

I started feeling that way about Al Gore when I read his book, An Inconvenient Truth. Mr Gore’s commitment to curbing world carbon emissions is commendable. We have arguably never seen someone deliver such a passioned message with such energy and impact. And I happen to agree, on the basis of the physical evidence we have, that efforts must be taken to diminish man-made CO2 emissions. So I had hoped that Mr Gore’s book would deliver a message that would be both emotionally and factually flawless—but as it turns out, some of his facts should have been checked.

Last 10 April I was a guest in a meeting of Toastmasters International (EPFL). One of the speakers delivered a wake-up call to all of us who would blindly believe anything coming out from the climate-change lobby. In particular, he pointed out that the famous correlation between CO2 concentrations and global temperatures observed for the past 100,000 years in ice core data, and used as one of the strongest arguments by Mr Gore, is not what we think. There is a correlation, yes—provided the CO2 data is shifted 800 years behind the temperature data.

The original paper from which this data came happens to be on my thesis’s bibliography, so I checked this claim. And sure enough, the authors of this paper show quite clearly that the the CO2 data lags behind the temperature data by 600±400 years—suggesting that historically, CO2 concentration increases have been caused by temperature increases, and not the other way around.

I am surprised that climate-change skeptics have not pointed this out before. But it does not really matter, for the current global warming does happen within the same time frame as a dramatic increase in CO2 concentrations. I do not need to look several ice ages in the past to know that. I only wish Mr Gore had done so too before using this argument.

Choice of new programming language

If you are regularly programming in your work you should invest some time to keep your programming skills sharp. And one of the best ways to do that is to master a new programming language.

I have now been through two major programming projects in Java, and am starting on a new one in Python. I am relatively new to that language but I was not too sure whether I should invest all my time into learning it. I wanted to master a new language, but was not sure which one.

Most of my programming consists in analyzing data and producing graphs and results. After some research I found five major languages that help in doing that. They are Lisp, Python, Ruby, S and Mathematica.

I then rated each language according to four criteria:

  1. Power. How powerful is the language at expressing things?
  2. Performance. How fast are typical implementations of the language?
  3. Libraries. Are vast libraries available, built-in into the language?
  4. Literate Programming. Does the language naturally support the Literate Programming style, proposed by Donald Knuth?

After some research I rated each language on a scale of 1 to 4 on each criteria. My judgement had at times to be very subjective, or based on hearsay. I found the following results, summarized in the table below. In the right-most column I sum the scores of each language for each criteria.

  Power Performance Libraries Literate Programming TOTAL
Lisp **** ** ** * 9
Python *** **** **** ** 13
Ruby *** *** *** * 10
S *** *** *** **** 13
Mathematica **** *** ** *** 12

All languages are about eaqually powerful, but Lisp and Mathematica get the top prize. The performance criteria had to be very subjective. I have not run any benchmarks whatsoever, I just report the feeling I got from what I read about the languages. Libraries seem to be the weak spot of Lisp. The language does not seem to come with built-in libraries for regular expressions, database connections, networking or such things. I might be mistaken, but that is the impression I got.

Literate programming is for me the capability of writing a report and programming a data analysis in the same document. Here S (in particular its open source implementation R) shines with its Sweave tool, that allows one to mingle S code with LaTeX. The Mathematica notebook concept is somewhat similar but won’t give as beautiful results. For Python I found a program called Leo, which is apparently a very direct implementation of Knuth’s original literate programming ideas.

So in the end I end up with a tie between S and Python. Mathematica follows closely, and should perhaps also deserve 13 points because its library is very powerful for mathematical applications.

What I did, in the end, was to use S as much as I can for analyzing data and producing reports and articles, and to use Python for everything else, including producing the data in the first place. Sounds like the best of all worlds.

Bug finding tools for Java

I came across an article in the 15th International Symposium on Software Reliability Engineering (2004) titled “A Comparison of Bug Finding Tools for Java“. The authors, Nick Rutar, Christian B. Almazan, and Jeffrey S. Foster, have carried out probably the first detailed comparison of the most popular automatic bug finding tools for Java.

They tested PMD, FindBugs, JLint, ESC/Java and Bandera. They ran most of these tools on five open source projects: Azureus, Art of Illusions, Tomcat, JBoss and Megamek.

From reading their article one gets the impression that the usefulness of these tools is greatly hurt by their tendency to report false positives, i.e. warnings that really are not bugs. Neither does there seem to be a great correlation between the tools, but this might be considered a good thing: some tools are better at finding certain categories of bugs than others.

I have used FindBugs before on a small project with three developers, and its use helped us uncover hidden bugs that mights have caused us much shame. One of us, for example, was casting to float an integer division, mistakenly assuming that 1/2 would be equal to 0.5. FindBugs caught that sorts of thing.

I have the feeling that the author’s choice of test data was perhaps not optimal. Popular open source projects tend to have code of excellent quality, which might be a reason for the high ratio of false positives. There simply aren’t enough bugs of the kind that can be caught by automatic tools. On more typical projects, I firmly believe that regular sweeps with this kind of tools have their place.

Oh, so what’s this got to do with smart buildings? Well, as I’ve argued elsewhere, the OSGi programming framework is ideally suited for programming home automation systems, and runs on Java. I therefore expect a lot of Java code to come the way of home automation in the near future, and any tools that might help in ensuring the code’s quality are welcome.

“Ecobilans” software presentation

IBPSA-CH, the newly created regional affiliate of IBPSA for Switzerland, hosted on 19 October 2006 in Geneva a software presentation of so-called “Ecobilans” programs.

These are programs that generally help an organisation assess its social and environmental impact. The programs that were presented were Ecoentreprise, Green-E and Eco-Bat. They all try to solve the same problem but from different perspectives. Eco-Bat, for instances, is exclusively focused on a building’s environmental impact throughout its whole life-cycle.

Two of them were server-based, the third was a standalone program. All can be tried in a demo version. I will not endorse here any of these programs, only note their existence. It is refreshing for me to see that simulation is slowly but surely gaining a foothold in Switzerland, although I have yet to see an integrated building simulation program being widely diffused in this country.

IBPSA Swiss regional affiliate created

The first General Assembly of IBPSA-CH took place on 1 September 2006 in Lucerne, Switzerland, at the University of Applied Sciences. International Building Performance Simulation Association, the biggest international society of building performance simulation professionals, has now affiliates in sixteen countries or regions.
This first General Assembly focused on presenting the goals and projects of IBPSA-CH. All participating individuals being automatically members, we reviewed and voted on the proposed bylaws of the new association (which can be found on the association’s website above).

Darren Robinson, Jessen Page and yours truly represented LESO-PB.

Energy-efficient appliances in Switzerland

Top Ten is a well-designed and user-friendly website that carries information on energy-efficient appliances and other equipment. It is mainly targeted at the swiss audience but I am sure most of these products can be found all over the european market.

On this site one can also find information on alternative energy sources and pointers to specialized companies that can help with their installation.