Saturday, May 26, 2007

Installing Ubuntu on a Fujitsu Lifebook A3120

As I mentioned in last week's post, I purchased a new Fujitsu Lifebook A3120. I've been busy lately, and consequently did not get any time to do any online research, something I did when I purchased my last laptop. I happened to be at Fry's Electronics, something which is quite rare for me lately, and I happened to be in the market for a new laptop, and the machine looked really nice, and I remembered reading somewhere that Fujitsu machines have a good reputation among the Linux community as being fairly easy to convert. And Linux installations have been getting slicker with more and more devices being supported. So it was something of an impulse buy, with a giant leap of faith in Fujitsu and recent Linux distributions. Kind of dumb, now that I think about it, but it mostly worked out OK, at least so far, so I guess alls well that ends well.

As it turns out, the machine got rave reviews for its looks, but not so much for its performance. One thing that disappointed me was its poor battery capacity, about 1.5 hours. Thats what my Toshiba gave me after about 2 years of use, when new the battery life was close to 3 hours. Its not a catastrophe, since I use the batteries during my commute to and from work, which is 50 minutes long. I just have to remember to charge the machine at home and at work. I was hoping they would sell a longer lasting battery, so I could just upgrade at some point, but according to their website, thats the only model they sell. Not sure what the guys at Fujitsu were thinking, but this can't be good for their image. For my part, I feel disappointed and cheated, since I had just assumed that modern laptops are built to provide at least 3 hours of battery life, and upto 6 if you shell out the big bucks. I guess thats the price you pay for not researching before buying.

The Lifebook is a AMD Turion 64 bit dual core machine. It came pre-installed with 32-bit Vista. I have been using Fedora for quite some time now, but a few colleagues suggested using Ubuntu, which is based off the Debian Linux distribution. I have been thinking of switching over to Debian for a while, but my last attempt to switch to Debian was a disaster, so I switched back to Fedora at that time. However, Ubuntu comes packaged as a Live CD, which allows you to run it off the CD, to see if all your devices are recognized, whether X comes up correctly, etc. So I decided to make the switch to Ubuntu.

To ease the transition, I bought and read Marcel Gagne's Moving to Ubuntu Linux, which was very helpful. It also came with a 32-bit Live CD, which worked fine. I tried downloading the 64-bit Ubuntu 6.06 (Dapper Drake), which promised long term support and roasting it with xcdroast on my Fedora box, but after toasting (pun intended) two CD-R's, a colleague burnt me the 64-bit Live CD using Nero.

I installed the 64-bit Dapper Drake Ubuntu from the Live CD, which was surprisingly painless. X worked right out of the box, and everything looked good. I even loved the human theme, with its swirling brown wallpaper, something I am told causes a lot of people extreme revulsion. The one thing that was almost completely unusable was the touchpad. It seemed to be way too sensitive, and would recognize a single tap as a double tap sometimes. At other times, I would have to double-click where I would normally expect to single click. A bit of searching on the Ubuntu forums led me to a variety of suggestions, from installing QSynaptics and configuring the touchpad through it, to replacing the InputDevice section for the touchpad with a custom one.

Turns out I had the Alps Glidepoint, which works with the Synaptics driver, but requires different configuration because the hardware is slower than the Synaptics hardware.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
sujit@sirocco:~$ cat /proc/bus/input/devices 
...
I: Bus=0011 Vendor=0002 Product=0008 Version=7321
N: Name="AlpsPS/2 ALPS GlidePoint"
P: Phys=isa0060/serio2/input0
S: Sysfs=/class/input/input3
H: Handlers=mouse2 ts2 event3 
B: EV=f
B: KEY=420 70000 0 0 0 0
B: REL=3
B: ABS=1000003
...

I installed qsynaptics as the threads asked, but tweaking the parameters did not help much. I then copied the new configuration verbatim from this Ubuntu Forum thread, which actually solved the problem. The following section replaces the section that was called "Synaptics Touchpad" in the default /etc/X11/xorg.conf file. I did notice that the touchpad would go revert back to the old behavior each time I switched from using AC to battery, but I am not noticing it anymore with the distribution upgrade to Feisty Fawn (more on that later).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Section "InputDevice"
  Identifier  "Synaptics Touchpad"
  Driver    "synaptics"
  Option    "AlwaysCore"
  Option    "SendCoreEvents"  "true"
  Option    "Device"  "/dev/input/event2"
  Option    "Protocol"  "event"
  Option    "LeftEdge"  "130"
  Option    "RightEdge"  "840"
  Option    "TopEdge"  "130"
  Option    "BottomEdge"  "640"
  Option    "FingerLow"  "7"
  Option    "FingerHigh"  "8"
  Option    "MaxTapTime"  "180"
  Option    "MaxTapMove"  "110"
  Option    "ClickTime"  "0"
  Option    "EmulateMidButtonTime"  "75"
  Option    "VertScrollDelta"  "20"
  Option    "HorizScrollDelta"  "20"
  Option    "MinSpeed"  "0.60"
  Option    "MaxSpeed"  "1.10"
  Option    "AccelFactor"  "0.030"
  Option    "EdgeMotionMinSpeed"  "200"
  Option    "EdgeMotionMaxSpeed"  "200"
  Option    "UpDownScrolling"  "1"
  Option    "CircularScrolling"  "1"
  Option    "CircScrollDelta"  "0.1"
  Option    "CircScrollTrigger"  "2"
  Option    "SHMConfig"  "on"
  Option    "Emulate3Buttons"  "on"
EndSection

I then proceeded to install pwsafe, which is a simple command line password manager which I already used. The pwsafe code is available as source, and surprisingly, the C compiler did not come standard with Ubuntu desktop. Not that surprising though, if one thinks about it, since it is built for home users. To build pwsafe, you need to do the configure, make, make install sequence. Configure flagged down the missing components, and I used apt-get install to install them. Once done, I just dropped in the .pwsafe.dat file from my old 32-bit machine and it just worked.

Next I installed Java 6 (from Sun's website). Most modern Linux distributions include Java 1.4 which you can turn off by setting the JAVA_HOME variable to the current one in your .bash_profile.

I then installed the latest versions of Maven (my favorite Java build tool), and Ant (the build tool in use at work), and Eclipse 3.2.2, my favorite IDE. This time, I opted to put all the optional software under /opt, to avoid confusion and keep things clean. These need extra environment variables, which I also set in my .bash_profile.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Java settings
export JAVA_HOME=/opt/jdk1.6.0_01
export PATH=$JAVA_HOME/bin:$PATH
# Ant settings
export ANT_HOME=/opt/apache-ant-1.7.0
export PATH=$ANT_HOME/bin:$PATH
# Maven settings
export MAVEN_HOME=/opt/maven-2.0.6
export PATH=$MAVEN_HOME/bin:$PATH
# CVS Access (Work, sourceforge)
export CVS_RSH=ssh
#export CVSROOT=...
export CVSROOT=...

I also had to add this line to the end of /etc/profile for my .bash_profile to be actually called each time I logged on.

1
. $HOME/.bash_profile

The next step was to figure out how to connect to my employer's VPN. For my old Fedora laptop, I used PPTP Client for my current and previous employer. This time around, I was unable to install the PPTPConfig component, a GUI setup tool, since some of the libraries that PPTPConfig depends on are not available in the debian repositories as 64-bit packages. I saw some threads where they suggest either installing the 32-bit packages and the associated 32-bit library packages, or compiling from source, but I was not familiar enough with Debian/Ubuntu to do this. There were also a few threads that suggested using the Network Manager applet that was available in Feisty Fawn (Ubuntu 7.04) in place of the PPTPConfig program.

So I had two options. Either upgrade to 7.04 to get the Network Manager, or figure out how to do the configuration manually. I was actually able to connect to the VPN with the manual configuration, but none of the machines in the network behind the VPN were visible. This meant that I had to change my route to make these machines visible, but I know almost nothing about routes. I did notice that Dapper Drake seemed quite old (I got Apache 1.3 from the Drake repository for example, whereas I have been using Apache 2 for at least a year now on my old Fedora laptop), and since one of the main reasons I chose Debian/Ubuntu was because of its ease of upgrades, I decided to go with upgrading to Feisty Fawn.

Doing a distribution upgrade is actually quite easy. All you do is invoke the update manager, and a GUI will tell you if a new distribution is available, and lead you through the steps. Unfortunately, it will let you upgrade one release at a time. It took me about 4 hours to upgrade from Dapper (6.06) to Edgy (6.10), and another 3 hours to upgrade from Edgy (6.10) to Feisty (7.04).

Distribution upgrades using the update manager (apt dist-upgrade) are also apparently fraught with danger, based on the comments on the forums, and most people prefer doing a full install. I found out why, since X started crapping out with strange "command not found" errors when I upgraded to Edgy. The problem was that bash just got stricter, and the fix was to change the /bin/sh at the top of /etc/gdm/Xsession to /bin/bash. I found this quite easily from the forums however. There were no issues with upgrading from Edgy to Feisty.

With Feisty, there is a GUI which allows you to configure your VPNs, see available wireless hotspots, and restart your wired connection. Its pretty neat and very convenient, but sadly, I was not able to connect to my employer's VPN using it. This was the error message I saw in /var/log/messages when I tried to connect. If anyone knows why, or if there is a fix or workaround, please let me know. I have also posted a question in the forums, may post it as a bug if its not already there.

1
2
May 26 14:41:00 localhost kernel: [ 1659.772358] nm-ppp-auth-dia[10538]: segfault at 
0000000000000088 rip 00002abc6a6d221b rsp 00007fff45f70720 error 4

So I had to revert back to the manual method. Luckily, this time (after half a day's search), I found the magic route commands that will allow me to see and ssh to the machines behind the VPN in this thread.

I ended up with a solution that was quite a hack. I mean, pppd is so nicely structured to call the scripts /etc/ppp/ip-up, which will call /etc/ppp/ip-up.local and/or scripts in the /etc/ppp/ip-up.d/ directory when the VPN link is being established, and the corresponding scripts /etc/ppp/ip-down when being torn down. I could not make any of these work. Ultimately, as described in the thread, it just turned out to be a shell script that calls pon and poff and adds the route commands at the end of pon. However, whats important is that it works, and I can wait for the real fix to NetworkManager when it becomes available. Here is what my script looks like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#!/bin/bash
case $1 in
  'on')
    echo "Starting VPN tunnel to company..."
    sudo pon vpn_company
    sleep 10
    sudo route add -net 10.1.128.0 netmask 255.255.255.0 dev ppp0
    sudo route add -net 10.1.1.0 netmask 255.255.255.0 dev ppp0
  'off')
    echo "Stopping VPN tunnel to company..."
    sudo poff vpn_company
    ;;
esac

And my manually configured tunnel file (/etc/ppp/peers/vpn_company) looks like this. It is mostly copied from the file generated by PPTPConfig on my Fedora laptop. The other files required for the configuration are listed in the "Configuring a tunnel by hand" section in the PPTPClient documentation.

1
2
3
4
5
6
7
8
pty "pptp vpn.mycompany.com --nolaunchpppd "
name MYCOMPANY\\sujit
remotename PPTP
require-mppe
file /etc/ppp/options.pptp
ipparam vpn_mycompany
usepeerdns
noauth

There are still a few other things I want to have working. I do have a wireless access point at home, but I have to get my wife off her PC to be able to configure it, and its set up to do MAC address authentication at the moment, so I was not able to test the wireless. However, the laptop is able to see the access point, so I think wireless should already work. This was quite refreshing compared to Fedora 2, where I had to compile the madwifi drivers for the integrated Atheros wireless card into the kernel. I never got around to doing this after upgrading to Fedora 3, as a result of which I haven't used wireless for quite a long while.

Thankfully, the laptop is at a point where its usable. It has all the software I need, the touchpad works, and I can connect through the VPN. Major thanks are due to the good folks who actually post to the Ubuntu Forums, one of the most comprehensive and active forums I have seen lately. There will definitely be more things I find and want as I explore Ubuntu and new softwares that are available here, and if I have issues with installing or using them, I may write about it here in the future.

Saturday, May 19, 2007

The sum of Small Transitions

I don't normally write one of these "no stuff, just fluff" articles, but I don't have anything interesting (i.e. code related) to write about this week. The reason I don't have anything interesting to write about are a series of transitions I am going through these past two weeks or so, which are contributing to this situation. So I thought I'd write about the transitions themselves, which are probably quite uninteresting by themselves, but maybe a little less so if grouped together. So anyway, you've been warned.

Glasses to Contact Lenses

Most people automatically assume that this switch has something to do with vanity or being fashionable, but my reasons were solely to do with costs. My insurance coverage provides me with a free eye exam and subsidized lenses every year, and a subsidized frame every other year. My insurance agent would probably say "free" where I say "subsidized", but I use this word because the coverage amount is insufficient to cover what most people would consider decent eyewear. As a result, I end up spending about $100-150 on lenses every year, and an additional $150 or so every other year on frames.

I have been looking at contact lenses since last year, because I calculated that one would spend in the region of $50-$100 on contact lenses over the insurance coverage amount, which is much better than glasses. However, there is definitely a barrier to entry, since you would have to de-condition yourself to not take evasive action (i.e. blink) while you push a plastic thingy into your eye. I chickened out last year and got myself glasses instead, but this year, I decided to bite the bullet and go for it.

The optician provides some training before you can start using contact lenses. It took me over a half hour the first time I tried to put them on myself (at his office). Since then, its gotten easier, and I spend about 3-10 minutes inserting them every morning and about 3-5 minutes taking them off every night. I especially enjoy being able to slip on a pair of $10 sunglasses on sunny days and still see through them.

Engineer to Manager

I also got promoted to a management position sometime last last week. I have had mixed feelings about the promotion. I have actively shunned these positions in the past, because I love to write code, and once one goes into management, there is less time to do this. Not that I am an engineering super-hotshot, of course. There are people in my group who are technically superior in various ways. However, I have been noticing quite a few of my colleagues (in past jobs) go into management, and quite a few of them made the transition from effective engineer to effective manager quite well. Some have even become more effective engineers as managers, since now they have the help of a bunch of people as smart as they are to carry out and implement their ideas. So that gives me hope that perhaps I can do the same.

I know its too early for me to have a "management style" yet, but what I would like to have is more of a collaborative style than one where I tell people to do something. The closest analogy I can think of is a quarterback in (American) football, as opposed to a coach. The quarterback is not necessarily the best player in the team, but he is generally one of the better ones. He has to throw/kick/run with the ball same as the rest of the team, but he also has to understand and communicate strategy, so he ends up working harder. And no, I am not a Monday night football guy, about the only game I see is the Super Bowl.

32-bit to 64-bit

I also bought a new laptop last Saturday. I used to have a Toshiba Satellite with a 2.8 GHz Intel Pentium 4 processor and 512MB of RAM. The new one is a Fujitsu Lifebook with AMD Turion 1.85GHz Dual core processors and 2GB of RAM. The review is not so great, but it seems to work for my use. Not sure if the switch to 64 bits actually bought me anything, probably not, but I thought it would be cool to have a 64 bit machine. And if I ever wanted to go beyond 2GB of RAM, a 64 bit machine would be able to take these.

Fedora to Ubuntu

I have long been impressed with Debian Linux, mainly because of its promise of install once, upgrade forever, even across major distributions. My first Linux system was Redhat 5.x, and I have been using Redhat all the way through 9.x, then switched to Fedora when Redhat started on their dual licensing model. My old laptop started with Redhat 9.x and currently runs Fedora Core 3. Ubuntu is getting quite popular lately, and a colleague was kind enough to burn me an ISO of the live CD, and its based off Debian, so I thought of giving this a try on my new laptop.

Modern Linuxes are a breeze to install, and the process of overwriting the 32-bit Vista Home Premium operating system that the machine came with with the 64-bit Ubuntu 6.04 took about 15-20 minutes. However, there is a lot of other software that I need to install which does not come built in with the OS, so I spent the better part of the day installing these and making these work. I still have to move the data (basically code which I write for fun, and which is not checked into any code repositories) off my old laptop to my new one.

Backpack to Messenger bag

So far, I've been carrying my laptop to work and back in a backpack, and I have always thought of the guys carrying the messenger bag style computer cases kind of cool. Backpacks remind me of the time I went to school, and I daresay that the messenger bags look a little more like the owner is actually going to work, and since my old one was coming apart at one of it seams, I sprung for one for my new laptop.

What I realize now is that the messenger bag may be good for bike messengers, but for people who have to walk part of the way to work or navigate narrow aisles aboard public transport, the messenger bag is an ergonomic nightmare. I keep bumping the bag into other people and into turnstiles and such, and its also harder to carry than the backpack. I'll probably switch to a backpack once this gets a bit old.

So what was my point again?

No real point, really. I just thought it was kind of odd being hit by so many little changes at the same time, so I thought I'd talk about it. Unfortunately, the sum of all these little changes meant that I did not have any time or opportunity to do my little experiments that I do in my spare (commute) time and blog about. I have been working harder at work, obviously, because in addition to doing my own work, I have to assist and guide others. The latter is actually quite a large time sink, something that surprised me when I thought about it. I also was going back and forth between my new laptop and my old, which meant that I ended up not doing much work. Hopefully, come next week, I will be able put my new laptop to good use.

Sunday, May 13, 2007

Spring Remoting Strategies compared

A few weeks ago, we needed to integrate a fairly resource-intensive module into our web application. Because it was so resource-intensive, it was not practical to run it embedded within the application, so we decided to isolate it into it own application and access it as a service. Since our application is Spring based, we decided to use one of the remoting strategies that comes built into Spring. This post describes my preliminary investigations into this, and a comparison of the performance between the various strategies.

The nice thing about Spring is that you can build your server and client without regard for the actual transport protocol that you will finally use. The supported protocols are RMI, Burlap, Hessian and Spring's Java serialization over HTTP. For each of the supported protocols, Spring provides an Exporter and a ProxyFactory component. The Exporter "exports" your server POJO as a service, while the ProxyFactory proxies the service interface at the client.

Java code

Since the client and server components are going to be sharing the classes for the service interface and common domain objects, I built this experiment up as a multi-module Maven2 project, with three sub-projects: service, common and client. To make this easy to create, I decided to follow the Account example in the Spring reference guide. However, my Account object is beefier, comparable in complexity to the type of object I expect to be sending and receiving remotely. Here is the service interface and the domain objects from the common module:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// AccountService.java
public interface AccountService {
  public void insertAccount(Account account);
  public List<Account> getAccounts(String accountName);
}

// Account.java
public class Account implements Serializable {

  private String customerName;
  private Map<String,List<Transaction>> transactions;
 ...
}

// Transaction.java
public class Transaction implements Serializable {

  private Date date;
  private String description;
  private int type;
  private Float amount;
  ...
}

I had originally created a Java 1.5 Enum for the Transaction.type variable, but that caused Burlap and Hessian remoting to fail, since there is no mapping for user-defined types. Even using the Java 1.4 style Type-safe Enum Pattern won't work, since they don't have the public default constructor that is needed for Burlap and Hessian remoting to work. I suspect that they also need public getter and setter methods, but did not test that, so can't say for sure.

The server component is just a shell that reports what came in and went out, without actually doing something useful, such as putting it in a database or something. The MockAccountHelper is a class in the common module which returns canned instances of Account objects.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
public class AccountServiceImpl implements AccountService {
  private static final Logger LOGGER = Logger.getLogger(AccountServiceImpl.class);
  public void insertAccount(Account account) {
    LOGGER.debug("Server inserts:" + account.toString());
  }
  public List<Account> getAccounts(String name) {
    List<Account> accounts =  MockAccountHelper.getMockAccountList();
    LOGGER.debug("Server returns:" + accounts.toString());
    return accounts;
  }
}

The client component is a plain POJO which references the proxy implementation of the AccountService API and exposes methods which simply delegate to the proxy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
public class AccountClient {
  private AccountService accountService;

  public void setAccountService(AccountService accountService) {
    this.accountService = accountService;
  }
  public void insertAccount(Account account) {
    accountService.insertAccount(account);
  }
  public List<Account> getAccounts(String name) {
    return accountService.getAccounts(name);
  }
}

Configuration

We expect the server to be a web application which exposes the service over RMI or HTTP POST. So we have to setup the following files on the server:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
<!--===== web.xml =====-->
<web-app ...>
  <servlet>
    <servlet-name>service</servlet-name>
    <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
    <load-on-startup>1</load-on-startup>
  </servlet>
  <servlet-mapping>
    <servlet-name>service</servlet-name>
    <url-pattern>/service/*</url-pattern>
  </servlet-mapping>
</web-app>

<!-- ===== service-servlet.xml ===== -->
<beans ...>
  <bean id="accountService" class="com.mycompany.myapp.server.AccountServiceImpl"/>

  <!-- Specific exporter configuration goes here -->
  ...
</beans>

On the client, we set up our client and inject it with a reference to the appropriate proxy, as shown below. Our client will ultimately be a web application, but for our experiment, we just have a JUnit test, so we set up an applicationContext-client.xml as shown below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
<!-- ====== applicationContext-client.xml ===== -->
<beans ...>
  <bean id="accountClient" class="com.mycompany.myapp.client.AccountClient">
    <property name="accountService" ref="accountService"/>
  </bean>

  <!-- ProxyFactoryBean implementation for the selected protocol -->
  <bean id="accountService" class="...">
    ...
  </bean>

</beans>

Protocol specific configurations

We describe the server and client configurations for each of our supported protocols below. We just plug-in the appropriate exporter configuration into the service-servlet.xml and the corresponding proxy factory configuration into the applicationContext-client.xml file. Tinou has a nice diagram of this setup in his blog entry A critique of Spring, although he cites this as one of the things he dislikes about Spring.

Protocol Service config Client config
RMI
Burlap
Hessian
Spring

Testing

For testing, we have to bring up the server web application. We can do this using the built-in Maven2 Jetty plugin, with the command mvn jett6:run. On the client, we use a JUnit test to hit this service. By the way, I started using JUnit 4 quite recently, after I could not find answers to one of my JUnit 3.8 questions. You can get up and running with it quickly using Gunjan Doshi's blog post "JUnit 4.0 in 10 minutes". By the way, its also backward compatibie, so your existing 3.8 test will run without problems under JUnit 4.0. Here is the code for my unit test, which I invoke from the command line as "mvn test -Dtest=AccountClientTest".

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public class AccountClientTest {

  private static final Logger LOGGER = Logger.getLogger(AccountClientTest.class);

  private AccountClient client;

  @Before
  public void setUp() throws Exception {
    ApplicationContext context = new ClassPathXmlApplicationContext("classpath:applicationContext-client.xml");
    client = (AccountClient) context.getBean("accountClient");
  }

  @Test
  public void testInsertAccount() throws Exception {
    Account account = MockAccountHelper.getMockAccount();
    StopWatch watch = new StopWatch();
    watch.start("insert");
    for (int i = 0; i < 50; i++) {
      client.insertAccount(account);
    }
    watch.stop();
    LOGGER.debug("Avg insert time:" + (watch.getTotalTimeMillis() / 50) + "ms");  }

  @Test
  public void testGetAccounts() throws Exception {
    StopWatch watch = new StopWatch();
    watch.start("get");
    for (int i = 0; i < 50; i++) {
      List<Account> accounts = client.getAccounts("");
    }
    watch.stop();
    LOGGER.debug("Avg get time:" + (watch.getTotalTimeMillis() / 50) + "ms");
  }
}

Results

As you can see, all I am doing is hitting each exported remote method 50 times and computing the average response time. I also created a baseline by doing something similar on the server application, so there is no network overhead. My objective was to compare the network overhead added in by each protocol. Here are my results, running on my laptop with 512MB RAM and a 2.80 GHz Pentium 4 processor.

Protocol insertAccount (ms) getAccounts (ms)
Locel (baseline) 9 9
RMI 13 13
Burlap 19 17
Hessian 17 14
Spring HttpInvoker 41 16

Saturday, May 05, 2007

Generic commons-collections

The Apache commons-collections library provides many interesting data structures that are quite useful and huge time-savers. Before I started using them, I would routinely build MultiMap or BidiMap implementations in my application code. I did not even know that there was a thing such as a Bag, something I used quite recently. In other words, a great library which I love and have used very heavily over the last few years.

However, ever since Java 1.5 came out with its support for Generics and generic Collection data structures in java.util, I find myself using the generic form to declare and work with these data structures. In addition to the obvious advantages of type-safety and compile-time checking generics brings to the code, I find the self-documenting feature of generic code very attractive. So for example, if you wanted to create a Map with BigDecimal keys and some application object as the value, you could simply state:

1
2
3
4
5
public static final Map<BigDecimal,AppObject> myMap = new HashMap<BigDecimal,AppObject>();
...
public Map<BigDecimal,AppObject> getMyMap() {
  ...
}
rather than:
1
2
3
4
5
6
7
8
public static final Map myMap = new HashMap(); // key=BigDecimal,value=AppObject
...
/**
 * @returns a Map of {BigDecimal,AppObject}
 */
public Map getMyMap() {
  ...
}

Since commons-collections is non-generic (as of the 3.2 version), and the decision to make these classes generic is being impacted by backward compatibility considerations, it is likely that code using commons-collections will have to continue to mix generic and non-generic calls in the forseeable future. This is not a huge deal, since the code will still work, but it does negate the advantages of using generics in Java code.

The collections15 project from Larvalabs, provides a generic version of the commons-collections project. This seems to be a fairly well kept secret, since I stumbled upon it accidentally when browsing through the commons-collections mailing list, looking for when (and if) there will be a generic version available for this library.

This post describes a small example that converts application code from using a TransformIterator from commons-collections to using its generic replacement TransformIterator<I,O> from the collections15 project. Hopefully, it will illustrate how easy the conversion is, and how much more readable and type-safe the code becomes as a result.

My example is creating a dynamic SQL query with an IN clause. For example, to find all employees with first name 'Bob' in Engineering, Finance and Sales, we could write a query which looks something like this:

1
2
3
4
select * from employee
  where first_name = 'Bob'
  and dept_name in ('Engineering','Finance','Sales')
  order by last_name;

The list of departments are provided to the method generating the dynamic SQL string as a List<String>. So the original code looked like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import org.apache.commons.collections.Transformer;
import org.apache.commons.collections.iterators.TransformIterator;
...
public class MyClass {
  ...
  private TransformIterator quotingIterator = new TransformIterator();
  quotingIterator.setTransformer(new Transformer() {
    public Object transform(Object input) {
      return "'" + StringEscapeUtils.escapeSql((String) input) + "'";
    }
  });
  ...
  private String buildSql(String firstName, List<String> departments) {
    quotingIterator.setIterator(departments.iterator());
    String sql = "select * from employee where first_name = " + 
      "'" + StringEscapeUtils.escapeSql(firstName) + "' " +
      "and dept_name in (" +
      StringUtils.join(quotingIterator, ',') + 
      ") order by last_name";
    return sql;
  }
}

The generic version of the code is as simple as dropping in the TransformIterator replacement from the collections15 project. To do this, I needed to comment out the following dependency from my Maven2 pom.xml file and add the new declaration,

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
    <!--
    <dependency>
      <groupId>commons-collections</groupId>
      <artifactId>commons-collections</artifactId>
      <version>3.2</version>
      <scope>compile</scope>
    </dependency>
    -->
    <dependency>
      <groupId>net.sourceforge.collections</groupId>
      <artifactId>collections-generic</artifactId>
      <version>4.01</version>
      <scope>compile</scope>
    </dependency>

then regenerate my Eclipse project files using mvn eclipse:eclipse. This causes the collections15 jar files to be downloaded to my local repository and make it visible to my project. The new code now looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import org.apache.commons.collections15.Transformer;
import org.apache.commons.collections15.iterators.TransformIterator;
...
public class MyClass {
  ...
  private TransformIterator<String,String> quotingIterator = new TransformIterator<String,String>();
  quotingIterator.setTransformer(new Transformer<String,String>() {
    public String transform(String input) {
      return "'" + StringEscapeUtils.escapeSql((String) input) + "'";
    }
  });
  ...
  private String buildSql(String firstName, List<String> departments) {
    quotingIterator.setIterator(departments.iterator());
    String sql = "select * from employee where first_name = " + 
      "'" + StringEscapeUtils.escapeSql(firstName) + "' " +
      "and dept_name in (" +
      StringUtils.join(quotingIterator, ',') + 
      ") order by last_name";
    return sql;
  }
}

As you can see, the actual buildSql() method is unchanged. However, the Transformer in the generified code explicitly expects a String and returns a String in its transform() method, instead of throwing a ClassCastException if it did not find a String input in the non-generified code.

Also, to get started, it may just be a matter of replacing the imports from org.apache.commons.collections to org.apache.commons.collection15 using a sed script on all the source code. This will allow us to get rid of the compile time dependency on the commons-collections jar, and then we could, at our convenience, generify the code using the new classes from the collections15 project.

In this particular application, there is no third-party or framework code that use commons-collections, but it would be short-sighted to conclude that they don't. In fact, because these libraries are so useful, I suspect that my case is the odd one here. However, dependencies from framework or third party code will be runtime dependencies, which can be satisfied by linking in the Apache commons-collections at runtime in addition to the collections15 jar file. Maven2 allows us to do this quite easily, by setting up the former as a runtime dependency and the latter as a compile time dependency.