I had been invited by the OS SawMill Linux group (Yoonho Park, Richard Neves, Trent Jaeger, etc) who tightly collaborate with Prof. Jochen Liedtke. The goal of the visit was to find out whether COMPOST can help to refactor the Linux kernel for SawMill and whether COMPOST components, boxes, can help to modularize the kernel in a new way.
First of all, I gave three talks, of which the slides are available.
IBM Hawthorne seems to be one of the best research labs in the world. On every floor, on every door, you meet well-known names from papers. They do a lot of very interesting work (see below). The lab lays in Westchester, north of New York, in a country-like surrounding. Around 500 researchers work there. The second lab, IBM Yorktown Heights, lies some miles further north, in the middle of nowhere (pampa). I was hosted in the Hilton at Tarrytown, a historic small city at the banks of the Hudson river. It is a nice location, since the Hudson is a large America-dimensioned river.
SawMill discussions
The group welcomed me warmly, it was a pleasure being with them. The talks on COMPOST raised a lot of questions. We found out that the SawMill group has at least the following problems for which we can try COMPOST:
They want to recognize certain spots in the Linux code (in particular procedure calls). At these calls, they insert #ifdef-switches to introduce their new code (i.e. they use the calls as hooks and embed branching code to their new modules).
They would like to recognize such calls and spots by pattern matching.
The pattern matching should rely on names, and constructs of the program model.
Certain device driver structs are accessed directly by field access. They would like to replace these field accesses to set/get access functions, and then generate access funciton implementations (access adapter classes). A difficulty is that device driver structs are splitted so that one part lies in a different address space than the other. Those structs, however, share common fields, and should be kept consistent. Can such a concistency maintainance be ensured by the implementation of theaccess functions?
Linux has a buffer cache and a page cache, where a page phyically contains 4 buffers. The management of both is separated; however, they are views of each other. Can this be specified as views and merged automatically with package mergers?
They would like to have tracing of polymorphic function calls (calls through void* pointers).
The pfs driver (physical file system) should be split into two parts of which they want to replace one. The split, however, is not possible by functional interface.
They want to use boxes for flexible configuration of the kernel.
It would be nice to offer them a future version of Inject/C. They will com over to Karlsruhe in May and July. Until July, we should have our C system ready, and in autumn someone of us should visit them to perform experiments on SawMill Linux.
Other meetings
After Yoonho Park had sent around my abstracts, several other people made date to discuss COMPOST and other things.
Harold Ossher SOP, Hyperspaces. This is one of the most interesting groups at Hawthorne. Both SOP and Hyperspaces are novel approaches to software composition. SOP provides views for classes, Hyperspaces provide the identification of 'slices' (called hyperslices) from Java systems. These hyperslices are specified with a 'concern mapping' which identifies parts of a system which are relevant to a concern (i.e. to an aspect). After identification (which can be done graphically), hyperslices can be composed with other hyperslices flexibly. We agreed that hyperslices are the same concept as boxes; just sets of arbitrary program elements.
Hyperspaces have the nice concept of 'concern mappings' which are not in the box model currently. However, they do not have the concept of 'declared hooks', and also do not provide program transformations (which is better for legacy systems such as Linux).
The group also was very interested in our work on aspect weaving with graph rewriting. They had the impression that this work goes further than AOP. Hence, an interesting field to go on. We agreed to stay in contact.
John Field, program analysis. John is an expert of applying term rewriting to program analysis and program slicing. Was very much interested in contract checking, and selection of data structures for scalable dataflow analysis. They have worked with COBOL on the Y2K problem, and it turns out that a lot of analysis algorithms do not scale for large programs: it is very important to tune your data structures. This is also one concern in OPTIMIX.
Vivek Sarkar, Boss of Jalapeno Java Compiler. Jalapeno is the new Java-based Java compiler and JVM from IBM. Based on dynamic profiling and modern optimzation techniques. Vivek is an experienced optimizer builder, has worked a lot in profile-based optimization, parallelization. During the week, it became clear that Martin Trapp is going over to the Jalapeno Java Compiler group. This is an ideal place for him, he will be responsible for the Jalapeno work on Array-SSA. This could be an ideal opportunity to collaborate.
G. Ramalingan. Program analysis.
D. Grove, Jalapeno, ex-member of Cecil group. David is an experienced optimizer person from the Cecil group (Vortex). He has a nice work on overview of optimizations. Interested in contract checking, and modern languages.
Frank Tip. Optimization of class hierarchies in byte code. Frank has eveloped a tool called JAX which optimizes space and time of bytecode files by rearranging (specializing) class hierarchy structures. We will organize a Dagstuhl seminaire on Java optimzation in November.
Sanjiva Weerawarana, Bean Markup Language. This group designes a configuration language for Java Beans, on the basis of XML. Can be used for runtime configuration of Bean systems. Very interesting. Could be optimized by COMPOST. They are interested to apply it. I like to invite him here.
Brent Hailpern, associate director. Was interested in COMPOST.
Morton G. Swimmer, anti-virus specialist. Viruses modify themselves nowadays. If you want to be able to still recognize them you have to apply the diff algorithm on the program dependence graph, not on the text anymore. Should be an interesting application of FIRM and graph rewriting.
General Remarks
It seems that IBM is looking for good people.
Hilton has enourmously expensive hotel telephone costs. Beware!
In Frankfurt, Hall A, there is a new express check-in counter for direct flights (counter 50). In case, you have a direct flight, and your suitcase weighs less than 28 kg, you can checkin there and avoid the long queue before the standard Lufthansa Economy class counters. You may even checkin your luggage there.