Technicals writings from JLH: 2008

Wednesday, December 10, 2008

Get rid of Vodafone Mobile Connect Card driver for Linux

Back in August, I explained why I switched from shipped Xandros to the Debian Eee blend. I bought my EeePC with a 3G USB modem from Huawei:
Bus 001 Device 003: ID 12d1:1003 Huawei Technologies Co., Ltd. E220 HSDPA Modem / E270 HSDPA/HSUPA Modem
Thanks to Vodafone Mobile Connect Card driver for Linux, I could use it very easily. But compared to the Asus dialer, it is way slow! Between the time I click on the icon and the time I am connected, there is nearly two minutes because EeePC is not powerful enough to process quickly this bloated Python software.

While inspecting this program, you will notice that it uses wvdial behind the scene:
jlh 11626 11579 0 15:22 pts/5 00:00:00 /opt/vmc/bin/wvdial --config /tmp/VMC_uJJ0r/VMCYy5LXzwvdial.conf connect
Just copy the configuration file to wvdial.conf:
[Dialer Defaults] Phone = *99***1# Username = slsfr Password = slsfr Stupid Mode = 1 Dial Command = ATDT Check Def Route = on Dial Attempts = 3 [Dialer connect] Modem = /dev/ttyUSB0 Baud = 460800 Init2 = ATZ Init3 = ATQ0 V1 E0 S0=0 &C1 &D2 +FCLASS=0 Init4 = AT+CGDCONT=1,"IP","slsfr" ISDN = 0 Modem Type = Analog Modem

But running wvdial with this configuration will only work if you have already entered the PIN code with Vodafone Mobile Connect Card driver for Linux. I needed to find a way to automate this. I quickly googled for a solution without luck, so I devised a way to do it myself.

What I basically did was to use strace(1) on Vodafone Mobile Connect Card driver for Linux just when I entered my PIN code. Then I looked for the PIN code itself in the output file and watched about so I could figure out the dialog with the USB modem to unlock it.

Let's say my PIN code is 5678. First check where my PIN code is used:
jlh@r2d2:~$ grep -n 5678 strace.vodafone 68172:12333 write(29, "AT+CPIN=5678\r\n"..., 14) = 14

Line 68172. The line is sent on file descriptor 29. Let's find where this file descriptor is opened before line 68172. It may be recycled during the execution of the program, so only take the last one:
jlh@r2d2:~$ cat -n strace.vodafone | head -n 68172 | grep 'open.*= 29' | tail -n 1 66643 12333 open("/dev/ttyUSB1", O_RDWR|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 29

And where it's closed:
jlh@r2d2:~$ cat -n strace.vodafone | tail -n +66643 | grep 'close(29' 74856 12333 close(29) = 0

We just have to get read(2) and write(2) calls on file descriptor 29 between line 66643 and line 74856:
jlh@r2d2:~$ sed -n '66643,74856{ /$read\|write$(29/p }' strace.vodafone 12333 write(29, "AT+CSQ\r\n"..., 8) = 8 12333 read(29, "AT+CSQ\r\r\n+CME ERROR: SIM PIN requ"..., 8192) = 39 12333 write(29, "ATZ\r\n"..., 5) = 5 12333 read(29, "ATZ\r"..., 8192) = 4 12333 read(29, "\r\nOK\r\n"..., 8192) = 6 12333 write(29, "ATE0\r\n"..., 6) = 6 12333 read(29, "ATE0\r\r\nOK\r\n"..., 8192) = 11 12333 write(29, "AT+CPIN?\r\n"..., 10) = 10 12333 read(29, "\r\n+CPIN: SIM PIN\r\n\r\nOK\r\n"..., 8192) = 24 12333 read(29, "\r\n^BOOT:12633134,0,0,0,64\r\n"..., 8192) = 27 12333 write(29, "AT+CPIN=5678\r\n"..., 14) = 14 12333 read(29, "\r\nOK\r\n"..., 8192) = 6 12333 read(29, "\r\n^SIMST:1\r\n"..., 8192) = 12 12333 read(29, "\r\n^SRVST:2\r\n"..., 8192) = 12 12333 read(29, "\r\n^RSSI:13\r\n"..., 8192) = 12 12333 read(29, "\r\n^BOOT:12633134,0,0,0,64\r\n"..., 8192) = 27 ...

The important thing is to send the PIN, so we will just stop after sending the PIN (if it doesn't work, you can still try to add a few lines). Let's translate this to a chat(8) script:
ECHO ON TIMEOUT 1 '' AT+CSQ 'SIM PIN required' ATZ OK ATE0 OK AT+CPIN? OK AT+CGSN OK AT+CPIN=5678

Then, just before running wvdial with the stolen configuration file, just use:
jlh@r2d2:~$ /usr/sbin/chat -f pin.chat < /dev/ttyUSB0 > /dev/ttyUSB0 [...]

Note that if you try to rerun this chat(8) script, it won't work because the modem won't return what is expected. Given there is no way to implement branches in chat(8), I used a very small timeout so it will exit very quickly if the modem is already unlocked.

Friday, September 26, 2008

Global Offset Table, why is it needed?

In ELF land, GOT stands for Global Offset Table. It is a very important component of Position Independant Code (PIC), used in shared libraries. In this article, we try to understand why it is needed.

Preface

Understanding linking and loading is difficult. At least it was for me because I mainly read documentation, without writing source which usually helps to stick things in mind. If you are really interested in linking and loading, I can only advice you to read the following documentation:

John R Levine's "Linkers and Loaders" book (ISBN 1-55860-496-0). This is the best documentation you can find on this topic. You can find the first chapter here which is a very good reading to understand the history of linkers.

Ian Lance Taylor's "Linkers" blog entries contains a overall explanation of the linking process.

Also, have a look in the bookmarks on my website there are a couple of interesting pointers too in "Code"/"Linkers/Loaders".

So, why GOT?

Quote from Wikipedia in the Linker article:

In computer science, a linker or link editor is a program that takes one or more objects generated by a compiler and assembles them into a single executable program.

As you may already know when linking binary objects together into a program, the linker has to handle relocations. Relocations are basically hints left by the assembler so the linker knows where there are addresses to fix once everything has been put together. Let's take a simple example to illustrate the need of relocations. Imagine that i is a global variable defined in a source file and you assign it a value in another source file: i = 1. On i386, the compiler translates this in movl $1, i. When the assembler processes this, it doesn't know the actual address of i since it can see only one file at time, so it just set the address to zero; on the other hand the linker sees all files at once and can figure out the final value of a symbol (or, address of a variable). So in this case, the assembler generates a relocation informing the linker to put the value of symbol "i" at such or such offset in the code section, once every binary objects have been combined and the final address of all symbols is known. I won't go futher into the details about relocations, this deserves another blog article I think. Just keep in mind that most types of relocation are used to "fix" the code of the library. And this is not a problem when linking multiple objects files (including static libraries) together into a program.

Contrariwise to static libraries, shared libraries are dynamically linked, which means that the linker's job is defered to runtime and is handled by the so-called dynamic linker. By definition, shared libraries code is shared among multiple processes: thanks to virtual memory, the same physical memory pages holding the library code are mapped into the address space of every processes that need it. The benefit is savings of physical memory. Thus the shared code must obviously stay the same and should not be fixed directly to suit one particular process memory layout or the other.

The difficulty comes from the fact we don't know where the dynamic linker will decide to load the library in the process memory as it depends on a number of factors such as executable size, number and size of libraries loaded previously, not to mention that actual load address might randomized for security reasons on recent systems. Still when the library code refers to a global variable, its address has to be stashed somewhere so the code can access it.

As we saw in the relocation example, this is normally done directly by fixing the code. In order to avoid modifying shared libraries code at runtime and losing memory sharing, we have to make the runtime relocations point to the process private data. This is done by asking the compiler to generate Position-Independant Code (PIC). The idea is simple: instead of having the final absolute address of a symbol in the code, the code refers to a particular entry of a private array containing this address. At runtime, the dynamic linker will stuff this array with the actual symbols value so the same code will access addresses proper to each process. The sharp minded reader may have noticed that this only moves the problem on step further because the address of the private array itself is still unknown. This is resolved by a small calculation. Given that the library is loaded as a single unit in the memory, the distance between the code and the private array is known by the compiler. The code just have to get the address of the current instruction and the address of the private array can be figured out.

Ladies and gentlemen, this private array of addresses is what we call the Global Offset Table, or GOT.

It is actually perfectly valid to create a non-PIC shared library. It will work. But under the hood, the dynamic linker will probably patch most of the code pages of the library which will be duplicated by the copy-on-write mechanism of the virtual memory. Besides wasting memory, it may take more time to start because if a symbol is used thousands time along the code, the dynamic linker will have thousands relocations pointing into the code section to process at startup. On the other hand, with PIC there is one GOT entry per symbol so there is only one relocation to perform. But to be honest this is more a tradeoff than a pure win because using GOT implies using a more expensive function prologue to compute the GOT address and an additional indirection for access through the GOT.

Shared libraries all have their own GOT, and the program would have his if it was compiled with PIC as well (this is called Position Independant Executable by the way). Actually in any shared library, the GOT is used for every data accessed with an absolute address in the code, which includes static variables and global variables defined in the library itself.

For static variables there is no need for a GOT entry: getting the address is straightforward because the distance between the GOT and the variable itself is known, being in the same loadable unit, so the code can directly access it.

One might think this is the same for global variables defined within the library itself. But this is without counting that ELF allows the executable or a previously shared library to redefine a global variable of a shared library. For example, you can perfectly define a stdout variable in your program despite it is already defined in libc, but keep in mind that your variable will also be used by the libc functions referring to stdout such as printf(3) so be careful that it points to a valid FILE object. In short, all global variables used by the shared library unconditionally have a GOT entry.

I hope you better understand what is GOT now :).

Tuesday, August 19, 2008

Debian on Asus EeePC 701 with Huawei USB modem from SFR or Vodafone

A few month ago, I bought a neat bundle from SFR, a french mobile operator, containing Asus EeePC 701 and a subscription to Internet through 2G/3G/3G+. The connection is achieved thanks to an "E220 HSDPA Modem" USB key from Huawei Technologies. All this stuff has been working very well out of the box with the pre-installed Xandros-based distribution (based on Debian).

All I wanted from this netbook was to be able to surf the web and open terminals without wasting time administer the beast. And well, standard EeePC distribution achieves this very well, and use OpenOffice as a bonus. Of course, I was lacking some stuff like gcc, mplayer, screen... That's why I harvested a few unofficial package repositories to cram my sources.list(5). There was obviously some conflicts between the repositories, but I really didn't care (although usually I'm very keen to make my package manager happy): I had a handful of configuration files backup'd on a USB key and in case of unrecoverable failure within the package system, I just had to restore the original state by pressing F9 at startup (EeePC is shipped with a cunning disk setup: two partitions, the first one containing the original system and the second empty one being mounted on top of the former using unionfs, so restoring the system basically means blanking the second partition).

But as time went on, one thing was more and more upsetting me: no package updates from Asus. As you may already know, this EeePC is also shipped out-of-the-box with remote root exploit (through Samba)... This was very annoying for me because I sometimes connect to other boxes using SSH, so one could hack my EeePC to steal my passwords or perform even more subtle things. So I turned off everything I could because the kernel provided by Xandros doesn't contain IPTables. But honestly, I was still worry about security.

I finally decided to spare some time to install an other, more up to date, distribution when I noticed that I couldn't use Firefox 3 because most of the required libraries were not available. A friend of mine had tried Ubuntu EEE or EEE Ubuntu, whatever. At first, I thought it was a good choice because it could fit both the low administration and up-to-date-ness requirements. But he quickly told me that Ubuntu was far more too memory hungry. Moreover, I don't like these kind of bloated distributions; they somewhat remind me Windows where everything is done behind the scene without giving me any choice unless I really dig deep to understand how things work together. So I forgot Ubuntu and kept on with Xandros

Then I read that Asus was working with Debian in order to maybe replace Xandros on EeePC some day. This caught all my attention as this implied that Debian EeePC support should be very good. What finally decided me to give Debian Eee a shot, despite my disappointing experience with Debian on my girlfriend's laptop last year, was this post from the Debian Eee PC Team, which looked quite encouraging. Additionally, Debian is one of the cleanest distribution; or rather I should say this is one of the less messy ;-) (hey, it's Linux!).

I fetched the Debian Eee's WPA Installer and spread it on a USB key. I could install Debian flawlessly through my WPA/TKIP router. I went for a 256 MB swap partition and all the remaining space as one big partition, and asked for a desktop installation, hoping to find again the regretted Xandros' usability.

And I am very happy. Given this is based on Debian testing, all packages are fairly up to date: Firefox 3 is here! This is a great improvement because it can achieve true full screen like Opera; it's damn important because EeePC 701's screen is really small for web surfing. All devices seems to be supported, although I haven't tested all of them; some shortcut keys are working (contrast keys for example), but volume keys don't. But honestly it's not a big deal compared to what I won and hopefully it will be resolved soon.

The "hardest" task was to make the Huawey modem USB key work. The current kernel (2.6.25...something) is supposed to support it, but anyway you have to somehow manage to enter the PIN code because there is a SIM card in it. Fortunately, the Vodafone Mobile Connect Card driver for Linux (wow, what a name!) handle it perfectly: just beware to download the i386 installer, which is not available as a package at time of writing. You just have run it and tell that your user must belong to the "vmc" group. Then don't forget to logout so as to be in the "vmc" group effectively and run the "vodaphone-mobile-connect-card-driver-for-linux" (I'm not kidding). Edit the profile, and change username, password and APN host to "slsfr" and the DNS servers to "172.20.2.10" and "172.20.2.39" as noted in this french forum post. And voila! You can connect to Internet over 3G+ and even read the SMS and directory stored on the SIM card!

Ok, it's not a package and it spreads some files but... it works! And if you ever want to remove it, you could still follow the installation script to know what as been copied (/etc/udev/rules/ and /usr/bin mainly if I recall correctly). The graphical interface is heavy and I would have preferred a neat command-line tool, but I won't complain more. In open-source, if you want it, just code it! :-)

In summary, if you're fed up with your Xandros, go for Debian/Eee! (It's hard to say when you are a BSD guy ;p.)

Monday, June 23, 2008

Chicken and egg problem with Propolice in runtime linker/loader

Some background first: Back in 2006, I was frustrated because FreeBSD was somewhat lagging behind other open-source operating systems in term of integrated security features. One of them is a GCC extension originally named Propolice or SSP for Stack Smashing Protection. As its name lets sound, it protects (very efficiently) against stack based buffer overflows. Historically Propolice has been developed by Hiroaki Etoh at IBM for gcc-2.95.3 and then gcc-3.4.4 as an external patch, but it has now been included in the mainstream, starting at gcc-4.1. The patch to integrate Propolice in FreeBSD has been existing for more than two years on my website, but then FreeBSD only provided gcc-3.4.4 and heavily patching a contributed software is ruled out by policy, so it couldn't be committed in FreeBSD-6. I missed the FreeBSD-7 window for various reasons, and now I'm working to get it committed to FreeBSD-8 (aka CURRENT).

How does Propolice work? The compiler identifies functions that might be vulnerable (containing a stack based buffer) and during their prologue, pushes a one-word canary between the return address stored in the stack and the local variables. In the function's epilogue, the canary is checked against its original value and if it has changed then a buffer overflow occurred and the program is aborted. The canary is initially in the BSS segment but is initialized to a random value by a function called during the program startup (namely, a constructor). Both the canary and the initializer function are provided in FreeBSD's libc.

When I sent the patch for review back in april, Antoine Brodin noticed that when build world is performed with -fstack-protector-all (which makes GCC to protect all functions instead of only those containing a local buffer), it breaks the whole system. There were actually various problems, such as the
initializer function being protected itself: during its prologue the canary was equal to zero but during the epilogue its value had been set to a random value meanwhile so obviously the saved value did't match... This problem has been resolved quickly. The nasty problem lay in the runtime loader (aka rtld-elf): once it was installed, all programs would fail with SIGSEGV.

When a dynamically-linked program is run, the kernel always transfers control to rtld behing the scene, instead of the actual program. The purpose is to do runtime linking of libraries needed by the program, which includes resolving symbols and performing relocations, before actually transfering control to it. So I've recompiled rtld without SSP, but it was still crashing. I've narrowed down the segfault to a call mmap(2) which turned out to be the first call into libc, against which rtld was statically linked. One of the very first thing rtld has to do is to relocate itself, mainly to be able to access global data which are addressed through GOT (Global Offset Table). This was the very problem. Given that all libc functions were protected with Propolice, mmap(2)'s prologue tried to push the canary, which is accessed through the «__stack_chk_guard» global symbol. This means it used a pointer from the GOT, which had not been initialized at this point.

As an additional note (and a reminder for me ;p), I came to thinking that the problem could also arise in the canary initializer which stands in rtld's .init section. After some thinking, I realized that usual .init and .fini sections were handled by rtld itself, so rtld's ones are actually never run I think.

Obviously rtld must been compiled without SSP. As a temporary solution, libc is not allowed to be compiled with -fstack-protector-all. I think a better solution would be to create a librtld containing symbols required by rtld and compiled without SSP.

Sharp minds have certainly understood that if the original patch worked without -fstack-protector-all it was just a matter of chance because no functions during relocation of rtld's GOT entries had been elected by GCC to be protected.

Wednesday, March 12, 2008

Sketching Subversion

Subversion is easy to grasp for long-time CVS users, since it is conceptually very similar. The intent when creating Subversion was simply to create "a better CVS" as they state on their frontpage. I think they succeeded with regard to this.

Revisions
First, each commit is atomic and revision numbers are incremented for each commit repository-wise, whereas in CVS revision numbers are incremented for each commit file-wise. This means each revision number represent a state of the filesystem tree. With CVS, if you want to get the diff of a commit, you could only approximate it with dates. In SVN, you simply ask for the diff between two revisions.

The puzzling side of this is that two successive revisions of a file may not have successive revision number. For example, I've recently worked on a script stored in a repository shared by multiple developpers, and here are the successive revision number:

Externals
Another feature worth noting is called "externals". Each repository entry, file or directory, may carry metadatas as name/value pairs, so-called "properties". Their name may be whatever you want, such as a copyright or even the full license text, but a few have special meaning. I've only used the one named "svn:externals", which it is extremely useful. I've once work on a CVS repository containing multiple projects and a couple home-grown libs shared among them. In order to build the project, we had to write some shell script glue that checked out the required libs automatically into the project directory before building. As far as I know, there is no other way to circumvent this problem (let me know if there is one). This is the exact problem that externals can resolve. By adding a couple of svn:externals properties to your project directory, you compel Subversion to pull down the required libraries along your project. Supposing your repository is laid out like this:


    myproject/
    lib/mylib/

You can add a "svn:externals" property to the "myproject" directory containing:


    mylib /path/to/your/repository/lib/mylib

And the next time you'll check out or update your project, the mylib/ subdirectory will automatically created. You could even use it as a working copy of lib/mylib/ and perform commits in it!

Branches and all
Subversion only knows about "cheap copies" of a directory or file somewhere else in the repository, no tag, no branch. Let me explain. Newly copied entries share their history with their ancestor, therefrom the cheap copy. From here onward, all commits on one or the other copy won't be shared... I'm sure the concept of branch is already looming in your mind :-). You may also have already figured that branches are addressed through the repository namespace: you don't need an extra information as in CVS ("-r BRANCH"). This is why Subversion repositories are commonly laid out like this (this is the advised way in SVN documentation to design your repository, though this is really a matter of policy):


    myproject/
        trunk/
        branches/
            1.0/
            1.1/
            2.0/

You may now wonder how to create a simple tag (i.e. not a branch tag, to use CVS terminology), unless you are especially keen and you have already grasped the whole thing. Actually, there is no difference between a branch and a mere tag, except you haven't performed any commits in the latter. A good practice when using CVS when you want to create a new branch for a given release is to first tag and then branch, so you can address the exact branching point using the tag. In Subversion, a cheap copy creates a new revision number, so you only have to dig the log up to for it and ask for the diff.

This also mean you can easily move a file in the repository without losing its history, by copying and then deleting the ancestor. With CVS, this required a so-called "repo-copy", i.e. duplicating the RCS file on the repository side, which somehow is a waste of space.

To sum up the whole thing, you have a namespace in which you can do cheap copies which will share their history up to the copy revision. Everything else is just a matter of policy.

Conclusion
I really like SVN, for all those aforementioned things above.
But I won't ever blame CVS for its weaknesses. It has been designed more than twenty years ago and relies on the RCS format which was designed in the early eighties. It works so well that numerous projects are still using it, notably the FreeBSD project which is known to have the biggest open-source repository ever.

Thursday, February 21, 2008

FreeBSD textdump(4) is awesome

FreeBSD has had the reputation of being rock solid for a long time. One of the reason for this is that FreeBSD provides a great number powerful debugging tools.

Especially, when your kernel panics, you have three options:

Live debug with ddb(4), but this is not always possible if the box has to be up back quickly.

Dump the memory to perform post-mortem analysis.

Do nothing and pray that the panic won't happen again too soon.

Memory dumps use the swap device. This is perfectly legal because once your OS has crashed, you won't do anything with the data in the swap anyway. On the next reboot, savecore(8) checks if the swap partition contains a memory dump and copies it into a file in /var/crash.

In the beginning, only full memory dumps were possible. In this case if you have 1 GB of RAM, you need a swap partition of at least 1 GB too. So is the file in /var/crash. This worked well but given that most of users are not kernel developpers, kernel dumps are usually useless unless they are transmitted to the right folk. But a 1-GB file is cumbersome.

In April 2006, Peter Wemm introduced minidumps. They are very similar to full dumps except that, from what I've understood, only the kernel memory is dumped. Typically, on my laptop with 1 GB of RAM, minidumps took about 150 MB. The problem, while lessened, was still there though.

A couple of weeks ago, Robert Watson commited a new feature called textdump(4) in FreeBSD 8.0-CURRENT. Basically, this is possible because of two new features of ddb(4):

It is possible to define "scripts" (no loop or condition, only a sequence of commands), certain special names corresponding to events.

ddb(4) output can be captured in an internal buffer and dumped in place of the memory.

In this post, Robert Watson gives numerous informations about textdumps. I strongly advice you to read this. The very important thing is that most of panics reported by users can be solved by a backtrace and a couple of DDB commands. This is precisely what this feature achieves. Moreover, textdumps rarely exceed one megabyte, which is far more convenient than dumps or minidumps and can be easily sent by e-mail.

Moreover, users using FreeBSD as desktop obviously run X.org. When a panic arise, it is not possible to go back to console mode, so ddb(4) is not accessible. If you've asked your kernel to drop to ddb(4) on panic as I did, the kernel dump is not performed automatically and you're screwed. Textdumps removes this needle from your foot.

Now let's see how to use them. FreeBSD will automatically configures (mini)dumps for you. This is possible to do in a single command:

root# ddb script kdb.enter.panic="textdump set; capture on; show pcpu;trace;show locks;ps;alltrace;show alllocks;show lockedvnods; call doadump"

"kdb.enter.panic" is a script name with a special meaning: as its name lets sound, it will be automatically executed on panic. The first command "textdump set", forces the next dump to be the captured ddb(4) output instead of the traditional memory dump. The second one "capture on"... enables the capture of commands output. Next comes a bunch of ddb(4) commands commonly. The final command "call doadump" performs the actual dump. If you want to reboot automatically, you can add the "reset" command afterward.

As far as I know, there is no configuration sugar to enable this automatically at boot time, so for now I stuck it in /etc/rc.local.

Sunday, January 27, 2008

Quick HOWTO for building Xen 3.2 on Debian/Ubuntu

I finally carried out "make world" with Xen 3.2 on Debian/sid after much struggle.

First, contrary to xen-3.1.0-src.tgz, xen-3.2.0.tar.gz doesn't come along with the linux-2.6-xen-sparse/ and patches/ directories which allow to build a «xen-infied» kernel from a vanilla kernel source. Thus it is impossible to use make world XEN_LINUX_SOURCE=tarball.

By default, make world will use Mercurial to pull down (or «clone» in Mercurial vocabulary) the xenified kernel from Xensource's Mercurial repository. Unfortunately, it seems that the current Mercurial version shipped with Debian/Ubuntu is outdated and cannot be used out-of-the-box. Nonetheless, it is possible to fetch the xenified kernel manually.

From my understanding, it is necessary to "make prep-kernels" in order to create the kernel build directory. Indeed if you put your .config file directory into the kernel tree itself, the kernel's build system will complain about the lack of cleanliness and will ask you to run "make mrproper". This is baffling but it appears that whenever the kernel is asked to store the object files in a separate directory (namely build-linux-2.6.18-xen_x86_32/), it makes sure you didn't create your .config file in the wrong directory. I suppose this is a safeguard.

So I devised with the following process to build Xen 3.2.


root# mkdir build
root# wget http://bits.xensource.com/oss-xen/release/3.2.0/xen-3.2.0.tar.gz
root# tar xzf xen-3.2.0.tar.gz
# Download the xenifid kernel tree manually, but NOT in xen-3.2.0/
# because the buildconfig/select-repository script would skip it.
# ! xen-3.2.0/ and linux-2.6.18-xen.hg/ must be at the same level !
root# hg clone http://xenbits.xensource.com/linux-2.6.18-xen.hg
root# cd xen-3.2.0
root# make prep-kernels
root# cp /boot/config-2.6.18-my build-linux-2.6.18-xen_x86_32/.config
# Using the world target will clean everything first.  Don't use it here.
root# make dist

Tuesday, January 22, 2008

Quick HOWTO for building Qumranet's KVM

This post really deserves the name scrawl, but I thought sharing my experience in building a Qumranet's KVM snapshot on Linux could helpful for others. Indeed it was not as straightforward as it seemed at first glance and it required some grope and wandering on Google.

You will need KVM (surprising, isn't it?) and Linux kernel sources corresponding to your kernel.

FWIW, KVM snapshots can be downloaded here.

First, let's prepare the kernel tree. This step is only important for people who use packaged kernel (or IOW who don't build their kernel themselves and have extracted the kernel source tree for the sole purpose of building KVM). For others, this step could be avoided because the targets we will use are performed implicitely when building the kernel.


root# tar xzf linux-2.6.23.tar.gz
root# cd linux-2.6.23.tar.gz
root# cp /boot/config-2.6.23 .config
root# make oldconfig prepare scripts
root# cd -

Next we will build KVM. Beware your GCC version! The current major branch is GCC 4 and it is shipped with almost if not all recent Linux distributions. Unfortunately, QEMU (on which KVM's userland is heavily based, not to say they've reused QEMU with little modifications) doesn't build with GCC 4. It requires GCC 3.2 or GCC 3.4 (a thorough explanation is provided in Fabrice Bellard's paper, "QEMU, a fast and portable dynamic translator", USENIX 2005). So you need to install this version of GCC as well. If you are running Debian for instance, this is pretty straightforward :


root# aptitude install gcc-3.4

And you will have a new compiler named gcc-3.4. On other distros, you will have to either find a package of GCC 3 or build it manually as described on Gentoo Wiki.

Finally you will simply have to build KVM with a few special arguments to the configure script:


root# tar xzf kvm-snapshot-20080117.tar.gz
root# cd kvm-snapshot-20080117
root# ./configure --qemu-cc=gcc-3.4 --kerneldir=$OLDPWD/linux-2.6.23 --prefix=/opt/kvm-snapshot-20080117
root# make all install

And this should work. :-)

As you may have noticed, I installed KVM in /opt in order to avoid messing with your tidy package management system.

Saturday, January 5, 2008

x86 assembly generated by various GCC releases for a classic function

In this new article, I will have a look at the same C source as in my previous article, except that it will be in its own function instead of being inlined in main(). We will see that some oddities spotted out previously only occured because we were in the main() function.

All tests are compiled with the following GCC command-line:

gcc -S -O test.c

The C source file is:


int     
function(int ac, char *av[])
{
        char buf[16];
        
        if (ac < 2)
                return 0;
        strcpy(buf, av[1]);
        return 1; 
}

int     
main(int ac, char *av[])
{

        function(ac, av);
        return 0;
}

Expectation, GCC 2.8.1 and GCC 2.95.3

The expectation, GCC 2.8.1 and GCC 2.95.3 assembly versions are the same as in the previous article. The stack frames are therefore identical too.

GCC 3.4.6

Fortunately, the assembly code generated by GCC 3.4.6 is far less puzzling when the C source code stands in a mere function instead of main(). Actually, the code is nearly identical to the one generated by GCC 2.95.3. The stack pointer has been aligned in the main() function on 16 bytes boundary.


function:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $24, %esp        /* Alloc a 24 bytes buffer */
        movl    $0, %eax
        cmpl    $1, 8(%ebp)
        jle     .L1
        subl    $8, %esp         /* Alloc an unused 8 bytes */
                                 /* buffer */
        movl    12(%ebp), %eax
        pushl   4(%eax)
        leal    -24(%ebp), %eax  /* The 24 bytes buffer is */
                                 /* used for strcpy() */
        pushl   %eax
        call    strcpy
        movl    $1, %eax
.L1:    
        leave
        ret

The corresponding stack frame, similar to the GCC 2.95.3 one but the entire 24 bytes buffer is provided to strcpy().


       |   av   |
       |   ac   |
       |   ret  |
 ebp-> |  sebp  |
       |/ / / / | ^
       | / / / /| |
       |/ / / / | |
       | / / / /| | buf, 24 bytes wide
       |/ / / / | |
       | / / / /| v
       |\\\\\\\\| ^
       |\\\\\\\\| v 8 unused bytes
       |  av[1] |
 esp-> |  &buf  |

GCC 4.2.1

Astonishingly, GCC 4.2.1 does not keep with 16 bytes alignment, although it seemed to do so in the main() function.


function:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $24, %esp        /* Alloc a 24 bytes buffer */
        movl    $0, %eax
        cmpl    $1, 8(%ebp)
        jle     .L4
        movl    12(%ebp), %edx
        movl    4(%edx), %eax
        movl    %eax, 4(%esp)    /* Fake push */
        leal    -16(%ebp), %eax  /* A 16 bytes buffer is */
                                 /* used for strcpy() */
        movl    %eax, (%esp)     /* Fake push */
        call    strcpy
        movl    $1, %eax
.L4:
        leave
        ret

And now the stack frame:


       |   av   |
       |   ac   |
       |   ret  |
 ebp-> |  sebp  |
       |/ / / / | ^ ^
       | / / / /| | |
       |/ / / / | | | buf, 16 bytes wide
       | / / / /| | v
       |  av[1] | |
 esp-> |  &buf  | v

The stack frame looks exactly like the expectation. One thing worth noting however is that GCC 4.2.1 always uses peculiar code for arguments storage. Instead of using the push instruction, it reserves space for arguments of further function calls in the same time as local variables are allocated. Arguments are the stored relative to %esp.

Technicals writings from JLH