VOICE Home Page: http://www.os2voice.org |
[Previous Page] [Next Page] [Features Index] |
By Michal
Necasek ©November 2000References on LVM and JFS:
|
First off, I'll finally explain what those acronyms are: LVM stands for Logical
Volume Manager and JFS is a Journaled File System. Not much clearer, is it? It will
be later - I hope.
LVM and JFS didn't originate on OS/2. They were created for AIX, IBM's high-end
Unix clone running on IBM's RS/6000 hardware. For the users that means that all
the really nasty bugs were ironed out long ago.
The role of LVM is to present a simple logical view of underlying physical storage
space, ie. harddrive(s). LVM manages individual physical disks - or to be more precise,
the individual partitions present on them (for a short
glossary of terms, look at the end of the article). LVM
hides the numbers, size and location of physical partitions from users. Instead
it presents the concept of logical volume. A logical volume
may correspond to a physical partition (but that obviously almost defeats the purpose
of LVM) but it doesn't have to. One volume may be composed of several partitions
located on multiple physical disks. Not only that, the volumes can even be extended
(not shrunk - people usually want more space, not less). They can even be extended
while the OS is running and the filesystem is being accessed! Of course, most home
and SOHO users don't have the hardware required for this.
The more experienced readers are now probably wondering how 'traditional' file
systems like FAT or HPFS could be extended at runtime. The answer is, they can't.
To take full advantage of LVM, it is necessary to use a filesystem designed for
it. This file system is of course JFS. JFS is not really tied to LVM; both LVM and
JFS can exist separately. But only when working in concert both can reach their
full potential.
The entire file system space is divided into logical blocks that contain
file or directory data. For JFS, the logical blocks are always 4096 bytes (4K) in
size, but can be optionally subdivided into smaller fragments (512, 1024 or 2048
bytes).
An i-node is a logical entity that contains information about a file or
directory. There is a 1:1 relationship between i-nodes and files/directories. An
i-node contains file type, access permissions, user/group ID (UID/GID - unused on
OS/2), access times and points to actual logical blocks where file contents are
stored. The maximum file size allowed in JFS is 2TB (HPFS and FAT allow 2GB max).
It should be noted that the number of i-nodes is fixed. It is determined at file
system creation (FORMAT) time and depends on fragment size (which is user selectable).
In theory users could run out of i-nodes, meaning that they would be unable to create
more files even if there was enough free space. In practice this is extremely rare.
Fragments were already briefly mentioned in the discussion of logical
blocks. The JFS logical block size is fixed at 4K. This is a reasonable default
but it means that the file system cannot allocate less than 4K for file storage.
If a file system stores large amounts of small files (< 2K), the disk space waste
becomes significant. We've all got to know and hate this problem from FAT (cluster
size of 32K leads to massive waste of space, in some cases over 50%). JFS attacks
this by allowing fragmentation of logical blocks into smaller units, as small as
512 bytes (this is sector size on harddrives and it is not possible to read or write
less than 512 bytes from/to disk). However users should be careful because fragmentation
incurs additional overhead and hence slows down disk access. I would recommend using
fragments smaller than 4K only when the users know for sure that they will store
very large amounts of small files on the file system.
The entire JFS volume space is subdivided into allocation groups. Each
allocation group contains i-nodes and data blocks. This enables the file system
to store i-nodes and their associated data in physical proximity (HPFS uses a very
similar technique). The allocation group size varies from 8MB to 64MB and depends
on fragment size and number of fragments it contains.
JFS uses a special log device to implement circular journal. On AIX, several
JFS volumes can share single log device. I'm not sure this is possible on OS/2,
I believe each JFS volume (corresponding to a drive letter) has its own 'inline'
log located inside the JFS volume - its size is selectable at FORMAT time.
It is important to note that JFS does not log (or journal) everything.
It only logs all changes to file system meta-data. Simply speaking, the log
contains a record of changes to everything in the file system except actual file
data, ie. changes to the superblock, i-nodes, directories and allocation structures.
It is clear that there must be some overhead here and indeed, performance may suffer
when applications are doing lots of synchronous (uncached) I/O or creating and/or
deleting many files in short amount of time. The performance loss is however not
noticeable in most cases and is well worth the increased security.
The log (or journal) occupies a dedicated area on disk and is written to immediately
when any meta-data change occurs. When the disk becomes idle, the actual file system
structure is updated according to the log. After a crash, all it usually takes to
restore the file system to full consistency is replaying the log; i.e. performing
the recorded transactions. Of course, if a process was in the middle of writing
a file when the system crashed or power died, the file could be inconsistent (the
app might not be able to read it again), but you will not lose this file nor
other files, as is often the case with other file systems.
FDISK.COM has been replaced by LVM.EXE and FDISKPM.EXE has been
replaced by LVMGUI.CMD. Please use one of these utilities.
It should be noted here that LVMGUI is a GUI app (as the name implies)
and requires Java, while LVM is a VIO app and can be run from a command
line boot. It looks and feels similar to FDISK, but it presents two views:
logical and physical. FDISK didn't differentiate between
the two. These views corresponds to the concepts described at the beginning of this
article. Basically the physical view shows physical disks and lets users manage
partitions while logical view presents volumes. One important concept must be introduced
here, and that is a compatibility volume. A compatibility volume corresponds
to old FDISK partitions. During WSeB installation, the installer automatically converts
all existing partitions to compatibility volumes. This conversion technically means
that the installer writes a special block of LVM data to the sector following the
partition table. OSes other than WSeB won't see any difference at all. It is however
necessary to manage all partitions/volumes exclusively with LVM after
this conversion.
I've captured several screenshots of LVM and LVMGUI to give
users unacquainted with LVM some idea of what they can expect. First, there's the
logical view of LVM:
Now there's the physical view of the same system.
And finally a glance at LVMGUI. It looks pretty cool but takes ages
to start. Personally I prefer the VIO version. Disk 3 is a ZIP-100 by the way and
G: is a FAT32 partition.
All FAT, HPFS, FAT32 etc. partitions can reside on either compatibility or LVM
volumes, however other OSes will only be able to access them on compatibility volumes.
JFS on the other hand must be created on LVM volumes. Those were already
described above and enjoy all the flexibility of LVM, such as spanning multiple
physical disks or online expansion.
Each volume, compatibility or LVM, represents a single drive letter on an OS/2
system. LVM however is significantly more flexible than FDISK because the
drive letters are not assigned by a fixed algorithm. Instead, users can assign arbitrary
drive letters to volumes. The drive letters can even be changed at runtime, but
users have to understand the dangers before doing that. If you reassign the drive
letter of the boot volume, it doesn't require a genius to understand that a system
crash will be the most likely result.
Characteristic |
Journaled File System (JFS) |
386 High Performance File System (386HPFS) |
High Performance File System (HPFS) |
FAT File System |
Max volume size |
2TB (terabytes) |
64GB (gigabytes) |
64GB (gigabytes) |
2GB (gigabytes) |
Max file size |
2TB (terabytes) |
2GB (gigabytes) |
2GB (gigabytes) |
2GB (gigabytes) |
Allows spaces and periods in file names |
Yes |
Yes |
Yes |
No (8.3 format) |
Standard directory and file attributes |
Within file system |
Within file system |
Within file system |
Within file system |
Extended Attributes (64KB text or binary data with keywords) |
Within file system |
Within file system |
Within file system |
In separate file |
Max path length |
260 characters 1) |
260 characters |
260 characters |
64 characters |
Bootable |
No 2) |
Yes |
Yes |
Yes |
Allows dynamic volume expansion |
Yes |
No |
No |
No |
Scales with SMP |
Yes |
No |
No |
No |
Local security support |
No |
Yes |
No |
No |
Average wasted space per file |
256 to 2048 bytes |
256 bytes |
256 bytes |
1/2 cluster (1KB to 16KB) |
Allocation information for files |
Near each file in its i-node |
Near each file in its FNODE |
Near each file in its FNODE |
Centralized near volume beginning |
Directory structure |
Sorted B+tree |
Sorted B-tree |
Sorted B-tree, must be searched exhaustively |
Unsorted linear |
Directory location |
Close to files it contains |
Near seek center of volume |
Near seek center of volume |
Root directory at beginning of volume; others scattered |
Write-behind (lazy write) |
Optional |
Optional |
Optional |
Optional |
Maximum cache size |
Physical memory available |
Physical memory available |
2MB |
14MB |
Caching program |
None (parameters set in CONFIG.SYS) |
CACHE386.EXE |
CACHE.EXE |
None (parameters set in CONFIG.SYS) |
LAN Server access control lists |
Within file system |
Within file system |
In separate file (NET.ACC) |
In separate file |
1) JFS stores file and directory names in Unicode. This allows JFS to
always maintain proper sort order, regardless of active codepage.
2) This is not a permanent limitation. Only no one wrote a JFS micro- and
mini-IFS yet.
It might perhaps interest some users that JFS also seems to have built-in support
for DASD limits. I have however never tried to use this feature.
DASD limits, aka Directory Limits feature of LAN Server allows administrators to
control how much space a directory can take, effectively enabling them to limit
disk space usage of users. Previously this feature only worked on HPFS386 volumes.
Obviously this is of no use to home users who have all their disk space for themselves
but it can be very useful for system administrators.
Parting note: Everything said here about WSeB will equally apply to eCS.