What's Java(tm)?
JavaSoft News
Products and Services
Developer's Corner
Last modified 08 Aug 1996
JAR File Format Specification version 1.0
Abstract
This specification defines a general purpose, compact archive format
for packaging the components of a Java application. The JAR format
supports Unicode names for entries, as well as a CRC for detecting
data corruption. The format is also designed to be stream based, so that
a JAR file can be created on any Java output stream, and likewise read
from any Java input stream. Additionally, an optional directory can be
included for random access to JAR file entries.
1. Introduction
This specification defines a general purpose data archive format that:
- Is very compact, with little overhead in terms of headers and optional
data
- Supports Unicode names in entry headers
- Is independent of both the CPU and operating system
- Allows the specification of additional data in headers for such purposes
as code signing
2. Specification
2.1. Conventions
The types u1, u2, and u4 represent unsigned
8-, 16-, and 32-bit integer values, respectively. All 16-bit and 32-bit
quantities are represented in network (big-endian) order, where the high
byte comes first. The type utf represents a Java UTF format string
as handled by the classes java.io.DataInputStream and
java.io.DataOutputStream.
The JAR format is described here using a C-like structure notation,
where successive fields appear in the structure sequentially without padding
or alignment. Additionally, header fields followed by (optional)
are meant to be optional depending on the value of the header flag byte.
2.2. Overview
A JAR file begins with a main header, followed by zero or more JAR
entries, and optional directory, and an end header. A JAR entry consists
of an entry header immediately followed by the entry data stored
in the ZLIB compressed data format (see references [1]
and [2] for more information on the ZLIB format).
Each header also contains a flag byte that specifies the type of header
as well as any optional fields present in the header. The least significant
2 bits together specify the header type, which can be one of the following:
HEAD_MAIN = 0
HEAD_ENTRY = 1
HEAD_DIR = 2
HEAD_END = 3
The most significant 6 bits specify the optional fields that are
contained in the header.
Each header can also contain optional extra field data which has the
following format:
extra_data {
u2 size;
extra_data_entry entries[];
}
The field size specifies the total number of bytes of extra field
data, followed by entries which contains the extra data itself.
Each entry has the following format:
extra_data_entry {
u1 type;
u2 size;
u1 data[size];
}
The field type indicates the type of entry, the field size
the size of the entry data in bytes, and data the entry data.
Currently, the only recognized extra field data types are:
EDATA_COMMENT = 0
The type EDATA_COMMENT is used to specify an optional
comment for the header.
2.2. Main header
A JAR file begins with a main header, whose structure is given below,
followed by a description of each of the main header fields:
main_header {
u4 magic;
u1 flags;
u2 major_version;
u2 minor_version;
extra_data edata; (optional)
u2 crc;
}
- magic (magic number)
- This has the fixed value of magic = 0xC0C0ADAC and identifies
the file as being in the JAR file format.
- flags (header flags)
- The flag byte indicates the header type and any optional header fields,
and is divided into individual bits as follows:
bit 0 0
bit 1 0
bit 2 FLAG_EXTRA
bit 3-7 reserved
- major_version (major version)
- This is the major version of the JAR format, and currently has the value
of major_version = 1 to indicate version 1.0.
- minor_version (minor version)
- This is the minor version of the JAR format, and currently has the value
of minor_version = 0 to indicate version 1.0.
- edata (extra data)
- If FLAG_EXTRA is set, then optional extra field data is present
as described above.
- crc (header crc)
- The field crc specifies the 16-bit CRC of the main header
contents. The CRC-16 consists of the two least significant bytes of the
CRC-32 for all bytes of the main header up to but not including the CRC-16
field itself.
2.3. Entry header and data
The main header is followed by zero or more JAR entries. Each JAR entry
consists of an entry header, immediately followed by the entry data in
the ZLIB compressed data format. The structure of an entry header is
given below, followed by a description of each of the entry header fields:
entry_header {
u1 flags;
utf name; (optional)
u4 size; (optional)
u4 mtime; (optional)
extra_data edata; (optional)
u2 crc;
}
- flags (flag byte)
- The flag byte indicates the header type and optional field information,
and is divided into individual bits as follows:
bit 0 1
bit 1 0
bit 2 FLAG_EXTRA
bit 3 FLAG_MTIME
bit 4 FLAG_SIZE
bit 5 FLAG_NAME
bit 6-7 reserved
- name (entry name)
- If FLAG_MTIME is set then the field name specifies
the name of the entry, represented as a Java UTF string.
- size (entry data size)
- If FLAG_SIZE is set, then the size field specifies the
total size of the uncompressed entry data in bytes.
- mtime (modification time)
- If FLAG_MTIME is set then the field mtime specifies
the modification time of the entry expressed as the number of seconds since
the epoch
- edata (extra field data)
- If FLAG_EXTRA is set, then optional extra field data is present
as described above.
- crc (header crc)
- The field crc specifies the 16-bit CRC of the entry header
contents. The CRC-16 consists of the two least significant bytes of the
CRC-32 for all bytes of the entry header up to but not including the CRC-16
field itself.
2.4. Directory header and data
The last JAR entry can be followed by an optional directory section that
can be used for random access to JAR entries. The optional directory
consists of a directory header immediately followed by the directory contents
stored in the ZLIB compressed data format. A JAR file can contain only one
directory and it must immediately precede the end header. The structure of
the directory header is given below, followed by a description of each of
the directory header fields:
dir_header {
u1 flags;
u4 count;
extra_data edata; (optional)
u2 crc;
}
- flags (flag byte)
- The flag byte indicates the header type and optional field information,
and is divided into individual bits as follows:
bit 0 0
bit 1 1
bit 2 FLAG_EDATA
bit 3-7 reserved
- count (entry count)
- The field count indicates the total number of entries in the
directory, and must be the same as the total number of JAR file entries.
- edata (extra data)
- If FLAG_EDATA is set, then extra field data is present as
specified above.
- crc (header crc)
- The field crc specifies the 16-bit CRC of the directory header
contents. The CRC-16 consists of the two least significant bytes of the
CRC-32 for all bytes of the directory header up to but not including the
CRC-16 field itself.
The directory data consists of count headers of the following format.
The headers appear in the same order as the corresponding JAR entries:
dir_entry {
utf name;
u4 size;
u4 mtime;
u4 head_off;
u4 data_off;
}
- name (entry name)
- The field name specifies the name of the entry, represented
as a Java UTF string. An empty string indicates that the entry has no name.
- size (entry data size)
- The field size indicates the total number of bytes of uncompressed
entry data.
- mtime (modification time)
- The field mtime indicates the modification time of the entry, or
0 if not specified.
- head_off (entry header offset)
- The field head_off is the offset in bytes of the entry header
from the beginning of the JAR file.
- data_off (entry data offset)
- The field data_off is the offset in bytes of the entry data
from the start of the JAR file.
2.5. End header
Every JAR file includes an end header which has the following structure
and fields:
end_header {
u1 flags;
u4 dir_off; (optional)
u4 dir_size; (optional)
u4 mtime; (optional)
extra_data edata; (optional)
u4 end_off;
u2 crc;
}
- flags (flag byte)
- The flag byte indicates the type of the header as well as optional field
information, and has the following bits:
bit 0 1
bit 1 1
bit 2 FLAG_EXTRA
bit 3 FLAG_MTIME
bit 4 FLAG_DIR
bit 5-7 reserved
- mtime (modification time)
- If FLAG_MTIME is set, then the mtime field indicates
the last modification time of the archive file, expressed as the number of
seconds since the epoch.
- dir_off (directory offset)
- If FLAG_DIR is set, then an entry directory is present and the
field dir_off indicates the offset in bytes of the directory header
from the start of the JAR file.
- dir_size (directory size)
- If FLAG_DIR is set, then an entry directory is present and the
field dir_size indicates the total size in bytes of the uncompressed
directory data.
- edata (extra field data)
- If FLAG_EXTRA is set, then optional extra field data is presentA
as described above.
- end_off (end header offset)
- The field end_off specifies the offset in bytes of the end
header from the start of the JAR file, and is used to locate the
optional directory from the end of the JAR file when random access to
JAR file entries is required.
- crc (header crc)
- The field crc specifies the 16-bit CRC of the end header
contents. The CRC-16 consists of the two least significant bytes of the
CRC-32 for all bytes of the end header up to but not including the CRC-16
field itself.
2.6. Limits
The size of a JAR file, and hence any JAR file entry, is limited to 2^32 bytes.
Additionally, the size of extra field data is limited to 64K bytes.
3. References
[1]
Deutsch, L.P., "ZLIB Compressed Data Format Specification",
available in
http://quest.jpl.nasa.gov/zlib/rfc-zlib.html
[2]
Deutsch, L.P., "DEFLATE Compressed Data Format Specification",
available in
http://quest.jpl.nasa.gov/zlib/rfc-deflate.html