Internet-Draft | G3FC File Format | July 2025 |
Guimaraes | Expires 28 January 2026 | [Page] |
This document provides the complete technical specification for the G3FC (G3 File Container) binary format, version 1.0. G3FC is designed to store multiple files and directories in a single, robust container, which can optionally be split into multiple data blocks. The format includes features for data integrity via checksums, security through authenticated encryption, and resilience via forward error correction (FEC). This specification details the file structure, data types, field layouts, and the algorithms required to implement compatible software for reading and writing G3FC archives.¶
The reference implementations of the G3FC Archiver Tool are licensed under the GNU General Public License v2.0. This specification document may be freely distributed and used for implementation purposes.¶
This is the final technical specification for version 1.0¶
Copyright (c) 2025 G3Pix - Lucas Guimaraes. All rights reserved.¶
https://g3pix.com.br/g3fc/
https://github.com/guimaraeslucas/g3fc/¶
The G3FC (G3 File Container) format provides a structured method for archiving multiple files and directories into a single container or a set of segmented files. It was designed with a focus on robustness, data integrity, security, and failure recovery. The format defines a clear layout with a header, a file index, data blocks, and a footer, allowing for efficient access and manipulation of the contained data.¶
This specification is intended for developers who need to implement G3FC-compatible tools for creating, reading, or modifying archives.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
A G3FC archive consists of a main index file (with a .g3fc extension) and zero or more data block files (with .g3fc<n> extensions). A non-split archive contains all components within the single .g3fc file.¶
The logical structure of a non-split archive is as follows:¶
In a split archive, the .g3fc file contains the Header, File Index, Metadata FEC Block, and Footer. The File Data is stored in separate data block files (e.g., archive.g3fc0, archive.g3fc1).¶
The Main Header is a fixed-size block of 331 bytes located at the beginning of the .g3fc file. It MUST NOT be compressed or encrypted. The fields are aligned sequentially with 1-byte packing.¶
Offset (B) | Size (B) | Data Type | Field Name | Description |
---|---|---|---|---|
0 | 4 | char[4] | magic_number | MUST contain the ASCII characters "G3FC". |
4 | 2 | uint16_t | format_version_major | Major version of the specification. SHALL be 1. |
6 | 2 | uint16_t | format_version_minor | Minor version of the specification. SHALL be 0. |
8 | 16 | byte[16] | container_uuid | A 16-byte UUID (v4 RECOMMENDED) that uniquely identifies the container. |
24 | 8 | int64_t | creation_timestamp | Timestamp of the container's creation. |
32 | 8 | int64_t | modification_timestamp | Timestamp of the last modification. |
40 | 4 | uint32_t | edit_version | Starts at 1 and MUST be incremented on each modification. |
44 | 32 | char[32] | creating_system | Name of the creating software, UTF-8, null-padded. |
76 | 32 | char[32] | software_version | Version of the creating software, UTF-8, null-padded. |
108 | 8 | uint64_t | file_index_offset | Absolute offset (in bytes) from the beginning of the file to the start of the File Index. |
116 | 8 | uint64_t | file_index_length | Length of the File Index in bytes (after compression and encryption). |
124 | 1 | uint8_t | file_index_compression | 0: None, 1: Zstandard. Current implementations SHALL use 1. |
125 | 1 | uint8_t | global_compression | 0: Per-file compression, 1: Zstandard compression on the entire data block. |
126 | 1 | uint8_t | encryption_mode | 0: None, 1: Single password for read/write. |
127 | 64 | byte[64] | read_salt | A 64-byte salt for the read password's KDF. MUST be zero-filled if not used. |
191 | 64 | byte[64] | write_salt | A 64-byte salt for the write password's KDF. MUST be zero-filled if not used. |
255 | 4 | uint32_t | kdf_iterations | Number of iterations for PBKDF2. SHOULD be a high value (e.g., >= 100,000). |
259 | 1 | uint8_t | fec_scheme | Forward Error Correction scheme. 0: None, 1: Reed-Solomon. |
260 | 1 | uint8_t | fec_level | Percentage of parity data for the Data FEC Block (0-50). Ignored for split archives. |
261 | 8 | uint64_t | fec_data_offset | Absolute offset to the Data FEC Block. In a split archive, this MUST be 0. |
269 | 8 | uint64_t | fec_data_length | Length of the Data FEC Block. In a split archive, this MUST be 0. |
277 | 4 | uint32_t | header_checksum | CRC-32 (IEEE 802.3 polynomial) of the header from byte 0 to 276. |
281 | 50 | byte[50] | reserved | Reserved for future use. MUST be filled with null bytes (0x00). |
The File Index is a data block containing a catalog of all files and directories. The index SHALL be serialized using Concise Binary Object Representation (CBOR) [RFC8949]. The root object is a CBOR array, where each element is a CBOR map representing a file entry.¶
The entire serialized CBOR byte stream is then compressed using Zstandard [RFC8878] and MAY be encrypted.¶
Each entry in the CBOR array is a map with the following keys. Analysis of the reference implementations reveals extra fields for handling large, split files (chunking), which are included here.¶
Key (string) | Value Type (CBOR) | Description |
---|---|---|
path | text string | Full, POSIX-style path using forward slashes (/). |
type | text string | MUST be "file" or "directory". |
uuid | byte string (16) | Unique 16-byte UUID for this entry. |
creation_time | integer | int64_t creation timestamp. |
modification_time | integer | int64_t modification timestamp. |
permissions | unsigned integer | uint16_t POSIX-style permissions (e.g., 0o755). |
status | unsigned integer | uint8_t entry status. 0: Normal, 1: Hidden, 2: Deleted. |
original_filename | text string | (Files only) The original filename. |
data_offset | unsigned integer | (Files only) uint64_t offset to the file's data within its data block. |
data_size | unsigned integer | (Files only) uint64_t size of the file's data in bytes (after per-file compression). |
uncompressed_size | unsigned integer | (Files only) uint64_t original size of the file in bytes. |
compression | unsigned integer | (Files only) uint8_t. 0: None, 1: Zstandard. Ignored if global_compression is active. |
checksum | unsigned integer | (Files only) uint32_t CRC-32 checksum of the uncompressed file data. |
block_file_index | unsigned integer | (Split files) uint32_t index of the data block file (e.g., 0 for .g3fc0). For non-split archives, this is 0. |
chunk_group_id | byte string (16) | (Split files) A 16-byte UUID shared by all chunks of a single original file. This is used to reassemble the file. |
chunk_index | unsigned integer | (Split files) uint32_t sequential index of this chunk for a given file (0, 1, 2...). |
total_chunks | unsigned integer | (Split files) uint32_t total number of chunks for the file this piece belongs to. |
The File Data section contains the actual content of the files. Its structure depends on whether the archive is split.¶
The G3FC format uses Zstandard (Zstd) for compression [RFC8878].¶
Data integrity is verified using CRC-32 checksums with the IEEE 802.3 polynomial (0xEDB88320).¶
If the `fec_scheme` in the header is 1, Reed-Solomon is used to generate parity data, allowing for recovery from corruption.¶
Cryptographic keys SHALL be derived from user-supplied passwords using PBKDF2 with HMAC-SHA256 as the pseudo-random function. The inputs are the password, the `read_salt` from the header, and the `kdf_iterations` count from the header. The derived key MUST be 32 bytes (256 bits) long.¶
Data encryption SHALL be performed using AES-256 in GCM (Galois/Counter Mode). GCM provides both confidentiality and authenticity.¶
For interoperability between implementations (specifically C#, Python, and Go), the encrypted payload SHALL be structured as follows:¶
+------------------+--------------------------+------------------+ | Nonce (12 bytes) | Authentication Tag (16B) | Ciphertext (...) | +------------------+--------------------------+------------------+¶
When encryption is active (`encryption_mode` > 0), the following blocks are encrypted:¶
The Main Header and the Footer MUST NOT be encrypted to allow for initial parsing of the archive.¶
Deletion: To mark a file or directory as deleted, its `status` field in the File Index SHALL be changed to 2. The actual file data is not removed from the data blocks. This allows for "undelete" functionality. A separate "compact" or "purge" operation MAY be implemented by an application to physically remove data marked as deleted and reclaim space.¶
This document requests the registration of a new media type in the "application" tree, as follows:¶
Implementers of this specification should be aware of the following security aspects:¶
The G3FC format relies on several established technologies. The authors of the specifications for these technologies are acknowledged for their foundational work.¶