Progress: can't really estimate (has been almost done for last six months)
NOTE: This is devel-version; specs are so unstable that we don't want to increase version-number every time we change something. MCF-files of different 0x00-versions might not be compatible. This also means that 0x00-version MCF-files won't work with 0x01-version (or later) software.
Please don't mirror this specification - we don't want old versions floating around the net. This specification lives in http://mcf.sourceforge.net/ and is very unlikely to disappear. Even if it does, developers have backup-copies of it and will set up a mirror somewhere. In such case you can find the new location by looking at your favorite message board.
Track contains one type of media (like video or subtitles) and controls some parameters needed for playing it.
Track Type defines what type of media a Track contains.
Track Entry defines a Track (gives it parameters such as Track Type and Format (compression) used).
Track Number - tracks get sequential numbers (in order of Track Entries), which are used for referring to a certain Track
Canvas is the video window of the player. Canvas is normally same size as the video, but for files without video (but still some graphics) it may differ.
Element is an element in our main file structure. Elements are those colorful things in the diagram below.
Part is a part of an element (for example, Main Header consists of four parts)
One octet is 8 bits.
One important thing for a cross-platform format is naming of files. Not all operating-systems support all characters and even if the OS supports something, there will definately be programs which don't support it. That's why we have defined two levels for naming: strict and relaxed. Filenames are limited to 64 characters in length. This really should be enough for everybody. If you think that YOU need more, send mail to mcf-devel list at SourceForge, telling us why.
Strict names can only contain A-Z, a-z, 0-9, "_", "-", "(", ")" and ".". Strict names can contain several dots, but may not start of end with one or contain two or more in a row (".."). Relaxed names can also contain spaces (but only one in row and not as first or last char), several dots in a row and commas (if you need more chars, let us know).
When encoding, use of relaxed names should give warning to the user (saying that it might not work reliably in all systems) and use of names outside relaxed should give an error.
Fields which contain [parts of] filenames, and therefore must follow these rules, are labelled "FILENAME".
Four character string in square braces is an identifier put in front of that element. It is normally NOT used for anything, but can be used for identifying elements when using hex-editor. Another possible uses are file validation and recovery.
Note that the format allows adding new elements between current elements and reorganizing elements, without losing compatibility. There are many ways to extend current headers too, but more about that later ..
Main Header is the only mandatory header, but if you don't have any Track Entries (no Tracks), the file can't contain Blocks (=data) either. Attached Files element only exists if you attach one or more files (doh). Codec Headers only exist if those are required. Signature Header and Chapter Definitions are optional.
Some people have speculated that the Main Header (Extended Info) will be our Achilles heel, but I don't really think so. If something in it outdates, the field can be abandoded and we still have compatibility with old software. Space waste caused by that is minimal and we can easily add new sub-parts to Main Header. There is also some space reserved inside, to allow adding new fields without adding new sub-parts.
In our mission to confuse you, we [may] have some exceptions to those rules.
This field doesn't apply very well in all situations though.
Main Header is the anchor point of everything in MCF. It must be the first thing in MCF-file and must start at offset 0. For streams and MCF-CDs the stream handling code must be able to figure out the MCF addressing space on the actual stream.
All positions in Main Header are relative to beginning of the file (as described in previous paragraph).
Note: in Type Header all strings are padded with 0x20, not with 0x00 (like elsewhere).
|Type Header (160 octets)|
|0x000||check||ASCII-string: "MCF - more info: http://mcf.sourceforge.net/". If you need to identify MCF-file, primarily use file extension (or MIME-type), but if that is not possible, you can use first 6 chars of this string ("MCF - "). To be sure, you can validate the checksum of Actual Header (or other parts).|
|0x037||-||0x0D (CR), 0x0A (LF), ASCII: "Site: ["|
|0x040||info||Site ("passed thru"-ads; ads ONLY allowed in this field), full URL or text (64 octets ASCII/Latin-1)|
|0x080||-||ASCII: "]", 0x0D (CR), 0x0A (LF), ASCII: "Size: ["|
|0x08A||check||File size, in octets, for easy detection of incomplete downloads (LW) (18 octets, ASCII-string, decimal, no formatting, left-aligned)|
|0x09C||-||ASCII: "]", 0x0D (CR), 0x0A (LF), 0x1A (EOF)|
(LW) means Linear Write: when some properties of the file are not known at the beginning of write and it is impossible to seek back and fix those fields later, LW-mode can be used. In that mode, Main Header fields marked with (LW) can be omitted and filled with zero (binary, not a number). Never use LW, unless you really have to.
All elements, except Main Header, must be specified either in "big elements" or "small elements". Non-existant elements have zero as position. Elements cannot have gaps in-between nor overlap other elements nor go out of the file. Any MCF-file with empty space between elements will fail the validation.
Last/lowest 4 bits in "small element" type denote error-tolerance of the element. Lower values denote elements with more importance. Main Header should be considered level 0. This value is used for determining how strong error correction different parts of MCF need (on MCF-CD, for example). Hardcode values listed here, don't change any. Other 24 bits are still undefined.
|Actual Header (864 octets)|
|0x0A0||info||Version (uint8, currently 0x00)|
|0x0A1||must||Minimum read-version (to use all features, uint8, currently 0x00)|
|0x0A2||must||Absolute minimum read-version (there is some sense in reading this file with, uint8 0x00)|
|0x0A4||must||Total length (uint32, in milliseconds)|
|0x0A8||must||Size of Main Header (uint32)|
|0x0AC-||-||Reserved (fill with 0x00)|
|0x0B0||must||Big element #1: Position of Clusters (uint64)|
|0x0B8||info||Big element #1: Total size of Clusters (LW) (uint64)|
|0x0C0-||info||Reserved for up to 4 other big elements (uint64 position, uint64 size) (fill with 0x00)|
|0x100||must||Small element #1: Position of Track Entries (uint64)|
|0x108||must||Small element #1: Total size of a Track Entries (uint32)|
|0x10C||must||Small element #1: element type (set to 0x00, 0x00, 0x00, 0x01) (4 octets)|
|0x110||must||Small element #2: Position of Codec Headers (uint64)|
|0x118||info||Small element #2: Total size of a Codec Headers (uint32)|
|0x11C||info||Small element #2: element type (set to 0x00, 0x00, 0x00, 0x02) (4 octets)|
|0x120||must||Small element #3: Position of Seek Entries (uint64)|
|0x128||must||Small element #3: Total size of a Seek Entries (uint32)|
|0x12C||info||Small element #3: element type (set to 0x00, 0x00, 0x00, 0x0E) (4 octets)|
|0x130||feat.||Small element #4: Position of Chapter Definitions (uint64)|
|0x138||feat.||Small element #4: Total size of a Chapter Definitions (uint32)|
|0x13C||info||Small element #4: element type (set to 0x00, 0x00, 0x00, 0x08) (4 octets)|
|0x140||feat.||Small element #5: Position of Attached Files (uint64)|
|0x148||info||Small element #5: Total size of Attached Files (uint32)|
|0x14C||info||Small element #5: element type (set to 0x00, 0x00, 0x00, 0x09) (4 octets)|
|0x150||feat.||Small element #6: Position of Signature Header (uint64)|
|0x158||feat.||Small element #6: Total size of a Signature Header (uint32)|
|0x15C||info||Small element #6: element type (set to 0x00, 0x00, 0x00, 0x0C) (4 octets)|
|0x160-||info||Reserved for up to 10 other small chunks (uint64 position, uint32 size, 4 octets type) (fill with 0x00)|
|0x200||feat.||Original filename, includes file extension (64 octets FILENAME)|
|0x240||feat.||Next part: Filename of next part (64 octets FILENAME) (loads from the same directory as current file (or from other, player-side configured paths), no path allowed)|
|0x280||feat.||Next part: Timecode of next part to start from (to allow seamless playback even with overlapping files) (uint32)|
|0x284||info||Muxing application or library (20 octets) ("libmcf-0.1.0", for example)|
|0x298||info||Writing application (24 octets) ("VirtualDub 2.1", for example)|
|0x2B0||check||Position of this Main Header copy (0 for the primary copy, other for backups) (uint64)|
|0x2B8-||-||Reserved (fill with 0x00)|
|0x300||info||Clusters: Number of Blocks = number of Block Headers (uint32)|
|0x304||must||Clusters: Size of a Cluster Header (currently 16) (uint8)|
|0x305||must||Clusters: Size of a Block Header (currently 10) (uint8)|
|0x306||must||Clusters: Size of a Cluster Footer (currently 4) (uint8)|
|0x307||must||Size of a Seek Entry (currently 12) (uint8)|
|0x308||check||Adler-32 checksum of Seek Entries|
|0x30C||must||Size of a Track Entry (uint16)|
|0x30E||must||Size of a Subtrack Entry (lives in the Codec Header of Audio Track) (uint16)|
|0x310||check||Attach: Adler-32 checksum of Attach Entries|
|0x314||must||Attach: Size of an Attach Entry (uint16)|
|0x316||-||Reserved (set to 0)|
|0x317||feat.||Chapter Definitions: Number of editions = number of Edition Entries (uint8)|
|0x318||check||Chapter Definitions: Adler-32 checksum of the whole Chapter Definitions|
|0x31C||feat.||Chapter Definitions: Size of a Chapter Block (uint16)|
|0x31E||feat.||Chapter Definitions: Size of an Edition Entry (uint16)|
|0x320-||-||Reserved (fill with 0x00)|
|0x3FC||check||Adler-32 checksum of [0x0A0,0x3FB] (Actual Header, excluding this field)|
In the normal copy of Main Header (in the beginning of file), set Multiheader values to zero.
Linear mode should only be used in situations where it is impossible to know all values at the beginning of write and when it is impossible to seek back and insert that data later. One example of this would be directly capturing video to a CD-R.
Fields in Extended Info and Content-specific Info are fixed-length for many reasons. One is easy reading - to read a specific field (say, "Encoded by") I only need to seek once (to 0x680) and read fixed amount of data (128 octets) and then figure out what is inside. This makes things a lot easier. Another reason for fixed length is to allow changing data without rewriting the whole MCF-file. Imagine fixing a typo in one of these fields and then reading and writing 700 Mo of data, just to reposition it by one octet.
Comments-field can be used when other fields don't fit for the information you need.
Some strings are ASCII (and not UTF-8) for obvious reasons: URLs, e-mails, etc. cannot contain non-ASCII chars.
|Extended Info (3072 octets)|
|0x400||info||Title (192 octets)|
|0x4C0||info||Edition or episode ("director's cut" or "1x08: Firewall", for example) (128 octets)|
|0x540||info||Name of the original author (company, team or individual) (128 octets)|
|0x5C0||info||URL of the original author (movie's website) (128 octets ASCII)|
|0x640||info||E-mail or secondary URL of the author (64 octets ASCII)|
|0x680||info||Encoded by (whoever compressed it into MCF) (128 octets)|
|0x700||info||Encoder's URL (128 octets ASCII)|
|0x780||info||Encoder's e-mail (64 octets ASCII)|
|0x7C0||info||Comments/description (2048 octets)|
|0xFC0||info||RSACI Rating (parental guidance) (6 octets)|
|0xFC6||info||Flags (not yet defined, set to 0) (1 octet)|
|0xFC7||info||Content Type (Content Type page) (uint8)|
|0xFC8||info||Encoding date (seconds since 1970-01-01 00:00 UTC, aka UNIXTIME) (uint32, 0=undefined)|
|0xFCC||info||Last editing date (seconds since 1970-01-01 00:00 UTC, aka UNIXTIME) (uint32, 0=undefined)|
|0xFD0||info||Production year (uint16, 0 means undefined)|
|0xFD2||info||Country code (same as in Internet domains) (2 octets ASCII)|
|0xFD4||info||Language code (4 octets)|
|0xFD8-||-||Reserved (fill with 0x00)|
|0xFFC||check||Adler-32 checksum of [0x400,0xFFB] (Extended Info, excluding this field)|
Like the name of Content-specific Info suggests, the info depends on Content Type. This means that we only have generic fields (Textfield 1, for example), but once Content Type is selected (say, the user selects "Movie"), fields get names (Actors, Director, Writer, ..).
|Content-specific Info (1024 octets)|
|0x1000||info||Textfield 1 (128 octets)|
|0x1080||info||Textfield 2 (128 octets)|
|0x1100||info||Textfield 3 (128 octets)|
|0x1180||info||Textfield 4 (128 octets)|
|0x1200||info||Textfield 5 (128 octets)|
|0x1280||info||Textfield 6 (128 octets)|
|0x1300||info||Textfield 7 (128 octets)|
|0x1380||info||SmallText (64 octets)|
|0x13C0-||-||Reserved (fill with 0x00)|
|0x13F8||info||Checkboxes 1-8 (highest bit is checkbox 1) (1 octet)|
|0x13F9||info||RadioA selection (RadioA 1 = 0x00; maximum is 0x07) (uint8)|
|0x13FA||info||RadioB selection (RadioB 1 = 0x00; maximum is 0x07) (uint8)|
|0x13FB||info||Content subtype (uint8)|
|0x13FC||check||Adler-32 checksum of [0x1000,0x13FB] ("content-specific info", excluding this field)|
Extended info explained
Track Entry helps player to decide what it needs for presentating the AV-data, but the main reason for this is to allow any combination of audio/video/other media, like dubs for different languages, or maybe alternative angles, ..
Tracks are numbered in defining order, starting from 1.
|0x00||check||"TrkE" (ASCII, 4 octets)|
|0x04||must||Track Type (uint8)|
|0x06-||-||Reserved (fill with 0x00)|
|0x0C||info||Language code (4 octets)|
|0x10||must||Format (16 chars ASCII, look below, video-formats)|
|0x20||info||Format Version1 (encoding) (uint16)|
|0x22||must||Format Version2 (reading) (uint16)|
|0x24||must||Format Version3 (absolute minimum for reading) (uint16)|
|0x26-||-||Reserved (fill with 0x00)|
|0x28||must||Size of the Codec Header (uint32)|
|0x2C||check||Adler-32 checksum of codec-header|
|0x30||info||Codec (name and version) used for compressing (64 octets ASCII)|
|0x70||info||URL to codec's website (64 octets ASCII)|
|0xB0||info||Alternative URL (should be free, preferrably multi-platform, codec that can at least decompress it) (64 octets ASCII)|
|0xF0||info||Nanoseconds per Block (if not constant, set to 0) (unsigned int64)|
|0xF8-||-||Reserved (fill with 0x00)|
|0x100||info||Settings (bitrate, ..) used for compression-codec - free form string (64 octets ASCII)|
|0x140||must||Track name ("Finnish subtitles" would be a good name) (64 octets)|
|0x180||-||Very IMPORTANT field. Depends on Track Type, more info on Track Type page (188 octets)|
|-0x04||check||Adler-32 checksum of Track Entry, excluding this field|
Note: checksum position cannot be hardcoded - it should depend on Track Entry size. Also the area it is calculated from should be Track Entry minus 4 octets.
Format versions may be pure values (greater value means greater version), bitmasks (each bit denotes a feature) or something else. Codecs or wrappers should know how to read those fields, and with that data, know if they can play it or not. It is recommended for all three fields to have same format, but not required.
Nanoseconds per Block can be used for converting to old formats (like AVI), which only support constant framerates. You should never use it for MCF playback. NOTE: we can have "dropped frames" even if the rate is constant: conversion software must be able to detect this by comparing timecodes of Blocks to number_of_frame * nanosecs_per_frame. When writing MCF, make sure you define that field, as we really want users to be able to convert to AVI, if the material itself allows that.
The actual data is stored in checksum protected Clusters. A cluster can have any number of Blocks inside, but it is recommended to have Clusters aligned with video keyframes. Also, Clusters should be small enough to be entirely loaded in memory, but parsers should also be able to handle situations where there is not enough memory for holding a Cluster.
Too small Clusters result in some wasted space, but too large Clusters reduce data recovability and prevent error checking on fly. Sizes between 5 Ko and 2 Mo are okay, but using sizes outside of that boundary should be avoided. If there is an error somewhere inside a Cluster, the entire Cluster will be skipped.
In a Cluster we have one or more Blocks with Block Headers. The header in front of a Block tells which Track the Block belongs to and how big the Block is.
The normal way to read data from MCF would be reading Cluster Header (which contains the size of the Cluster), then reading the whole Cluster to a buffer in memory. At that point the checksum can be easily validated. If the checksum is valid, the first Block Header in it can be read. With the information in Block Header the first Block can be read, then the second Block Header, and so on, until the size of the Cluster is reached.
A Block contains one video frame, a tiny piece of audio, one subtitle, a menu or something else.
Structure of a Cluster:
|Cluster Header||One or more Blocks||Cluster Footer|
|0x04||must||Size of this Cluster, including Cluster Header and Footer, in octets (uint32)|
|0x08||check||Exact position of this Cluster (uint64)|
Checksum includes Cluster Header (except "CHdr" and the checksum itself) and the content.
If the file is damaged, ASCII string "CHdr" can be searched. Exact location of the Cluster also helps recovering misaligned files.
|0x00||check||Adler-32 checksum of Cluster Header and Block(s) of the Cluster|
Checksum allows verifying data integrity during playback, and reliable recovery of broken files.
Cluster Header and Cluster Footer sizes are defined in Actual Header. In practice, those sizes are always constant, but parsers MUST read those from Actual Header, instead of hard-coding.
|Block Header||Data (or Lacing + Data frame(s))||Ending timecode, if gap flag enabled|
This is located in front of each Block and is used for figuring out how big the Block is, when it should be played and which Track it belongs to.
|0x00||must||Block size, including Block Header (uint32)|
|0x04||must||Timecode (milliseconds from start of file, uint32)|
|0x08||must||Track Number (Track Entry) (uint8)|
Keyframe flag can only be set if the Block starts with a keyframe. This is not a problem for video, but some audio compressions might have keyframes and non-keyframes. If we store several frames in one Block, any keyframe (if there are keyframes in that Block) should be aligned to the start of Block, not somewhere middle it. MP3, Vorbis all other compressions I am aware of, only contain keyframes, so this is not a problem there.
When there is a gap in a Track, set gap flag of the last Block before that gap. When gap flag is set, you must append ending timecode (uint32) to the Block, right after its contents (data). Once ending timecode is reached, the Track is reset to the same state it is in the beginning of stream. In other words, it gets cleared. Audio tracks are silent, Video is black and Titles are completely transparent. Gap ends when there is another Block on the Track.
Frozen: structure, but new features may appear.
This is a special structure used for some stream control functions required in MCF.
|Track #0: special case of Block Header|
|0x00||must||Block size (uint32)|
|0x04||must||Timecode (milliseconds from start of file, uint32)|
|0x08||must||Track Number: 0x00 (uint8)|
|0x09||must||Magic Block type (uint8): 0 = deleted block, 1 = headers rebroadcast, 0xFF = stream reset: end of Clusters Element|
Clusters in MCF must always terminate with stream reset command. This command must be the last Block in MCF. It is considered as the end of stream. In normal files it is followed by other Elements (defined in Main Header) or Main Footer. In Linear Write files, it is always followed by the Main Footer that has Main Header copy in it. Broadcasts may use it differently (work in progress).
When reading, you have to skip Magic Blocks whose type you don't know. If you know the type, you'll know how to handle it.
Some (audio) formats (Vorbis) require many tiny frames in one Block. For this we have a system for sub-dividing a Block. We could just have each frame in its own Block, but that would cause too much overhead.
Example of this; we have a Block with 5 frames in it. Frame sizes are 70, 260, 255, 1030 and 666. Block starts with following structure:
Each lace ends with the first (uint8) value smaller than 255. Frame size can be calculated by taking a sum of all values in a lace (for frame #4: 255 + 255 + 255 + 255 + 10 = 1030). Size for the last frame is not required because we already know the Block size and all the remaining octets in the Block are for the last frame.
NOTE: Only Audio Tracks are supported at the moment (other Track Types don't need this!). Remember to set/read the flag in Track Entry before using this feature.
Codec Headers contain (variable-sized) Track-specific data, which doesn't fit in the Track Entry. The actual content depends on Track Type and Format. Position of the Codec Headers element is defined in Actual Header. Codec Header size for each Track is defined in Track Entries. Codec Headers are sorted in Track order. Structure of Codec Headers element:
If a Track doesn't need Codec Header, size of it (in Track Entry) is set to 0.
Adler-32 checksum of the whole Codec Header element is stored in Actual Header.
|ASCII: "Seek" (4 octets)||Cluster Seek entries (one for each Cluster)||Block Seek information|
Seek Header is an index that allows fast seeking - for example, if you want to watch a part starting at 57 minutes, you could only download file headers (including Seek Header), then directly seek to the correct position and only download what you really want to. All players should read Seek Header to memory, for fast access. Space required for it is typically less than 20 kilobytes. Seek Header is mandatory.
Seek Header starts with ASCII "Seek" (4 octets), directly followed by one CS-entry for each Cluster of the file:Size: 12 octets (might increase, but very unlikely)
|CS-Entry (Cluster Seek)|
|0x00||must||Position of the Cluster (uint64, octets relative to start of file)|
|0x08||must||Timecode (milliseconds from start of file, uint32)|
Timecode points to the time "sync" is reached if decoding starts at the Cluster pointed to by that Seek Entry. Sync means that each Track either has reached keyframe or has a gap (is not active).
If the Cluster is the first Cluster in the file, timecode should be set to zero (because all tracks have gap until the first Block comes). If some Tracks never get synced (end of file is reached before they sync), set timecode to 0xFFFFFFFF.
There is one big problem with this system: if there is some content that stay on-screen for very long (a logo, for example), seeking won't work as expected. As a workaround, such problematic Blocks are excluded from those timecodes (CS-Entries). Blocks excluded from CS-index are described in Block Seek.
A good rule for deciding if a Block should go to normal or special index is displaying time: a Block that stays on screen over three or more Clusters should go to BS instead of CS.
Work in progress: Block Seek specs coming soon!
Frozen: All of it.
This allows bundling (little) files with the content. This could mean pictures of movie covers, posters, etc. Structure of the element is:
|Attach Entry (currently 0xD1 (209) octets)|
|0x00||must||File description (128 octets)|
|0x80||must||MIME-type (32 octets ASCII)|
|0xA0||must||File extension (32 octets FILENAME)|
|0xC0||must||Position of the file (uint64)|
|0xC8||must||File size in octets (uint32)|
|0xCC||check||Adler-32 checksum of the file|
|0xD0||must||File compression (0 = no compression, others not defined yet) (uint8)|
File extension is required for saving file to disk and for determining its type (if the target-system doesn't support MIME-types). It can also be used for defining part of the full filename, not just the extension. Files extracted get their basename (name up to the first dot) from the MCF-file they were extracted from. Example-movie.XviD.Vorbis.av.mcf with an attachemnt named instructions.html would result that attachment extracted as "Example-movie.instructions.html". Never start an extension with a dot.
Frozen: all of it, but don't consider this freeze very strong.
This is a simple menu for jumping to certain positions of the movie - a menu listing these should be accessible with following buttons (depending on which buttons your hardware has): chapters, menu, tab, alt-c. You are not tied to using these buttons, but these are what we recommend. On Windows-platform this should be in the right-click menu (as one submenu).
Another use for this is defining different editions (cuts) of the material. Control-track (menu) is used for switching between editions and for actual jumps during playback. When edition is switched from a control-Block, entries available in player's chapter-menu should change.
The first thing in Chapter Definitions is one or more edition-entries. The edition defined first is edition #1 and will be the default.
|Edition Entry (66 octets each)|
|0x00||info||Name of the edition (64 octets)|
|0x40||must||Number of chapters in this edition (uint16)|
For each chapter of edition #1 there is one chapter-entry. Directly after those there are chapter-entries for chapters of edition #2 (if there are two or more editions), and so on.
Chapter Blocks are part of the chapter-header. Total size of the chapter-header is (no_of_chapters * size_of_chapterBlock + 4) octets.
|Chapter-entry (264 octets each)|
|0x00||must||Start-timecode of the chapter (uint32)|
|0x04||must||End-timecode of the chapter (uint32)|
|0x08||must||Name of the chapter (256 octets)|
We use 0x00 for separating menu levels:
Maximum number of menu-levels is 4. Total number of chapter-entries is limited to 4096. Maximum number of entries in one menu is limited to 256. If any of these limits are broken, players should give a warning and ignore the chapter-structure of that movie. The purpose of these limits is to stop MCF-files from overloading the user-interface.
This header is also used for next/prev chapter functions. Start/end-timecodes are used for figuring out which chapter is playing. If no chapters match (ie. the position playing is not included in the edition currently selected), we'll jump to timecode 0. Previous-button normally moves to the chapter right before current chapter, not to the beginning of current chapter. Play-button (if not shared with pause-button) should jump to beginning of current chapter.
This goes to the end of file and contains a digital signature of the file, so that you can verify that the file is coming from where you think it is from - and unmodified.
|0x00||check||ASCII: "SigH" (4 octets)|
|0x04||must||Public key algorithm (0 = RSA) (uint8)|
|0x05||must||Hash algorithm (0 = SHA1-160) (uint8)|
|0x06||must||Size of the public key (octets, uint16)|
|0x08||must||Position of the first octet to sign (normally 160: beginning of the Actual Header) (uint64)|
|0x10||must||Position of the last octet to sign (normally the octet right before this header) (uint64)|
|0x1A||must||Public key (extremely long unsigned integer)|
|0x1A + keysize||must||Crypted hash (extremely long unsigned integer)|
|-0x04 (from end of header)||check||Adler-32 checksum of Signature Header, excluding this field|
Crypted hash size is determined by the total size of the Signature Header element (defined in Actual Header).
We are compatible with PGP/GPG - your keypairs work with this too.
Work in progress: Signature Header is likely to change
Work in progress! Really.
Sometimes files must be split, to fit on CDs, or to meet file size restrictions. It isn't always easy to find a spot with keyframe appearing for all tracks. MultiSegment allows splitting MCFs at any Cluster boundary, without having to recalculate checksums for Clusters.
A MultiSegment set of files should be treated just like one big file. All segments must have exactly same settings (no changing of codecs or anything is allowed).
All Small Elements (except MultiSeg Header) must be stored in the first segment; other segments only contain Main Header, MultiSegment, Clusters and Main Footer.
MultiSeg Header: (alpha) 0x00 "MSeg" 0x04 Cur_seg_num (first is 0x01) 0x05 Total_segs (or 0x00 if unknown) 0x06 Number of segs defined here (must be greater or equal to cur_seg_num) 0x07 Reserved 0x10 Filename of segment 1 (64 octets) 0x50 Seg2: First timecode 0x54 Seg2: First synced time (or 0xFFFFFFFF if never gets synced in this segment) 0x58 Seg2: Offset (how many octets all positions inside Clusters Element are off) 0x60 Seg2: Filename (64 octets) 0xA0-0xF0 Seg3 (same structure as Seg2) 0xF0- .. and so on until "number of segs defined here" reached. -0x04 Checksum of MultiSeg Header (offset relative to end of the header, defined by Element size).
Filenames must contain ".seg3of4.", where 3 and 4 would be decimal numbers for current segment and total number of segments. Players should not rely on this information, but use names defined inside instead (and ask for user what to load in case something cannot be found). If user can't supply some part, players should still handle the situation by skipping over the missing part. Even the case where first segment is missing should be handled.
NOTE: Main Header backups inside Clusters might get replaced by this, have to think about it..
Work in progress!
This part is anchored at end of file and is mandatory. If Linear Write is used in the primary copy of Main Header, Main Header copy with all information filled in must be inserted into Main Footer. Main Footer can also contain other Elements; in that case, positions and sizes of those elements are defined in the Main Header copy inside Main Footer.
|-size_of_MF||-||Main Header and optionally other Elements (1024 or more octets)|
|-0x14||must||Main Footer size (uint32)|
|-0x10||-||"MCF ends here ->" (16 octets)|
Previous copy of Main Header may be at zero (if there only is one copy) or somewhere else (with Magic Blocks).
If MultiSegment and Linear Write are both used, Main Footer copy of Main Header
MCF uses checksums for all headers and data to make sure that those are not broken and to allow redownloading only the broken part, or playing other parts, while skipping the broken part. The algorithm used everywhere is Adler-32.
Links to Adler-32 resources:
I don't know if this part clarifies anything, but it's the best I can offer now. To access a low level (say, #3), you must obviously need to know higher levels (#1 and #2) too. Level #3 is surely the most common
Not actually a level, but some software might be happy without knowing anything about other levels. That would mean software only reading Extended Info or some other field from Main Header.
Where required: search engines, very simple file validators
Each has its own color in our structure diagram
Where required: reordering MCF for better performance, selecting protected areas for MCF-CD
Where required: checksum validation of data
Where required: players, encoding software, simple editors
We could take the easy way and put each Frame to a separate Block (="single per block" framing mode), so that each would have a uint32 fields for size and timecode. Unfortunately audio Frames can be very small (ranging from 32 bits of typical PCM to roughly 100-1500 octets of Vorbis etc). This would mean big overhead. Also, always handling data in so small pieces would be inefficient. That's why we often need to put many frames into a single Block.
On player-level data in handled in Blocks: no knowledge of the underlying framing is required.
Where required: advanced editors, MCF-codecs, wrappers
Exceptions to above rules about levels:
is hosting MCF