mirror of
https://github.com/codedread/bitjs
synced 2025-10-03 09:39:16 +02:00
177 lines
6.6 KiB
Markdown
177 lines
6.6 KiB
Markdown
# bitjs.archive
|
|
|
|
This package includes objects for unarchiving binary data in popular archive formats (zip, rar, tar)
|
|
providing unzip, unrar and untar capabilities via JavaScript in the browser or various JavaScript
|
|
runtimes (node, deno, bun).
|
|
|
|
A prototype version of a compressor that creates Zip files is also present. The decompression /
|
|
compression happens inside a Web Worker, if the runtime supports it (browsers, deno).
|
|
|
|
The API is event-based, you will want to subscribe to some of these events:
|
|
* 'progress': Periodic updates on the progress (bytes processed).
|
|
* 'extract': Sent whenever a single file in the archive was fully decompressed.
|
|
* 'finish': Sent when decompression/compression is complete.
|
|
|
|
## Decompressing
|
|
|
|
### Simple Example of unzip
|
|
|
|
Here is a simple example of unzipping a file. It is assumed the zip file exists as an
|
|
[`ArrayBuffer`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer),
|
|
which you can get via
|
|
[`XHR`](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest_API/Sending_and_Receiving_Binary_Data),
|
|
from a [`Blob`](https://developer.mozilla.org/en-US/docs/Web/API/Blob/arrayBuffer),
|
|
[`Fetch`](https://developer.mozilla.org/en-US/docs/Web/API/Response/arrayBuffer),
|
|
[`FileReader`](https://developer.mozilla.org/en-US/docs/Web/API/FileReader/readAsArrayBuffer),
|
|
etc.
|
|
|
|
```javascript
|
|
import { Unzipper } from './bitjs/archive/decompress.js';
|
|
const unzipper = new Unzipper(zipFileArrayBuffer);
|
|
unzipper.addEventListener('extract', (evt) => {
|
|
const {filename, fileData} = evt.unarchivedFile;
|
|
console.log(`unzipped ${filename} (${fileData.byteLength} bytes)`);
|
|
// Do something with fileData...
|
|
});
|
|
unzipper.addEventListener('finish', () => console.log(`Finished!`));
|
|
unzipper.start();
|
|
```
|
|
|
|
`start()` is an async method that resolves a `Promise` when the unzipping is complete, so you can
|
|
`await` on it, if you need to.
|
|
|
|
### Progressive unzipping
|
|
|
|
The unarchivers also support progressively decoding while streaming the file, if you are receiving
|
|
the zipped file from a slow place (a Cloud API, for instance). Send the first `ArrayBuffer` in the
|
|
constructor, and send subsequent `ArrayBuffers` using the `update()` method.
|
|
|
|
```javascript
|
|
import { Unzipper } from './bitjs/archive/decompress.js';
|
|
const unzipper = new Unzipper(anArrayBufferWithStartingBytes);
|
|
unzipper.addEventListener('extract', () => {...});
|
|
unzipper.addEventListener('finish', () => {...});
|
|
unzipper.start();
|
|
...
|
|
// after some time
|
|
unzipper.update(anArrayBufferWithMoreBytes);
|
|
...
|
|
// after some more time
|
|
unzipper.update(anArrayBufferWithYetMoreBytes);
|
|
```
|
|
|
|
### getUnarchiver()
|
|
|
|
If you don't want to bother with figuring out if you have a zip, rar, or tar file, you can use the
|
|
convenience method `getUnarchiver()`, which sniffs the bytes for you and creates the appropriate
|
|
unarchiver.
|
|
|
|
```javascript
|
|
import { getUnarchiver } from './bitjs/archive/decompress.js';
|
|
const unarchiver = getUnarchiver(anArrayBuffer);
|
|
unarchive.addEventListener('extract', () => {...});
|
|
// etc...
|
|
unarchiver.start();
|
|
```
|
|
|
|
### Non-Browser JavaScript Runtime Examples
|
|
|
|
The API works in other JavaScript runtimes too (Node, Deno, Bun).
|
|
|
|
#### NodeJS
|
|
|
|
```javascript
|
|
import * as fs from 'fs';
|
|
import { getUnarchiver } from './archive/decompress.js';
|
|
|
|
const nodeBuf = fs.readFileSync('comic.cbz');
|
|
// NOTE: Small files may not have a zero byte offset in Node, so we slice().
|
|
// See https://nodejs.org/api/buffer.html#bufbyteoffset.
|
|
const ab = nodeBuf.buffer.slice(nodeBuf.byteOffset, nodeBuf.byteOffset + nodeBuf.length);
|
|
const unarchiver = getUnarchiver(ab);
|
|
unarchiver.addEventListener('progress', () => process.stdout.write('.'));
|
|
unarchiver.addEventListener('extract', (evt) => {
|
|
const {filename, fileData} = evt.unarchivedFile;
|
|
console.log(`${filename} (${fileData.byteLength} bytes)`);
|
|
});
|
|
unarchiver.addEventListener('finish', () => console.log(`Done!`));
|
|
unarchiver.start();
|
|
```
|
|
|
|
#### Deno
|
|
|
|
```typescript
|
|
import { UnarchiveExtractEvent } from './archive/events.js';
|
|
import { getUnarchiver} from './archive/decompress.js';
|
|
|
|
const print = (s: string) => Deno.writeAll(Deno.stdout, new TextEncoder().encode(s));
|
|
|
|
async function go() {
|
|
const arr: Uint8Array = await Deno.readFile('example.zip');
|
|
const unarchiver = getUnarchiver(arr.buffer);
|
|
unarchiver.addEventListener('extract', (evt) => {
|
|
const {filename, fileData} = (evt as UnarchiveExtractEvent).unarchivedFile;
|
|
print(`\n${filename} (${fileData.byteLength} bytes)\n`);
|
|
// Do something with fileData...
|
|
});
|
|
unarchiver.addEventListener('finish', () => { console.log(`Done!`); Deno.exit(); });
|
|
unarchiver.addEventListener('progress', (evt) => print('.'));
|
|
unarchiver.start();
|
|
}
|
|
|
|
await go();
|
|
```
|
|
|
|
## Compressing
|
|
|
|
The Zipper only supports creating zip files without compression (store only) for now. The interface
|
|
is pretty straightforward and there is no event-based / streaming API.
|
|
|
|
```javascript
|
|
import { Zipper } from './bitjs/archive/compress.js';
|
|
const zipper = new Zipper();
|
|
const now = Date.now();
|
|
// Create a zip file with files foo.jpg and bar.txt.
|
|
const zippedArrayBuffer = await zipper.start(
|
|
[
|
|
{
|
|
fileName: 'foo.jpg',
|
|
lastModTime: now,
|
|
fileData: fooArrayBuffer,
|
|
},
|
|
{
|
|
fileName: 'bar.txt',
|
|
lastModTime: now,
|
|
fileData: barArrayBuffer,
|
|
}
|
|
],
|
|
true /* isLastFile */);
|
|
```
|
|
|
|
## Implementation Details
|
|
|
|
All you generally need to worry about is calling getUnarchiver(), listen for events, and then `start()`. However, if you are interested in how it works under the covers, read on...
|
|
|
|
The implementations are written in pure JavaScript and communicate with the host software (the thing that wants to do the unzipping) via a MessageChannel. The host and implementation each own a MessagePort and pass messages to each other through it. In a web browser, the implementation is invoked as a Web Worker to save the main UI thread from getting the CPU spins.
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Host Code
|
|
participant Port1
|
|
box Any JavaScript Context (could be a Web Worker)
|
|
participant Port2
|
|
participant unrar.js
|
|
end
|
|
Host Code->>Port1: postMessage(rar bytes)
|
|
Port1-->>Port2: (MessageChannel)
|
|
Port2->>unrar.js: onmessage(rar bytes)
|
|
Note right of unrar.js: unrar the thing
|
|
|
|
unrar.js->>Port2: postMessage(an extracted file)
|
|
Port2-->>Port1: (MessageChannel)
|
|
Port1->>Host Code: onmessage(an extracted file)
|
|
|
|
unrar.js->>Port2: postMessage(2nd extracted file)
|
|
Port2-->>Port1: (MessageChannel)
|
|
Port1->>Host Code: onmessage(2nd extracted file)
|
|
```
|