diff --git a/README.md b/README.md index e00e7af..1b98922 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,9 @@ ## Introduction -A set of dependency-free JavaScript modules to handle binary data in JS (using Typed Arrays). Includes: +A set of dependency-free JavaScript modules to handle binary data in JS (using +[Typed Arrays](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray)). +Includes: * bitjs/archive: Unarchiving files (unzip, unrar, untar) in JavaScript, implemented as Web Workers where supported, and allowing progressive @@ -18,7 +20,7 @@ A set of dependency-free JavaScript modules to handle binary data in JS (using T ## Installation -Install it using your favourite package manager, the package is registered under `@codedread/bitjs`. +Install it using your favourite package manager, the package is registered under `@codedread/bitjs`. ```bash npm install @codedread/bitjs ``` @@ -29,10 +31,11 @@ yarn add @codedread/bitjs ### Using in Node -This module is an ES Module, which should work as expected in other projects using ES Modules. However, -if you are using a project that uses CommonJs modules, it's a little tricker to use. One valid example -of this is if a TypeScript project compiles to CommonJS, it will try to turn imports into require() -statements, which will break. The fix for this (unfortunately) is to update your tsconfig.json: +This module is an ES Module, which should work as expected in other projects using ES Modules. +However, if you are using a project that uses CommonJs modules, it's a little tricker to use. One +example of this is if a TypeScript project compiles to CommonJS, it will try to turn imports into +require() statements, which will break. The fix for this (unfortunately) is to update your +tsconfig.json: ```json "moduleResolution": "Node16", @@ -48,93 +51,24 @@ const { getFullMIMEString } = await import('@codedread/bitjs'); ### bitjs.archive -This package includes objects for unarchiving binary data in popular archive formats (zip, rar, tar) providing unzip, unrar and untar capabilities via JavaScript in the browser. A prototype version of a compressor that creates Zip files is also present. The decompression/compression actually happens inside a Web Worker, when the runtime supports it (browsers, deno). +This package includes objects for unarchiving binary data in popular archive formats (zip, rar, tar). +Here is a simple example of unrar: #### Decompressing ```javascript -import { Unzipper } from './bitjs/archive/decompress.js'; -const unzipper = new Unzipper(zipFileArrayBuffer); -unzipper.addEventListener('progress', updateProgress); -unzipper.addEventListener('extract', receiveOneFile); -unzipper.addEventListener('finish', displayZipContents); -unzipper.start(); - -function updateProgress(e) { - // e.currentFilename is the file currently being unarchived/scanned. - // e.totalCompressedBytesRead has how many bytes have been unzipped so far -} - -function receiveOneFile(e) { - // e.unarchivedFile.filename: string - // e.unarchivedFile.fileData: Uint8Array -} - -function displayZipContents() { - // Now sort your received files and show them or whatever... -} +import { Unrarrer } from './bitjs/archive/decompress.js'; +const unrar = new Unrarrer(rarFileArrayBuffer); +unrar.addEventListener('extract', (e) => { + const {filename, fileData} = e.unarchivedFile; + console.log(`Extracted ${filename} (${fileData.byteLength} bytes)`); + // Do something with fileData... +}); +unrar.addEventListener('finish', () => console.log('Done')); +unrar.start(); ``` -The unarchivers also support progressively decoding while streaming the file, if you are receiving the zipped file from a slow place (a Cloud API, for instance). For example: - -```javascript -import { Unzipper } from './bitjs/archive/decompress.js'; -const unzipper = new Unzipper(anArrayBufferWithStartingBytes); -unzipper.addEventListener('progress', updateProgress); -unzipper.addEventListener('extract', receiveOneFile); -unzipper.addEventListener('finish', displayZipContents); -unzipper.start(); -... -// after some time -unzipper.update(anArrayBufferWithMoreBytes); -... -// after some more time -unzipper.update(anArrayBufferWithYetMoreBytes); -``` - -##### A NodeJS Example - -```javascript - import * as fs from 'fs'; - import { getUnarchiver } from './archive/decompress.js'; - - const nodeBuffer = fs.readFileSync('comic.cbz'); - const ab = nodeBuffer.buffer.slice(nodeBuffer.byteOffset, nodeBuffer.byteOffset + nodeBuffer.length); - const unarchiver = getUnarchiver(ab); - unarchiver.addEventListener('progress', () => process.stdout.write('.')); - unarchiver.addEventListener('extract', (evt) => { - const extractedFile = evt.unarchivedFile; - console.log(`${extractedFile.filename} (${extractedFile.fileData.byteLength} bytes)`); - }); - unarchiver.addEventListener('finish', () => console.log(`Done!`)); - unarchiver.start(); -``` - -#### Compressing - -The Zipper only supports creating zip files without compression (store only) for now. The interface -is pretty straightforward and there is no event-based / streaming API. - -```javascript -import { Zipper } from './bitjs/archive/compress.js'; -const zipper = new Zipper(); -const now = Date.now(); -// Zip files foo.jpg and bar.txt. -const zippedArrayBuffer = await zipper.start( - [ - { - fileName: 'foo.jpg', - lastModTime: now, - fileData: fooArrayBuffer, - }, - { - fileName: 'bar.txt', - lastModTime: now, - fileData: barArrayBuffer, - } - ], - true /* isLastFile */); -``` +More explanation and examples are located on [the API page](./docs/bitjs.archive.md). ### bitjs.codecs @@ -164,7 +98,8 @@ exec(cmd, (error, stdout) => { ### bitjs.file -This package includes code for dealing with files. It includes a sniffer which detects the type of file, given an ArrayBuffer. +This package includes code for dealing with files. It includes a sniffer which detects the type of +file, given an ArrayBuffer. ```javascript import { findMimeType } from './bitjs/file/sniffer.js'; @@ -173,7 +108,8 @@ const mimeType = findMimeType(someArrayBuffer); ### bitjs.image -This package includes code for dealing with binary images. It includes a module for converting WebP images into alternative raster graphics formats (PNG/JPG). +This package includes code for dealing with binary images. It includes a module for converting WebP +images into alternative raster graphics formats (PNG/JPG). ```javascript import { convertWebPtoPNG, convertWebPtoJPG } from './bitjs/image/webp-shim/webp-shim.js'; @@ -188,7 +124,8 @@ convertWebPtoPNG(webpBuffer).then(pngBuf => { ### bitjs.io -This package includes stream objects for reading and writing binary data at the bit and byte level: BitStream, ByteStream. +This package includes stream objects for reading and writing binary data at the bit and byte level: +BitStream, ByteStream. ```javascript import { BitStream } from './bitjs/io/bitstream.js'; @@ -199,8 +136,13 @@ const flagbits = bstream.peekBits(6); // look ahead at next 6 bits, but do not a ## Reference -* [UnRar](http://codedread.github.io/bitjs/docs/unrar.html): A work-in-progress description of the RAR file format. +* [UnRar](http://codedread.github.io/bitjs/docs/unrar.html): A work-in-progress description of the +RAR file format. ## History -This project grew out of another project of mine, [kthoom](https://github.com/codedread/kthoom) (a comic book reader implemented in the browser). This repository was automatically exported from [my original repository on GoogleCode](https://code.google.com/p/bitjs) and has undergone considerable changes and improvements since then, including adding streaming support, starter RarVM support, tests, many bug fixes, and updating the code to ES6. +This project grew out of another project of mine, [kthoom](https://github.com/codedread/kthoom) (a +comic book reader implemented in the browser). This repository was automatically exported from +[my original repository on GoogleCode](https://code.google.com/p/bitjs) and has undergone +considerable changes and improvements since then, including adding streaming support, starter RarVM +support, tests, many bug fixes, and updating the code to modern JavaScript and supported features. diff --git a/archive/events.js b/archive/events.js index 4f46b32..1ae6bad 100644 --- a/archive/events.js +++ b/archive/events.js @@ -107,6 +107,7 @@ export class UnarchiveFinishEvent extends UnarchiveEvent { } } +// TODO(bitjs): Fully document these. They are confusing. /** * Progress event. */ diff --git a/docs/bitjs.archive.md b/docs/bitjs.archive.md new file mode 100644 index 0000000..83e88f4 --- /dev/null +++ b/docs/bitjs.archive.md @@ -0,0 +1,149 @@ +# bitjs.archive + +This package includes objects for unarchiving binary data in popular archive formats (zip, rar, tar) +providing unzip, unrar and untar capabilities via JavaScript in the browser or various JavaScript +runtimes (node, deno, bun). + +A prototype version of a compressor that creates Zip files is also present. The decompression / +compression happens inside a Web Worker, if the runtime supports it (browsers, deno). + +The API is event-based, you will want to subscribe to some of these events: + * 'progress': Periodic updates on the progress (bytes processed). + * 'extract': Sent whenever a single file in the archive was fully decompressed. + * 'finish': Sent when decompression/compression is complete. + +## Decompressing + +### Simple Example of unzip + +Here is a simple example of unzipping a file. It is assumed the zip file exists as an +[`ArrayBuffer`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer), +which you can get via +[`XHR`](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest_API/Sending_and_Receiving_Binary_Data), +from a [`Blob`](https://developer.mozilla.org/en-US/docs/Web/API/Blob/arrayBuffer), +[`Fetch`](https://developer.mozilla.org/en-US/docs/Web/API/Response/arrayBuffer), +[`FileReader`](https://developer.mozilla.org/en-US/docs/Web/API/FileReader/readAsArrayBuffer), +etc. + +```javascript + import { Unzipper } from './bitjs/archive/decompress.js'; + const unzipper = new Unzipper(zipFileArrayBuffer); + unzipper.addEventListener('extract', (evt) => { + const {filename, fileData} = evt.unarchivedFile; + console.log(`unzipped ${filename} (${fileData.byteLength} bytes)`); + // Do something with fileData... + }); + unzipper.addEventListener('finish', () => console.log(`Finished!`)); + unzipper.start(); +``` + +`start()` is an async method that resolves a `Promise` when the unzipping is complete, so you can +`await` on it, if you need to. + +### Progressive unzipping + +The unarchivers also support progressively decoding while streaming the file, if you are receiving +the zipped file from a slow place (a Cloud API, for instance). Send the first `ArrayBuffer` in the +constructor, and send subsequent `ArrayBuffers` using the `update()` method. + +```javascript + import { Unzipper } from './bitjs/archive/decompress.js'; + const unzipper = new Unzipper(anArrayBufferWithStartingBytes); + unzipper.addEventListener('extract', () => {...}); + unzipper.addEventListener('finish', () => {...}); + unzipper.start(); + ... + // after some time + unzipper.update(anArrayBufferWithMoreBytes); + ... + // after some more time + unzipper.update(anArrayBufferWithYetMoreBytes); +``` + +### getUnarchiver() + +If you don't want to bother with figuring out if you have a zip, rar, or tar file, you can use the +convenience method `getUnarchiver()`, which sniffs the bytes for you and creates the appropriate +unarchiver. + +```javascript + import { getUnarchiver } from './bitjs/archive/decompress.js'; + const unarchiver = getUnarchiver(anArrayBuffer); + unarchive.addEventListener('extract', () => {...}); + // etc... + unarchiver.start(); +``` + +### Non-Browser JavaScript Runtime Examples + +The API works in other JavaScript runtimes too (Node, Deno, Bun). + +#### NodeJS + +```javascript + import * as fs from 'fs'; + import { getUnarchiver } from './archive/decompress.js'; + + const nodeBuf = fs.readFileSync('comic.cbz'); + // NOTE: Small files may not have a zero byte offset in Node, so we slice(). + // See https://nodejs.org/api/buffer.html#bufbyteoffset. + const ab = nodeBuf.buffer.slice(nodeBuf.byteOffset, nodeBuf.byteOffset + nodeBuf.length); + const unarchiver = getUnarchiver(ab); + unarchiver.addEventListener('progress', () => process.stdout.write('.')); + unarchiver.addEventListener('extract', (evt) => { + const {filename, fileData} = evt.unarchivedFile; + console.log(`${filename} (${fileData.byteLength} bytes)`); + }); + unarchiver.addEventListener('finish', () => console.log(`Done!`)); + unarchiver.start(); +``` + +#### Deno + +```typescript + import { UnarchiveExtractEvent } from './archive/events.js'; + import { getUnarchiver} from './archive/decompress.js'; + + const print = (s: string) => Deno.writeAll(Deno.stdout, new TextEncoder().encode(s)); + + async function go() { + const arr: Uint8Array = await Deno.readFile('example.zip'); + const unarchiver = getUnarchiver(arr.buffer); + unarchiver.addEventListener('extract', (evt) => { + const {filename, fileData} = (evt as UnarchiveExtractEvent).unarchivedFile; + print(`\n${filename} (${fileData.byteLength} bytes)\n`); + // Do something with fileData... + }); + unarchiver.addEventListener('finish', () => { console.log(`Done!`); Deno.exit(); }); + unarchiver.addEventListener('progress', (evt) => print('.')); + unarchiver.start(); + } + + await go(); +``` + +## Compressing + +The Zipper only supports creating zip files without compression (store only) for now. The interface +is pretty straightforward and there is no event-based / streaming API. + +```javascript + import { Zipper } from './bitjs/archive/compress.js'; + const zipper = new Zipper(); + const now = Date.now(); + // Create a zip file with files foo.jpg and bar.txt. + const zippedArrayBuffer = await zipper.start( + [ + { + fileName: 'foo.jpg', + lastModTime: now, + fileData: fooArrayBuffer, + }, + { + fileName: 'bar.txt', + lastModTime: now, + fileData: barArrayBuffer, + } + ], + true /* isLastFile */); +``` diff --git a/tests/unzipper-test.js b/tests/unzipper-test.js index 746bd07..f6983e1 100644 --- a/tests/unzipper-test.js +++ b/tests/unzipper-test.js @@ -26,7 +26,7 @@ async function getFiles(fileChangeEvt) { for (const b of buffers) { await new Promise((resolve, reject) => { - const unzipper = new Unzipper(b.buffer, { pathToBitJS: '../' }); + const unzipper = new Unzipper(b.buffer); unzipper.addEventListener(UnarchiveEventType.FINISH, () => { fileNum++; resolve(); diff --git a/tests/zipper-test.js b/tests/zipper-test.js index e0437ff..6c6f4f6 100644 --- a/tests/zipper-test.js +++ b/tests/zipper-test.js @@ -37,7 +37,6 @@ async function getFiles(fileChangeEvt) { result.innerHTML = `Loaded files`; const zipper = new Zipper({ - pathToBitJS: '../', zipCompressionMethod: ZipCompressionMethod.DEFLATE, }); byteArray = await zipper.start(fileInfos, true);