Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write a .zip #1442

Open
swankjesse opened this issue Feb 23, 2024 · 11 comments
Open

Write a .zip #1442

swankjesse opened this issue Feb 23, 2024 · 11 comments

Comments

@swankjesse
Copy link
Collaborator

We should design an API to create a .zip file.

See also:
#1408

@swankjesse
Copy link
Collaborator Author

Maybe something like this?

fun BufferedSink.writeZip(
  sourceFileSystem: FileSystem,
  baseDirectory: Path,
)

You’d create a real or fake FileSystem, populate a directory with content, then create a .zip from that content.

One drawback of this API is it’s awkward to create entries from a stream, like an HTTP response.

@swankjesse
Copy link
Collaborator Author

Another option:

fun BufferedSink.writeZip(
  writeContents: FileSystem.() -> Unit,
)

@swankjesse
Copy link
Collaborator Author

swankjesse commented Feb 24, 2024

A couple more considerations:

  • A new ZIP-writing API should allow the caller to supply timestamps. These could come from the originating file, or from the clock, or they could be constant 0 values. What’s the convention for zeroing out timestamps in zips? We should do that.

  • A new ZIP-writing API should allow the caller to configure either COMPRESSION_METHOD_DEFLATED or COMPRESSION_METHOD_STORED for each file.

  • For directory entries, we could always include them, always exclude them, let the user choose, or let the user choose on a case-by-case basis.

I suspect these are a deal-breaker for the APIs that use a FileSystem as the input or builder.

Here’s another API proposal. It ends up looking a lot like Moshi’s JsonUtf8Writer in name & usage.

class ZipWriter(sink: BufferedSink) : Closeable {
  inline fun <T> file(
    file: Path,
    compress: Boolean = true,
    lastModifiedAtMillis: Long? = null,
    lastAccessedAtMillis: Long? = null,
    createdAtMillis: Long? = null,
    writerAction: BufferedSink.() -> T,
  ): T

  fun directory(
    dir: Path,
    lastModifiedAtMillis: Long? = null,
    lastAccessedAtMillis: Long? = null,
    createdAtMillis: Long? = null,
  )
}

inline fun <T> BufferedSink.writeZip(writerAction: ZipWriter.() -> T): T

And a usage example of the above:

FileSystem.SYSTEM.write("greetings.zip".toPath()) {
  writeZip {
    file("hello.txt".toPath()) {
      writeUtf8("Hello World")
    }

    directory("directory".toPath())

    directory("directory/subdirectory".toPath())

    file(
      file = "directory/subdirectory/child.txt".toPath(),
      compress = false,
      lastModifiedAtMillis = Clock.System.now().toEpochMilliseconds(),
    ) {
      writeUtf8("Another file!")
    }
  }
}

@swankjesse
Copy link
Collaborator Author

I think I’d canonicalize input paths by stripping a leading / if present. I think that’s more user-friendly than either crashing or creating a .zip file that includes an absolute path.

@swankjesse
Copy link
Collaborator Author

I think I’d default timestamps to null/absent/0 rather than grabbing the host machine’s time and jamming that in there. Too many tools that produce .zip archives end up with non-deterministic outcomes because their libraries inserted data in the output that the author never really asked for.

@swankjesse
Copy link
Collaborator Author

I think I’d produce .zip files that don’t include directory entries at all by default. I’d only add ’em if the user explicitly asked for them. This creates an escape hatch for developers that want empty directories in their .zip files, without creating a bunch of redundant data otherwise.

@swankjesse
Copy link
Collaborator Author

I think I’d stream output to a BufferedSink, which should make it straightforward to create .zip files on-demand in web services or clients.

@vanniktech
Copy link
Contributor

That API in #1442 (comment) looks really good and would suit most of my needs. I have a bunch of app of which you can export your data. Everything that is a table in my sqlite tables just gets a corresponding json file where I dump all the data. Media files such as videos/images are stored such that they preserve their relative path from Context.filesDir so for instance I'd have inside the zip file attachments/image_1664623103090.jpg file. It would be really amazing if as part of ZipWriter you could also stream files into the zip via a Source, maybe something like this:

class ZipWriter(sink: BufferedSink) : Closeable {
  fun copy(
    source: Source,
    compress: Boolean = true,
  ): T
}

Or would this just be achievable by something like this?

file("attachments/image_1664623103090.jpg".toPath()) { 
   writeAll(fileSystem.source("attachments/image_1664623103090.jpg"))
}

@swankjesse
Copy link
Collaborator Author

swankjesse commented Mar 28, 2024

@vanniktech We could include all kinds of helpers, possibly as extensions.

fun <T> ZipWriter.copy(
    file: Path,
    compress: Boolean = true,
    lastModifiedAtMillis: Long? = null,
    lastAccessedAtMillis: Long? = null,
    createdAtMillis: Long? = null,
    openSource: () -> Source,
): T
copy("attachments/image_1664623103090.jpg".toPath()) { 
  fileSystem.source("attachments/image_1664623103090.jpg")
}

@mipastgt
Copy link

Is there already some functionality to create a simple ZIP file of a directory or any ZIP file at all for native targets (iOS in my case)?

@swankjesse
Copy link
Collaborator Author

@mipastgt not yet!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants