Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): add an endpoint for retrieving a list of files #94

Merged
merged 49 commits into from
Aug 7, 2023

Conversation

ajbdev
Copy link
Contributor

@ajbdev ajbdev commented Jul 13, 2023

Description

Hi there! I forked rustypaste to add a feature I need for a frontend I am building utilizing rustypaste as a backend. It occurred to me that it may be beneficial for the upstream project, although I am not sure if the feature fits with the philosophy or roadmap of the project. If it doesn't feel free to decline this PR and I'll understand completely.

Please note: I am a total Rust beginner and may have implemented this completely wrong. Feel free to critique accordingly.

Motivation and Context

I'm using rustypaste as a lightweight fileshare backend on my local network and I need a way to list all files for a frontend UI component I am building. This is also a common feature of other pastebin services and I think it may have many potential use cases beyond my own.

I only listed files in the upload dir and did not list urls, one shot files, or one shot urls. For my use case this is all I need but I could change this if requested.

How Has This Been Tested?

I've tested it locally on my machine and written a test that covers functionality.

Changelog Entry

Added

  • Add a function that lists all files in the upload directory and transforms them into a JSON HTTP response of objects representing the files in the directory.
  • Add new variable to config.toml to expose the json list endpoint. By default, it is not enabled.

Changed

  • Modified the behavior of the GET /{file} route to conditionally execute and return the transformation function if the JSON index path configuration value matches the name of the requested file.

Types of Changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (no code change)
  • Refactor (refactoring production code)
  • Other

Checklist:

  • [?] My code follows the code style of this project.
  • I have updated the documentation accordingly.
  • I have formatted the code with rustfmt.
  • I checked the lints with clippy.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@ajbdev ajbdev requested a review from orhun as a code owner July 13, 2023 04:42
Cargo.toml Outdated Show resolved Hide resolved
@tessus
Copy link
Collaborator

tessus commented Jul 13, 2023

This looks great. I was thinking about implementing something similar. One suggestion though.

I think this should be guarded with an auth token. That is, if one is set. You can check how it was done for the /version and /upload endpoint.

Update: I also think that the expiration should be done as a point in time and not a duration. The closer to ISO 8601 (or RFC 3339), the better. But anything like YYYY-MM-DD HH:mm will do, since there is no ambiguity. A relative time value is only useful in self updating and responsive UIs, but even then I dislike relative time measurements with the heat of a thousand suns. But that's of course subjective.
However, in this case, especially if you save that output for later processing a duration is rather useless.

@ajbdev
Copy link
Contributor Author

ajbdev commented Jul 13, 2023

I'll implement the auth guard and make the change to the expiration variable (I'm not sure why I implemented it that way and agree that the expiration date definitely makes more sense!). Thanks for the feedback, let me know if there is anything else!

@tessus
Copy link
Collaborator

tessus commented Jul 13, 2023

I am just a user like yourself. I really only made suggestions, even though I think they made sense.
orhun is the owner of this project, so he is the one who makes all the decisions. ;-)

@tessus
Copy link
Collaborator

tessus commented Jul 14, 2023

Oh, I have just noticed that you are using the / endpoint for retrieving the list.

I am just playing the devil's advocate here. What would happen, if someone chose version as the json_index_path? Or any other endpoint that exists? I am not sure how actix handles that.

Maybe it would be better to use a hardcoded separate endpoint. Otherwise you migth have to check for collisions (make sure that no existing endpoints) are used.

Maybe orhun has some input on this as well.

@orhun orhun changed the title Server endpoint for retrieving a list of files as JSON feat(server): add an endpoint for retrieving a list of files as JSON Jul 15, 2023
README.md Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Jul 15, 2023

Codecov Report

Patch coverage: 75.75% and project coverage change: +1.91% 🎉

Comparison is base (33e038c) 66.92% compared to head (f7f477a) 68.84%.
Report is 7 commits behind head on master.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #94      +/-   ##
==========================================
+ Coverage   66.92%   68.84%   +1.91%     
==========================================
  Files          11       11              
  Lines         520      552      +32     
==========================================
+ Hits          348      380      +32     
  Misses        172      172              
Flag Coverage Δ
unit-tests 68.84% <75.75%> (+1.91%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
src/config.rs 66.66% <ø> (+16.66%) ⬆️
src/server.rs 80.61% <75.75%> (+1.47%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@orhun
Copy link
Owner

orhun commented Jul 15, 2023

Hello, thanks for submitting this PR, it is definitely a nice feature to have!

What would happen, if someone chose version as the json_index_path?

It is a valid concern. Having a hardcoded endpoint would be better in this case I think. But let us know if you made any tests about collisions etc. and if it is fine to have something customizable.

@tessus
Copy link
Collaborator

tessus commented Jul 15, 2023

@ajbdev if you go for a hardcoded endpoint, something simple and short like /list should do. ;-)

You can then later extend it for types (url, oneshot, ..), if you ever need that functionality. e.g.: /list/<type>

Btw, when you are done with your frontend, I'd be interested in checking it out in case you are going to make it public on github.

@ajbdev
Copy link
Contributor Author

ajbdev commented Jul 15, 2023

This all sounds good - I will change to the hardcoded endpoint! Yes, I'll be sharing my frontend publicly, although it won't be a very traditional pastebin UI most likely.

@tessus
Copy link
Collaborator

tessus commented Jul 15, 2023

although it won't be a very traditional pastebin UI most likely.

For me this is perfectly fine. I am just interested in what kind of frontend you are cooking up. I mostly use the command line, but now and then a UI (text or graphics based) can be useful.
I am also interested in this, because I really suck in anything frontend-like. Haha.

@ajbdev ajbdev force-pushed the json-index-path branch from bc044ce to 46a5ccf Compare July 21, 2023 05:30
@ajbdev
Copy link
Contributor Author

ajbdev commented Jul 21, 2023

I had some time to make these changes and add test coverage. Let me know if I missed anything or you see anything else to change.

@tessus
Copy link
Collaborator

tessus commented Jul 30, 2023

orhun squashes them when he merges the PR. I haven't seen any indication that during the lifetime of a PR he wanted to squash and force-push, but I might be wrong.

@tessus
Copy link
Collaborator

tessus commented Jul 30, 2023

I just noticed that the filename also includes the timestamp:

[
  {
    "filename": "foo.rjOq.bar.txt.1690750967938",
    "filesize": 12,
    "expires_at": "2023-07-30 21:02:47"
  }
]

Is this on purpose?

When you create a link BASE_URL + filename, the result (in a browser) is always that the file is downloaded, because the mimetype is not recognized with this (timestamp) extension.

For me this is not an issue, but might not be as intended. Maybe the timestamp should be moved to its own separate field. But I really do not know, and even if I were the person to decide, I still couldn't make the decision alone, because I am not sure what the best solution is. There are a few options and all of them have their pros/cons:

  • keep it as is
  • remove the timestamp from the filename and do nothing else
  • move the timestamp from the filename to a separate field

What do you guys think @orhun and @ajbdev?

Edit: One more thing. rustypaste uses UTC time for everything (logs and for timestamps). I think this should be noted in the README or denoted in the json output. Either in the field (expires_at_utc) or in the value (2023-07-30 21:02:47 +0000).
The easiest and most performant is to just add it to the key. The nicest is to add +0000 to the timestamp. But personally I can live with it the way it is currently.
Please note that rustypaste should not handle timezone info at all (other than UTC), otherwise chrono will become necessary.

@ajbdev
Copy link
Contributor Author

ajbdev commented Jul 31, 2023

I think we should just chop it off. The expiration is already a separate field. Adding tz to expiration key seems like a good idea, too.

@orhun
Copy link
Owner

orhun commented Jul 31, 2023

I think we should just chop it off. The expiration is already a separate field.

Yup, let's do that.

The easiest and most performant is to just add it to the key.

Agreed!

src/server.rs Outdated Show resolved Hide resolved
@orhun orhun self-requested a review July 31, 2023 10:16
src/server.rs Show resolved Hide resolved
@ajbdev
Copy link
Contributor Author

ajbdev commented Aug 1, 2023

Hey guys, I understand this PR may need further changes but for the purposes that I need its more than adequate. Unfortunately (with a newborn to take care of) I can only spend so much time tending to hobby projects, so if you'd like to merge the PR someone else will have to implement further changes. As much as I'd love to know how to be a competent rustacean sometimes life gets in the way :) I hope you all understand.

@tessus
Copy link
Collaborator

tessus commented Aug 1, 2023

Hey ajbdev, first of all - congrats. A newborn is certainly a handful.

I am more than happy to finish this up, but I can't push to your branch, so we might have to open a new PR. Or you can try to give me write access to your branch.
On the other side it seems to me that it's almost ready thus orhun might be able to do the last few changes.

I think the only thing that is still open is this #94 (comment) But I have already replied and let's see what orhun says.

Edit: thanks for the access. Just tested a push and it works!

@orhun
Copy link
Owner

orhun commented Aug 2, 2023

Congrats @ajbdev! Thank you for your efforts so far and definitely no worries, we will be finishing this up with @tessus for sure!

@tessus
Copy link
Collaborator

tessus commented Aug 2, 2023

The easiest and most performant is to just add it to the key.

Agreed!

I also thought about maybe adding one more method to the crate as_string_utc(). Because in reality with this kind of primitive transformation without any TZ awareness there are only 2 options to present a date/time to the user:

  • local (whatever that might be) without TZ info
  • UTC (also showing that it is in fact UTC)

No harm in adding the second one of these 2 options to the crate.

P.S.: I have added the method to the crate, but not used it in this PR. For this PR I just changed the key to expires_at_utc

@tessus
Copy link
Collaborator

tessus commented Aug 5, 2023

@orhun I think this should be ready. please also have a look at #94 (comment)

@orhun
Copy link
Owner

orhun commented Aug 7, 2023

I cleaned up the implementation in b9e15dc (it might be worth checking out)

Copy link
Owner

@orhun orhun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both @ajbdev @tessus for your efforts 💖

@orhun orhun changed the title feat(server): add an endpoint for retrieving a list of files as JSON feat(server): add an endpoint for retrieving a list of files Aug 7, 2023
@orhun orhun merged commit d0a6775 into orhun:master Aug 7, 2023
@orhun
Copy link
Owner

orhun commented Aug 7, 2023

P.S. Would you be interested in adding this feature to rustypaste-cli as well?

@tessus
Copy link
Collaborator

tessus commented Aug 7, 2023

I cleaned up the implementation in b9e15dc (it might be worth checking out)

Nice refactor indeed.

Would you be interested in adding this feature to rustypaste-cli as well?

Yes, I was thinking about it, but was hung up on how to present the data. When I thought about the formatting, a few options came to mind. The best would probably be a table, but I'd have to iterate through the json to find the longest filename and the largest filesize to determine the width of the columns. This also means iterating through the JSON twice.
Or the first column could be the expiry timestamp, the second column the file size, and the third column the file name. Then the column widths could be static.

Then how about the file size? Bytes or units (kb, mb, gb, tb, ...). e.g. I use the term MB, but calculate with 1024. Many, many years back they renamed the MB that is binary based to MiB (which I really hate). I grew up with learning that in IT KB/MB/GB are not factors of 1000 but 1024.
Then these idiot hard disk manufacturers started to calculate with 1000 and people were angry and confused that they got less storage than they expected. As far as I know RAM manufacturers still use the term MB but caluclate with 1024.

Sorry for my rant, but this stuff still makes me angry every time I think about it.

@tessus
Copy link
Collaborator

tessus commented Aug 7, 2023

While working on the cli code (almost done, btw), I have noticed that expired files are also listed. There should be a check (on the server) inverse to what is done when removing expired files when building the list of files.

@orhun
Copy link
Owner

orhun commented Aug 9, 2023

Then these idiot hard disk manufacturers started to calculate with 1000 and people were angry and confused that they got less storage than they expected. As far as I know RAM manufacturers still use the term MB but caluclate with 1024.

Lol, yeah, I can totally relate with this rant 😄

@ajbdev ajbdev deleted the json-index-path branch August 31, 2023 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants